In a decision that may be as far-reaching as the 2006 changes in rules for civil proceedings that essentially created the e-discovery market, Southern District of New York Magistrate Judge Andrew Peck has issued a ruling that litigants may (that word is important) use computer-assisted review software that uses “predictive technology” software to help determine the relevance of documents.
Ironically, this all happens almost exactly a year after the New York Times published an article on the subject, which though it didn’t use the term “predictive coding” described the practice and its effect on the legal community. Studies have also found that computer programs are better at it than legal staff.
The “may” is important for two reasons. The first is that, due to some confusion, some people believed that Peck’s ruling, in the case of Monique Da Silva Moore, et al., Plaintiffs, v. Publicis Groupe & MSL Group, Defendents, 11 Civ. 1279 (ALC)(AJP), required the use of predictive coding, which is does not do. The second is that a different case, Kleen Products LLC v. Packaging Corporation of America, et al., still in court, does hinge on the question of requiring predictive coding.
Indeed, in the particular case to which Peck refers, the litigants agreed between themselves to use predictive coding in principle — but have been unable to agree on the details, and in fact the plaintiffs have filed an objection to Peck’s ruling, saying they are concerned that the software process is not transparent enough.
Peck’s opinion is not a surprise; last October, he wrote an article describing predictive coding and its role in e-discovery. While he uses charming phrases such as “A basic problem is that absent cooperation, the way most lawyers engage in keyword searches is, as Ralph Losey suggests, the equivalent of “Go Fish,””, one hopes he is a better judge than a prophet:
Perhaps they are looking for an opinion concluding that: “It is the opinion of this court that the use of predictive coding is a proper and acceptable means of conducting searches under the Federal Rules of Civil Procedure, and furthermore that the software provided for this purpose by [insert name of your favorite vendor] is the software of choice in this court.” If so, it will be a long wait.
Four months isn’t all that long.
Needless to say, e-discovery vendors are kvelling about the ruling, and not just because Peck uses charming phrases such as, “The Court recognizes that computer-assisted review is not a magic, Staples-Easy-Button, solution appropriate for all cases.” (Peck emphasizes that he isn’t endorsing any particular vendor.)
Clearwell, for example — recently purchased by Symantec (which had specified growth in technology-assisted review as one of its 2012 predictions) as one of the first e-discovery acquisition dominoes to fall — noted five major points about the decision:
- The Court did not order the use of predictive coding
- Computer-assisted review is not required in all cases
- The opinion should not be considered an endorsement of any particular vendors or tools
- Predictive coding technology can still be expensive
- Process and methodology are as important as the technology utilized
Organizations that have held off on implementing predictive coding now have a green light to proceed.
Remember when Facebook started designing its own servers and data center?
Now it’s designing its own disk drives.
This is all supposed to be part of the company’s Open Compute initiative, according to Wired, though it’s not yet included on the website, and details were thin. (For example, it isn’t clear whether they include the hard drive thermostat the project described last summer.) However, the company said it will release its new storage designs in early May at the next Open Compute Summit.
Facebook is doing all this because it has such a heavy load — 845 million users and 140 billion digital photographs, Wired said — so savings that it can achieve in hardware, whether in the hardware itself, the power it uses, or the cooling it requires, can aggregate to quite a lot. The company has already made a number of changes to its servers to save cost, space, and heat.
For example, in its Prineville, Ore., data center, the company has eliminated chillers and uninterruptible power supplies, Wired said. The article quoted a Facebook engineer, originally from Dell, as saying that the really valuable part of storage is the disk drive itself and the software that controls how the data gets distributed to and recovered from those drives, and that the company would do what it could to eliminate the other ancillary parts, as well as make the valuable parts easier to get at and fix. For example, the company would like to eliminate the handles and screws that are currently part of some disk drives.
So why does this matter to you? Because Facebook intends to open source the storage design when it’s finished, meaning it could end up in the marketplace, as it has with its servers. So chances are, what Facebook decides will affect your data center, too.
Contradicting earlier court actions in other states, the Atlanta-based U.S. Court of Appeals of the 11th Circuit has ruled that a man suspected of holding child pornography on his hard disk drive doesn’t have to reveal the necessary code to decrypt it for law enforcement, saying it violates his Fifth Amendment protection against self-incrimination.
In comparison, in January a woman suspected of bank fraud was ordered to give up her password by a U.S. District judge.
A few weeks ago, we were hearing all about how IBM researchers were developing teeny-weeny disk storage. Now we’re hearing about how other researchers are developing really fast disk storage. Unfortunately, the two technologies aren’t compatible, so you’ll have to settle for small or fast, not both. Noted one York University researcher in the multinational team:
Instead of using a magnetic field to record information on a magnetic medium, we harnessed much stronger internal forces and recorded information using only heat. This revolutionary method allows the recording of Terabytes (thousands of Gigabytes) of information per second, hundreds of times faster than present hard drive technology. As there is no need for a magnetic field, there is also less energy consumption.”
According to ScienceNOW, this is how it works:
[L]aser light heats up the gadolinium-iron alloy so incredibly fast—in 1/10,000 of a nanosecond—that at first only the iron atoms lose their mass orientation. The gadolinium atoms react more slowly in losing their magnetization. And once the iron atoms get hot enough and are free to pivot around, they prefer to align in the same direction as the gadolinium atoms. Then, as the material quickly cools and the orientations of the atoms freeze up, the iron and gadolinium atoms again prefer to point in opposite directions. But this time, it’s the slow-cooling gadoliniums that flip leading to a predictable overall reversal in the material’s magnetization.”
There’s only one problem. Remember the jokes about “write-only memory“? Turns out that, at least for the moment, that’s what the laser storage produces, because it isn’t clear how to read it again. “The only problem, at this point, is that while lasers are great at writing magnetic data, reading it is another challenge entirely,” notes DVICE.com. “The researchers seem to have used a fancy type of X-ray spectrometer that can read magnetic fields to check and see if they were writing the data that they thought they were, but until those get shrunk down to HDD component size (or someone comes up with something clever), we may be stuck just writing our data really really fast and not reading it ever again.”
Neither the storage industry nor the state of Idaho are known for having flashy technical CEOs like Larry Ellison and Steve Jobs, but they both lost one last Friday when Micron CEO Steve Appleton died unexpectedly in a crash of his plane.
A daredevil and adrenalin junkie who excelled in tennis, scuba diving, surfing, wakeboarding, motorcycling, off-road car racing, taekwondo, and aviation, who had already survived a crash in 2004, the 51-year-old Appleton was named one of the worst CEOs in the country by Forbes at the same time that Fortune was naming Micron one of the most-admired companies in the nation. Some criticized him for his salary, while others said it was not out of line in the heavily cyclical DRAM industry.
Raised in California, Appleton attended Boise State University and began working for Micron soon after graduation, eventually working himself up to president, chairman, and CEO in 1994, making him one of the nation’s youngest CEOs. According to Jim Handy of Objective Analysis:
Under his guidance the company became the last surviving US DRAM manufacturer and turned around a number of failing DRAM businesses it acquired from Texas Instruments, Toshiba, Qimonda, and others, while investing in businesses outside of its core DRAM strength including a recent acquisition of NOR maker Numonyx. One particularly successful investment has been Micron’s IMFT joint venture with Intel for the manufacture of NAND flash.
While the company’s chips were used in a variety of products, its own consumer brand is Lexar.
In Idaho, Micron was a major employer and, along with HP, helped form Boise’s nascent technology community. Due to the company’s innovations and the state’s small population, Idaho often ranked at or near the top in lists of numbers of patents per capita.
Unlike some other superstar tech CEOs, however, Appleton was known for his philanthropic efforts — for example, donating money to Boise State for its tennis courts and for a business and economics building to be named after Micron, still under construction. The company’s Micron Foundation also donated to the College of Western Idaho community college, founded just a few years ago.
Appleton is survived by a wife and four children. The board has named as CEO former president and COO Mark Durcan — who had just announced his retirement a week before.
In case you needed proof of what the Stop Online Privacy Act (SOPA) bill could have done, the U.S. government went on a few days after SOPA was withdrawn and shut down a website, claiming it was used to disseminate copyrighted content such as movies and television programs.
“The domain name associated with the website Megaupload.com has been seized pursuant to an order issued by a U.S. District Court. A federal grand jury has indicted several individuals and entities allegedly involved in the operation of Megaupload.com and related websites charging them with the following federal crimes,” including copyright infringement, racketeering, and conspiracy, reads a notice on the website.
Regardless of the merits of that case in specific, the bigger issue is, what about the users of the site — reportedly up to 50 million of them — who were using it for completely legitimate purposes?
Or, as we wrote last year:
And think of how this would play with the new PROTECT-IP bill that’s being proposed, which would let a third party shut down a site for having a copy of its intellectual property: Viacom, say, uploads a copy of a movie it suspects is available on Dropbox, finds it’s already there, demands to know who it owns it, and then shuts down that company’s site — potentially all without ever getting a warrant, because if Dropbox won’t tell, Viacom can shut *it* down for having a copy of the file. And if Dropbox gets shut down, what happens to all its other, innocent users’ files?
Data on the MegaUpload servers was scheduled to be deleted as soon as February 2, but the companies that own the servers have agreed to wait at least two weeks in hopes of developing a way that legitimate users can get access to their files. The companies are working with the Electronic Frontier Foundation and have set up a website to collect users who might have lost access to their data.
But the logistics of this might be complex, noted Time.
It’s also unclear how users would get their data back even if Megaupload and the government came to an agreement. Would they simply open the site again with uploads and sign ups disabled, or come up with some other way to access the data? And how would they ensure that users weren’t helping themselves to content that infringes copyrights? Any method would require time and development efforts — the process could easily get messy.
It also means that every e-discovery company will be crawling out of the woodwork to tout (as opposed to tort — did you see what I did there?) its wares. Announcements include the following:
- AccessData will launch its redesigned Summation product line.
- Avansic announces Avansic Tracker.
- BlumbergExcelsior, Inc. has introduced Blumberg Entity Tracker.
- Business Intelligence Associates, Inc. (BIA) will be previewing the TotalDiscovery.com Apple Data Collection feature.
- Cabinet NG (CNG) will debut a legal targeted website, PaperlessAttorney.com.
- C2C has upgraded its suite of products for simplifying email and PST file management.
- Equivio will be showing Equivio Zoom, an integrated platform for predictive coding and analytics.
- IPRO Tech will be demonstrating its line of products, as well as new versions of the review component, early case assessment component, and processing components of IPRO Enterprise
- Levit & James, Inc. is preannouncing Best Authority version 3.0.
- MerlinOne will demonstrate Legal Review 3.0, a hosted e-discovery review tool.
- Nuix will announce Visual Analytics, Contract Discovery, and Defensible Deletion.
- Orion Law Management Systems, Inc. will announce its AR Collection Manager Module.
- Venio Systems announces Venio FRP version 3.5.
- World Software Corp. launches Worldox GX3 Professional, a new release of its document management system.
Remember, what happens in New York….could be held against you in a court of law.
Quis custodiet ipsos custodes?
Or, in this case, who protects you from the person who protects your data? According to a recent study by the Ponemon Institute, Trends in Security of Data Recovery Operations, the very third-party data recovery services that can help you get your data back might be helping themselves to your data, too.
We surveyed 769 IT security and IT support practitioners who are involved in their organization’s data security or data recovery operations. According to the findings, 85 percent of these respondents report their organizations have used or will continue to use a third-party data recovery service provider to recover lost data. This is an increase from 79 percent in the previous study. We also learned that organizations are frequently using a third party when a device crashes. In fact, 37 percent use multiple third parties and 39 percent say they use third parties at least once each week or more. However, the vetting of these data recovery service providers is considered fair by 30 percent of respondents and 9 percent say it is poor.”
This sort of problem isn’t new, and isn’t limited to corporations, but the problem is getting worse, Ponemon says:
A large percentage of respondents in this study report their organization has had at least one data breach (87 percent) in the past two years. (This is consistent with other Ponemon Institute studies about the prevalence of data breaches). Of the 87 percent who say their organization had a data breach, 21 percent say the breach occurred when a drive was in the possession of a third-party data recovery service provider. This is an increase from 19 percent in the previous study. In many cases, respondents point to the data recovery service provider’s lack of security that led to the data breach.”
Note, too, that this doesn’t mean the third-party data recovery service itself hires crooks, but that the security at the service itself might be lacking and serve as an enticing honeypot for criminal hackers. For example, in May 2011, Co-operative Life Planning’s funeral planning division discovered that the personal data of 83,000 customers was leaked after a data recovery firm was called in after a hard disk failure. Although the work was successful, the data was retained on the servers of the data recovery company, and their servers were then hacked into. (But no doubt it’s the owner of the data, not the recovery company, that has to deal with notifying the users involved.)
So, what to do? The Ponemon report offers some suggestions on how to pick a reputable firm, and DriveSavers offers a (somewhat dated, 2009) white paper with similar suggestions.
The important thing, Ponemon says, is that organizations need to consider security as a primary factor in selecting such companies. Notes the study:
The majority of respondents in our study either report to the Chief Information Officer or Chief Information Security Officer. Fifty-nine percent are at or above the supervisory level. These individuals believe that their organizations are making decisions about who will handle the data recovery process based on the speed of service, successful rate of recovery and overall quality of service rather than data security. As a result, only 28 percent see data security as a main criterion for determining the adequacy of third-party data recovery service providers.”
To give you an idea of IBM’s accomplishment of storing one bit of data in 12 atoms, or one byte of data in 96 atoms, of iron on a surface of copper nitride, the equivalent would be a 10, 416-terabyte drive in the size of a 1 TB drive today.* That’s because, according to the New York Times, “Until now, the most advanced magnetic storage systems have needed about one million atoms** to store a digital 1 or 0.”
This was all done at IBM’s Almaden Research Center in San Jose, Calif., which took five years to do it. Which means, because it was IBM, the scientists then used the teeny weeny storage device — where each atom has to be manipulated by hand using a device the size of a room — to spell out IBM’s motto, “Think.” Good thing they didn’t work at Microsoft, where they would have had to painstakingly spell out, “Where do you want to go today?”
Of course, this is one of those “ignore friction” dealies that couldn’t happen in the real world; for one thing, it was performed at close to absolute zero, which is going to be difficult to achieve in an overheated press room at CES, for example. Still, according to the scientists, it could be done at room temperature with just 150 atoms, the New York Times said. Realistically, though, it is likely that the technology could at most produce a drive of 100 TB in the space of 1 TB today.
Something I haven’t seen in the reportage is just how sensitive such a system would be to cosmic rays, sunspots, droppage, and shocks from shuffling on carpeting. “Dude! You just wiped out the Library of Congress!” “Sorry, dude!”
The operative part of the technology was best explained by the Financial Times: “They did this by using an antiferromagnetic, instead of a ferromagnetic, structure – in other words switching the atoms in the structure from pointing towards each other (like in a fridge magnet) to pointing away from each other. This allows for less interference, which is important when storing data in 12-atom blocks.”
*Yes, yes, I know, technically a terabyte has 1099511627776 bytes. Hush. You get the point. Incidentally, CNET said it would be 83,000 disk drives, because CNET forgot to divide by 8.
**The New York Times‘ story originally referred to “copper nitride atoms” until people made fun of them in the comments and “atoms” was deleted. Science is hard. Noted one commenter, “ I have been both enlightened, and entertained. I also now know there is no such thing as a copper nitride atom, whereas previously, I had never wondered whether there was a copper nitride atom. Now I do, I’m not sure what to do with that.”
Think getting your backup right is a case of life and death? Here’s an incident where it really is.
In a criminal case in Miami in 2009, a man named Randy Chaviano was convicted of second-degree murder committed in 2005 and sentenced to life in prison. As usual, a court stenographer was taking notes at the trial. But then there was a string of coincidences worthy of a Law & Order script.
- The stenographer didn’t have enough paper for her machine — a mistake she’d apparently made before
- Consequently, the notes she took were recorded only in the machine’s internal memory
- She transferred the stenography machine’s records to her own PC
- She deleted the records from the stenography machine
- She didn’t do a backup of the PC
- A virus hit the PC and deleted what was by then the only record of the trial, leaving only a pretrial hearing and closing arguments; it wasn’t clear when this happened
This was all discovered recently, when the case was appealed, and it was discovered that the notes no longer existed — meaning that the case will have to be re-tried from scratch, according to the Miami-Herald. The paper didn’t say how much re-trying the case would cost.
The court stenographer has since been fired — in fact, courts in Miami are now moving toward using digital recorders and no stenographers at all. Moreover, cost-cutting may have caused the problem in the first place, noted the Herald:
Court reporters in criminal court have also complained that plunging rates paid by the state have driven away experienced stenographers and forced firms to hang on to aging equipment.
“It seems very sloppy to allow the only record of a trial’s proceedings to be held on an individual’s PC – it’s like asking for trouble if it isn’t at the very least held securely as a backup elsewhere,” noted Graham Cluley in the security blog Sophos. You think?
No word on the fate of the IT person who should have been responsible for doing backups on the PCs.