Note to self: When you’re doing your backups, make sure you have them on a different place than your production network.
That’s a lesson learned the hard way by VFEmail.net, a worldwide email service provider, which recently lost not only its subscribers’ email messages, but also all its backups, because they were all on the same network.
“We have suffered catastrophic destruction at the hands of a hacker, last seen as firstname.lastname@example.org,” noted the VFEmail.net website. “This person has destroyed all data in the US, both primary and backup systems. We are working to recover what data we can.”
The exact details are sketchy, because the people running VFEmail.net are, naturally, kind of busy trying to put it back together. Thus far they’ve found a single offline backup dating from August 2016, so, hurray, VFEmail.net users are now only missing their last two-and-a-half years of email messages.
But apparently someone hacked into the system and zapped not only the primary mail servers, but the backups as well. Speculation, and there’s plenty, is that it was either an inside job or someone – perhaps even a foreign government – thinking there was something incriminating on the server and deleting everything on it, just in case.
And the deletions were reportedly very thorough, in a way that couldn’t be recovered. “At this time, the attacker has formatted all the disks on every server. Every VM is lost. Every file server is lost, every backup server is lost.. Strangely, not all VMs shared the same authentication, but all were destroyed,” noted the VFEmail.net Twitter handle. “This was more than a multi-password via ssh exploit, and there was no ransom. Just attack and destroy.”
Ironically, VFEmail.net was originally set up in response to an email virus. “VFEmail started in 2001 by Rick Romero in direct response to the ‘ILOVEYOU’virus,” notes the company’s website. “At the time, anti-virus was not integrated into email systems. After writing a set of batch files to integrate Norton AntiVirus Corporate Edition A/V scanning into Mercury/32 on Windows, Rick turned his attention to helping regular users and local small businesses avoid email-based viruses. VFEmail started with a single FreeBSD server, and thanks to Rick’s broad and extensive IT experience, frugal purchasing, and long-term planning, VFEmail has grown into the site you see today. While other services have shut down, or been exposed as not delivering on their promises, VFEmail keeps chugging along.”
The few numbers of servers may have been part of the problem. The company offered free as well as paid email accounts, and consequently wanted to save money. “We strive to build an economical and redundant system, to provide our users with as much uptime as possible,” the website continues. “As mentioned, VFEmail started with a single machine, but over time we’ve built out, adding systems for load balancing/failover and separating services. Most recently we’ve made use of Virtual Machines in order to keep hardware acquisitions at a minimum, in those cases where it would not impact performance. By separating vital functions, upgrades, updates, and system problems can quickly and easily be isolated from the rest of the system and provide you with uninterrupted accessibility.”
Yeah, well, not so much.
It all just goes to show that simply making a single backup is not enough. The rule of thumb some people use is 3-2-1: three copies of the data, two of them onsite but one of them offline, and one of them offsite. (Not to mention, checking the backups periodically to make sure you can actually recover from them.) While that requires a lot of hard disk drives and coordination, it at least protects against the majority of problems.
We’ve written before about the challenges in storing bitcoin, and how if you’re not careful, you can lose access to $7.5 million by accidentally throwing the hard drive containing the cryptographic key away. We’ve also written about how people can lose access to data when someone dies without revealing passwords.
Now we’ve got a story of both. Or do we?
It all started on December 9, when Gerald Cotten, CEO of crypto exchange QuadrigaCX, died. The result is that his widow Jennifer Robertson said the company owes its customers some $190 million, and the company has filed for creditor protection because it says it doesn’t have access to the majority of its bitcoin. His death was announced January 14.
Cotten was admirably conscious about security, writes Doug Alexander in Bloomberg. “The laptop, email addresses and messaging system he used to run the 5-year-old business were encrypted,” he writes. “He took sole responsibility for the handling of funds and coins and the banking and accounting side of the business and, to avoid being hacked, moved the ‘majority’ of digital coins into cold storage,” which was not connected to the Internet. He also reportedly had a USB key that was also encrypted.
Apparently, this actually happens more often than people like to admit, writes Michael Kaplan in the New York Post. In addition to James Howells, who accidentally threw away the wrong hard drive, there’s Matthew Mellon, whose family was reportedly unable to locate the cryptographic key required to retrieve as much as $1 billion in bitcoin, he writes, going on to describe several other cases – including, potentially, the guy who invented bitcoin itself. “Losing passwords is the kind of nightmare that haunts bitcoin investors,” he writes. ”In fact, there are an estimated 3 million bitcoins — totaling nearly $25 billion — lost because the retrieval codes have gone missing or the currency owners died without passing the codes onto their next of kin.”
According to the Wall Street Journal, as much as 20 percent of all bitcoin has been lost.
Now, however, there are all sorts of new wrinkles, like a new will that the CEO wrote a few days before he died, whether bitcoin had been moved out of the accounts, and suspicion about whether the company actually had that amount of bitcoin at all. It didn’t help that the company had had issues several times in 2018 with people not being able to gain access to the bitcoin they had on deposit with the company.
“To a lot of people it’s strange, because two weeks before his death he had left a will leaving what is said to be a plane, two houses, and $100,000 for the care of his two Chihuahuas,” Elvis Cavalic, an investor with the company, said in an interview with CBC Radio. “Why wasn’t there a conversation had over that if there was a conversation over the dogs?”
“On the Quadriga sub-Reddit, rumour mixes with fact,” writes Don Pittis with CBC News. “One post claims that accounts of Quadriga’s litecoin, for which passwords were supposed to be lost, are showing activity. Others insist the millions never really were there and the trading platform was being used as a Ponzi scheme, where people were being paid out from new investors’ deposits.”
Meanwhile, the legal case is still going on.
Incidentally, the guy who threw away the hard disk with access to $7.5 million in bitcoin on it – which has been worth up to $75 million – is still trying to get access to the dump where he believes his hard drive ended up, Kaplan writes. He’s offered the dump 10 percent of the bitcoin’s worth if they let him go look for it, but so far, no dice. In the meantime, he considers the dump the “ultimate safe,” he writes.
Shocked, shocked as they were to learn that user cellphone location data was being sold, major cellphone service providers have pledged to stop the practice, for reals this time. At least, by March. For sure.
The major carriers had already pledged last year to stop selling location data, other than that for useful services that, for instance, helped customers with roadside assistance or fraud protection., writes Tali Arbel for the AP. However, when it was demonstrated that the data was still readily available, companies pledged to stop selling it to those providers, too.
“Last year we decided to end our arrangements with data aggregators, but assessed that the negative impacts to customers for services like roadside assistance and bank fraud alerts/protection that would result required a different approach,” Sprint said in a statement quoted by The Hill, in a nice show of passive aggressiveness. “We implemented new, more stringent safeguards to help protect customer location data, but as a result of recent events, we have decided to end our arrangements with data aggregators.”
In other words, when AAA can’t find you next time you’re on the highway with a flat, don’t blame us.
Shocked legislators, most of them Democrats, also wanted to know from the Federal Communications Commission about the meaning of all this, and demanded that FCC chair Ajit Pai show up and tell them. Oh, sorry, Pai said, in a fine show of passive aggressiveness himself. Can’t come by because of the government shutdown. I can only handle issues of immediate threat of life and limb. Let me know when the government’s open again.
This really all started in May of last year when the New York Times pointed out that cellphone location data was readily available through vendors. That’s what led to the vendors’ initial pledge to stop sharing such data.
“These aggregators, barnacles of the telecom industry, depend on cellular giants, like AT&T, Verizon, Sprint, and T-Mobile, for their livelihood,” intoned Robert Hackett in Fortune. “They sell data access to other companies, which sell them to others still. Phone holders have no choice but to opt-in. People’s devices beacon out to cell towers at all times, triangulating their positions, simply by virtue of being on the grid. There is no hiding; everyone’s back bears a target.”
If this seems like much ado about nothing, do you really want the data about how often you visit the liquor store, the legal marijuana dispensary, or McDonald’s to be available to your insurance company? Also, keep in mind that in some cases, location data is a matter of security. You may recall that a year ago, people were able to discover the locations of all sorts of secret military bases due to location tracking on Fitbits.
Cellphone location data is so important that, as you may recall, they made a federal case out of it. The Supreme Court’s Carpenter ruling – also, coincidentally, last June – was all about how law enforcement needed to get a warrant before going to a cellphone provider to get location data about a suspect. The issue of whether law enforcement can ask Google for anonymized cellphone location data near crimes and then use that as a basis for a warrant is also working its way through the courts.
Wouldn’t it be a lot easier for law enforcement just to go to a data aggregator that buys such location data wholesale from the cellphone providers, and get the data that way? (To be fair, those aggregators also asked for warrants, but according to the New York Times, they didn’t check them very carefully.)
And yes, it’s true that the typical consumer doesn’t realize that this is going on – though chances are they clicked on some multipage contract at some point that allowed companies to collect this data and sell it. Keep in mind that every few years someone freaks out upon discovering their Google location data.
No doubt this decision is actually making some companies sad. Location data was supposed to be one of the neat new things marketers could use, such as ads for “Hey, you’re about to pass by a Starbucks! Here’s a 10 percent coupon!” And some people would actually like that kind of service. Urban planners, among others, were also using location data to help them in their jobs.
Why it’s taking until March to stop selling this data, the companies aren’t saying, but presumably it has to do with contracts and such.
It’s been a big few weeks for acquisitions and investments in the eDiscovery marketplace.
It’s not like the old days, when major vendors were being acquired every few months. One way or another, most of the big vendors are already gone, acquired by bigger vendors, with varying degrees of success. Many of the companies these days are smaller, specific to the legal industry, and often include services as well. That said, that’s where the market is at these days.
So here’s what’s new:
DISCO, which is not a dance music company but an Austin-based eDiscovery company that uses artificial intelligence (AI), got an investment of $83 million from K-1 Investment Management, for a total of $135 million this round. According to Robert Ambrogi at Lawsites Blog, “The investment was led by Georgian Partners, a Toronto-based venture-capital firm with expertise in applied artificial intelligence. Existing investors Bessemer Venture Partners, LiveOak Venture Partners, The Stephens Group, and venture-debt provider Comerica all participated in the round. Tyson Baber, a partner at Georgian Partners, joined DISCO’s board of directors.” The company plans to use the money to scale up U.S. operations – double it, from 200 to 400 employees, writes Khari Johnson in VentureBeat — develop new products, and pursue international growth, he writes.
In addition, HaystackID, a Washington, DC-based eDiscovery services firm, acquired eTERA Consulting, an eDiscovery managed services company. The companies also received additional investment from Knox Capital, ORIX Mezzanine & Private Equity, Maranon Capital, L.P., and Baird Principal Group. HaystackID also acquired Inspired Review and Envision Discovery in 2018.
All of this is on top of similar investments and acquisitions in 2018, such as $100 million in Beaverton, Ore., company Exterro by New York private-equity firm Leeds Equity Partners, Ambrogi writes. Other eDiscovery investments in 2018 include $25 million to Logikcull; $25 million to Everlaw; the merger of two major e-discovery companies, Consilio and Advanced Discovery; and eDiscovery company Catalyst’s acquisition of TotalDiscovery, a legal hold and data collection platform, he adds.
There are also two additional trends. First of all, as with DISCO, is the emphasis on AI in the legal industry. Of the $1 billion invested in legal technology alone in 2018, $362 million of this funding has been invested in legal solutions that make use of AI, writes Lawgeex. “This AI-focused funding alone in 2018 represents a bigger sum than the investment across all legal technology in 2017,” the blog notes.
While most of these investments aren’t in eDiscovery per se, it was AI’s use in eDiscovery – called “predictive coding,” or “technology-assisted review” and first permitted in 2012 – that paved the way for the use of AI in other forms of legal technology.
Second is simply the emphasis on technology in the legal field in general. Ambrogi and Lawgeex have gigundo lists of investments in various kinds of legal software and services in 2018, and many of them are not necessarily about eDiscovery. On the other hand, it’s clear that eDiscovery has made lawyers realize the value of computers in the legal field.
“Lawyers claim that much of the work they do is too “special” for automation,” Lawgeeks writes. However, the profession is “undoubtedly waking up to the reality and opportunities for investment and the increased adoption of tech in every corner of their profession.”
It’s also clear that there’s a lot more room for investment, Lawgeeks continues, noting that financial technology saw $41.8 billion in investment in 2018 and that according to Top Healthcare AI Trends to Watch, a report from CB Insights, healthcare saw $4.3 billion across 576 funding rounds in the last five years.
At the same time, it’s also clear that all these teeny companies aren’t going to continue to stand on their own. Also following the lead of the eDiscovery industry, there’s likely to be a lot of merger and acquisition efforts going forward. Stay tuned.
For some time now, it’s been true that, while people may or may not be required to give their cell phone passwords to law enforcement, they were required to give fingerprints and other biometric agents. That’s because a fingerprint is something you have, similar to the way that you can be compelled to give up a blood sample to test for alcohol. And just last August, law enforcement forced a suspect to unlock their iPhone with their face.
But due to a recent court ruling, that may be changing, and people might not be forced to unlock their phones using biometric agents, either.
“Judge [Kandis] Westmore declared that the government did not have the right, even with a warrant, to force suspects to incriminate themselves by unlocking their devices with their biological features,” writes Thomas Brewster in Forbes. “Previously, courts had decided biometric features, unlike passcodes, were not ‘testimonial.’ That was because a suspect would have to willingly and verbally give up a passcode, which is not the case with biometrics. A password was therefore deemed testimony, but body parts were not, and so not granted Fifth Amendment protections against self-incrimination.”
But the judge didn’t agree with this, Brewster writes. “That created a paradox: How could a passcode be treated differently to a finger or face, when any of the three could be used to unlock a device and expose a user’s private life? And that’s just what Westmore focused on in her ruling. Declaring that ‘technology is outpacing the law,’ the judge wrote that fingerprints and face scans were not the same as ‘physical evidence’ when considered in a context where those body features would be used to unlock a phone. ‘If a person cannot be compelled to provide a passcode because it is a testimonial communication, a person cannot be compelled to provide one’s finger, thumb, iris, face, or other biometric feature to unlock that same device,’ the judge wrote.”
Oh my. Isn’t that going to be interesting.
Of course, we’re a long way from this case changing anything universally. “The magistrate judge decision could, of course, be overturned by a district court judge, as happened in Illinois in 2017 with a similar ruling,” Brewster points out.
That ruling was when a U.S. Magistrate Judge in the Northern District of Illinois used the Fourth and Fifth Amendments to deny a warrant to compel individuals present at the scene of an investigation to use their “fingerprints and/or thumbprints” to unlock Apple devices, writes Ian Lopez in the Recorder.
“By using a finger to unlock a phone’s contents, a suspect is producing the contents on the phone,” the Illinois judge noted. “With a touch of a finger, a suspect is testifying that he or she has accessed the phone before, at a minimum, to set up the fingerprint password capabilities, and that he or she currently has some level of control over or relatively significant connection to the phone and its contents.”
Not everyone agrees with the judge’s ruling, which also can’t be used as a precedent. Orin Kerr, of the Volokh Conspiracy, who has written about a number of these issues, doesn’t agree that providing a fingerprint violates the Fifth Amendment, for example. “Westmore’s opinion will only make things less clear and more complicated,” writes Josephine Wolff in Slate. “All of her reasoning completely ignores the fundamental idea that what the Fifth Amendment protects is the contents of your mind—not the pattern of your fingertip or anything else about your physical attributes. Just because fingerprints and passwords can both be used for the same purpose when it comes to encryption does not mean that they are both testimony or should both be treated in the same way under the law.”
Eventually, the whole case could end up in the Supreme Court’s lap.
Interestingly, the ruling cited a recent Supreme Court case, Carpenter, about cell phone location data, as well as another one, Riley, requiring a warrant to search a cell phone. The judge also used the Fifth Amendment argument that providing a biometric was self-incrimination, just as courts have recently been deciding that knowing an encryption password wasn’t on its face self-incrimination.
It’s also likely to get civil liberties’ organizations such as the American Civil Liberties Organization, the Electronic Frontier Foundation, and the Electronic Privacy Information Center pretty excited, because up until now the “have to provide a fingerprint” thing was fairly settled. Lopez quoted an ACLU representative as saying that he expected to see a lot more of these cases going forward.
We’ve written before about Rekognition, Amazon’s facial recognition software, and how organizations such as the American Civil Liberties Union (ACLU) has asked Amazon to stop selling it to law enforcement organizations. So Amazon is trying another tactic: Developing its own database, which it would collect through its Ring visual doorbell.
“A patent application filed by Amazon offers a vision of how doorbell cameras could be equipped with new technology that would allow the devices to gather data and identify people considered to be ‘suspicious,’” writes Peter Holley in the Washington Post. “The application describes how a series of cameras could be used to piece together a composite image of an individual’s face, giving homeowners and police the ability to more easily identify someone who has engaged in potential criminal activity.” A visitor could be added to either an “authorized” or a database of “suspicious people.” Such information could also be shared among neighborhood residents, perhaps using the company’s “Neighbors” app, which lets its 1 million users view and comment on crime and security information in their communities, he writes.
Or, as described by the ACLU, “a massive, decentralized surveillance network.” And once collected, the information could be subpoenaed by law enforcement, writes MyNorthwest.
“Just imagine if a person who has a criminal record is delivering a package, but the system has been set to automatically recognize anyone who has a prior criminal history as a ‘suspicious person’ and then the cops show up at this place when this person is just doing their job,” Jake Snow, a technology and civil liberties attorney at the American Civil Liberties Union of Northern California, told Holley. “Then you have an interaction between police and this individual, and we’ve seen how interactions between people of color and the police can turn deadly for any reason or for no reason at all.”
Or it could go further. “Imagine a group of volunteers approach a neighborhood as a part of a voter registration drive,” writes Tanvi Misra in CityLab. “If any of them match the ‘database of suspicious persons,’ the system could ping police or other neighbors. Or, in another iteration, if a caller’s face doesn’t match with a list of ‘authorized people’ created by a user, the system could add that image to the user’s own list of suspicious persons and raise the alarm accordingly.”
The Federal Bureau of Investigation (FBI) has also said that it is using the software as an automated way of searching through surveillance footage, such as that of Las Vegas mass shooter Stephen Paddock, writes Frank Konkel in NextGov. Amazon Rekognition could have gone through the same data in 24 hours, or three weeks faster than human FBI agents, he writes.
The company also reportedly last June shopped its facial recognition software to, not law enforcement organizations exactly, but the Immigration and Customs Enforcement department, according to a Freedom of Information Act request by the advocacy group Project on Government Oversight, writes Drew Harwell in the Washington Post. This also led eight Democratic legislators to write to the company with questions about privacy.
“We have serious concerns that this type of product has significant accuracy issues, places disproportionate burdens on communities of color, and could stifle Americans’ willingness to exercise their First Amendment rights in public,” the Congressional representatives wrote.
Even Amazon employees are complaining about the company’s facial recognition actions.
Taylor Swift is also reportedly using facial recognition software, but it’s to compare concert attendees with a database of her stalkers, which is slightly less heinous. (What’s it like to be so famous that you have so many stalkers – including some you find asleep in your own bed — that you have to store them in a database?) It isn’t clear whether she’s using Amazon’s Rekognition.
Amazon made a point of saying that the patent application had been started by Ring before it was acquired by Amazon, and that it didn’t necessarily represent a product direction for the company.
It’s kind of hard to write a year in review about something that’s become a commodity.
“Hard disk drives, whether spinning disks or solid state, keep getting bigger, denser, and cheaper.” That’s pretty much it. Zzz.
Really, the most exciting thing in storage was Dropbox finally going public, years after anyone expected it. After opening in March, and spiking in June, it’s gradually been decreasing since then, even during the bull market. At this point, it’s lower than its IPO price, and not far above its low for the year. (In comparison, Box, which went public in 2015, hit its all-time high in May, and has also been steadily decreasing since then.)
Sure, there’s occasional new technologies. We’re going to store data in glass. We’re going to store data in DNA. Yep, sure we are. Not anytime soon, though.
Meanwhile, magnetic tape is still a thing.
What really ends up being news in storage is what we do with it. And, sadly, we’re not getting a lot smarter with it.
We’re still losing hard disk drives, or letting them get stolen, or letting them get hacked, or not wiping them before discarding them. And, of course, without encrypting the data on them. (Not to mention, still using really dumb passwords.)
We’re still poking strange USB sticks in things, even with PCs that are supposed to be so secure that they’re “airgapped,” or not connected to the Internet. Even when the USB things in question – cute little fans, in this case — come from North Korea.
We’re still getting our personal information added to giant databases, whether it’s through Facebook posts and quizzes, genetics, or — not necessarily willingly — through drone footage or required facial scanning to attend events.
At the same time, government data to which we should have access continues to disappear, whether it’s by deleting police bodycam footage due to lack of space (or, realistically, budget to pay for the space), politicians who conduct the people’s business on private communications channels or ones that automatically delete information, or government data that just disappears or, at least, can no longer be reached.
Most of all, we have attorneys and judges trying to figure out how to balance people’s right to privacy (if there even is such a thing – remember, it’s not written in the Constitution) with protecting the public against crime. We have people being required to submit their fingerprint or their face to unlock their phone, but not always being required to submit a password or an encryption key. We have people still sometimes being required to give up their devices for search anytime they cross a border, even when they’re attorneys or journalists. And we have companies, law enforcement, and governments increasingly able to track our every move.
This is all based on laws that are different in every country and, at least in the U.S., date back to the 1980s, when we barely had an Internet, let alone a tiny computer in our pockets with more processing power than it took to send humans to the moon – not to mention toys, cars, and speakers we talk to. And in an increasingly mobile society and global business world, we’re still trying to figure out how to determine jurisdiction. Is it based on where the data lives at the moment? Where the owner of the data lives? The location of the company that provides the storage service?
Storage itself might not be interesting, but what we do with it remains endlessly fascinating.
eDiscovery is kind of a funny thing. Every few years, the rules governing it change, but it takes a couple of years after that to see the effect, based on case law.
As you may recall, a new set of rules for the Federal Rules for Civil Procedure (FRCP) took effect in December, 2015. (That was after the original set, in 2006.) These included a number of amendments intended to streamline the preliminary steps of the legal process by as much as half. Several other amendments reduced the number and length of depositions, requiring more specificity in objections, and required that participants consider proportionality — basically, be reasonable in their e-discovery demands.
So, how are they working out?
Sadly, Gibson Dunn has not yet released its frighteningly complete set of ediscovery case law for 2018. Nonetheless, there are still some conclusions that can be made. (Heck, iDiscover released its Top 5 eDiscovery Trends for 2018 in August.)
- It’s still possible to get ginormous sanctions for having the court believe that you’re withholding documents. In Klipsch Group, Inc. v. ePRO E-Commerce (2d Cir., Jan 25, 2018), the company allegedly spoiled discoverable information, writes Michael Hamilton in Legaltech News. “Namely, the defendant:
- Failed to place adequate legal holds on electronic data including emails;
- Did not disclose 40,000 relevant sales documents; and
- Manually deleted thousands of files and emails,” he writes.
As a result, the company was slapped with a $2.7 million fine. To add insult to injury, it was only a $20,000 case to begin with!
- The legal system is still trying to figure out the nuances of technology-assisted review (TAR), or the notion of using artificial intelligence to help weed out documents in eDiscovery. In particular, the current question is whether that needs to be disclosed, writes Casey Sullivan – a really funny guy — in the Logickull blog. “If you are going to use robots to ‘review’ documents without actually having a human being put eyes on them, do you need to disclose this to the other side beforehand?” he writes. “It’s a debate that still rages with staunch proponents on either side — the human sides (the robots don’t seem to care) — which came to light most recently, with a side of dry, English wit in Triumph Controls UK Ltd & Anor v Primus International Holding Co & Ors EWHC 176 (TCC) (07 February 2018).”
In response to figuring out the nuances, some courts are going into voluminous detail, Sullivan writes. “In Re Broiler Chicken Antitrust Litigation, 1:16-cv-08637 (N.D. Ill. Jan. 3, 2018) has been hailed as the ne plus ultra of TAR protocols, with eight pages of exacting detail that appear, at least to some, as the ultimate means to avoid further TAR disputes,” he writes. “Yet, to others, the very precision of the In Re Broiler Chicken protocol is the precise reason that it will be the sine qua non cause of endless discovery disputes.” But the case did have one advantage, he adds. “The one thing that we can know at this time is that the case certainly has been the cause of endless and endlessly awful chicken-related puns, a temptation which we will, perhaps surprisingly, ourselves resist (if only because that it’s just too… easy).”
- Courts are still learning to figure out how eDiscovery relates to social media, texting, and other communications systems, especially for ones intended to be ephemeral. If a company (or a government) is using an app that automatically destroys messages, is that just good document hygiene or a way to evade detection? And just how long do you need to keep texts and social media posts around, anyway?
In the case high-profile Waymo v. Uber Technologies, Uber used ephemeral apps to quickly erase any messages they made, writes Victoria Hudgins in Legaltech News. “Waymo claimed Uber used the apps to minimize its paper trail,” she writes. “[U.S. District Judge Xavier ] Rodriguez said the case questioned when companies can circumvent the duty to preserve and if there’s a duty to preserve messages in message-deleting apps. If companies allow employees to communicate through a message-deleting app about a product at issue, they must ensure the messages aren’t deleted, Rodriguez advised.”
It’s always interesting to read about how the BackBlaze backup service is doing with its hard disk drives. Like companies such as Facebook and Google, it uses so many hard disk drives that it ends up stripping off extraneous parts and building its own structures with them. But because the company is so much more forthcoming than Facebook or Google about what it’s doing, it’s a lot more instructive to the industry. It’s fun – if you’re a certain kind of person, anyway – to read over the years of BackBlaze reports as they migrate from 2 TB to 3 TB and on up to 12 TB hard disk drives.
As you may recall, BackBlaze used to use “pods” of 45 hard disk drives, but a year or so ago started using “vaults” made up of up to 20 even bigger “pods,” each of which hold up to 60 hard disk drives. Gradually, as time goes on, the company replaces the hard disk drives in each of those structures as technology improves. The result is that the capacity of the pods and vaults keeps going up over time.
BackBlaze also uses commodity hard disk drives rather than the latest and greatest bleeding-edge technology, not only because it’s cheaper but because it’s easier to get in large quantities. Because the company works with such large quantities of a single model of hard disk drive, that makes it easier for them to calculate longevity and failure statistics than if they had a whole lot of different models.
At this point, BackBlaze has now migrated out the last of its 3 TB hard disk drives, and its most popular hard disk drive is now 12 TB. Seems like only yesterday that it was talking about testing 6 TB hard disk drives, but that was actually back in 2014. Time flies. Similarly, the company plans to upgrade all its 4 TB hard disk drives to larger capacities over the next couple of years.
- It now has 99,636 hard disk drives – 1,866 boot drives and 97,770 data drives. That’s 584 fewer drives than last quarter, but because of the larger sizes, it’s actually added 40 petabytes of storage, because the company replaced 3 TB, 4 TB, and some 6 TB drives – all about four years old — with 3,600 new 12 TB drives. In fact, it has more 12 TB hard disk drives – in this case, a Seagate one – than any other model, with 25,101.
- The least reliable hard disk drives during the quarter – that is, the ones with the highest percentage of drive failures – are 6 TB Western Digital drives, with a failure rate of 4.64 percent. Several hard disk drives, including 4 TB from Hitachi, Toshiba, and Western Digital, 5 TB from Toshiba, and 12 TB from Hitachi, haven’t failed at all during the quarter. (Though, to be fair, the 12 TB Hitachi hard disk drives were in service only nine days during that quarter, BackBlaze points out.)
- The least reliable hard disk drives that the company is still using are also the 6 TB Western Digital ones. Other than the new 12 TB Hitachi ones, the most reliable drives are 4 TB Hitachi ones. Incidentally, the company’s overall failure rate is 1.71 percent, which it said was the lowest it had ever achieved.
As always, BackBlaze makes its full data set available to people who want to play with it. “All we ask are three things,” the company writes. “1) you cite Backblaze as the source if you use the data, 2) you accept that you are solely responsible for how you use the data, and 3) you do not sell this data to anyone. It is free.”
Does this all mean that the models that work well for BackBlaze will work well for you, too? Not necessarily. Obviously, hard disk drives kept in pods or vaults are much more heavily and steadily used than an everyday hard disk drive for home use or even for a company. But it’s a good way to bet. The results you get might be different, but in general, a hard disk drive with a failure rate of 4 percent isn’t likely to work out as well as a hard disk drive with a failure rate of 0.5 percent.
Disclaimer: I am a BackBlaze customer.
What with World Backup Day, Electronic Records Day, Ask an Archivist Day, and Sysadmin Day, I suppose it’s no surprise that there’s an E-Discovery Day. Incidentally, it’s Tuesday.
There was, apparently, some controversy about when to schedule E-Discovery Day this year. Typically scheduled on December 1, that date fell on a Saturday this year, so it was moved to December 4. Why a Tuesday and not a Monday? Organizers didn’t say.
(Why it’s December 1 in the first place isn’t specified, either. In comparison, March 31’s World Backup Day is the night before April Fool’s Day, presumably in case someone loses data due to a puckish prank, while Electronic Records Day is October 10 so it can be 10-10 to symbolize digital data.)
Like those other days, E-Discovery Day is sponsored by a number of vendors and organizations that could be said to have some investment in the technology. That said, promoters swear that the list of webcasts scheduled for the day are informational and not sales promotions. And some of them actually sound interesting, such as how GDPR will affect e-discovery, controversial issues in e-discovery, and people’s e-discovery wish lists.
(To judge by the list of in-person events, one of the things e-discovery professionals like to do is drink. About half of them are happy hours in various cities.)
Naturally, there’s a Twitter feed and even an Instagram page, but, oddly, no Facebook page. And, notably, some of the webinars and in-person events count for continuing legal education (CLE) credit, for people who need to worry about such things. There is also, apparently, a Women in E-Discovery organization – TIL – as well as an Association of Certified E-Discovery Specialists. I was crushed and dismayed, however, to get a 404 on the latter’s page that was supposed to contain “E-Discovery Day themed E-Cards, badges, and memes.”
In any event, E-Discovery (not EDiscovery, though things like the Twitter feed drop the hyphen) Day, which has been going on for four years now, is intended to raise awareness of the critical issues surrounding E-Discovery, as well as, like the other days, providing a focal point for discussion. “More e-discovery in one day than the rest of the year combined,” notes the event’s web page.
“All too often, e-discovery professionals operate in the background,” the webpage notes. “Hot-shot litigators argue cases in court. Judges command attention from the bench. Even IT security pros and hackers get occasional headlines when a there’s a data breach. In 2015, we decided that enough was enough. E-Discovery plays a critical—and growing—role in the legal process. After all, organizations spend almost $10 billion per year on e-discovery services. To get e-discovery, and the hard-working professionals who make it happen, the attention they deserve, we established E-Discovery Day.”
The event seems to have ramped up this year, which is especially interesting because the market itself seems to have slowed; Gartner doesn’t even seem to produce a Magic Quadrant for E-Discovery any more. Up until now, the event had accumulated a total of 37 webcasts and 20 live events later. Just this year, there’s 19 webcasts and 14 live events, as well as 25 supporting organizations, compared with 15 online webinars and 13 in-person events around the United States last year. “E-Discovery Day 2018 will certainly eclipse last year’s record of over 3,000 participants attending live and online educational events” is undoubtedly true.