Yottabytes: Storage and Disaster Recovery

January 13, 2020  9:53 AM

Year in Preview 2020: Storage, the New Electricity

Sharon Fisher Sharon Fisher Profile: Sharon Fisher
cloud, Storage

The problem with storage becoming a commodity is that people stop thinking about how important it is. If you look at the various predictions for 2020 – and people are loving to make them, because “2020” is such a cool number – hardly any of them explicitly mention storage, and the ones that do typically limit their predictions to the cloud.

Yet, many of the other predictions they make are predicated on having easy access to reliable, secure, inexpensive and, most of all, plentiful storage.

Gartner, for example – which, to show how forward-thinking it is, makes its 2020 predictions in October during its Symposium conference at Walt Disney World – did include one storage-based prediction in its Gartner Top 10 Strategic Technology Trends for 2020, “distributed cloud.” “Distributed cloud refers to the distribution of public cloud services to locations outside the cloud provider’s physical data centers, but which are still controlled by the provider,” the company writes. “In distributed cloud, the cloud provider is responsible for all aspects of cloud service architecture, delivery, operations, governance and updates. The evolution from centralized public cloud to distributed public cloud ushers in a new era of cloud computing. Distributed cloud allows data centers to be located anywhere. This solves both technical issues like latency and also regulatory challenges like data sovereignty. It also offers the benefits of a public cloud service alongside the benefits of a private, local cloud.”

Yay! Storage, sort of! Even if it does take to Trend #7 to get to it. A related trend is #6, the “empowered edge.” “Edge computing is a topology where information processing and content collection and delivery are placed closer to the sources of the information, with the idea that keeping traffic local and distributed will reduce latency,” Gartner writes. “This includes all the technology on the Internet of Things (IoT). Empowered edge looks at how these devices are increasing and forming the foundations for smart spaces and moves key applications and services closer to the people and devices that use them.” One aspect of IoT is that they typically generate a horrendous amount of data, which has to be stored.

But a number of the other trends also touch on storage. Take Trend #1, “hyperautomation.” “Automation uses technology to automate tasks that once required humans,” the company writes. “Hyperautomation deals with the application of advanced technologies, including artificial intelligence (AI) and machine learning (ML), to increasingly automate processes and augment humans. Hyperautomation extends across a range of tools that can be automated, but also refers to the sophistication of the automation (i.e., discover, analyze, design, automate, measure, monitor, reassess.)”

Okay. How are you going to do that without storage?

Similarly, there’s Trend #3, “democratization.” “Democratization of technology means providing people with easy access to technical or business expertise without extensive (and costly) training,” Gartner writes. “It focuses on four key areas — application development, data and analytics, design and knowledge — and is often referred to as ‘citizen access,’ which has led to the rise of citizen data scientists, citizen programmers and more.  For example, democratization would enable developers to generate data models without having the skills of a data scientist. They would instead rely on AI-driven development to generate code and automate testing.”

Anytime you see “data scientists,” that means big data – and a place to put it. And artificial intelligence typically requires a large amount of data the computer can learn from.

And there’s also keeping track of the data once you get it, as in Trend #6, “The evolution of technology is creating a trust crisis. As consumers become more aware of how their data is being collected and used, organizations are also recognizing the increasing liability of storing and gathering the data,” Gartner writes. “Legislation, like the European Union’s General Data Protection Regulation (GDPR), is being enacted around the world, driving evolution and laying the ground rules for organizations.” In this trend, you no longer even have to consider the details of how you are storing the data, but about how to deal with it.

Autonomous things, blockchain, and AI security, three other Gartner trends, all also require storage to work.

Gartner’s not alone. Forrester made similar predictions – again, where a number of them are predicated on having large amounts of storage but without ever using storage itself as a trend, such as  “Advanced firms will double their data strategy budget,” “Data and AI will get weaponized,” and “Regulation will make and break markets.”

This all just goes to show how important storage is to our lives and how much we’re taking it for granted. You couldn’t do most of these predictions without storage, yet few of them mention it explicitly, just like none of them say, “Hey, you know, we’ll need electricity and telecommunications to do this stuff, too.”

December 31, 2019  12:07 PM

E-Discovery Trends for 2019: Acquisitions and Autonomy

Sharon Fisher Sharon Fisher Profile: Sharon Fisher
Autonomy, E-discovery, HP

It turns out that, for 2019, there were really only two E-discovery stories: The continuing consolidation of the E-discovery marketplace, and the ongoing train wreck that is the HP-Autonomy merger and the lawsuits that have followed in its wake.

Yes, that was a mixed metaphor.

Well, okay, there was one more. The Supreme Court ruled that winning litigants couldn’t necessarily count on being reimbursed for E-discovery costs. Considering that can be a major component of court cases these days, it will be interesting to see how often that gets used as a precedent in coming years.

Plus I forgot about E-Discovery Day in 2019. Oops.

The ongoing consolidation of the E-discovery marketplace has been a thing ever since 2011, when Gartner published its first Magic Quadrant for the E-discovery marketplace, giving larger vendors a handy shopping list for acquisition. So well did that shopping list work that, out of 22 vendors in the original one, only a handful still remain. In fact, Gartner stopped publishing the E-discovery Magic Quadrant altogether after 2015, having apparently concluded there is no longer an E-discovery marketplace per se.

And yet, vendors keep finding other vendors to acquire, because law firms have figured out that the best way they can make more money is to automate their current processes, so people keep starting up new legal software companies for the purpose of them acquiring each other – or, in a relatively new development, investing in them.

At least it keeps the M&A team busy.

Speaking of M&A, take HP-Autonomy.

Actually, don’t.

Considered to officially be the sixth-worst acquisition of all time, HP and Autonomy have been beating each other up in civil and criminal court for going on five years. So far, the only winners are the lawyers.

In case you’ve forgotten, in the Autonomy-HP merger – officially the sixth-worst merger and acquisition of all time – HP chairman and CEO Leo Apotheker (who was fired later that year) paid $11.1 billion to acquire Autonomy, a European e-discovery company. By the following year, HP claimed that Autonomy had cooked its books to overvalue itself, wrote down the purchase a a $9 billion loss, and sold off the company’s remaining assets in 2016.

Then came the lawsuits, starting with a shareholder lawsuit, which HP settled in 2015 for $100 million. Former Autonomy CFO, Sushovan Hussain, was found guilty on 16 counts of wire and securities fraud. HP also had a countersuit by former Autonomy CFO for $160 million, and an appeal by Hussain, as well as actual criminal fraud charges were filed against Lynch, and were added to.

In March, a $5 billion civil lawsuit against Autonomy CEO Mike Lynch started. And that’s what we’ve been working on this year.

The basic story is the same: HP says Autonomy pumped up its value, and Autonomy says that HP doesn’t understand British accounting and is trying to overcome its own incompetence at not successfully integrating the company. It’s the details that make this a train wreck, like just how many times did Autonomy CEO Mike Lynch use the f-word in email messages to his subordinates?

The high point this year was when former HP CEO Meg Whitman threw former HP board chair Leo Apotheker under the bus.

We know this, because she wrote an email message saying “Happy to throw Leo under the bus.”

Remember, kids, in an E-discovery case, your email messages can come back to haunt you.

HP, in fact, has so much post-traumatic stress disorder around the whole thing that it’s been bleeding over into the potential HP-Xerox acquisition. In case you missed that, Xerox suggested that it merge with HP. But HP, after being mocked on the world stage by having done insufficient due diligence on the Autonomy acquisition, is refusing to reveal any information about itself to Xerox, while at the same time demanding that Xerox provide all sorts of due diligence before HP will even look at it. Whether the acquisition itself actually makes any sense is immaterial; what’s important is that nobody will ever be able to say that HP failed to do enough due diligence again.

And the various Autonomy cases aren’t over. After all, we need E-discovery news for 2020.

December 30, 2019  10:15 AM

Storage Year in Review 2019: Everything Old is New Again

Sharon Fisher Sharon Fisher Profile: Sharon Fisher
government, privacy, Security, Storage

The funny thing about writing a year in review for storage in 2019 is that it’s almost exactly the same as the year in review for storage in 2018. Only the links change.

Hard disks and other forms of data storage still get bigger, denser, and cheaper. Researchers look at new technologies for storage in the future, such as glass storage and storage in DNA. We still use magnetic tape. We still lose data and poke USB sticks in things. (Sometimes these two things are related.) Our data is still being added to government databases and various aspects of this keep going to court, such as whether the police can gain access to genetic databases without a warrant. Sometimes we even win, such as when courts finally decided – for good, I hope – that border agents couldn’t search laptops and other storage devices willy-nilly.

Really, the biggest trend in storage this past year was getting rid of it. Thank you, Marie Kondo.

To some people, this year also marks the end of a decade. (Those people would be wrong, but still.) What’s interesting about that is finding out that my doing storage trend pieces in December is actually a fairly recent development, when I thought I’d been doing it all along. It’s funny how easily we can convince ourselves that something has always been true, in a we’ve-always-been-at-war-with-Eastasia kind of way. Who needs 1984? We can do it ourselves.

Incidentally, 1984 was 36 years ago.

That’s actually what demonstrates the value of storage. Human memory is not only limited, but fallible. Humans have so many logical fallacies in the way that we remember and present information.

You know, once I learned about confirmation bias, I started seeing it everywhere.

Another one, not included in that list, is recentism, also known as “the curse of memory” or the “availability heuristic”: If we can remember it, it must be important. Conversely, if we can’t remember it, it must not be important.

The point of storage is to help us override that bias – to ensure that we don’t attribute too much importance to something recent, just because we remember it better, and remind us of important things that happened in the past. Remember “history doesn’t repeat itself, but it rhymes”? Business and history travel in cycles, and it’s important for us to be able to back and look at the last time we were in a similar place in the cycle, because that could help us get through it more easily the next time. Or, better still, prevent it from happening.

That’s actually one of the scarier trends in storage. It’s bad enough when a database remembers or tracks something that we’d just as soon it had forgotten, or, conversely, when data we had counted on being there suddenly no longer is, whether that’s by accident, such as when a system loses data, or on purpose, such as when politicians delete government data that is no longer convenient to have around. (Though, to be fair, President Donald Trump’s administration is doing this a whole lot less than people were afraid of at first.)

The bigger concern is whether we can trust that the data we have is actually accurate. Whether it’s “deep fakes,” or using technology to create believable audio and video of things that did not occur, or actually changing data, such as the concern about voting machines not accurately recording voting data while making it look as though they did, it seems to me that one of the biggest and scariest trends of the year, and perhaps even the decade, is not just the security and robustness of the data we have, but also its reliability. How can we trust that the data we have is actually what we believe it is? How do we know whether data was changed in the process, or made up out of whole cloth?

Perhaps what vendors should be working on now is not how to make data storage bigger, denser, or faster, but to help us find ways to ensure that the data we do have is actually accurate and unchanged.

December 29, 2019  10:56 PM

Remember Zork Text Games? Its Source Code is Available

Sharon Fisher Sharon Fisher Profile: Sharon Fisher

In case Entombed, Magnetic Scrolls, and Prince of Persia aren’t enough, you ow have access to a number of the 1980s-era Infocom games, ranging from Zork to Leisure Suit Larry.

If you’ve forgotten, or weren’t alive at the time, Infocom games were text-based, because this was back in the day before graphics were particularly available in games.

“It was 1977, and home computers were big, expensive, heavy, and were almost entirely lacking in computing power by today’s standards,” writes Krypton Radio. “Yet, in this primitive environment, the first computer adventure games were born. Zork was the first commercial offering.” It was based on the very first text adventure game, Colossal Cave Adventure — or, as some people called it, just Adventureand originally written for the Digital Equipment Corp. DEC-10 minicomputer. “That’s right, it took a mainframe to run it!” Krypton Radio notes.

Now, the source code to some of the Infocom games has been posted to GitHub, so that people who have been longing for the days of text-based games can indulge themselves to their heart’s content.

However, that was also back in the day when there were multiple types of computers and each game needed to be written for each. Consequently, the games were all written in a proprietary language called ZIL.

“ZIL, or Zork Implementation Language, is the unique programming language used to make the Zork games and was based on another old coding language called MIT Design Language (MDL),” writes Matt Kim in US Gamer. “[ZIL] is written to create adventure games in an environment people haven’t used commercially in over 25 years. And even then, it was about 15 people. ZIL then is a pretty niche coding language with a niche group of followers. There are actual online communities that teach and carry on ZIL, but it’s not a modern coding language like C++.”

As with other games, looking at the source code reveals all sorts of things about the game. For example, when the source code had first become available a couple of years earlier, people learned that some aspects of Zork were completely random. “While Zork checks the player’s item count to determine if they’re carrying too much, it also uses a random roll just to mess with the player,” writes Logan Booker in Kotaku. “The roll used a number between 0 and 100, forcing players to keep trying to pick things up until it finally worked. I was skeptical at first — surely a system as important as inventory wouldn’t be so cavalier with capacity? My skepticism grew when searches of Zork‘s MDL code from MIT and the public domain source from Infocom came up empty. But, after checking various sources of decompiled code from Zork, it does indeed appear the game would fire out an overburdened message based solely on randomness.”

Source code for other Infocom games were posted as well. “Leisure Suit Larry, the complete source code, have been uploaded to GitHub – alas, not the assets too, so you can’t build Leisure Suit Larry from this, but you can certainly get a glimpse as to how the game was created and how the asset system worked with the game script itself,” Krypton Radio adds, as well as The Hitchhiker’s Guide to the Galaxy and others. “There are about two pages of listings, mostly Infocom, but there are some hidden gems there too, like an open source version of the engine Croteam created for Serious SamPeter Spronck’s Space Traderas well as the complete source for Hexen and Hereticboth from Raven Software.”

If Fortnite has lost its appeal for you, it might be worth checking out.

December 11, 2019  9:59 AM

Another Glass Storage Milestone: Preserving Superman 1978 for Eternity

Sharon Fisher Sharon Fisher Profile: Sharon Fisher
Glass, Storage

As you may recall, every couple of years someone does a new experiment with glass storage, and everyone falls all over themselves talking about how it’s the wave of the future and never wears out and is completely indestructible.

Right. Tell that to the casserole dish that I took from the fridge to the oven too fast.

Remember, by 2015 we were all supposed to be using glass storage by now, at least according to Hitachi, which announced it in 2012.

Anyway, Microsoft, which has been on the cutting edge – see what I did there? – of storage research for a while, recently announced a new breakthrough: It had stored an entire movie on glass. And which movie did it pick?

Superman 1978.

Of all the movies that could be preserved forever, they pick that one? Admittedly, they could do worse. The dorky “Can you read my mind?” scene aside, it’s not a bad movie, may Christopher Reeve and Margot Kidder rest in peace.

It turns out there’s a reason they picked that movie. Warner Brothers, which is partnering with Microsoft on this research to find better, more economical, safer ways to store its backlog (remember the 2008 Universal fire?) apparently had discovered some recordings of the 1940s-era Superman radio show on glass discs, and they took that as a Sign.

You know, that’s the story I really want to hear. How did those recordings get made? How did Warner Brothers find them? How did they figure out a way to play them? Sadly, I can’t find any information on that other than the offhand references in the Microsoft pieces.

Warner Brothers isn’t alone. GitHub is also partnering with Microsoft to store its archives on glass, among many other storage media, in a program called LOCKSS, or Lots of Copies Keeps Stuff Safe.

With the Project Silica technology, Microsoft has reportedly succeeded in storing Superman, all 75.6 gigabytes of it, on a piece of glass the size of a “drink coaster,” 75 x 75 x 2 millimeters, the company writes. I guess comparing it to the size of a CD or DVD didn’t occur to them.

So, it’s good to know that glass has now reached the same level of data density as a DVD. Earlier versions of glass storage could store only 40 megabytes per square inch, which was about the same level as a CD, but not as good as a hard disk.

“A laser encodes data in glass by creating layers of three-dimensional nanoscale gratings and deformations at various depths and angles,” writes Jennifer Langston on the project website. “Machine learning algorithms read the data back by decoding images and patterns that are created as polarized light shines through the glass.”

In other words, this is not technology you’re going to be picking up on a thumb drive anytime soon. And it’s not intended to be. “It represents an investment by Microsoft Azure to develop storage technologies built specifically for cloud computing patterns, rather than relying on storage media designed to work in computers or other scenarios,” Langston writes. “We are not trying to build things that you put in your house or play movies from. We are building storage that operates at the cloud scale.”

And we get the usual song and dance about how indestructible it is. “The hard silica glass can withstand being boiled in hot water, baked in an oven, microwaved, flooded, scoured, demagnetized and other environmental threats that can destroy priceless historic archives or cultural treasures if things go wrong,” Langston writes.

Notice how she doesn’t mention “dropped.” “Sure, it is breakable if you try hard enough,” a Microsoft researcher told Janko Roettgers in Variety. “’If you take a hammer to it, you can smash glass.’ But absent of such brute force, the medium promises to be very, very safe, he argued: ‘I feel very confident in it.’”

And there’s still the formatting issue. “Long-term storage costs are driven up by the need to repeatedly transfer data onto newer media before the information is lost,” Langston writes. “Hard disk drives can wear out after three to five years. Magnetic tape may only last five to seven. File formats become obsolete, and upgrades are expensive. In its own digital archives, for instance, Warner Bros. proactively migrates content every three years to stay ahead of degradation issues. Glass storage has the potential to become a lower-cost option because you only write the data onto the glass once. Femtosecond lasers — ones that emit ultrashort optical pulses and that are commonly used in LASIK surgery — permanently change the structure of the glass, so the data can be preserved for centuries.”

Well, okay. But as Langston mentions, file formats become obsolete, and glass doesn’t solve that problem. All that gives you is a bunch of indestructible data that nobody can read because nobody has the readers or the software for it.

Though you could always use them for coasters.

November 30, 2019  9:28 PM

Another Bitcoin storage fraud, this time in China

Sharon Fisher Sharon Fisher Profile: Sharon Fisher
Bitcoin, privacy, Security

The CEO of a Chinese Bitcoin exchange, International Data Access Exchange (IDAX), has vanished with the keys, leaving all its balances inaccessible — to anyone but himself, presumably.

“Following the official announcement ‘Announcement of IDAX withdrawal channel congestion’ on November 24, We announce Urgent notice about current situation of IDAX Global,” noted the company’s website. “Since we have announced the announcement on November 24, IDAX Global CEO have gone missing with unknown cause and IDAX Global staffs were out of touch with IDAX Global CEO. For this reason, access to Cold wallet which is stored almost all cryptocurrency balances on IDAX has been restricted so in effect, deposit/withdrawal service cannot be provided.”

The action may be linked to crackdowns by the Chinese government in the cryptocurrency market, reported BeInCrypto. “The news from IDAX comes just days after the exchange suddenly announced its withdrawal from the Chinese market entirely,” writes Rick D. “Citing ‘policy reasons,’ a statement on November 25 explained that the company would no longer provide its services to China. Although not explicit, the sudden announcement seems almost certainly linked with recent news of a further clampdown on digital currency trading venues by the Chinese government.” However, as of yet, no bitcoin were reported missing, he added.

If this sounds familiar, it’s because in December 2018, Gerald Cotton, CEO of crypto exchange QuadrigaCX, reportedly died in India on his honeymoon without leaving access to the keys to anyone including his new wife, Jennifer Robertson. Whether he’s not actually dead has never been ascertained, but since then, a report by Ernst & Young has stated that much of the money was taken out of the exchange and used privately.

“In the course of its investigation, the Monitor identified significant transfers of Fiat from Quadriga to Mr. Cotten and his wife,” the report noted. “The Monitor understands that in the last few years, Mr. Cotten and his wife, either personally or through corporations controlled by them acquired significant assets including real and personal property. The Monitor also understands that they frequently travelled to multiple vacation destinations often making use of private jet services. The Monitor has been advised that neither Mr. Cotten nor his wife had any material source of income other than funds received from Quadriga.”

That real and personal property includes land in Canada, airplanes, and cars, amounting to about $12 million Canadian, or $9 million US, which the report said would be sold to help repay creditors.

The report noted a number of other accounting and financial problems with the company, adding, “In addition, the Monitor understands passwords were held by a single individual, Mr. Cotten and it appears that Quadriga failed to ensure adequate safeguard procedures were in place to transfer passwords and other critical operating data to other Quadriga representatives should a critical event materialize (such as the death of key management personnel).”

You think?

As it turns out, Quadriga might have been intended to be a fraud from the beginning, and Cotten might have started defrauding people as early as 15.

In fact, Cotten might not even be dead. “The RCMP and the FBI have refused to comment, but some of their interview subjects have gotten the impression that they believe Cotten might not be dead,” writes Nathaniel Rich in Vanity Fair. “’They asked me about 20 times if he was alive,’ says one witness who has intimate knowledge of Quadriga’s workings and has been questioned by both agencies. ‘They always end our conversations with that question.’ QCXINT, the creditor and blockchain expert, said that the FBI’s Vander Veer told him that with hundreds of millions of dollars missing and no body, ‘it’s an open question.’ The only way to verify that the body Robertson brought home from India was Cotten is to exhume it. The RCMP, which has jurisdiction over the case, has thus far not done so.”

November 27, 2019  10:05 PM

32,768-Hour Hard Disk Drive Failure Strikes HPE

Sharon Fisher Sharon Fisher Profile: Sharon Fisher

People creating a new system sometimes underestimate how long it’ll be around. That was the core of the “Y2K Problem,” which is when people were concerned that computer programs around the world would fail because the designers had never considered the idea of a year after 1999.

Boy, that feels like a long time ago.

Most of the Y2K bugs got worked out before everything went poof at midnight on December 31, 1999, but it’s not unusual for there to be similar bugs related to data fields that get filled up. In addition, hackers have learned to create and exploit these bugs by putting a system into a vulnerable state through a buffer overflow, such as with the “heartbleed” bug from about five years ago.

But more recently, there’s a doozy.

“Bulletin: HPE SAS Solid State Drives – Critical Firmware Upgrade Required for Certain HPE SAS Solid State Drive Models to Prevent Drive Failure at 32,768 Hours of Operation,” reported the Hewlett Packard Enterprise Support Center earlier this month.

If that seems like an odd number, it’s not – literally, that is. It’s 2 to the 15th power.

So let’s take a guess – some field associated with the solid state drive is 15 bits long, and when the hour count gets beyond that (which is about 1,365 days, or 3 ¾ years), the field fills up and the system is froached.

The power-on counter in the affected drives uses a 16-bit Two’s Complement value (which can range from −32,768 to 32,767). Once the counter exceeds the maximum value, it fails hard,” writes Marco Chiappetta in Forbes.

And it gets really froached.

After the SSD failure occurs, neither the SSD nor the data can be recovered,” HPE notes. “In addition, SSDs which were put into service at the same time will likely fail nearly simultaneously.”

Chiappetta goes into more detail about that aspect. “This issue can be particularly catastrophic because the affected enterprise-class drives were likely installed as part of a many-drive JBOD (Just A Bunch Of Disks) or RAID (Redundant Array of Independent Disks), so the potential for ALL of the drives to fail nearly simultaneously (assuming they were all powered on for the first time together) is very likely.”

Oh goody.

HPE said that one of its vendors had discovered the problem. “HPE was notified by a Solid State Drive (SSD) manufacturer of a firmware defect affecting certain SAS SSD models (reference the table below) used in a number of HPE server and storage products (i.e., HPE ProLiant, Synergy, Apollo, JBOD D3xxx, D6xxx, D8xxx, MSA, StoreVirtual 4335 and StoreVirtual 3200 are affected).

One wonders how this bug presented itself. Did someone happen to run across it just in time? How long have HPE drives been crashing and burning until this bug was tracked down and repaired?

And which vendor was this? HPE doesn’t say, but one would guess that HPE might not be using that vendor again in the future.

“This HPD8 firmware is considered a critical fix and is required to address the issue detailed below. HPE strongly recommends immediate application of this critical fix.”

You don’t say.

November 20, 2019  11:25 PM

Cops Now Using Warrants to Gain Access to Genetic Databases

Sharon Fisher Sharon Fisher Profile: Sharon Fisher
Database, Storage

As you may recall, last year police officers were able to track down a murderer through relatives in a genetic database. Now, it’s gone one step further: Police have succeeded in using warrants to gain access to genetic databases to search for suspects.

Police first started using genetic databases for law enforcement in 2015. In fact, in some cases, they started asking people for DNA samples to prove they weren’t suspects in cases.

In response to the 2018 case, genetic database companies started writing and following best practices guidelines regarding the use of their data in law enforcement. (The agreement, however, didn’t cover GEDMatch, the open source database used by law enforcement to track down the alleged “Golden State Killer.”) Even before that, in response to the 2018 case, people started making their genetic records private.

In September, the U.S. Department of Justice issued a policy limiting searches by federal law enforcement agencies to violent crimes and DNA profiles with user consent, writes Jocelyn Kaiser in Science. But that wasn’t enough.

“What experts really worry about is that police may seek warrants to access all of GEDMatch’s data,” Tina Hesman Saey wrote – presciently, as it turns out – in Science News in June.

Now, a police officer in Florida actually has gotten a search warrant for all the records in a GEDmatch database – including the ones that had made themselves private.

“A Florida detective announced at a police convention that he had obtained a warrant to penetrate GEDmatch and search its full database of nearly one million users,” write Kashmir Hill (who’s been writing about genetic databases since at least 2010) and Heather Murphy, in the New York Times. “Legal experts said that this appeared to be the first time a judge had approved such a warrant, and that the development could have profound implications for genetic privacy.”

You think?

While GEDmatch has about a million users, other genetic databases are much bigger, and now that a precedent has been set, law enforcement may go after those other databases as well, Hill and Murphy write. “DNA policy experts said the development was likely to encourage other agencies to request similar search warrants from 23andMe, which has 10 million users, and Ancestry.com, which has 15 million,” they write. “If that comes to pass, the Florida judge’s decision will affect not only the users of these sites but huge swaths of the population, including those who have never taken a DNA test. That’s because this emerging forensic technique makes it possible to identify a DNA profile even through distant family relationships.”

If GEDmatch isn’t very big, why did law enforcement professionals start there? Because GEDmatch is open source and was easiest to access, they add. (In fact, for the 2018 case, police didn’t even alert GEDmatch they were doing so.)

That said, one researcher was surprised that GEDmatch didn’t fight back against the warrant, and felt that bigger genetic database companies would probably protest such warrants more strongly.

And, in fact, 23andMe did write a blog post saying it would fight such warrants. “If we had received a warrant, we would use every legal remedy possible,” writes Kathy Hibbs, the company’s chief legal and regulatory officer.

But not even that might help, Kaiser writes, quoting Natalie Ram, a law professor at the University of Maryland’s Carey School of Law in Baltimore.

It’s not clear whether the DNA company or a criminal defendant would have the right kind of interest in the DNA and privacy rights at issue to even be able to challenge the warrant effectively. (That is, it’s not clear either has ‘standing’),” Ram says. “So, we might discover that this is a situation in which, as a practical matter, there is no one who can effectively challenge this warrant. And that’s not a good place for the law to be.”

What makes that an issue? “Last year, researchers calculated that a database of about 3 million people would allow for the identification of virtually any American of European descent,” Saey writes. With access to those two companies’ databases, law enforcement would be solving cases every day, she quotes one genetic genealogist as saying.

Moreover, at around the same time, a University of Washington study found that genetic databases were subject to fraud. In other words, it was possible to create a fake person who was related to a real person.

“Researchers at the University of Washington have found that GEDmatch is vulnerable to multiple kinds of security risks,” writes Sarah McQuate for UW News. “An adversary can use only a small number of comparisons to extract someone’s sensitive genetic markers.”

How many? Just 20 – and it would take about ten seconds to do, she writes.

“The team played a game of 20 questions: They created 20 extraction profiles that they used for one-to-one comparisons on a target profile that they created,” McQuate writes. “Based on how the pixel colors changed, they were able to pull out information about the target sequence. For five test profiles, the researchers extracted about 92% of a test’s unique sequences with about 98% accuracy.”

It doesn’t stop there. “A malicious user could also construct a fake genetic profile to impersonate someone’s relative,” McQuate writes. “Once someone’s profile is exposed, the adversary can use that information to create a profile for a false relative. The team tested this by creating a fake child for one of their experimental profiles. Because children receive half their DNA from each parent, the fake child’s profile had their DNA sequences half matching the parent profile. When the researchers did a one-to-one comparison of the two profiles, GEDmatch estimated a parent-child relationship.An adversary could generate any false relationship they wanted by changing the fraction of shared DNA,”

Now, put those two things together. Will we have police creating fake relatives to justify gaining access to the DNA records of real suspects? The September policy is supposed to forbid that, but it applies only to federal searches, Kaiser writes.


November 14, 2019  9:56 AM

Laptop Border Searches Now Require Probable Cause

Sharon Fisher Sharon Fisher Profile: Sharon Fisher
government, privacy, Security

It’s safe to bring your cell phones and laptops into the United States again.

The Electronic Frontier Foundation (EFF) has for some time been pushing for a case to expand the provisions of the Riley case, which stated that law enforcement officials needed a warrant to search someone’s cell phone, to Customs and Border Patrol (CBP) searches. “We are eager to further the law in this area—to make it clear that the Riley decision applies at the border,” the organization wrote at the time, urging people to let it know when they undergo a border search.

Now, it got it, with a summary ruling from the U.S. District Court in the District of Massachusetts, in Boston.

The result is that border officers must now demonstrate individualized suspicion of illegal contraband before they can search a traveler’s device, writes the EFF, which has published a guide on border searches and in general has collected information about such cases.

“The ruling came in a lawsuit, Alasaad v. McAleenan, filed by the American Civil Liberties Union (ACLU), Electronic Frontier Foundation (EFF), and ACLU of Massachusetts, on behalf of 11 travelers whose smartphones and laptops were searched without individualized suspicion at U.S. ports of entry,” the EFF writes.

Ten of the plaintiffs were U.S. citizens, while the other was a lawful permanent resident.

The U.S. has had a policy since 2009 that border agents can demand access to a smartphone within 100 miles of the border – which covers much more U.S. territory than you’d think. According to the American Civil Liberties Union (ACLU), as of 2006, more than two-thirds of the U.S. population lived within 100 miles of the border. Altogether, it meant that anyone in that area with a laptop could have that laptop seized without a warrant, at any time, taken to a lab anywhere in the U.S., have its data copied, and searched for as long as Customs deemed necessary. And despite their objections, the policy has largely been upheld.

In 2015, a judge ruled that – following the lead of the Supreme Court ruling on the Riley case– customs officials needed to have probable cause before it could search someone’s laptop. The problem with that ruling is it applied just to that one case, not overall.

This new filing applies to everyone – at least, for now. Presumably the federal government could appeal the case to the Supreme Court.

This case was filed in 2017, which is when a number of people started reporting anecdotally that they had had their devices searched. In one case, a US-born NASA engineer who worked with the federal government and was also a part of the Customs and Border Protection Global Entry program was told he couldn’t re-enter the U.S. until he unlocked his encrypted NASA phoneSeveral other incidents have also happened over that summer, reported the Electronic Frontier Foundation.

In particular, this happened with the press. Even a Canadian journalist was denied entry to the U.S. for refusing to unlock his phone, and a Wall Street Journal reporter had the same experience, though customs agents backed down when she told them to call the paper. A BBC reporter also had to turn over his phone.

One of the plaintiffs, an incoming Harvard freshman, not only had his phone searched but had his visa denied because of what border officials said were anti-American posts in his social media.

In April, the ACLU and the EFF reported that searches were becoming so egregious that they asked for a summary judgment without a trial. That is what happened here.

In general, the number of searches has increased sharply in recent years. Last year, CBP conducted more than 33,000 searches, almost four times the number from just three years prior.

October 31, 2019  10:56 PM

Drivers Deal With Tesla Flash Memory Problem

Sharon Fisher Sharon Fisher Profile: Sharon Fisher
Flash memory

One of the criticism about flash memory is that, while it’s fast to read, writing on it multiple times wears it out and its performance decays. Flash memory vendors have been saying that this is a problem they’ve been working on. But they might have a bit of a problem after a recent incident.

It turns out that Tesla cars, which use flash memory, log so much data that it froached the cars’ memory and bricks the cars, which requires a repair that can cost $1,800 or more.


The problem first started being reported in May, when a video was posted to YouTube describing the problem, writes Jason Koebler in Vice.

Three different auto shops reported the problem, writes Gustavo Henrique Ruffo in Inside EVs. “They aim to warn Tesla owners that the clock is ticking for all of them,” he writes. “Regardless of your car, the logging will require replacing your MCU sooner or later.”

The problem is that the size of the firmware has grown, and it’s now starting to compete with the logs, Ruffo writes. That means there’s no extra space on the chip to write data when it’s trying to write the data more evenly, he writes.

“Apparently, Tesla is overworking these systems (at least on some models) to a point where they can’t take it anymore,” writes Matt Posky in The Truth About Cars. “It’s basically the same thing that would happen if you filled and wiped a USB drive hundreds of times every day. One morning you’d plug it in and find that it’s no longer functional due to being burnt out from overuse.”

Each of the three repair shops said they had encountered at least a dozen cars with the problem in just the last couple of months.

Drivers have also been reporting the problem, which, in an annoying coincidence, apparently tends to happen around the time that the warranty runs out, after about four years or so.

Moreover, it’s not a problem that’s getting better with newer models, because the newer models do even more logging than older ones, Ruffo writes.

The other part of the problem is that the chip is soldered to the board, meaning the whole board has to be replaced. Some of the auto shops reported that they were creating sockets on the board to make it easier to replace the chips in the future.

In response to one Twitter discussion of the problem, Tesla founder Elon Musk said the problem should be “much better at this point,” Posky writes.

But people were dubious, writes Dan Robitzski in Futurism.com. “Without specifying how or why, Musk replied that the problem should ‘be much better at this point’ – drawing immediate skepticism from the engineer and others who didn’t see any evidence of a fix,” he writes.

Mechanics and drivers are suggesting that the company should reduce the amount of logging that the car does.

Tesla owners who are still under warranty are urged to try to update the faulty part.

Ultimately, it’s not only bad for Tesla cars on their own, but for flash memory in general.

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to: