Chances are, you’ll be celebrating World Backup Day Eve tonight by assembling a NAS drive for your kids, who are snug in their beds after having left out chocolate-chip cookies and milk. Tomorrow morning they’ll come pounding down the stairs before dawn, to see what sort of backup hardware and software they find in their stockings.
You mean, you don’t Believe?
Whether or not you Believe, the day before April Fool’s Day is as good a day as any to remind yourself to make sure that your data is backed up and, more to the point, that you can retrieve it again afterwards. Like changing the batteries in your smoke detector when the time changes, it provides a useful mnemonic for one of those boring but important things to do.
A similar mnemonic this year is 3-2-1: Keep three copies, in two formats, with one copy off-site. “Step 1: This can be as simple as backing up to an external hard drive,” notes backup vendor Acronis. “Step 2: Use a different format such as cloud backup software, which can automatically backup all data to the cloud. It’s automated, so you don’t even have to do anything. Step 3: Store a backup copy in a secure, offsite location. This is where using the cloud is beneficial because it can join steps two and three together.”
Acronis also conducted a survey that found that
- More than 75 percent of consumers store their data digitally
- Consumers would be nearly three times more upset if they lost their photos than if they lost their phone, computer or tablet
- More than 50 percent of consumers store their data only on their computer – or not at all
- Of those using a data backup system, only one third are protecting their entire computer system, while the rest are simply protecting some files.
- More than 50 percent believe their personal data are more valuable than their actual devices
- Almost half of the respondents value their data at over $1,000
- Only 5 percent of consumers surveyed are willing to actually spend that amount to recover their data once it is lost
- 94 percent of respondents said they are willing to spend up to $100 to preventively backup their data.
For its part, backup vendor Verbatim notes in its survey that 25 percent of respondents said they didn’t back up their data because they were lazy. 18 percent said they didn’t know how, 12 percent said it took too long, 9 percent said it was too expensive, 6 percent said the research took too long, and 5 percent simply felt that nothing would happen to them.
First held in 2011, World Backup Day was actually spawned by a reddit discussion and is primarily intended for consumers who might otherwise lose pictures, music, and so on. (Presumably the people would also set up an automated backup from then on; backing up your data once a year isn’t going to help much.)
The World Backup Day website also offers a variety of statistics on backups, though some of them date back to as far as 2001. People are also asked to pledge to back up their data and encourage their friends to do so; thus far, 2816 people have pledged, which is an improvement over last year’s 1800, especially since it’s not even The Day yet.
These days, World Backup Day is mostly an occasion for backup vendors – primarily services, rather than hardware and software – to promote their services and to offer sales. So if you’ve been looking into one of these services, now might be the time to do it.
Here’s some of them:
- Carbonite has bonuses for its resellers, as well as a contest for users.
- CloudBerry Lab is selling its Cloudberry Box service for half price through April 2.
- Datto is offering 10% off a new, one year contract for 25+ users for its Backupify service if purchased beforeWorld Backup Day comes to a close on March 31, 2015.
- RossBackup is giving away basketball tickets to new customers.
You can also order T-shirts, fridge magnets, or posters.
Meanwhile, tonight you can listen for the sound of little hard drives on the roof.
In the 1940s, during World War II, Japan attacked North America using paper balloons filled with hydrogen and carrying a payload of incendiary and anti-personnel bombs. The balloons were supposed to float across the Pacific Ocean using air currents and then set fire to the forests of the Northwest, as well as injure people upon landing. As it happens, the only casualties were a pregnant woman and five children in Oregon who found one of the things in 1945, but remnants of the devices are still being found even today.
This being more civilized times, these days people are attacking culturally, with American television and movies.
“We’re here to hack the North Korean government’s monopoly of information above the 38th parallel on the Korean peninsula,” wrote Thor Halvorssen and Alexander Lloyd in the Atlantic last year. “The embargo of information into and out of the country has forced human rights groups to be creative in their methods of reaching North Korean citizens.”
The particular group they were working with was called Fighters for a Free North Korea, led by Park Sang-hak, a defector and son of a former North Korean spy. The group sent a series of balloons – 20-foot long “transparent, cylindrical tubes covered in colorful Korean script,” each of which carried three large bundles wrapped in plastic containing “DVDs, USBs, transistor radios, and tens of thousands of leaflets printed with information about the world outside North Korea,” the Atlantic writes.
And what is stored on those USBs that is such a threat to the North Korean government? Not malware, but simple popular culture from outside the country. “Shows such as Desperate Housewives and The Mentalist, and films like Bad Boys, all of which defectors tell us are very popular in the North, provide a wildly different alternative to their daily lives,” explains the Atlantic. (Friends is reportedly also popular.)
The up to 1,500 USB drives also contained copies of a Korean language version of Wikipedia, reported Business Insider. “USB keys are one of the most powerful tools, because they’re small, can be hidden and shared easily, and carry massive amounts of data,” Halvorssen said. USBs are also easier to hide than DVDs or other storage methods, and can even be swallowed, note other dissidents.
Park’s group is not the only one. The PBS news program Frontline interviewed a different group that is working to overthrow North Korea using thumb drives. “The men prefer watching action films. Men love their action films. I sent them Skyfall recently,” defector Jeong Kwang-Il said on the show. “The women enjoy watching soap operas and dramas. They like that kind of film. Now they’re sharing thumb drives a lot. Even officials have one or two. North Korea is trying to hunt them down because the thing that changes people’s mindsets is popular culture. It probably has the most important role in bringing about democracy in North Korea.”
So seriously does the country take this that some North Koreans have reportedly been executed for watching foreign television, Frontline reported.
This week, Park had planned another balloon drop with 10,000 copies of the movie The Interview, which as you may recall created an international incident in December when Sony at first dropped plans to release the movie after North Korean threats. However, the organization announced on Monday that it was delaying or scrapping the plan, and in fact was suspending the entire balloon program out of concerns about retaliation from North Korea.
While we talk a lot about the potential hazards of USB drives, they have a power as well.
We keep telling you and telling you: Don’t plug strange USBs into your computer! You don’t know where it’s been! Now, it could kill your computer.
It’s tough, because some things are so enticing. Even government workers, who should really know better, have a bad habit of picking up stray USBs just to see what they’d do. And there’s other ways to propagate USB drives than scattering them in a parking lot.
Take dead drops, which we wrote about a couple of years back and are making the rounds again. It’s a USB drive literally cemented into a wall or curb that you can plug into your laptop, and exchange data, whether it’s an art installation or seekrit messages. Since the initial ones in 2010, there are now 1,500 dead drops around the world, with nearly 10 terabytes of combined storage, according to Alex Hern in The Guardian. “There are dead drops on every continent in the world except Antarctica, as far north as eastern Iceland and as far south as Wellington, New Zealand,” he writes, with the most recent being added in Hong Kong, Baden Württemberg in Germany and Xining, a city in western China.
“When cemented into place, each drive is empty except for a file explaining the group’s manifesto: ‘A Dead Drop is a naked piece of passively powered Universal Serial Bus technology embedded into the city, the only true public space,’” Hern writes. “But after a while, anything from photos to videos can be uploaded by anyone – which has led to some problems.” Examples include plans for a bomb, guides to producing crystal meth, and recipes for various deadly poisons, he describes.
Or, hypothetically, a virus or other malware, which is the problem with picking up unidentified USBs and plugging them into your computer to see what they do. (That’s probably how the International Space Station got a virus on it.)
We’ve also heard about USB drives with malware in the microcode, so the USB device can pretend to be something it isn’t and steal data – not to mention be almost impossible to remove.
But now poking a USB drive into your laptop won’t just give it a virus, it could literally destroy it. That’s because a Russian computer person nicknamed “Dark Purple,” just for grins, decided to design something in a USB drive form factor that could zap whatever laptop it was plugged into, according to the description on a Russian website (including pictures). Basically, it consists of lots and lots of capacitors to store energy, and send back out through the USB port, but which looks just like a regular USB drive.
“The basic idea of the USB drive is quite simple,” describes an English translation of the Russian website. “When we connect it up to the USB port, an inverting DC/DC converter runs and charges capacitors to -110V. When the voltage is reached, the DC/DC is switched off. At the same time, the filed transistor opens. It is used to apply the -110V to signal lines of the USB interface. When the voltage on capacitors increases to -7V, the transistor closes and the DC/DC starts. The loop runs till everything possible is broken down.”
Hacker News goes so far as to claim that with it, a laptop could be turned into a bomb, or at least set on fire.
The website didn’t include any imagery of the device in action, but it’s certainly a heads-up that such a device might be out there.
We’ve written before about the notion of Politicians Behaving Badly by using personal email systems while they served in office. Now former Secretary of State Hillary Clinton is under fire.
The New York Times broke the story on March 2, noting that in her time in office, Clinton used the email account email@example.com and, in fact, had never had an official .gov email account. Later stories ascertained that the email system in question, which ran Microsoft Outlook, lived on a server in her house in Chappaqua, N.Y.
There’ve been two major challenges in the reporting of this story. First, Clinton is a polarizing figure in politics, and it can be difficult to separate out fact from partisan issues. Second, the mass media members are not technical experts, and some of the articles have been, to put it kindly, lacking in technical details, as David Gewitz so ably points out. The AP article on her server, for example, called it “homebrew,” as though she’d put it together in the basement with spare parts from Radio Shack, while Fox News and Bloomberg hired hackers to scour the Internet for references to other email accounts from that server and to look for security holes in her system, respectively.
Politicians ranging from Alaska Gov. Sarah Palin to the entire state of California have come under criticism for using personal email systems. What are the issues? The email isn’t secure. The system can be hacked. The owner of the system, if it’s a public mail service, has access to the government official’s email. The email messages might not be accessible to public records requests and legal issues.
Another major issue is concern that the government official can more easily delete potentially embarrassing email messages, either on purpose or on accident. This has been an issue with a number of government officials, ranging from President George W. Bush to Arkansas Governor Mike Huckabee to Massachusetts Governor Mitt Romney to Lois Lerner of the IRS. (Incidentally, they found her missing email. Right where it was supposed to be. Hmm.)
Clinton’s situation, however, is somewhat different. First of all, she wasn’t using personal email some of the time for certain issues; she used the personal email system all the time. Which raises the question: Why didn’t anybody talk about it before now? President Barack Obama reportedly said he didn’t know about this until he saw news reports. Really? He and his Secretary of State never exchanged email, or if they did, he never noticed her email address?
Second, the Secretary of State’s office has apparently not traditionally had an email system per se. Noting that only four Secretaries of State have been in office in the email era, the State Department asked them to send in copies of any email records they had from private email systems. Two, Madeline Albright and Condoleeza Rice, said they didn’t use email.
Apparently this isn’t unusual in government; South Carolina Senator Lindsey Graham – who, incidentally, serves on the subcommittee on privacy, technology, and the law – says he’s never sent an email message.
Meanwhile, Colin Powell, Secretary of State under President George W. Bush until 2005, said he used a personal email account because the State Department system was “antiquated.” But it’s only since 2014 that rules about private email accounts for federal government business were implemented, which is why current Secretary of State John Kerry uses one.
Third, the State Department system was vulnerable to hacking. In fact, despite some security weaknesses such as a default encryption certificate, it may have been stronger than the official system, notes Clay Johnson, former director of the Sunlight Labs for the Sunlight Foundation (and others). “That personal email was probably far more secure than her state.gov email account,” he writes. The State Department’s email system has been compromised for months.”
For example, the State Department doesn’t have a number of common security measures, such as malware detection for remote email, encryption, digital signatures, or two-factor personal identity verification cards, reports NextGov. And the “homebrew” system would likely have been more secure than a public system such as Gmail or Yahoo!, writes Slate.
But didn’t Clinton violate the law by not using the government email system? “There was no such ironclad rule when Clinton became Secretary of State,” writes Joe Conason in National Memo. “The former Secretary of State doesn’t appear to have breached security or violated any federal recordkeeping statutes, although those laws were tightened both before and after she left office. She didn’t use her personal email for classified materials, according to the State Department.”
“Federal regulations don’t outright ban the use of personal accounts,” confirms NextGov.
Regardless, Clinton’s actions have come under criticism, such as from Gov. Scott Walker of Wisconsin. “How can she ensure that that information wasn’t compromised?” he told The Weekly Standard, after an event with supporters in Des Moines. “I think that’s the bigger issue—is the audacity to think that someone would put their personal interest above classified, confidential, highly sensitive information that’s not only important to her but to the United States of America.”
This, of course, is the same Gov. Walker whose staff set up a private email system within his own County Executive office when he was running for Governor. But that was different, Walker argues, though The Weekly Standard didn’t explain how.
Clinton was also criticized by former Florida Gov. Jeb Bush. “For security purposes, you need to be behind a firewall that recognizes the world for what it is and it’s a dangerous world and security would mean that you couldn’t have a private server,” he told Radio Iowa. “It’s a little baffling, to be honest with you, that didn’t come up in Secretary Clinton’s thought process.”
This, of course, is the same Gov. Bush who also used a private email system and address, on a server that he owned, when he was Governor of Florida. But that was different, some argue, because people knew that he was using that email address.
Clinton is also being criticized by Utah Rep. Jason Chaffetz, chair of House Oversight and Government Reform Committee, which is going to investigate the situation. ABC News, however, points out that Rep. Chaffetz’ own business card lists a Gmail address. The list of Clinton detractors who also use private email goes on.
Whether people see this as the death blow to Clinton’s candidacy or something to mock, as Saturday Night Live did, we should have our chance to judge for ourselves before long; Clinton has called on the State Department to release her 55,000 pages of email messages to the public.
Believe it or not, some organizations are still using Zip drives.
In case you don’t remember, or are too young to remember, Zip drives were developed by Iomega in 1994. They were a similar size to floppy disks – thicker – but held considerably more data; they started at 100 MB and eventually went up to 750 MB. Another interesting distinction about them is that they could be used for either PCs or Apple computers.
“A little over 20 years ago, however, when Iomega introduced the original 100MB Zip disk, that was staggeringly huge for a removable disk,” writes Christopher Phin in Macworld. “The wildly more common 3.5-inch floppies held 1.4MB. For context, the entry-level PowerBook 150, introduced in the same year, had a 120MB hard disk, and the base configurations of even 1994’s server Macs came with hard disks that were only five times the capacity of the Zip disk.”
Kids these days don’t remember how expensive storage used to be. “Today, when the most popular USB flash drive on Amazon is a $15 SanDisk Cruzer that stores 320 times the original 100MB Zip disk, we have a pretty blasé attitude to storage, but in the ’90s, you carefully counted the kilobytes when saving a JPEG out from Photoshop, because the literal cost of storage was so high,” Phin writes. He notes, for example, that the pile of Zip disks it would take to store the data on his 5.42 TB hard drive would be higher than the Eiffel Tower.
They were also known for a reliability problem known as the Click of Death. “Without any warning a Click Of Death drive begins emitting a series of audible and distinctive clicking sounds, either when a cartridge is first inserted or when attempting to read or write data to or from a previously inserted cartridge,” writes Steve Gibson, who has an entire FAQ devoted to the problem. “The word ‘Death’ appears in the names for this problem since that’s exactly what occurs in real life: Minutes, hours, or days after the clicking is first heard, the drive — and usually one or more of the user’s cartridges — suddenly dies without warning. And since people tend to rely heavily upon their Zip and Jaz cartridges for the storage of their important data, this typically results in spontaneous, catastrophic, irreversible, loss of all their data.”
On the other hand, Phin still used them for some time after they were superseded by technology. “Even once hard disks became so big in relation to the capacity of the original hundred-meg Zip disk, I still used them to store specific projects,” he writes. “There was and is something satisfying about compartmentalizing jobs, and there’s something far more conceptually agreeable about taking a case down from a shelf, slotting a disk into a drive and so being prompted mentally to change gears into a particular work mode than there is about just double-clicking a folder on a multi-terabyte external RAID or NAS.”
And even though they haven’t been made in more than ten years, they’re still in use – and not just for communicating with other outdated systems. In Ada County, Idaho, which contains the capital city of Boise, Zip disks are still used as part of the election system.
At times, this can be a challenge.
“The disks had a high failure rate, are no longer made and are hard to find,” writes Cynthia Sewell in the Idaho Statesman. ”When the county heard the Boise School District was jettisoning its Zip disks, the county snatched them up. It also scours eBay and Craigslist for Zip drives.”
That said, people still feel nostalgic about Zip drives. “Nowadays, I can stuff a 32GB USB thumb drive in my pocket, making the bulky 100MB Zip disks seem even more antiquated,” writes Eric Bangeman in Ars Technica. “But for a few short years, the Zip Drive hit a sweet spot in the market, which is why I still have fond memories of it.”
If you’re worried about people spying on you, you might want to think about the sorts of surveillance you’re conducting on yourself.
In 2009, a company called Dropcam formed to sell surveillance cameras to people. But like razors and razor blades – and like the police body cams we wrote about earlier this month – the company was also in the business of selling cloud storage to the people who bought the cameras, so they could look up the footage the cameras recorded. Reportedly, 40 percent of Dropcam customers did this.
Last year, Dropcam was purchased by Nest, reportedly for $555 million. By this point, Nest itself had been bought by Google, for $3.2 billion. Since then, the companies have undergone some reorganization; former Dropcam CEO Greg Duffy left last month, and other reorganizations may follow.
In other developments, the company has said that some of its older models of camera will stop working in April, but is offering free updated versions to those users. While this might seem generous, recall how many people were paying for storage for their archived data, and keep in mind that only a few months’ worth of data storage would pay for a new camera.
Some people had always been a little weirded out about Dropcam. “Watching a room in your house 24/7?” wrote Liz Ganne in Re/Code2012. “Why would normal people want to do that?” The people who bought the cameras typically did it to watch over their houses, their babies, or their pets, she continued. Other people use them to keep track of what’s going on in their neighborhoods.
But as time goes on, some people are getting more interested in the sort of data that Dropcams collect. Police, for example. In several cases, law enforcement people have reportedly come to Dropcam with search warrants to gain access to stored data.
People have also found Dropcams in other places, such as in Airbnb rooms they’re renting – purportedly for security. And the law on monitoring people in your own home is not entirely settled. “You’re allowed to record yourself in your own home, of course,” writes Kashmir Hill in Fusion. “But when others share your space, the legal issues get murkier.”
But it’s the Google acquisition that is making some people nervous. “The reality of the situation, however, is that Google now has a way to look inside your home,” writes Simon Sharwood for Register UK. Not that that’s necessarily a bad thing, he hastens to add. “There’s plenty to like about that: a camera that can detect a very bright day and and talk to home automation kit that moves powered louvres to block out extra light and cool a house to remove the need for air conditioning is a fine application. Other applications may be more … ahem … chilling.”
Sometimes, people even end up accidentally spying on themselves. “You still get periodic emails when the camera senses activity and it’ll send a medium sized low-res picture several times a day embedded in the message,” explains Dropcam user Matt Haughey. “I never thought much of this until I opened an email to see a photo of me completely naked walking by the camera, on my way to grab from a pile of recently folded clean clothes after I took a shower.”
Oops. (And yes, he included a copy of the picture – with strategically placed black bars – to back up his account.)
So that’s why the fact that Google now owns that data is concerning some people. “I realized that image is on Dropcam’s system,” Haughey continues. “And Google bought Dropcam so my photo is somewhere in Google’s cloud. There’s a web-accessible photo of my naked ass (with no black bar added above) somewhere and I have no idea where it is or how easy it is for anyone to find. Wonderful.”
A flurry of incidents involving police and suspects, and even innocent bystanders, is causing many police departments to implement body cameras to help collect records of the incidents – or, hopefully, forestall them. But police departments that have implemented the body cameras are finding out that the cameras themselves are just the half of it. The data they collect has a lot of cost and issues of its own.
“The storage expenses — running into millions of dollars in some cities — often get overlooked in the debates over using cameras as a way to hold officers accountable and to improve community relations,” write Brian Bakst and Ryan J. Foley for the AP. Some police departments are having to choose between hiring officers and storing the data, they continue.
- Baltimore officials estimated costs up to $2.6 million a year for storage and the extra staff needed to manage body camera data
- Duluth’s 110 officer-worn cameras generate 8,000 to 10,000 videos per month that are kept for at least 30 days
- Wichita estimates that its program will cost $6.4 million over the next ten years
- Berkeley expects to spend $45,000 a year to store and manage data, and that the time required for officers to manage the cameras is the equivalent of five full-time officers, for a total of almost $1 million
- San Diego would pay $267,000 for five years, but $3.6 million for storage contracts, software licenses, maintenance, warranties and related equipment
- Des Moines is looking for $300,000 to start a program
- Las Vegas estimates that data storage could cost $1 million per month
- Muskogee, Okla., paid $278,000 for cameras for 70 officers, as well as storage space for five years – which was most of the cost
In fact, like razors and razor blades, some companies are reportedly giving police departments the cameras for free or at a discount in return for contracts to store the data, which could amount to $20 to $100 per officer per month, the AP writes. Duluth, for example paid $5000 for its cameras but is paying $78,000 for data storage.
In addition to the cost of storage are all the security and privacy complications involved any time you have a lot of stored data. Who’s allowed to look at it? Who’s allowed to copy it? How do you keep people from hacking into it? How long are you supposed to retain it? What about the feelings of the families of the people shown in the films? What are the civil liberties issues associated with it? “Departments are being swamped with public records requests from watchdog groups,” reports ABC News. And police departments don’t always have the IT expertise to deal with these questions.
“Imagine a hacker who edits the data to change the identity of an assailant or leaks the footage of a victim immediately following a violent crime,” writes the Christian Science Monitor. “The concern is not speculative – at least one white hat hacker has shown he can break into a police video system and criminals have demonstrated the ability to penetrate police department networks.”
Just managing the data is a hassle. In Pittsburgh, for example, footage of a homicide scene is required to remain in the system forever, while traffic stops are automatically deleted after one year. But only a supervisor can manually delete footage — after the police chief and the lieutenant sign off on a memo, writes Action News. The city also hasn’t determined whether it has the bandwidth to send the camera data to the cloud.
Hastings, Minn., found that body camera data would be considered a public record, and was concerned about the privacy of crime victims, as well as records of innocent people. “I don’t want to have a bunch of pictures of Hastings residents doing nothing wrong sitting in our files,” Mayor Paul Hicks told the Hastings Star Gazette.
In response, the Minnesota state legislature is considering a bill to limit access to the data to law enforcement personnel and people actually in the video – though how you’re supposed to know if you’re in the video without looking at it, I don’t know. Organizations such as the ACLU are also concerned that such laws would defeat the purpose of having the body cameras in the first place.
As you may recall, former First Brother and potential GOP Presidential candidate Jeb Bush announced in December that he intended to release “all,” for some definition of “all,” the email messages from when he served as Governor of Florida from 1999-2007. And when this was announced, I came up with some questions about this email dump, wondering about some of the details – and the pitfalls.
Well, Governor Bush has now released his email. (Ironically, this is all happening against the backdrop of the revelation that the Bush political action committee’s newly hired CTO, Ethan Czahor, spent the weekend scrubbing his Twitter feed of some of his youthful indiscretions, like his belief that women were sluts and gay men were looking at him.) So we now have answers to some of the questions.
- “Will he really release all of the email?” Not even close, reports the Associated Press. “They account for a sliver of the Bush archive, and don’t include emails sent to and from his official government email address, as well as other records such as office notes and calendars.” The email messages had already been obtained, analyzed and published by media outlets, including CNN, and Democratic opposition research group American Bridge, noted CNN. Which makes sense. As we said in December, for someone who claims to have a 30-hour a week email habit, 250,000 or even 300,000 email messages for eight years doesn’t sound like much. I’ve had my Gmail account since April, 2004, and I have 336,404 messages in my inbox – and I’m not a Governor.
- “Did he ever use any unofficial or personal email address?” So far, the email messages appear to be to and from firstname.lastname@example.org. Was that really the official Florida gubernatorial email address? “Millions of emails came in through our website, but it was when I made my personal email – email@example.com – public that I earned the nickname ‘The eGovernor,’” Bush writes in his ebook. Current Gov. Rick Scott has a form on his website you fill out to send him email – though he also notes “Under Florida law, all correspondence sent to the Governor’s Office, which is not exempt or confidential pursuant to Chapter 119 of the Florida Statutes, is a public record. All public record electronic mail sent through this website will be posted to Project Sunburst athttp://www.flgov.com/sunburst, and will be accessible to the public.”
- “What format will it be in?” There’s two ways to get the email: You can search by day with the website (between January 4, 1999 to January 3, 2007, though there is in fact no email after December 31), or you can download half dozen Outlook .pst files. Which, incidentally, have been compressed using .rar format, which is more advanced but more arcane than .zip files, so people will need to figure out how to unpack it first. Certainly setting it up that way makes it more challenging to find any good stuff in it; you can’t search by subject, and you have to know how to download the .pst files and set them up in Outlook to be able to search through them – not to mention the difficulty in juggling a half dozen of them. So it’s certainly not set up to make it easy for people to search for things.
- “Will it be full-text searchable?” It’s straight text. It shows you a screen of about 20 email subject lines, you click on one, and then you get a single email message per screen. You can click to the next one, or the previous one, without having to go back through the calendar. You can cut and paste it. But there’s no provision for searching for text that I found.
- “So, where is this email now?” Not clear.
- “How is it that the Governor has it in the first place?” It’s a public record. That means he can publish it? Is he paying for it? Can anyone else publish it, perhaps in a more usable interface? Hmm. Interesting questions.
- “Is personal information going to be redacted?” Apparently not. <facepalm> The stuff’s only been out a few hours and reporters have already found personally identifiable information (PII) such as names, addresses, and even Social Security numbers in clear text. “Bush not only published every email, he published every email address—and many personal names, physical addresses and personal phone numbers, that people include in their email footers,” writes Newsweek. “The archive contains thousands upon thousands of personal identifying details about Floridians.” Fortunately, Florida just updated its data breach legislation last year; we are sure that the Governor will rapidly be informing the state of his breach, as required by the new law. But hurry, identity thieves; having the problem called to their attention, the Bush campaign is apparently going to remove it somehow – though, a spokeswoman noted, it’s still available under public records laws.
- “Is the metadata going to be in there?” In the database, there’s the from email address, the to email address (including those of all the cc:ed people), the date, and the subject. The email messages quoted in his ebook don’t have email addresses. It’s certainly tempting to email people like Army Sgt. Travis van Buren, who emailed Governor Bush on December 31 to tell him “If you ever do decide to run for President, you’ve got my vote, hands down!” to see if he still feels the same way. Incidentally, the email doesn’t seem to include attachments; at least, there was no sign of Ed Moore’s dissertion questions, which he said on December 31 that he was sending.
There’s certainly a small army of people who have divided the messages up between them and are looking up anything good, at least if there’s anything that hasn’t already been revealed before. So the challenging aspects of searching this email trove will probably be dealt with through crowdsourcing. But since it’s already clear that it doesn’t really include everything, its “proof” of Bush’s transparency is, to put it kindly, limited.
Think your backup job is tough? How about backing up the entire Internet?
You may wonder, what’s the point of archiving the Internet? Do we really need to save all those memes and cat pictures? But the Internet is more than that, insist preservationists.
Jill Lepore leads off her New Yorker article by noting that the Internet Archive’s web preservation service, known as the Wayback Machine, was the only remaining source of evidence that Ukraine separatists had posted that they had shot down Malaysia Airlines Flight 17 on a Russian social media site – a site that the Internet Archive had begun saving just two weeks before. “On July 17th, at 3:22 P.M. G.M.T., the Wayback Machine saved a screenshot of Strelkov’s VKontakte post about downing a plane,” she writes. “Two hours and twenty-two minutes later, Arthur Bright, the Europe editor of the Christian Science Monitor, tweeted a picture of the screenshot, along with the message ‘Grab of Donetsk militant Strelkov’s claim of downing what appears to have been MH17.’ By then, Strelkov’s VKontakte page had already been edited: the claim about shooting down a plane was deleted. The only real evidence of the original claim lies in the Wayback Machine.”
In addition to web pages, the Internet Archive – which, incidentally, is hosted in a former Christian Science church because it looked like the organization’s logo — also hosts books, videos, “ephemeral” films such as advertising, audio recordings, concert recordings, audio books, television news broadcasts, and historical software (including Oregon Trail and Leisure Suit Larry in the Land of the Lounge Lizards), writes Andy Baio in Medium. Altogether, it includes 500,000 pieces of software, more than 2 million books, 3 million hours of TV, and 430 billion web pages, writes Justin Ellis. “In a single day, they digitize more than 1,000 books. They capture TV 24 hours a day. In a week, they save more than 1 billion URLs.”
So how do pages get saved into the Wayback Machine? There’s three ways, Lepore writes:
- There’s a crawler that attempts to make a copy of every Web page it can find every two months or so, though she points out that the New Yorker’s home page gets saved about six times a day
- Librarians choose certain pages to be archived in certain subject areas, through a service called Archive It, at archive-it.org, which also lets individuals and institutions build their own archives
- Anyone who wants to can preserve a Web page, at any time, by going to archive.org/web, typing in a URL, and clicking “Save Page Now,” which is how five of the twelve screenshots of the Malaysian Airlines post were made
At this point, the Wayback Machine has archived more than 430 billion Web pages, comprising 20 petabytes of storage – which is double its 2012 figure, Lepore writes. 600,000 people use it every day, conducting 2,000 searches a second, she adds.
That said, it’s not difficult to keep the Wayback Machine from trawling a site; all it takes is a single text file, Lepore writes – which has the effect of deleting all the archives as well. “Blocking a Web crawler requires adding only a simple text file, ‘robots.txt,’ to the root of a Web site,” she writes. “The Wayback Machine will honor that file and not crawl that site, and it will also, when it comes across a robots.txt, remove all past versions of that site. When the Conservative Party in Britain deleted ten years’ worth of speeches from its Web site, it also added a robots.txt, which meant that, the next time the Wayback Machine tried to crawl the site, all its captures of those speeches went away, too.”
The biggest problem with the Internet Archive is that it’s so big it’s really difficult to search, Lepore writes, because it lacks the tools. “You can do something more like keyword searching in smaller subject collections, but nothing like Google searching (there is no relevance ranking, for instance), because the tools for doing anything meaningful with Web archives are years behind the tools for creating those archives,” she writes. “Doing research in a paper archive is to doing research in a Web archive as going to a fish market is to being thrown in the middle of an ocean; the only thing they have in common is that both involve fish.”
To this end, the Internet Archive was recently one of 22 organizations to share in $3 million of grants from is the Knight Foundation through the Knight News Challenge, towards projects that provide new tools and ideas for making libraries more accessible. “The Internet Archive will get $600,000 to develop new technology to give users more control over how materials are uploaded, categorized, and curated in the archive,” Ellis writes. “What they plan to do with the funding from Knight is create a simpler upload system that works across any browser, a contributor management system that lets one or many people work on collections, expanded search functions, and improved tools for organizing what material can be added to certain collections.”
Including cat pictures, one presumes.
Not with a bang, but with a whimper. After HP’s monstrous $10 billion acquisition of Autonomy in 2011, for which nearly everyone agreed it overpaid, it took an $8 billion writedown on the deal, a whole bunch of people threw lawyers at each other, and some of those proceedings are still dragging on.
First, there was the lawsuit of HP stockholders suing HP. Turns out that some HP shareholders took exception to the whole sorry incident and sued, claiming current and former H-P executives and directors, including CEO Meg Whitman, failed to heed warning signs about problems with Autonomy’s business, writes the Wall Street Journal.
Because that’s the way these things are done, HP is attempting to settle, but keeps being shot down by the courts, because its proposed settlements have been too nice to HP. District Judge Charles Breyer said in December that “the proposed settlement improperly protected the H-P directors, officials and professional firms from a wide swath of potential future shareholder litigation, including some suits that might not be related to the Autonomy deal,” writes the Journal.
This is after a similar decision in August, where Judge Breyer criticized an earlier version of the settlement because of the proposed fees for the shareholders’ lawyers, and a different list of protections from future lawsuits against the H-P officials and others, the Journal continues.
Hoping that the third time’s the charm, HP filed a third settlement attempt last week. If you’re just dying to look it up for yourself, it’s In Re Hewlett-Packard Co. Shareholder Derivative Litigation, 12-cv-06003, U.S. District Court, Northern District of California (San Francisco), according to Bloomberg. Reportedly, it protects the company officers – including those of both of the new companies, too – only from future lawsuits that have to do with Autonomy.
Second, there was the matter of HP suing Autonomy, which was complicated by the fact that HP is based in the U.S. and Autonomy was based in U.K. Earlier this month, the U.K.’s Serious Fraud Office (no word on whether there’s an Insignificant Fraud Office to go with) ruled that it had closed its investigation, which it began in early 2013 following a referral from HP. “The SFO has concluded that, on the information available to it, there is insufficient evidence for a realistic prospect of conviction,” the organization reports.
Naturally, there’s still an ongoing investigation on the U.S side, the SFO reports. The U.K. Financial Reporting Council is also still investigating, reports Bloomberg.
And in an amusing sidenote, the SFO (which has come under some criticism of its own) itself uses the Autonomy software, which the office assures us is not a conflict of interest. “Throughout the investigation we have kept the potential for conflict of interest under review,” the organization writes. “Such a conflict of interest does not exist, nor has it ever existed, and the matter played no part in any decision concerning this investigation.”
All righty then.
Heck, Autonomy’s still even listed in the Leaders section in the 2014 Gartner E-discovery Magic Quadrant.
But fear not, attorneys. The lawsuits are ongoing. Your jobs are still safe.