BOSTON – Sam Madden, professor of electrical engineering and computer science at MIT, is hoping to help advance the field of machine learning from dark art to principled science with an open source project. ModelDB, available on GitHub, is essentially a database system designed to help organize and manage machine learning models.
“These models are the engines of machine learning,” Madden said at the MassIntelligence conference, hosted by MassTLC and MIT’s Computer Science and Artificial Intelligence Laboratory. “They are the things that take the data and extract the insight out of it.”
When researchers build machine learning models, the process is highly iterative. Models are built using training data, and, if they’re supervised models, they are tested, evaluated and then tweaked (i.e. new features are added, new parameters are added) to improve their performance. That process is repeated — sometimes hundreds of thousands of times, according to Madden — until the models perform at an acceptable level.
But there is no way to manage the process. “You go through thousands of these models, you update the models all of the time, and there’s no sort of standardized way to track the history of the modeling process,” he said.
Madden likened it to the way people organize personal documents on their computers, which is to say not at all. “People are terrible at it,” he said. “And they don’t promote carefully organized data.”
ModelDB is a database system that acts as a central repository for machine learning models — all iterations — and is searchable, creating a system of record for researchers. “People can look at see what’s been done in the past and continue work that’s been partially completed,” Madden said.
Features include “experiment tracking,” so that models in the pipeline can be logged; “versioning,” or the ability to compare model performance; and “reproducibility,” so that any model can be rerun an any input data set.
“This isn’t a deep or radically complicated idea,” he said. “But it’s one of the things that I think is needed in order for us to go from where we are now, which is sort of this [dark] art, to a much more principled scientific approach.”
We finally know which two big tech companies were conned millions by an email phishing scam, as reported last month, and you might recognize them.
The culprit — a Lithuanian man being charged with fraud, aggravated identity theft, and money laundering by the Department of Justice — swindled Google and Facebook out of $100 million collectively by pretending to be a popular Taiwanese electronics manufacturer.
The man allegedly forged emails from employees, invoices and contracts and asked the tech giants to send payments to his bank accounts in Latvia and Cyprus, instead of the real company’s actual bank accounts — and it was enough to convince employees at Google and Facebook.
“Humans are the most vulnerable point of any information system; even the world’s biggest tech companies aren’t immune to this,” said Neil Wynne, CISSP and Gartner analyst. “The vast majority of cyberattacks use social engineering, such as phishing, to trick employees into taking actions detrimental to the company. Many large and high-profile breaches have started with successful phishing attacks.”
A recent report from threat management provider PhishMe found that 91% of cyberattacks start with a phish. The top reasons that people fell for the emails: curiosity, fear and urgency. These are the things that attackers pray on — and upping technology-based defenses can’t address those kinds of vulnerabilities, said Wynne.
“There tends to be an over reliance on a technology-based approach,” he said. “Instead, CIOs should take a multipronged approach that spans technical, procedural and educational controls to effectively mitigate these attacks. The education aspect is a critical component because it increases employee resilience to social engineering.”
Bryce Austin, cybersecurity expert and CEO at IT consulting company TCE Strategy, agrees phishing scam detection hinges on training.
“I think the big takeaway from this incident is, first and foremost, that a cybersecurity awareness program is critical to all companies regardless of size — big or small,” said Austin. “Many of these fraudsters will try to get employees to break standard process and procedure by saying ‘this is very confidential’ or ‘this is related to some new merger or acquisition’ or something like that.”
Austin said the size of the scam suggests that the Lithuanian scammer got employees at Google and Facebook to break process and procedure, and he did it by convincing them to do it — with believable documentation and credentials — and/or by finding someone who wasn’t trained on what the process and procedure was.
Educate, educate, educate!
Time will tell if ReadyRefresh — Nestlé’s makeover of its century-old bottled water delivery business — becomes the UPS, Amazon or Uber of its industry. But these are the masters of the digital ecosystems Nestlé is building in order to meet changing customer expectations, said Aymeric Le Page, vice president, business strategy and transformation at Nestlé Waters North America.
“Customers are not just comparing us with other delivery companies, we are now being compared to everything you have on your phone. We are compared to Seamless,” Le Page said at the recent Digital Strategy Innovation Summit. Seamless is the online food ordering service that merged with GrubHub three years ago.
“It’s all about convenience,” Le Page said. Making sure a customer never ran out of water before the next company-determined delivery date was the old Nestlé’s service model. Digital titans like Amazon and UPS have raised the bar. “Now it’s, ‘Make sure you deliver what I want when I want it.'”
‘Your health, your home, your way’
Nestlé S.A. is the world’s largest producer of bottled water. Until recently its bottled water unit has functioned as a business-to-business supplier, delivering 5-gallon bottles to large enterprises on a set schedule. Two years into its “digital transformation journey,” Le Page said Nestlé is using digital technologies — cloud, mobile, analytics, geolocation, IoT — to customize its business service and build a direct-to-consumer “healthy hydration” service targeted at households.
“Your health, your home, your way is the slogan” for the consumer side of ReadyRefresh, Le Page said. “That represents a different way of doing business from, ‘I’ll come whenever I can to change your water bottles.'”
Thus, a new user-friendly website — “Just Click and Quench,” is part of the logo — aims to make it easy for customers to order and personalize deliveries: they can reschedule or add a delivery 24/7. The ReadyRefresh website also exposes customers to Nestlé’s full portfolio of bottled beverages, from Poland Spring and Perrier to Pellegrino and Pure Life, among others. Meanwhile, the company’s 2100 trucks, which literally drive brand visibility while en route, use the latest in telematics to optimize those routes.
Le Page said the new business model connects Nestlé to three digital ecosystems — e-commerce, where Amazon leads the pack; logistics, where UPS dominates and the lifestyle digital ecosystem, where he claimed there is “no current winner.”
Digital ecosystems change the operating model
“It’s a big change for a company that has been in business for 100 years with a very linear, simple operating model,” Le Page said. A big change in customer focus, and a big change for Nestlé’s some 300,000 employees. “You have to change the culture, the ways of working, change the mindset.”
LePage was talking to an audience of mainly digital strategists and mobile app developers, but it struck me that much of what he was saying was extremely relevant to CIOs — and not just because Nestlé is replacing its 30-year old legacy system with a new ERP to support these new digital ecosystems. Or, as Le Page said, because the company is adopting Agile to keep up with the 20 strategic initiatives underpinning its digital transformation and the more than 500 different projects underway.
Information technology — the business of CIOs — has fundamentally changed customer expectations. Forward-looking CIOs have long recognized that IT can no longer be delivered on IT’s schedule. Today, as Le Page said, it’s all about make sure you deliver what I want when I want it. Like the nearly $100 billion Nestlé company, IT organizations everywhere should be thinking hard about how to do that.
If there was one message drilled into the heads of attendees at the Business of Blockchain event co-hosted by the MIT Technology Review and the MIT Media Lab it’s this: Blockchain looks like it could follow the same mind-blowing, world-altering trajectory of the internet.
The only problem is, presenters at the Business of Blockchain event couldn’t quite agree on just where blockchain technology is on the internet timeline. “We’re investing like it’s 1998,” said Joi Ito, director at the MIT Media Lab, which houses the Digital Currency Initiative. “But I think it’s like 1989 in terms of the level of standardizations we have.”
Amber Baldet, who is heading up the blockchain effort at JP Morgan, said Ito’s 1989 marker was actually optimistic and suggested we rewind the clock another 20 years.
“The joke I make is that we’re actually in ARPANET 1969,” she said, referring to a time when the early packet switching network was barely a network at all — it was just four university computers connected together. “I keep a diagram of ARPANET from 1969 behind my desk because it looks remarkably like the [blockchain] proof of concept and pilot diagram that I have where we’re connecting two banks and one market infrastructure provider.”
Plus, there are key differences between the two technologies — one of the biggest is, to use the conference’s wording, the business of blockchain.
“With the internet, we had a couple of decades where people basically left us alone,” Ito said. “And we could make very non-commercial decisions like the idea of carrying packets for each other. That’s a very hippie move.”
That isn’t the case for blockchain developers. Corporations and venture capitalists are pouring money into blockchain technology and demanding a return on investment. The demand is unprecedented, according to Baldet, and developers have to work at a pace that poses inherent risk. “With what other technology would we consider taking something to production with real money that’s never been tested in a real-world environment before? I mean, nobody picked up databases or relational databases without having seen them in plenty of other contexts first,” she said.
But the internet-blockchain comparison is not without merit. And, as middle school students no doubt learn in their civics and government classes: History has a tendency to repeat itself.
Indeed, one of Ito’s takeaway messages was that attendees consider the lessons learned from the development of the internet. The internet protocols that won the day were often affiliated with academic and government funding, he said. Companies that survived the booms and busts of the World Wide Web kept a pulse on the conversations happening in nonprofit developer communities (such as academia, which often creates the open standards for the private sector) and remained flexible enough to transition as the technology changed.
“So, my advice, if I had it: It’s a long game; you should build expertise, you should spend strategically in building models and ideas, but I think you have to be prepared for quite a bit of change and disruption,” Ito said. “I would pay attention to the open standards and layers where [there are] communities of expertise.”
The question of who’s the CISO‘s boss is an old one, and there’s still no single answer. I reported on it a year ago. Some say the IT security chief should not report to the overseer of IT initiatives, the CIO, because cybersecurity could come into conflict with technology innovation. Others say the CISO should report directly to a business-side executive to “translate infosec risk into business risk,” said Nemertes Research founder Johna Till Johnson.
So when I spoke recently to Scott Weller, co-founder of Boston cloud startup SessionM, about a new IT security role he’s designing there, I thought it was a good occasion to reopen the debate. He’s a good one to ask. He’s the CTO — as well as the acting CISO — at the nearly six-year-old company.
“Your CISO needs to report directly to the CEO,” Weller said. “The CISO has to be very transparent around building an apparatus that can report issues and challenges and exposure to certain security issues.”
Hail to the new chief
SessionM sells a cloud platform that helps companies personalize marketing messages. The company is writing the job description for a CISO-like position it’s calling a chief cloud security officer. Weller described the role as an IT security person familiar with “the old world” of physical servers who also knows cloud computing inside and out and can identify cloud-specific security problems. Unlike a typical CISO, though, the executive won’t aim to protect just the immediate computing environment from threats — he or she will help the provider’s customers guard against them as well.
It’s a new IT security role, but it will likely fit into the CISO reporting structure SessionM already has in place: The boss is the chief executive, and the CTO and CISO are linked, of course, because Weller holds both positions. When the new hire is in place, Weller will be linked to the position through a dotted line. That means “their roadmaps are aligned,” and they will both be held accountable by the CEO to manage security problems as they emerge.
“Ultimately, it’s the role of CTO and that organization that executes technology implementation to actually take what the chief security officer is recommending and that strategy and build that apparatus into the organization,” Weller said.
‘Potential for ignorance’
He’s been in organizations in which IT security was the purview of engineering or technology execs — and sometimes less-than-ideal decisions regarding security were made.
“There is a potential for ignorance to emerge around, ‘What are our threats? What are our core priorities? How do we address those?'”
It’s important, Weller said, for a CISO — and the new IT security role — to keep the CEO and even the board of directors informed on what the risks are and what security incidents happen when they happen. They should know about attempted breaches, for example, or ransomware attacks, and how to fend off future offensives. That way, “the team together can make a collective decision on how they respond to those types of things.”
The chief cloud security officer position started at cloud providers such as Amazon and Microsoft. Learn more about it in this SearchCIO report.
The news that companies like Tesla, Google and Apple are in a race to develop Level 5 autonomous cars is stale by now. But when Intel bought Mobileye earlier this month, it re-fueled the self-driving car hype.
The Society of Automotive Engineers defines Level 5 automation as “full-time performance by an Automated Driving System of all aspects of the dynamic driving task under all roadway and environmental conditions that can be managed by a human driver.”
That means that even in the middle of a blizzard, Level 5 autonomous cars need to get people to work, said Bryan Reimer, research scientist at MIT AgeLab and associate director at New England University Transportation Center.
“Robots have to be far, far better than humans under all situations for that to happen,” Reimer said.
While Level 5 autonomous cars are a long way off, there has been accelerated progress in the autonomous space as more cars are fitted with the technologies, experts said.
As self-driving cars become the norm, they could potentially transform into mobile offices in the future, said Mike Ramsey, analyst at Gartner’s CIO research group.
It opens up a lot of productivity time for the people in the vehicle, and suppliers like Harman are working on integrating Microsoft Office 365 into its infotainment systems, he said. “If that’s enabled by an autonomous vehicle then you can work in the car, do video conferencing and other enterprise actions in the vehicle.”
Alan Lepofsky, vice president and principal analyst at Constellation Research, said the possibilities are endless.
“Is this just my individual vehicle, or if there are ten people that are driving to the same area, will our cars link up and drive to the same location and will we be able to have meetings while we are in those autonomous vehicles?”
If these “mobile offices” become the norm, CIOs would also have to think about how it’s going to improve productivity if employees are able to get more work done in their cars on the way to work, Lepofsky said.
They would also have to ensure that communication inside such vehicles is secure, he added, and there are also considerations from an HR standpoint.
“What are the expectations from employees going to be like?” Lepofsky said. “Is it too much to ask your employees to work during travel time that used to be personal? If you and I have the same job and you spend an extra hour working, then am I considered a worse employee because I want to FaceTime with my family?”
Employers and employees will need to figure out how they use this technology to make the most of their work day and still maintain a work-life balance, David Keith, assistant professor of system dynamics at MIT, said. How drivers react to vehicle autonomy is also yet to be seen, he added.
“Self-driving vehicles and autonomous technologies have emerged very quickly, but how soon we get to the more advanced level of autonomy that will change the game is hard to know,” he said.
Incorporating game mechanics into daily tasks has proven to be an effective way to motivate workers. As it turns out, gamification techniques don’t just work on us. Google DeepMind is applying the tactic to machine learning.
Prodded by gamification techniques, artificial intelligent (AI) systems are quickly becoming game masters: There’s IBM Watson and Jeopardy!, Google DeepMind and Go, and Carnegie Mellon University’s Libratus and No Limit Texas Hold ‘Em Poker champion — the latter a landmark victory in the annals of AI gaming because poker involves bluffing and guesswork.
The triumphs don’t stop there. AI is also becoming a video game master. The Google DeepMind team has trained computational AI systems known as neural nets to play Atari video games such Breakout, which was released in 1976.
The objective of Breakout, a single-player game, is to rid the top third of the video screen of bricks. But “the machine was not given the rules of Breakout,” Erik Brynjolfsson, professor of management at the MIT Sloan School of Management and director at the MIT Center for Digital Business, said the recent MIT Disruption Timeline conference.
Instead, the machine was given the raw pixels of the screen; a controller, which moves left and right; and an objective to maximize the score. After 500 games, the neural nets performed better than humans, even developing new strategies, Brynjolfsson said.
Here’s the real punchline: Researchers at DeepMind then took the process of training neural nets on how to win at Atari video games and turned it into a gamification technique for energy efficiency. Researchers trained a system of neural nets on operating scenarios, historical data on energy consumption as well as prediction data and gave it access to all of the gauges and dials; this time, the objective of “the game” was to maximize energy efficiency, a huge cost center for the internet search giant.
“Now, this data center had already been heavily optimized by a bunch of very smart PhDs, some of the best in the world,” Brynjolfsson said. “So this is not an easy problem at all.”
Turns out, the neural nets bested the best, managing a 15% reduction in overall power savings and a 40% reduction of energy used for cooling, one of the biggest consumer of energy, in particular.
“You can imagine if you take that level of improvement and apply it to all of our systems — our factories, our warehouses, our transportation systems, we could get a lot of improvement in our living standards,” Brynjolfsson said.
Robots are basking in the limelight these days, but the possibility of purchasing a Rosie the robot for the home is still a ways off. In fact, robots are used to solve only a few problems today despite their growing popularity. Three panelists at the recent MIT Tech Conference said that’s because programming robots to navigate in the real world, where the unexpected and the accidental are frequent, is complicated.
“The challenge with those sorts of environments is that there’s a really long tail of strange things that can happen where things go wrong and your robot doesn’t work anymore,” said Stefanie Tellex, assistant professor of engineering and computer science whose Humans to Robots Laboratory at Brown University is working to create collaborative robots. “And it’s really challenging to figure out all of these weird, different edge cases where things don’t quite work.”
In the lab, robot manipulators, or machines built and programmed to pick up objects and place them somewhere else, can accurately pick something up 90% of the time. “That might sound good, but if that robot is in your house picking stuff up for you, then it’s dropping your stuff and breaking your stuff one out of every 10 times,” Tellex said.
Programming robots to solve for the edge cases will require “a combination of better mobility, better sensing and perception,” said Helen Greiner, co-founder iRobot, maker of the Roomba, and founder of CyPhy Works Inc., a drone company. “By sensing, I mean more data coming back; and by perception, I mean the interpretation of that data.”
But that may be really hampering the use of robots to solve more problems has nothing to do with programming robots to sense and interpret the data. Instead, it’s the use cases. “People seem hell-bent on trying to solve problems for everyone just to start,” said Ryan Gariepy, co-founder and CTO at Clearpath Robotics. The autonomous car is a good example, with startups and corporations working on building a level five autonomous vehicle, one that would require no human interaction other than turning the car on and off.
Rather than jump on the level five bandwagon, Gariepy said to start small and look for use cases that can be solved with robotics right now.
He pointed to Bosch and its agricultural robot as an example, and his own company, Clearpath, is bringing industrial self-driving vehicles to the factory. “A factory or warehouse is an indoor city,” he said. Factories have roads, traffic, signals and rules that need to be followed, but, unlike city streets where unpredictability abounds, the factory is a fairly controlled environment.
Clearpath’s technology can be deployed in days, and because the self-driving vehicles operate so similarly to cars (they have turnings signals, for example), training the factory staff is fairly simple. By going the factory route, Gariepy said they’re already in production and making money.
Smart city technology promises to make city living better on many fronts — from easing traffic jams to improving air quality. But figuring out how to finance these do-good projects is a conundrum for cities, according to the 2017 Smart Cities Innovation Accelerator.
In the city of San Jose located in the heart of tech-rich Silicon Valley, the message to smart city vendors is simple: The city of San Jose is open for business — under certain conditions. Local government is happy to lend its architecture and man hours to help launch a smart city pilot project, but what it won’t do is help foot the bill.
Vendors pitching smart city technology are asked to make a case as to how their products would benefit the public, specify that the city won’t be required to cough up any funds, and provide an explicit beginning and end for the pilot project, according to Lloyd. “Within that construct, we can do some remarkable things,” he said.
A case in point is this year’s pilot of Facebook’s Terragraph system in downtown San Jose — part of the social networking giant’s mission to provide high-speed Internet access to everyone around the globe. Terragraph is a Wi-Fi based system that uses antennas or nodes installed on light poles and buildings within a city. The nodes connect to internet-ready devices and to each other using the 60 gigahertz frequency, which is an unlicensed and an often untapped band on the airwaves.
In his state of the city address last month, Sam Liccardo, the mayor of San Jose, said the high-speed service is part of a vision to transform the city “into a platform for the testing and demonstration of the most innovative technologies with civic impact.”
Erasing the digital divide
One of the biggest civic impacts Facebook’s Terragraph system could have is helping to close the digital divide between the city’s haves and have nots. More than 12% of households in San Jose have no internet access, according to a statistic from the city manager’s office.
The city is already partnering with the East Side Union High School District to erase the divide. With $2.7 million in school bonds over five years, the school district will pay the city to create “the first-ever school-district-wide network,” according to a report in The Mercury News.
The public funds will be used to install and maintain the district’s existing network as well as extend the city’s public Wi-Fi network to students’ neighborhoods. The partnership has Lloyd and the city thinking: Could the downtown Terragraph project be expanded as well?
Neither project will cost the city money and, yet, “they provide a great learning experience, a clear public benefit, and two wonderful partners where we can say we’re using each other’s efforts to keep on doing bigger and better things,” Lloyd said.
Ask Shutterstock CIO David Giambruno about building a software-defined data center (SDDC) at the stock photo company, and he’ll talk about what he calls “indiscriminate computing.” He means getting IT infrastructure “out of the way so the developers and the product teams can turn around and face forward.”
To do that, Giambruno is building new ways to deploy code on top of the SDDC, which essentially virtualizes all data center resources so IT can be delivered as a service. He didn’t want to talk about how he’s doing it before it’s complete — but he was eager to segue from the always-on capabilities he’s working toward to last week’s four-hour-plus downing of large swaths of the internet.
The public cloud outage was caused by a glitch in servers running Amazon’s cloud storage service in the provider’s eastern U.S. region. Companies storing photos, videos and other information there saw their sites slow and in some cases stop working.
Putting copies of that data in other geographic regions gives sites more durability — they’re less susceptible to what happens in one data center — but doing that takes more time and more money.
“I call it the difference between JV and varsity,” Giambruno said. Varsity players, to keep with his metaphor, build “multizoned” architecture — making data accessible across regions so they can absorb the jabs and blows public cloud outages can deliver.
Don’t call me junior
“Poorly architected clouds are Binary. When they fail it’s everything. My lesson 2007.”
That lesson was learned after he “stayed up just north of 50 hours piecing together tens of thousands of servers.”
No longer. At Shutterstock, for example, he has an expanded IT architecture — from having two domain name system providers to spreading out application data across multiple providers’ clouds. It’s insurance against events such as public cloud outages or cyberattacks.
Of course, Giambruno noted, such measures require “setting up that vision and that scope and also having the internal support to do it, which I do.” Shutterstock has those resources. Despite recent sluggish growth — and a Zacks Equity Research recommendation to ditch its stock — the company saw revenue climb 16% in 2016 to $494 million. The platform has 125 million royalty-free images and expanded its video library to 6.2 million.
Companies putting applications that are critical to business in the cloud need to decide at the outset how much downtime they can tolerate in case of public cloud outages, said Forrester Research analyst Dave Bartoletti in my Friday column on the Amazon outage. Then they can determine whether they need to spend the time and money to make those apps “highly, highly resilient.”
“That’s a business decision companies have to make,” he said.
Some just won’t put certain types of data in the public cloud. At the end of my story, I asked the question “How did the Amazon cloud outage affect you?” One reader, using the handle marcusapproyo, wrote this:
“It didn’t because we’re smart enough not to put production SAP into AWS.” But, he continued, “at the SAP conference I was at, a CIO stated he was losing $1M in revenue every 15 min. of down time! OUCH!!!”
Ouch indeed. Another reader, ITMgtConsultant, mused on the duration of the cloud outage and commented, “One would assume that after losing $16M of revenue the CIO is now an ex-CIO?”