TotalCIO

Oct 13 2017   12:40PM GMT

Feature engineering headache disappears with deep learning

Nicole Laskowski Nicole Laskowski Profile: Nicole Laskowski

Tags:

NEW YORK — One of the biggest differences between machine learning and deep learning is the effort that goes into making the algorithms work.

With machine learning, data scientists have to perform a task called feature engineering. “People get the incoming data, and they prepare it, and they clean it, and they maybe manipulate it in a way that’s going to give them the relevant information,” said Edd Wilder-James, former vice president of technology strategy at Silicon Valley Data Science and now an open source strategist at Google’s TensorFlow, during a presentation at the Strata Data Conference.

Take the use of machine learning to determine if it’s day or night, and the data used to train the model is photographs. Before the model is released into production and before it’s even trained, data scientists have to determine what features in the data will help the model learn. “Our feature engineering might be as simple as counting the number of dark pixels at a certain threshold: What percentage of the image is dark?” he said.

Pinning down features and thresholds is a difficult but vital process that requires domain expertise and knowledge of the data, according to Wilder-James. “With this kind of machine learning, a lot of the effort goes into … figuring out what are the features and making the damn thing work,” he said.

With deep learning, data scientists can skip the feature engineering step. The model would instead rely on enormous training data sets to figure out which is which on its own — and time.

“It’s slow. We’re talking days, weeks even, maybe a month to train a model,” Wilder-James said. “It requires a large amount of training data to get right. This is definitely a big data problem, in that sense.”

And be warned, deep learning models can also be fooled. Generative adversarial networks can trick a model into seeing something in the images that can’t be detected by the human eye. This creates big security implications, Wilder-James said.

1  Comment on this Post

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when other members comment.
  • BigDataAnalytics

    This Blog is Very interesting to read and thank you for sharing the valuable information about Deep Learning. The information you provided was very easy to read and understand. I gathered a lot of information from your Deep Learning blog.

    0 pointsBadges:
    report

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

Share this item with your network: