when relevant content is
added and updated.
when relevant content is
added and updated.
Data science image via Shutterstock
By James Kobielus (@jameskobielus)
Societies are massive learning machines. Cultures endure because groups refine and perpetuate repeatable learning processes, such as child rearing, community schooling, and organized religion.
In societies, everyone gets indoctrinated, whether they like it or not, and whether or not they agree with the “urtext” being instilled in their central nervous system. Indoctrination can be utterly essential if the lessons help us to survive and thrive. We can have confidence in what we’ve learned if it corresponds to what we experience in the world around us. And we can have confidence in machine learning (ML) if what it detects in the data corresponds to what’s being observed in the real world. And that, in turn, demands that the outputs of ML models align with some sort of authoritative urtext.
What I’m referring to is “training data,” which is the fundamental urtext in the dominant ML practice known as “supervised learning.” Without a baseline set of curated training data labeled by one or more knowledgeable humans, supervised-learning algorithms can’t work their magic. What these algorithms do, at heart, is search for correlations in the observational data that are consistent with those previously tagged and flagged in the training data. Another, more evocative term for training data is “ground truth.”
Ground truth, in supervised-learning ML applications, can come from groups just as readily as from individual human experts. Depending on the algorithmic task at hand, the knowledgeable individuals may in fact be a crowdsourced group of anonymous strangers who’ve been asked to respond to specific challenges. In other words, a species of “social” indoctrination can be used for improving the performance of ML algorithms. The upshot is that we can have greater confidence in ML models if collective human judgments are used to continuously vet, adjust, and improve their outputs.
That’s what this recent Computerworld article describes as “human-in-the-loop computing.” The article gives examples of supervised learning ML applications where crowdsourced inputs are essential. These include such challenges as identifying handwritten alphanumerics, annotating the narrative revealed in photographic images, and providing failsafe checkpoints to autonomous vehicles. As the article states, the benefits of this social-learning man-machine symbiosis are undeniable. “It’s often very easy,” states author Lukas Biewald, “to get an algorithm to 80 percent accuracy but near impossible to get an algorithm to 99 percent. The best machine learning lets humans handle that 20 percent since 80 percent accuracy is simply not good enough for most real-world applications.”
But, strangely enough, social learning can even apply when there are no humans in the loop, as discussed in this recent MIT Technology Review article. It discusses a research project in which groups of robots share data with each other in order to master a neuromuscular capability that most organic creatures have hardwired into their DNA. Researchers are using deep-learning algorithms to help robots collectively and iteratively to figure out how to manipulate physical objects with greater control and precision. Leveraging a continuous feed of training data fed from cameras and infrared sensors, robots are teaching themselves how to manipulate an order of magnitude more objects more rapidly than any of them could if they were operating in isolation from their peers.
As global society adopts algorithmic processes into every element of our existence, we’ll be demanding that social-learning approaches be used to continuously and automatically refine the underlying machine learning models that drive it all. If we extrapolate the above-cited robot-social-learning capability out to the Internet of Things, it’s clear that humans may someday be able to leverage worldwide sensor grids to drive robotic social-learning applications to manipulate every last physical objects in the world around us, or even in space. And if we leaven this robotic social learning with crowdsourced human judgments, we can extend human control over the physical environment to an awesomely cosmic extent.
Our confidence in the algorithmic economy will ride on the knowledge that it’s continually learning from society as a whole. And by that I mean both the societies of crowdsourced humans and of our colonies of intelligently machines everywhere.