Posted by: Jack Vaughan
data architecture, hadoop, middleware
Q: What is a data scientist? A: It’s a DBA from California. The joke belies the fact that the world of big data skills right now is pretty much topsy-turvy. If you would you like to look at a short list of skills associated with big data initiatives, you are out of luck. Try a long list instead.
The skills list – courtesy of the IT skills specialists at Foot Partners, LLC – includes Apache Hadoop, MapReduce, Hbase, Pig, Hive, Cassandra, MongoDB, CouchDB, XML, Membase, Java, .NET, Ruby, C++ and more.
Further, the ideal candidate needs to be familiar with sophisticated algorithms, analytics, ultra-high-speed computing and statistics – even artificial intelligence. The needs of big data, which arise in part from modern computing’s ability to produce more and more bits and bytes, mean that developers have to hone their skills significantly. Suddenly, SQL-savvy developers have to obtain NoSQL skills.
New technology like Hadoop is so raw that the developer is often forced to create his or her own software tools, which is a skill in itself. Writes the Foote crew:
Hadoop is an extremely complex system to master and requires intensive developer skills. There is a lack of an effective ecosystem and standards around this open source offering and generally poor tools available for using Hadoop.
Foote warns that there is only more of the same to come, especially as unstructured data from sources such as sensors and social media pile up in the in-bin. Note to big data scientists of tomorrow: get ready for the deluge! – Jack Vaughan