when relevant content is
added and updated.
A recent pop event at Lloyds Bank in London hosted by DominoData has highlighted one of the age-old ailments of IT. It is the illness of being misaligned with the business.
The latest example of this was succinctly described by MoneySuperMarket.com’s analytics head, Harvinder Atwal during a presentation at the event.
One of the surprises that came out from his presentation is that data scientists tend to do their work on their own laptops.
Laptop data models
So data scientists tend to use their own laptops with real customer data to create data models. The reason they see a need to do this is because corporate IT puts in layer upon layer of process and procedures.
Data scientists have to request data access from IT. They need to negotiate with IT for the required compute resources, then wait for these resources to be provisioned. They may need to go back to IT to install query tools. “As a data scientist, you just want to use data as quick as possible,” Atwal said in his presentation. But in Atwal’s experience, IT is stuck in a 20th century operating model. “People don’t have access to data warehouses.”
Clearly, having real customer data on a laptop, may well infringe data regulations and is definitely a no-go area for IT security and the corporate governance, regulation and compliance teams. Nevertheless, without this data and the right data manipulation tools and environment, data scientists are unable to do their work effectively. Given that businesses hire data scientists in the hope they will discover some hidden meaning in the masses of data they collect, it seems illogical that access to the data and the tooling the experts need is so complex.
This becomes more of a challenge when the data scientists wish to improve the accuracy of their data model by having the data analytics equivalent of continuous testing and delivery. In an ideal world, a data model will continuously improve once deployed. This is because insights from new customer data can provide a feedback loop. The very structured approach to delivering corporate IT is not aligned with the need for business to gain insights from data rapidly.
DataDomimo is one of a number of companies hoping to tackle this problem by providing a containerised environment for data scientists to work in.