Data Matters

Page 1 of 1212345...10...Last »

March 14, 2018  2:46 PM

Fourth Industrial Revolution rhetoric: mere cant?

Brian McKenna Profile: Brian McKenna

Philip Hammond’s Spring statement, as UK chancellor, reached, predictably, for the rhetoric of the so-called fourth industrial revolution.

Not for the first time. Whenever he gets the chance to say the UK is in the forefront of artificial intelligence, big data analytics, and so on, and so forth he takes it. He might be taking his “spreadsheet Phil” moniker a bit too seriously.

This nationalistic appropriation of AI/machine learning functions as a fig leaf for Brexodus, it almost goes without saying. “Don’t worry about Brexit, we’ve got the AIs and the hashtags to keep us warm”, is the gist of government patter here, whether from Hammond or Amber Rudd, home secretary. How much any of them know about technology is anyone’s guess.

Hammond seems to believe Matt Hancock, secretary of state for culture, media and (also) sport, is himself a product of the software industry — of which he is, admittedly, a scion. This is Hammond, speaking in the House of Commons this week:

“Our companies are in the vanguard of the technological revolution.

And our tech sector is attracting skills and capital from the four corners of the earth.

With a new tech business being founded somewhere in the UK every hour.

Producing world-class products including apps like TransferWise, CityMapper,

And Matt Hancock.”

Hilarious. And Theresa May, the prime minister, is always keen to get in on the 4IR act. Her speech in Davos, to a half-empty hall, was long on technology rhetoric, and short on detail about what the global elite are interested in – viz Brexit.

Now, there is no denying the UK does have some unusual strengths in AI, at least in terms of academic research, and the start-ups therefrom. One can only wonder at the world-class work undoubtedly going on at GCHQ under the AI banner. The UK must, surely, have an advantage to squander?

Hopefully, the forthcoming House of Lords Select Committee report on artificial intelligence will provide a balanced, cool, rational, non-flag waving description of the state of the art in the UK, and offer some policy that will make a positive difference to our economy. But it will only do so if it takes the measure of some of the AI scepticism expressed in the committee’s hearings towards the end of last year. And appreciates that there are different sides in the debate on AI among people who know what they are talking about. It’s not all Tiggerish enthusiasm, whether nescient or not.

February 15, 2018  2:28 PM

Machine Learning, what is it and why should I care?

Brian McKenna Profile: Brian McKenna

This is a guest blogpost by Luiz Aguiar, data scientist at GoCompare.

We produce a massive amount of data every day.

Not only that, our attitudes towards the data we produce are also changing. We’re becoming more comfortable sharing the data we produce with apps, businesses, and other entities, if it means getting better services.

Most of us are happy for companies like Google, Amazon or Netflix to know our preferences to better tailor the content we are served, or recommend the things we want to buy. We’re even inviting these companies into our homes by embracing AI systems, like Alexa, Google Home or Siri to make our lives easier, by using the data we provide them.

So if we produce data at exponential speed and are happy to share it for get tailored services, why aren’t more companies taking advantage of this? Why do so many still rely solely on traditional market research and guesswork?

The key problem is that the sheer amount of data available means it’s hard for companies to analyse it effectively. It would take forever for a person to be to be able to analyse all the data we provide and get some insight from it, let alone being to design better services as a result.

The problem of unstructured data

Not only is the sheer volume of information a problem for analysts, another issue is that the majority of this data is unstructured making it incredibly hard to classify and compare.

That’s because the information we produce is not in the right format, shape or requires some enrichment.

As an example, imagine you are in a restaurant deciding what to order. The likelihood is you’ll look through the menu and choose one of the options based on the information available – this is structured data.

In comparison, unstructured data would be like sitting down to a list of every single raw ingredient and cooking utensil available in the kitchen, then having to piece it all together to figure out what you want. All the information is there – but just not in an easily accessible way.

Obviously, the first option is the easier one to process, the second would be too daunting and complex for a person to analyse and make a quick decision – and this is where machine learning can help.

Machine learning runs information through a series of algorithms that classify and group data and then uses this to find patterns and subsequently predict future behaviours, all on an enormous scales. In short, machine learning techniques are able to extract insights deeply hidden inside your data, that otherwise would be impossible to detect.

Thinking back to our restaurant example, while a person might struggle to sift through the unstructured data for just one establishment, a well-trained AI could do this for any restaurant in the country, or even the world.

Then, using other information about you it could make an informed decision of what you should eat, when you should eat and where you should eat – giving you the best possible experience, without you having to even think about it.

And that’s just one example. Algorithms as Artificial Neural Networks, that try to mimic the functions of a biological neural network are very powerful in pattern recognition and image classification. They have the potential to do a better job than humans at recognising stock market trends, house prices, insurance costs, medical diagnoses, you name it. The possibilities are almost endless.

This is why you should care about machine learning, and why over the next few years machine learning and AI won’t just be the buzzword that everyone is talking about, but will be the fundamental difference between successful tech companies and those that get left behind.

GoCompare has opened access to its APIs to other fintech organisations through a new community development, Machine Learning for Fintech. For more information, or to apply for a developer token, go to

 Originally from Rio de Janerio, Luiz holds completed his an MSc in Computer Science Optimisation and Machine Learning from the Pontifical Catholic University of Rio de Janerio.

Luiz moved to England in July 2015 and worked for Formisimo as lead data scientist on the Nudgr project and Perform Group as a Data Scientist, before joining the Data Science team at GoCompare.

February 13, 2018  3:51 PM

The data warehouse – why it’s time to stop writing its obituary and modernise your approach

Brian McKenna Profile: Brian McKenna

This is a guest blogpost by Dave Wells, practice director, data management at Eckerson Group.

If there’s one thing the IT industry is exceptionally good at it, it’s proclaiming the death of a particular technology.  In the mid 1980s industry observers sagely pronounced that COBOL was dead.  Fast forward to today and COBOL still playing a role in healthcare for 60 million patients daily, 95% of ATM transactions, and more than 100 million lines of code at the IRS and Social Security Administration alone. I can’t help but recall Mark Twain’s famous quote, ‘the reports of my death have been greatly exaggerated!’

It’s not only COBOL that people want to consign to history.  In 2013 SQL was declared dead, yet thousands of SQL job postings can be found on the web today.  Just recently I heard that popular programming language Ruby was on its last legs.  And then we have the data warehouse: over the last few years, there’s been a steady stream of obituaries announcing that the data warehouse was about to be consigned to the technology graveyard.  But when surveys such as that conducted by Dimensional Research show that 99% of respondents see their data warehouse as important for business operations and 70% are increasing their investment in data warehousing, it appears the data warehouse remains very much alive.

But here’s the issue, while the data warehouse is alive, it also faces many challenges today. The root of the “data warehouse is dying” claim comes from the opinion that it hasn’t ever completely delivered on its promised value.  The original vision was a seductive one – got a ton of data but no way to leverage it? No problem. Put it in a data warehouse and you’ll be extracting valuable insights to drive competitive edge in hours.  Except, you couldn’t.  Companies found that using traditional and very manual tools and processes, building and managing data warehouses wasn’t quite as easy as promised. Once built, typically, the data warehouses didn’t scale well, weren’t particularly agile or easy to rely on (due to performance variability), and, later on as needs evolved, they weren’t particularly well equipped for coping with the challenges of big data.

Data warehousing in the cloud

But, but, but….  The very fact that so many companies have clung doggedly to their (imperfect) data warehouse tells us that they are extracting some value.  It’s just that it could be so much more.  Enter the data warehouse of the cloud computing age.  By migrating to the cloud, some classic data warehouse challenges disappear.  Can’t scale or be agile in providing data quickly to those who need it?  The cloud data warehouse changes that.  Need to deploy rapidly but also dial up (and down) investment?  The cloud data warehouse allows you to do that.  And if you’re faced with the argument that the cloud erodes confidence in data governance and compromises the reliability of the data warehouse, well, there’s an answer to that too.

However, if we’re to constructively stem the expert proclamations of data warehouse demise, we must re-evaluate the original simplistic expectations of data warehousing as a one-size-fits-all, never evolving data infrastructure model for every organisation to reach its best use of data. Data warehousing must be fluid as organisational needs change and new data technologies and opportunities arise. And to accomplish that, we need to modernise how IT teams design, develop, deploy and operate data infrastructure.  Expensive, redundant, laborious and time-intensive efforts intertwined with the use of traditional, non-automated approaches have limited organisational value greatly and cast a heavy cloud over data warehousing. However, organisations using automation software, such as Wherescape’s, to develop and operate data warehouses are providing far-reaching value to business leaders at greater speeds and less cost, while at the same time positioning IT to more easily incorporate timely technologies, new data sources and flex as business needs demand. With these adjustments, the reality of the data warehouse can better live up to the associated vision, and continue to deliver much more to organisations for many years to come.

February 2, 2018  10:59 AM

How to do AI if you are not Google

Brian McKenna Profile: Brian McKenna

This is a guest blogpost by Matt Jones, lead analytics strategist at Tessella, in which he argues companies with physical products and infrastructure cannot simply cut and paste the tech giant’s AI strategy

Much written about AI seems to assume everyone wants to emulate Google, Facebook, or other companies built around data.

But many organisations look nothing like these tech giants. Companies in manufacturing, energy, and engineering – long standing, multi-billion-pound industries – derive revenues from physical products and infrastructure, not from targeting adverts at groups or individuals. Their data is usually collected from industrial machines and R&D processes, not people and internet spending habits. Their data collection is often bolted onto decades-old long lived internal processes, not built-in by design.

This type of data will deliver insights such as whether a factory can operate safely or predict the active properties of a new drug like molecule; not whether clicks turn into sales. This is very different from the insights that companies like Google are generating and looking at, and these pre-digital companies must take a very different approach to deriving benefit from AI.

CIOs at these companies can learn from the tech giants but trying to cut and paste their approach is a route to AI failure. Based on our work with companies built in the pre-digital age, we at Tessella recently produced a white paper outlining 11 steps that these pre-digital companies must take if they are to drive growth and stay competitive with AI. Broadly, these steps fall into three categories: building trust into AI, finding the right skills, and building momentum for AI programme delivery.

Trust is important

A key difference between the digital native companies and pre-digital enterprises is that the latter are often looking for very specific insights. Digital companies can afford to experiment and accommodate imprecision; a badly targeted advert will do a little harm. But an AI designed to spot when a plane engine or off-shore oil rig subsurface structure might fail demands absolute certainty.

Pre-digital companies cannot simply let an AI loose on all their data and see what patterns emerge, such unsupervised training experiments may provide estimations or suggestions, but they cannot be depended upon to inform an empirical solution. In these high-risk cases, there is a higher need to find the right data in order to effectively train AIs in a supervised learning regime.

Too many companies start by trying to pool all their data, perhaps looking admiringly at what Facebook and Amazon can do. For most, this is costly and unnecessary, at least in the short term. Companies should start by defining the problems AI can solve, identify the data needed to solve that problem, put people, technology and processes in place to collect and tag that data, then turn it into AI training data.

As AI is developed, there is also a need to maintain oversight to ensure the AI is delivering trustworthy results. Basic AI governance in high risk situations must include random sampling of AI outcomes and checking them against human experts for accuracy.

Finally, AI interaction, the user experience, must be intuitive, or it will not be taken up. AI decision support must take advantage of data visualisation and search technologies to ensure results are presented in meaningful ways. We can learn from digital native companies here, who are experts at making things easy for users: Google Photos runs neural networks, image analysis, and natural language understanding, but all the user needs to master is a search bar.

People not platforms

The temptation can be to completely hand over the problem to so-called data experts, or to buy in expensive technology platforms. But this misses an important point: that AI isn’t about spotting patterns, it’s about understanding what those patterns mean.

AI needs people who understand that data represents something in the real world – material strain, temperature readout, chemical reactions, maintenance schedules – and who can put together effective training regimes. AI should therefore be designed by people who understand the underlying data and what it represents within this business context. The best teams include representatives from IT, operations and business teams, domain experts partnered with embedded AI and data analytics experts who not only possess technical expertise but can also translate between these different roles.

We can again learn from the digital native companies. It is notable that these companies spend their budgets hiring the best people to design AIs which are right for them, not on buying in off the shelf technologies. Whilst the pre-digital companies will need different skill sets and more specific industry understanding in their AI teams, the focus must remain upon finding these right skills. This is the key to AI success, regardless of industry.

Build momentum

The digital native companies started from scratch and created the digital world, which they went on to lead. Longer established companies do not have this luxury – they come with decades of development in a pre-digital world, which has now been upturned and potentially disrupted. Many of their staff and processes are not ready for this new data driven world. They cannot just switch overnight; however ambitious their CIOs might be.

Such companies should set long term goals of digitalising processes and identifying where they see AI automating and advising. But they must work towards this goal determinedly and transparently keeping their people informed and engaged with the digital transformation; gradually shifting the business model and bringing existing staff with them on the journey. Starting too big without a carefully planned digital roadmap often undermines effectiveness and impact.

Pre-digital companies should initially focus on well-understood opportunities that can be executed quickly, with clear measurable milestones to demonstrate success built into their roadmap. This should be accelerated by running multiple agile AI projects in parallel, ensuring the best ideas are progressed rapidly. This will build a critical momentum for AI change programmes.

As they go, they should monitor their many AI projects, checking relative performance of each, immediately abandoning the bad ideas, and using successes (and failures) to improve training regimes. This agility is how digital companies deliver innovation but is lacking in many pre-digital organisations.

To summarise: physical enterprises undergoing digital transformation can and must harness the disruptive potential of AI. If they don’t, they will quickly be outpaced by competitors, startups or even tech giants with an eye on expansion. They start from very different positions to digital native companies. If they want AI to deliver business impact, they must mindfully find their own approach to people, processes, technology and management and form close, strategic partnerships with those that will build momentum behind an AI enabled digital transformation.

January 25, 2018  11:39 AM

Empowering everyone with actionable analytics

Brian McKenna Profile: Brian McKenna

This is a guest blogpost by Arijit Sengupta, head of Einstein Discovery, Salesforce                                                                                    

The business world has largely forgotten why we need analytics: for action. Employees spend hours each week, month and quarter crunching hundreds of thousands of data points, but all too often the pretty charts and insights are never looked at again. Maybe those insights are on a slide that is presented and discussed in a meeting, but since analytics aren’t incorporated in the workstream, nothing ends up happening. Or maybe the data is about last quarter, and it’s too late to do anything with the knowledge. The point is, analysis is useless when it doesn’t result in specific actions.

We need analytics because it is supposed to guide us to the right decisions. Analytics has the power to tell us what we need to do and how to do it at the moment we need to make a business decision: to increase sales next month, double-down efforts on a certain region, decrease customer churn or recreate a successful campaign. Of course, analytics can guide decisions at the most senior, strategic levels — what new markets to tackle or new products to develop — but as with any technology, the most impact is going to come when everyone is empowered with the analytics needed to make the most intelligent decisions.

True technological revolutions happen when everyone is empowered. The invention of the computer was innovative but not revolutionary. It became revolutionary when the average person, with little experience with computers, could go to the store, pick a Mac off the shelf, and plug it in at home.

The Internet browser faced a similar trajectory. The Internet had existed for years before it truly became accessible to everyone. I remember learning how to use the earliest browsers, like Nexus and Lynx, that were text-oriented and required the user to write queries; the knowledge required barred the majority of people from ever using them. The Netscape browser drastically changed that. Whatever you saw on the screen, you could click and more would appear. This was the fundamental shift: when the Internet truly became easy enough for just about anyone who could read and write to understand and use it, it transformed society.

We’re at a similar inflection point with analytics today as it is rapidly moving from the realm of expert data scientists to the larger business community. The early days of analytics meant only specially trained experts could build models and perform statistical tests to ensure findings were accurate. Self-service business intelligence (BI) soon followed, giving everyone the chance to do basic analytics. It let you draw a graph and make your data look pretty, but that’s where we forgot the real significance of analytics — those graphs didn’t necessarily spur action, they were just visualizations of what had already happened. Not to mention, self-service BI could lead you down a wrong path all too easily. Say the standard deviation was so high that the whole graph could change if you removed one data point! The rigor of statistical testing that data scientists use never made it to the self-service technology.

But today, artificial intelligence (AI) in advanced analytics is the key to spurring the democratization of data-driven insights. Now AI makes it possible for the computer to do complex statistical tests for you. AI can recommend which graph to look at if the one you’re using has unclean data. It can find insights across multiple variables that the untrained eye would not notice. It puts protective systems in place to ensure the user doesn’t make poor decisions based on a misleading graph.

Augmented analytics, also known as smart data discovery, takes it even further by using the power of machine learning to surface actionable insights and recommendations, and shows tremendous promise. In fact, Gartner calls it the “next wave of disruption in the data and analytics market.” Gartner states that automated insights embedded in enterprise applications will “enable operational workers to assist in business transformation. Most business users do not have the adequate training needed to read complex charts and graphs, but leaders are discovering there is an untapped benefit to putting data-driven insights in the hands of employees in sales, service and marketing. If I’m a sales manager, my dashboard can tell me that the Western Region is underperforming. With a few clicks, I can understand why and how to fix it: I can see that we’re losing large deals against a certain competitor, and identify my top-performing AEs to train others on how to approach these deals in the future. Suddenly I’m able to trust the findings when I understand the context behind them. Every employee can feel confident they’re taking the most beneficial actions to achieve the desired outcome, thanks to an intelligent analytics experience that recommends what to do the moment they need to make a decision.

Like the PC or the browser of the 90s, the presence of AI in advanced analytics has untold promise to be the democratizing force for data-driven and actionable insights. The power of AI plus augmented analytics means that data becomes digestible and easy for anyone with a basic understanding of numbers and the business problems at hand. Imagine the potential of each and every employee empowered by data science at every decision point — that’s the reality of AI for everyone.

January 3, 2018  11:56 AM

Big data best practice means a best-of-breed approach

Brian McKenna Profile: Brian McKenna

This is a guest blogpost by Sebastian Darrington, UK MD at Exasol

The days of operating a single vendor IT software estate are behind us. Such is the pace of innovation and change, putting all your eggs in one basket simply won’t do.

Businesses need the ability to mix and match, leveraging the very best new and existing technologies for the task at hand. Nowhere is this more clearly the case than with big data, analytics and the cloud.

Data brings clear benefits to any business that harnesses it. From greater insight, more informed decision-making, to better efficiencies of operation and execution. Big data that is used correctly can unlock opportunities far beyond those achievable with individual silos of information. However, to achieve this requires the right tools, interoperability between systems and the organisational buy-in to use it properly. This is why the market for data warehouses alone is forecast to grow to $20 billion by 2022 according to Market Research Media, with the wider market for enterprise data management set to hit $100 billion by 2020, according to MarketsAndMarkets

Single vendor vs multi-vendor

A data warehouse is a system used for reporting and data analysis, and is considered a core component of business intelligence. According to Wikipedia, “DWs are central repositories of integrated data from one or more disparate sources. They store current and historical data in one single place that are used for creating analytical reports for knowledge workers throughout the enterprise.”

The implication and indeed the precedent, has been that this meant a data warehouse implied a single data store into which data was ingested; which in turn meant a single vendor solution.

The burden of single vendor lock-in, and the inevitable compromises that approach brings with it are unconscionable in today’s agile, decentralised and increasingly data-heavy IT world. It is unlikely that any organisation can extract maximum performance, functionality and features from a single vendor solution that has been designed to appeal to the broadest possible customer base. The emergence of highly customised data warehouses in the cloud and data warehouse as a service (DWaS) is case in point, as organisations demand lower latency, higher-speed analytics, cost-effective processing and the ability to scale on demand.

Back in 2012 Gartner introduced the term “Logical Data Warehouse”, the idea being that you didn’t have to have a single data store, but instead could leverage best of breed data stores such as Hadoop or NoSQL technologies and present them as a single aggregated data source without the need to necessarily ingest the data first.

The idea has evolved over the last five years, but the fundamental premise remains: Organisations investing in data warehousing need to architect their solution based on a best of breed multi-vendor strategy. One that allows for good deal-making and competitive tendering among vendors vying for your custom, cost and time-effective incremental change and the most extensive level of compatibility between systems and processing platforms. Doing so allows the resource to grow and shift with the business, rather than become a fixed-point release that ultimately ages to become an impediment to progress.

Making the pieces fit

An effective data warehouse is the beating heart of your data strategy and consolidates or aggregates various raw data formats and multiple sources through a single presentation layer. This is why interoperability is so critical. This centralised data hub can then be used to correlate data and deliver a single version of the truth to all data consumers within an organisation whether they be BI analysts, data scientists, line of business users, analytics engines, visualisation systems, marketing communications platforms or even AI algorithms; either in near-real-time or on a periodic basis.

If you utilise best-of-breed standalone components, they need to talk to each other, as well as with the primary data store. With the emergence of the internet of things (IoT), data platforms are getting more fragmented as the sources of data grow in number. When building your data warehouse stack, whether you leverage cloud platforms such as Azure or AWS or visual analytics systems from the likes of Tableau, Qlik and MicroStrategy, the core of a best of breed data warehouse needs to be a database that can work across a wide variety of complementary applications, can straddle on-premise and cloud services and that does not insist on a single vendor investment or data format strategy. Ideally, a good logical data warehouse strategy will be complementary to the systems already operating in the business, maximising the return on investment in them, rather than displacing them.

These core component decisions will also need to take into account where the data is coming from, how much of it there is, how frequently it updates, how frequently it needs to be analysed and what else needs to be done with the data in order to extract value from it. From here you then have a base for adding on other commoditised and custom components and features that will bolster enterprise management, data visibility and value extraction.

Staying up-to-date without disrupting

The analytics space is evolving at breakneck speed. So much so that any integrated analytics solution is going to be out of date in less than the three-year lifecycle employed for most enterprise IT systems. Being able to extract individual solutions from a data warehouse will allow for efficient and cost-effective development and expansion of your data warehouse, while avoiding lock in to obsolete systems and code bases due to reliance on them by other parts of a single vendor system.

Good enough no longer cuts it

Ultimately, each solution needs to be proven and a market leader with regards to the functions you need. Good enough won’t cut it and will ultimately hold back the rest of the system. It’s beyond the pale for any one vendor to produce a fully-featured, completely flexible logical, cost effective data warehouse that will evolve the way every organisation needs. A decentralised, commoditised and component based data warehouse, built to the specific needs of the organisation will be best place to deliver better performance, easier customisation and gradual evolution that keeps pace with innovation across the board, rather than from a locked-in vendor.

January 3, 2018  11:36 AM

Combining machine learning and scorecards to assess credit risk

Brian McKenna Profile: Brian McKenna

This is a guest blogpost by Shafi Rahman, principal scientist, FICO

Last year, artificial intelligence (AI) generated countless news headlines and ideas that fascinate us. Its predictive ability has been called on in a range of industries, especially in the financial sector.

However, the use of AI and machine learning in retail banking pose a special challenge. There are numerous regulations requiring that lenders be able to explain their analytic models, not only to regulators but often to consumers. AI is frequently a “black box” approach, so how do you tell a customer why an AI-developed algorithm gave them a certain risk score, for example?

Our work at FICO has focused on bridging that gap. This example shows what is possible if you want to blend the best of traditional analytics approaches with AI and machine learning.

Limitations of traditional credit risk models

A traditional credit risk scorecard model relies on inputs of various customer characteristics to generate a score reflecting the probability of default. These factors are put into different value ranges, with each “bin” being assigned a score weight. The score weights corresponding to the individual’s information are then added up to produce the final score.

Each bin computes its score weight by measuring the weight of evidence (WoE), which is the separation between known good cases and known bad cases. A WoE of 0 means that the bin has the same distribution of good and bad cases as the overall population, whereas the further away the score is from 0, the more concentration in the bin of one case over the other compared to the overall population. A scorecard generally has a few bins with a smooth distribution of WoE.

However, these analytics-driven scorecards cannot function effectively without a sufficient number of either known good or bad cases. When data is limited, a noisy, choppy WoE distribution is yielded across bins, leading to weak-performing scorecard models.

A Machine Learning alternative

One machine learning alternative to the scorecard model is an algorithm called Tree Ensemble Modelling (TEM). TEM involves building multiple “tree” models, where each node of the tree is a variable which is split into two further sub-trees.

Each tree model uses just a handful of characteristics as input, which produces a shallow tree and ensures a limited splitting of variables. With TEM, the minimum number of good and bad cases can be met more frequently, thus solving a key problem of the scorecard approach.

However, unlike a scorecard approach, TEM cannot point to the reasons for giving someone a particular score. This lack of explainability is a big limitation of a purely machine learning approach, given that TEM models can have thousands of trees and tens of thousands of parameters with no simple interpretation.

Although not practical for use, a comparison of both showed that the machine learning score outperformed the scorecard. The next challenge was to narrow the performance gap between the machine learning and scorecard models.

A hybrid approach

At FICO, we wanted to merge the practical benefits of a scorecard – explainability, the ability to input domain knowledge, and ease of execution in a production environment – with the deep insights of machine learning and AI, which can uncover patterns scorecard approaches cannot.

To do this, we developed a tool that recodes the patterns and insights discovered using machine learning or AI and turns them into a set of scorecards. Instead of directly computing the WoE from good and bad data points, the tool tries to match the score distribution generated by a machine learning algorithm like TEM, which ends up providing an estimate of the WoE for each bin.

Significantly, this hybrid model is almost as predictive as the machine learning one and we think overcomes the limitations imposed by an insufficient number of cases. Whereas it was previously considered impossible to build powerful scorecards with sparse cases, our approach now allows us to do so, and remain transparent as well.

Machine learning can expose powerful and predictive latent features that can be directly incorporated into a scorecard model to preserve transparency while improving prediction – a function that is not limited to credit risk modelling.

December 21, 2017  3:35 PM

Three things learning providers can glean from the private sector

Brian McKenna Profile: Brian McKenna

This is a guest blog by Jayne Wilcock, curriculum and data manager at East Riding of Yorkshire Council

‘Technology gives the quietest student a voice’ and few would argue with Jerry Blumengarten, the American writer and education consultant who said this. But is it time for adult and community learning providers to use technology not just in class, but in the same way that businesses do to predict trends and respond accordingly?

At East Riding of Yorkshire Council, there has been a push in recent years to explore how the methods adopted by successful businesses might help enhance our adult and community learning offering.

The council provides learners aged 16 and above with individual support to help build confidence, promote wellbeing and develop skills to ultimately boost their career prospects. But with four sites dotted across the region, it was previously a challenge to deliver the right courses in the right areas, while making sure learners were getting maximum benefit from their studies.

By adopting three strategies from the corporate world, we have transformed the way we support learners and lifted the burden of administration from our staff.

  1. Make life easy for the customer

Key to the success of the initiative has been to make it easier for learners to engage with us, achieved partly by moving our largely paper-based enrolment process online.

Before, it was expensive to produce and distribute paper course information and impossible to update or change anything without incurring further costs.

Now, potential learners simply log on and they are directed to the East Riding of Yorkshire website, where they can see details of the courses they are interested in alongside any additional, relevant information.  They may also be shown similar courses that could interest them, a common approach taken by online retail outlets.

The process can be compared to buying a new computer online where potential customers might be directed to other relevant information such as details of compatible software, printers or support options, for example.

With more information available, and the option to contact the authority directly if necessary, learners get the help they need to choose the course which is right for them and the instant payment option increases the chances of them signing up straight away.

  1. Use data well

In the corporate arena, data is scrutinised and used to make informed decisions. Moving from a paper-based to an electronic registration system has allowed us to analyse attendance data from each of our sites, which was not easily achievable before.

Falling attendance can be a sign that a learner might withdraw from a course, so to help identify and address issues they might be experiencing, staff use the tools available in our UNIT-e management information system to look at course attendance levels regularly for both individual learners, and across subject areas. It’s then possible to quickly spot gaps and intervene early to reduce the risk of withdrawal.

The ability to create a set of customisable reports to provide live data on everything from the number of enrolments to the latest qualification achievement rate allows managers to make informed and timely decisions.

  1. Measure the impact

Drop-out rates have been reduced as a result of this initiative – during the 2016-17 academic year, retention was boosted by 4% points on the previous academic year and a 6%-point rise was achieved over a three-year period.  Overall achievement rates also rose by 4% points in the same three-year period.

Being able to view the latest attendance figures, alongside learners’ achievement data, has helped us to focus on ensuring quality education is being delivered across the region.

Our key aim is to provide learners with the best possible experience, whilst helping them to gain new skills, confidence and achieve their goals – whether this is to find work or become more familiar with digital technology.

To succeed in business, there needs to be a tangible benefit to any action and a positive impact in the marketplace. It’s no different when you’re in the business of delivering learning opportunities.

Jayne Wilcock is curriculum and data manager at East Riding of Yorkshire Council, which uses the UNIT-e management information system from Capita’s further and higher education business. 

December 6, 2017  11:42 AM

Why not the North?

Brian McKenna Profile: Brian McKenna

This is a guest blogpost by Ted Dunning, chief application architect, MapR Technologies.

I am a foreigner to the UK. I am an engineer.

These characteristics are what shaped the first impressions I had of the north of England over twenty years ago. I came then to consult at the university in Sheffield and was stunned by the rich history of world-class engineering in the region. The deep culture of making and building across the north struck me at the time as ideal for building new ventures based on technology and engineering.

Twenty five years on, when I come back to visit, I am surprised to see that the start-up culture in Britain is still centred around London with small colonies in Edinburgh, Cambridge and Oxford. The north of England is comparatively a start-up vacuum.

The sprouting of technological seeds like the Advanced Manufacturing Research Centre (AMRC) at University of Sheffield show that the soil is fertile, but that success makes the lack of other examples all the more stark.

Drawing necessarily imperfect analogies with US cities, the former steel town of Pittsburgh has suddenly become a start-up mecca for self-driving cars, but Sheffield has not had a comparable result, in spite of scoring well in the last, 2014, Research Excellence Framework in Computer Science and Informatics – 47% of the submissions scoring 4*: “quality that is world-leading in terms of originality, significance and rigour”. For comparison, Oxford scored 53, Cambridge 48, and Manchester (with its Turing-related heritage in computer science), 48: so Sheffield is in a similar bracket of excellence.

Invention and start-ups are like a rope and cannot be pushed. The inventors and visionaries who would pull on that rope can, however, must be inspired and encouraged. The real magic of Silicon Valley is a sense of optimism and willingness to attempt the impossible. Closely related to that optimism is a generosity of spirit and willingness to help others for no obvious short-term return. There are stories about places like the Wagon Wheel Restaurant in Mountain View where engineers from different companies used to share problems and solutions over beers. Unfortunately, it seems to be a common impression that this licence is somehow geographically bound.

It isn’t.

It is woven into all of our expectations of what can and cannot be done. The same sense of “yes, we can” can be applied in the north.  If that idea could turn sleepy California orchard towns like San Jose or Sunnyvale or a gritty steel town like Pittsburgh into technological powerhouses, it can do the same for Sheffield or Liverpool or Manchester.

The time to start is now.

December 6, 2017  10:58 AM

From the server to the edge: the evolution of analytics

Brian McKenna Profile: Brian McKenna

In a guest blogpost, Peter Pugh-Jones, head of technology at SAS UK & Ireland, reflects on how the analytics industry is evolving and what organisations need in a data-driven economy.

Forty years is a long time in analytics, and in that time much has changed. In the last four decades, analytics has become part of everyday life and helped solve some of society’s biggest challenges. From helping develop specialised medications to combatting crime networks and ensuring transport fleets are energy efficient.

Data analytics is playing an ever-increasing role in our businesses, economy and environment. In the beginning, data analytics was used to find the solution to an existing problem. Today this approach has been turned on its head. Now we start with the data to uncover patterns, spot anomalies and predict new opportunities.

Data now informs organisations about trends and problems they never knew existed. It shapes how people interact, share information, purchase goods, and how they’re entertained and how they work. It dictates political decisions and economic cycles. Data is the raw power that helps us optimise decisions and processes to iron out inefficiencies though use of analytics. Analytics can be utterly transformative.

On the edge

For example, General Electric Transportation (GET) is a leading division in locomotive manufacturing and maintenance. It depends on the efficient running of its rail assets, with breakdowns and inefficient fuel usage threatening profits. To optimise its operations, each train has been equipped with devices that manage hundreds of data elements per second to improve operations. Analytics is then applied to the small, constrained devices that sit at the network’s edge to uncover use patterns that keep trains on track.

This ability to analyse and learn from data in transit is a game changer for all industries. Smart sensors on the production line are improving product quality by identifying faults before they happen or instantly as they occur. In turn, customer satisfaction and company competiveness are increased.

Connected devices are now generating more data than ever before. At the same time, customer demands are rising and the complexity of modern, global supply chains is growing exponentially. To stay competitive and provide the best products and services, companies require an unprecedented level of control and the ability to positively intervene at every stage of the process.

Moving at speed

Yet most are not up to the task. Any inefficient processes between capture, insight and action squander valuable opportunities for the business. It’s obvious that the static analytics approach of the past is no longer tenable. The increasing volume of sensors and the limitless possibilities for the fusion of their data has changed the conversation. Analytics now needs to be applied at the right time and in the right place, for the right level of return.

Take energy consumption as an example. A single blade on a gas turbine can generate 500GB of data per day. Wind turbines constantly identify the best angles to catch the wind, and turbine-to-turbine communications allow turbine farms to align and operate as a single, maximised unit. By using analytics, data can be used to provide a detailed view of energy consumption patterns to understand energy usage, daily spikes and workload dependencies so that we can store more energy for use when the wind is light.

The challenge is that new connected devices, the Internet of Things (IoT) and artificial intelligence (AI) put an infinite level of insight in the hands of organisations. This means that the analytics of the present and future has to become instantaneous. The ability to gather and analyse an ever-growing amount of data to deliver relevant results in real-time will become the deciding factor for whether organisations win or lose. Analytics has to move at speed and make the development of the most promising technologies possible.

When we speculate how analytics will be used in the future, it is clear we are on the edge of something revolutionary. It is old hat to think that analytics still resides in the server – it has been brought to the edge.

Yet it would be unrealistic to assume that all businesses can run their analytics on this scale. Most organisations are a complex patchwork of legacy systems and siloed data infrastructures which do not always speak to each other. Integration is a key part of the puzzle. Organisations require analytics platforms that understand the different states of play and can consolidate data from the edge to the enterprise, from the equipment in the field to the data centre and the cloud.

Unified, open and scalable

In recent years analytics has been made open and accessible. No longer the preserve of data scientists, businesses have realised considerable gains when analytics and its insight can be communicated and used at every level of the organisation. This evolution is driven by necessity. Data is growing exponentially and becoming more complex every day, and there is no organisation with a blank cheque for technology investment. In modern, complex data environments a business’s analytics has to be flexible. It must be able to adapt to infrastructure changes and the daily challenges of businesses.

For organisations and industries, it will mean they must have access to a single, unified platform that is constantly evolving. Organisations need a platform that can scale to their needs and is delivered flexibly to achieve the latest advances that allow them to solve problems and create new value. This means being cloud-native and having access to scalable, elastic processing and accessible open interfaces. This means an environment where organisations can easily log in, access data, build models, deploy results and share visualisations.

Organisations now need platforms that are open and integrated, leveraging future technologies to scale and provide instant insight through a consolidated data environment. Platforms that provide joined-up data and faster, more accurate insight will transform organisations’ decision-making and facilitate better integrated planning. Above all, they will allow business to make good on the opportunities that data offers.

Page 1 of 1212345...10...Last »

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to: