Head in the Clouds: SaaS, PaaS, and Cloud Strategy

Jul 17 2014   10:24AM GMT

Big data in the cloud enables scientific discovery

Caroline de Lacvivier Caroline de Lacvivier Profile: Caroline de Lacvivier

Gordon E. Moore, co-founder of Intel, noticed that over the course of tech history, compute power doubled every two years. Later coined as “Moore’s law,” his observation was meant to predict the general upward trend of processor speeds. Over the past few years, however, new content outlets — social media, in particular — have caused an explosion of unstructured data, a phenomenon that has bypassed Moore’s law by a landslide.

There is no single super-tool to tackle the growing generation of big data, according to Ben Butler, senior solutions marketing manager of big data at AWS. In his session at the AWS Summit in San Francisco, Butler advocated, instead, for a network of solutions — AWS solutions, to be specific — that leveraged the flexibility, capacity and cost effectiveness of the cloud.

Last week, Butler hosted another session at an AWS Summit in New York. His talk drilled down AWS solutions a bit further, offering specific use cases from different industries.

Big data has been used for fraud detection, click stream analysis and ad targeting, to name a few. One of the more exciting use cases is gene sequencing. This analysis of genetic variation can be used for disease research, personalized medicine and molecular testing. It is, in short, a tool that contributes to our understanding of disease and could be instrumental to the evolution of healthcare.

The sudden influx of big data has put pressure on on-premises systems that used to store, analyze and share data without much trouble, just a few years ago.

“DNA sequencing is scaling faster than Moore’s Law, so processing the sequence data is an increasingly significant barrier,” said Alex Dickinson, VP of strategic initiatives at Illumina, a genetic research company. Dickinson confirmed that the best solution for this processing bottleneck was cloud computing.

All of Illumina’s raw data streams from its sequencing instruments, over the Internet, to AWS, Dickinson explained. “There the data undergoes intensive processing to assemble final genomes from that raw data. It is then stored on AWS and made available to researchers for further analysis.” In other words, most of the big data lifecycle is processed on AWS.

Dickinson cited three reasons for selecting Amazon over other cloud providers. One, AWS has large instances that can handle big loads of raw data. Two, AWS has sites all over the globe. Three, AWS has competitive pricing.

Whether big data researchers choose AWS or not, the cloud is certainly the next frontier for processing massive datasets. In Illumina’s case, it is removing computational constraints and, by extension, generating more opportunities for scientific insight. As Dickinson put it, “the cloud enables raw instrument data to be transformed into disruptive healthcare discoveries.”

 Comment on this Post

There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when other members comment.

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

Share this item with your network: