Analytics image via Shutterstock
By James Kobielus (@jameskobielus)
Speed isn’t always a value. Faster data is not necessarily better data. If the data whizzes by faster than you can extract value, it’s a waste.
Stream computing is much more than low-latency middleware. Its value-added applications are several. It supports high-throughput filtering and analysis across disparate data streams. It delivers real-time updates to consuming applications. It enables rich query of high-velocity data. And it provides continuous updates of pre-processed intelligence to downstream repositories, ranging from small databases to big-data clusters.
In all of these ways, stream computing is a central component of any comprehensive big-data infrastructure. This recent article does a good job explaining how stream computing platforms, such as IBM InfoSphere Streams, can complement Hadoop, enterprise data warehouses (EDWs), in-memory databases, and other big-data platforms that are optimized for data that spans the latency spectrum from “at-rest” to “in-motion.”
What I found especially interesting was the discussion of “live data marts” that are refreshed by stream computing. Author Kai Wähner describes the concept as one of “provid[ing] end-user, ad-hoc continuous query access to this streaming data that’s aggregated in memory….A live analytics front ends slices, dices, and aggregates data dynamically in response to business users’ actions, and all in real time.”
What’s useful about this “live data mart” concept is that it blurs the increasingly arbitrary distinction between “in-motion” and “in-memory,” on the one hand; “in-motion” and “at-rest” on the other; and also (if it were possible to have a third hand) “in-motion” and “in-process.” The purpose of stream computing is to drive speedier results through delivery of live intelligence into live business processes. Ideally, every “at-rest” big-data repository–be it enterprise data warehouse (EDW), Hadoop, or whatever–can and should host live data in order to drive live decisions.
Live data marts should live on a converged infrastructure of stream computing, complex event processing, and various real-time-optimized big-data platforms, including the EDW. I’m happy that Wähner picked up on the notion that stream processing can figure into an EDW modernization strategy. I prefer to call this the “live EDW”:
- Using stream computing to filter and reduce EDW storage costs
- Leveraging the structured, unstructured, and streaming data sources required for deep analytics that are hubbed on the EDW
- Combining streaming and other unstructured data sources to existing EDW investments
- Delivering improved business insights from the EDW to operations for real-time decision-making
Essentially, the “live EDW” would aggregate at least one streaming source with other lower-latency sources into a conformed, continually refreshed in-memory data structure that drives real-time business processes.
Security image via Shutterstock
Should you be concerned about the Bash security vulnerability? Find out in this week’s roundup.
1. Attackers already targeting Bash security vulnerability – Brandan Blevins (SearchSecurity)
Exploits are already being written and rewritten for the ‘Shellshock’ Bash security vulnerability, which was announced just days ago, increasing the urgency for enterprises to remediate it quickly.
2. HP SDN app store is open for business with eight OpenFlow apps – Shamus McGillicuddy (SearchSDN)
HP’s marketplace for SDN apps is now open. Download apps from F5, Blue Cat, Kemp and others for HP’s OpenFlow controller.
3. Experts: Expect cloud breaches to endanger data privacy – Rob Wright (SearchCloudSecurity)
Attendees and speakers at the CSA Congress and IAPP Privacy Academy stressed the need for better data classification to reduce the effects of cloud breaches.
4. CloudBees move shows PaaS is no place for the little guy – Trevor Jones (SearchCloudComputing)
CloudBees is the latest small PaaS provider to bow out, leaving enterprise IT questioning the market as larger vendors squeeze out remaining players.
5. NetApp brings out new version of StorageGrid object storage – Dave Raffo (SearchStorage)
NetApp expands its low-profile StorageGrid object storage with a Webscale version to go beyond its healthcare niche.
By Greg Lord (@GregLord11)
Businesses are focused on pursuing the holy grail of higher revenue and lower costs, and as they evaluate new technology solutions to help drive this growth, CIOs are changing the way they deploy and manage business critical applications by striving to leverage the ubiquity and cost efficiencies of the Internet for application delivery. Although every organization’s IT strategy and approach to application delivery varies, the common requirement across all organizations is that end-users need fast, reliable and secure access to all their business applications. This requirement has become increasingly challenging given the complexity of application distribution across multiple data centers, end-users located all over the world using various devices and a growing list of business applications such as customer relationship management (CRM), collaboration, product lifecycle management (PLM), and support portals that users rely on every day.
For CIOs to successfully deliver applications to end-users within an organization, they need to understand the challenges in managing the Internet connection between an end-user’s device and the data center where a particular application is hosted – specifically on at the enterprise-level. The Internet was not designed to handle the demands and requirements of business use given the legacy architecture and logic of the Internet, the selection of routes between end-users and data centers is extremely inefficient. Once selected, the transmission of data along a route is slow and error prone. The Internet itself, and large Internet-connected cloud data centers, are prone to over-congestion and downtime. In addition, mobile devices have different operating systems, browsers and connection types that introduce complexities. The Internet offers no inherent web security protection and it can be very difficult to gain visibility or manage and control applications being delivered over the Internet.
These challenges are what we call “The Enterprise Internet Problem,” which can result in lost revenue due to partner and customer frustration with the poor response times and spotty availability. There are negative impacts on end-user productivity due to long load times as well as data loss vulnerabilities. Frustrated IT organizations struggle to troubleshoot issues and support complex application delivery architectures, let alone find the time to try to optimize the end-user experience.
To begin addressing the Enterprise Internet Problem, organizations typically try one of the following two approaches:
1. Implement a solution that lives within the four walls of the data center – either a physical hardware box or a virtual appliance.
The data center could be an organization’s own data center or the data center of their cloud or hosting provider. Any way you slice it, this approach doesn’t work because organizations need a symmetrical solution that addresses both ends of the application delivery and IT organizations can’t possibly implement a box or virtual appliance in every data center and every end-user location. Also, this approach introduces additional cost and complexity, because organizations need to purchase, implement, and support these solutions – a challenge which is compounded as applications inevitably move across datacenters and cloud environments over time.
2. Continue to invest in maintaining private network infrastructure.
This approach works to a certain extent, in that it helps address Internet performance and reliability issues, but it doesn’t scale because it limits access to applications and restricts organizations from leveraging the cost efficiencies and ubiquity of the Internet.
In order to solve the “The Enterprise Internet Problem,” organizations need to look at various options, including a possible movement to the cloud. Instead of requiring IT organizations to take on the burden of deploying and managing these critical capabilities on their own, cloud-based platforms can help enterprises determine the most optimal Internet route, connection offload, load balancing, real-time failover, web acceleration, Front-End-Optimization, DDoS mitigation and Web Application Firewalls that are not constrained within the four walls of a few data centers. Deploying applications in servers and networks can be effective, as it brings end users closer to the applications needed to operate a business.
By understanding and addressing these problems, organizations can position themselves to instantly enter new markets, improve customer interactions, do business via lower-cost online channels, enable end-users to get more done in less time, and realize the holy grail of higher revenue and lower costs.
Greg Lord is the Sr. Product Marketing Manager responsible for Enterprise Solutions, including Enterprise Application Delivery and Cloud Solutions, at Akamai Technologies. Before joining Akamai, Greg held several Enterprise Sales and Marketing roles at Intel Corporation, including having led Cloud & Data Center Marketing for Intel’s Americas business. Prior to Intel, Greg was an IT Manager at both Reebok and Partners Healthcare. Greg is a certified Project Manager (PMP), has an undergraduate degree in Computer Information Systems from Bentley University, and his MBA from the University of Notre Dame.
Cloud Computing image via Shutterstock
Was staying independent the right move for Rackspace? Find out in this week’s roundup.
1. Rackspace goes all in with managed cloud – Trevor Jones (SearchCloudComputing)
Rackspace rebuffed its suitors and opted to stay independent. What that means for the long-term stability of a company pushing managed cloud remains unclear.
2. VMworld 2014 recap: The good, the bad and the ugly – Tom Walat (SearchVMware)
At VMware’s annual conference in San Francisco, the virtualization company announced a new hardware appliance and other offerings to further its goal to deliver its vision for a software-defined data center.
3. Home Depot data breach update: 56 million cards confirmed stolen – Brandan Blevins (SearchSecurity)
Home Depot said late Thursday that its recent breach involving 56 million payment cards was the result of custom-built malware, and that the company has since rolled out new POS encryption technology.
4. Ellison steps aside as Oracle CEO, becomes CTO and chairman – Mark Fontecchio, Jessica Sirkin and Craig Stedman (SearchOracle)
Oracle said founder Larry Ellison is giving up his CEO position but will continue to oversee product development as CTO, while also becoming the company’s executive chairman.
5. Cloudian adds appliances, flash to run HyperStore cloud software – Sonia Lelii (SearchCloudStorage)
Hybrid cloud vendor Cloudian bucks the software-defined storage trend, adding appliances to run its HyperStore object software.
Cloud Computing image via Shutterstock
Should you be worried about the cloud after the latest outage? Find out in this week’s roundup.
1. Azure outage drudging up concerns about cloud’s capabilities – David S. Linthicum (SearchCloudComputing)
Microsoft is the latest cloud provider to suffer an outage and send the IT world into a state of frenzy. Customers are questioning the capabilities of cloud computing, but are they overreacting?
2. Microsoft, Dell EMM updates take aim at crowded market– Diana Hwang and Jake O’Donnell (SearchConsumerization)
With new features on the way, Dell and Microsoft will vie for more EMM customers this fall. What will it take to bring them on board?
3. What you need to know about VMware EVO:RAIL – Ryan Lanigan (SearchServerVirtualization)
Users had high hopes when VMware announced their new hyper-converged infrastructure offering, EVO:RAIL. But questions still linger about its hardware partners.
4. Yahoo wins bid to shine more light on U.S. surveillance – Warwick Ashford (ComputerWeekly)
Yahoo has won the release of 1,500 pages of documents from a key 2008 case in a secretive US surveillance court.
5. Home Depot confirms data breach began in April – Eric Parizo (SearchSecurity)
The home improvement retailer confirms its customers’ payment card data was breached in an incident that is believed to have begun in April, likely compromising millions of card accounts.
Cloud computing image via Shutterstock
Can OpenStack finally make its way into the corporate world? Take a look at this week’s roundup to find out.
1. OpenStack adoption creeps toward corporate acceptance – Trevor Jones and Ed Scannell (SearchCloudComputing)
Four years after OpenStack’s debut, many enterprises are still working through integration, upgrade and cost issues.
2. Windows 8.1 tablets swim — not sink — in hospitality industry – Diana Hwang (SearchConsumerization)
Big hotel and travel companies look to Windows 8.1 tablets as a platform to connect guests, customers and workers, but new deployments could pose IT problems not found in mainstream corporate settings.
3. Card trail shows Home Depot data breach could be huge – Brandan Blevins (SearchSecurity)
The reported Home Depot data breach may have affected stores nationwide over the course of several months if new data proves to be correct.
4. Ransomware on the rise, warns cyber threat report – Warwick Ashford (ComputerWeekly)
The first half of 2014 saw an increase in online attacks that lock up user data and hold it for ransom, reports F-Secure Labs.
5. EVO:RAIL and containers top VMworld highlights – Nick Martin (SearchServerVirtualization)
In this podcast, we talk with Simon Crosby, CTO of Bromium, about VMware’s support for containers and dig into how EVO:RAIL fits into the hyper-converged market.
Data management image via Shutterstock
By James Kobielus (@jameskobielus)
Data management professionals know that how you model the data directly constrains how flexibly you can analyze it.
When you consolidate relational sources that embody divergent data schemas and definitions, you are inviting a world of pain. Rollup of those sources for unified drilldown can’t take place until you run it all through a gantlet of data integration, matching, merging, and cleansing. Even then, you generally have to make the resultant data set available in relational third-normal form.
And when you add unstructured sources to the mix, watch out! Querying across multi-structured sources might involve unstructured-data integration to transform the nonrelational data to relational schemas that support SQL access. Or it might involve keeping data in its source formats and offering agile query access through an abstraction that can do justice to the myriad semantics.
That’s where ontologies, taxonomies, and other data abstractions enter the picture. As multi-structured data moves into the mainstream, data scientists will increasingly require integration tools to help them analyze data within the semantic contexts expressed in these and other domain-specific abstractions. As noted in this recent article on ontologies, these and other abstractions have a clear analytic advantage over relational and other platform-specific models.
Ontologies, as author Malcolm Chisholm emphasizes, are principally oriented toward data’s analytical uses within and across disparate data-store implementations. Framed in Resource Description Format and other formats, ontologies are, he states, “analysis, not design, artifacts,” geared to semantic query and knowledge discovery. “An ontology is a view of the concepts, relations and rules for a particular area of business information, irrespective of how that information may be stored as data.”
In the broader perspective of multistructured analytics, ontologies support the following use cases:
- Building semantic models: Developers explicitly model semantics as RDF ontologies and/or related logical structures like taxonomies, thesauri, and topic maps. These ontologies are used to drive the creation of structured content that instantiates the entities, classes, relationships, attributes, and properties defined in the ontologies.
- Mediating between heterogeneous semantics. Developers use ontologies and other semantic models to drive the creation of mappings, transformations, and aggregations among existing, structured data sets.
- Mining the semantics implicit in unstructured formats: Developers use natural-language processing and pattern-recognition tools to extract the implicit semantics from unstructured text sources.
- Managing semantics in a consolidated repository: Application environments require repositories or libraries to manage ontologies and other semantic objects and maintain the rules, policies, service definitions, and other metadata to support the life-cycle management of application semantics.
- Governing semantics through comprehensive controls: Application environments require that various controls — on access, change, versioning, auditing, and so forth — be applied to ontologies; otherwise, it would be meaningless to refer to them as “controlled vocabularies.”
You might regard ontologies as metadata applicable to the deep analytic meaning of data. As such, ontologies are a key semantic stratum within which all data-driven insights are rooted firmly–and from which they all exude like liberated liquid energy.
VMware image via Shutterstock
Missed all of the VMworld 2014 news? Not to worry, it’s all here in this week’s roundup.
1. With EVO: RAIL, VMware turns VSAN into a franchise – Dave Raffo (SearchVirtualStorage)
VMware’s EVO: RAIL allows hardware vendors to build hyper-converged appliances running VSAN and other VMware software.
2. Software-defined data centers pique IT’s interest – Margie Semilof (SearchServerVirtualization)
IT pros’ interest in software-defined data centers continues to grow as tools, such as VMware’s EVO:RAIL, offers IT an effective small business option.
3. Backoff point-of-sale malware hits over 1,000 businesses – Brandan Blevins (SearchSecurity)
In an advisory Friday, the U.S. government estimated that the Backoff point-of-sale malware campaign has struck over 1,000 businesses to date.
4. Apple and FBI launch iCloud hack investigation – Warwick Ashford (ComputerWeekly)
Apple and FBI investigate the breach of Apple’s iCloud causing fresh business concerns over cloud security.
5. Maxta Inc. develops MaxDeploy, seeks hardware partners – Garry Kranz (SearchVirtualStorage)
Like VMware, Maxta wants to sell its software-only, hyper-converged storage platform integrated on standard industry hardware.
Google image via Shutterstock
How will Google be able to combine IaaS and Paas? Tune into this week’s roundup to find out.
1. Google fills the gap between Iaas and PaaS – Trevor Jones (SearchCloudComputing)
Google wants to merge the worlds of IaaS and PaaS to create a single continuum of services for customers. It’s likely a sign of things to come from all the major public cloud vendors as they look to cover their bases in the maturing market.
2. Two-year PC replacement saves cost, raises productivity – Diana Hwang (SearchEnterpriseDesktop)
IT pros debate whether companies should replace PCs every two years instead of following conventional wisdom of three to four years. In today’s world, one size doesn’t fit all, and a two-year cycle may work in some cases.
3. Community Health breach shows detecting Heartbleed exploits a struggle – Brandan Blevins (SearchSecurity)
The difficulty of detecting Heartbleed exploits means that the Community Health breach is unlikely to be the last incident linked to the OpenSSL flaw.
4. New partnerships, SLAs make Google Enterprise services a UC option – Gina Narcisi (SearchUnifiedCommunications)
Consumer Google services like Hangouts weren’t always an option for enterprises. New partnerships with UC providers are making Google Enterprise Solutions more appealing as UC tools.
5. FC Bayern Munich partners with SAP for help with sports analytics – Todd Morrison (SearchSAP)
In this roundup, SAP inks a deal with FC Bayern Munich that includes sports analytics, and an Austrian retailer looks for better inventory control.
Cloud Computing image via Shutterstock
Will Microsoft be able to make a dent in Amazon’s lead in the IaaS cloud market? Find out in this week’s roundup.
1. IaaS cloud race far from over – Adam Hughes (SearchCloudComputing)
Amazon Web Services remains the frontrunner in the IaaS cloud market, but Microsoft Azure has made strides to improve its cloud. Can Microsoft capitalize on its advantages and make a bigger dent?
2. Microsoft issues critical IE patch, introduces whitelisting – Jeremy Stanley (SearchWindowsServer)
Microsoft patched two publicly known vulnerabilities in the August Patch Tuesday update. The company also introduced plug-in whitelisting in IE.
3. OpenStack market size will cross $1.7bn by 2016, says 451 Research – Archana Venkatraman (ComputerWeekly)
Free and open-source cloud computing platform OpenStack could reach an estimated market size of $1.7bn by 2016.
4. Internet of Things security issues rise to the fore at Black Hat – Brandan Blevins (SearchSecurity)
This year’s Black Hat showed that the Internet of Things security issues are going to demand increased attention in the near future.
5. Data explosion poses storage challenges to universities – Carol Sliwa (SearchStorage)
The incoming Michigan State CIO discusses the data storage challenges universities have to deal with and how to address them with cloud storage.