Adventures in Data Center Automation:

Firescope

May 20 2008   10:47PM GMT

Performance and Availability vs. Analytics - Part 4 of 5



Posted by: Ryan Shopp
DataCenter, Integrien, Managed Objects, Firescope, Opnet

Sorry for the delay but family time called as we were blessed with a baby boy a couple weeks ago.  So back on track; in part one we hit data collection, part two talked about applying analytics and business/service mapping and part three we hit on evolving the Data Center Automation Blueprint from Performance & Availability to Service Assurance.  So what does that mean for analytics?

Well, here is where it gets tricky.  I believe their are two types of analytics that are sometimes being confused or blended together…

Type 1:  Is Per functional category - meaning, software automation that uses algorithms, automated analysis, etc focused on one of the 3 functional categories (e.g., Performance & Availability, Configuration & Change, Security & Protection).
Cross functional category.

Type 2: Is Cross-functional - like Process Orchestration & Resource Reconciliation, you have a roll-up aggregated view of metrics that are mapped to the business (beyond IT specific metrics).  This is also commonly called Business Service Management by most definitions.

Some quick examples….companies like Integrien, Opnet fall into type 1, while companies like Managed Objects, Firescope map closer to type 2.  Now this all gets very confusing as there are overlaps where vendors who do mostly type 1 analytics and some type 2 analytics claim both and even call themselves BSM vendors…meanwhile, the same occurs where mostly type 2 analytics (aka BSM) also claim to do some type 1.  So I’m not a BSM guru but I do exchange blogs/emails with some and would love to hear them chime in on this thread.  Based on this feedback and some further reading over at my favorite BSM blog, my next post will wrap up this series and I’ll update the Data Center Automation Blueprint.

Apr 17 2008   9:58PM GMT

Performance and Availability Management vs. Analytics - Part 1 of ?



Posted by: Ryan Shopp
nimsoft, cittio, eg innovations, Alcatel-Lucent, Analytics, Apparent Networks, Brix Networks, Compuware, Entuity, Fluke Networks, Gomez, Groundwork, Hyperic, Indicative, Application monitoring, DCAB, Firescope, HP Software, IBM Tivoli, InfoVista, Integrien, NetScout, Netuitive, Solarwinds, Systems monitoring, BMC, Quest Software, NetIQ, Network monitoring, Packet Design, Performance management, CA, Keynote, Nagios, NetQoS, Network Instruments, OpenNMS, Opnet, Xangati, ZenOSS

I’ve had an opportunity to be briefed over the past couple months by a number of current Data Center Automation Blueprint’s Performance & Availability vendors (e.g., CITTIO, eG Innovations, InfoVista, Integrien, Nimsoft).  With that and some further research I think I’m ready to take another pass at this area of the blueprint.

First up, all these vendors use a variety of techniques to collect a variety of data from as many points of view as possible.

  • Their own server agents that collect data about systems, services, applications, databases, etc and then aggregate back to a centralized console
  • Agent-less centralized consoles that leverage infrastructure standard communications protocols (e.g., SNMP, RPC, ODBC, WMI, SSH, TCP, UDP, HTTP) to query or connect remotely to collect data from networks, systems, services, applications, databases, etc.
  • Passive traffic flow collectors (which can be an agents or appliance) that are either in-line with the traffic flows or receive an exact copy of all traffic flows traversing a network connection (e.g., switch port uplink) through hardware vendor capabilities (e.g., spanning)

These data collection points can be statistics about a specific IT infrastructure resource ; physical devices, virtual devices, physical connections, virtual connections or resources running on physical or virtual devices like services, processes, applications, databases, etc.

Or the data collection points can be traffic flows or end-to-end specifics including passive traffic flows, synthetic transactions or even as simple as a pinging from remote points.

Metrics that are captured, typically revolve around throughput, errors, utilization, latency, up/down status, etc. (there are way to many to mention here).

After saying all this, there is a list a mile long of vendors (a number already noted on the DCAB) that capture these predominately time-series oriented data points about performance, capacity, availability using any/all these methods or vantage points (I know, passive traffic flows are not time-series data but patterns/usage/performance etc can be determined from them).

So, with all that data, what most these vendors offer are two primary types of functionality; 1) a variety graphical reports and 2)metric thresholding capabilities that produce a list of outstanding issues/alerts/alarms/events/concerns (whatever you want to call them).

Ok, so why did I organize and point all this out. So I can draw a line around where most of the innovation from my perspective is occurring. The above is for the most part in my eyes a commodity these days. Most companies have had collection/reporting/thresholding capabilities spanning multiple technology silos since pretty close to the start of the enterprise networking. The reports continue to get fancier, the number of data sources a single product collects from continues to expand, etc.  Another sign of commoditization is related to the variety of economic business models offering these products; open source, managed service providers, internet distributed products, appliances deployment models and indirect sales forces, large enterprise direct sales force, completely flexible frameworks for service providers to basically “build their own,” etc.

For the most part where the majority of technical innovation is occurring these days is the next layer above this data collection, reporting and alerting. Now let me say this, yes…there is some great innovation still occurring in the data collection realm (e.g., Xangati offering real-time Netflow down to a user level, PacketDesign monitoring routing messages, NetQoS leveraging advanced TCP/IP theory to analyze where end-to-end bottlenecks are occurring). But, for the most part these new data sources are being used to augment or replace currently deployed data sources in an attempt to see things from either as many vantage points or the best vantage points to avoid surprises within their unique enterprise IT environment.

So where is the serious innovation coming from…stay tuned for part 2.


Jan 10 2008   6:12PM GMT

Came across a great BSM blog series in process



Posted by: Ryan Shopp
Firescope

During the holidays I came across this great blog by Mark Lynd over at Firescope here in Dallas, currently at part 3 in what should eventually be a 6 part series on BSM.  Since it’s on the same path and talks concepts/ideals for IT Management it provides insights around capabilities that also apply to automating your data center through software.  I’ve subscribed now and look forward to reading the next 3 parts.  Here are the first three parts.

Part 3 - History of BSM - great little run down of ITSM, ITIL, BS15000 and MOF and how it leads into the current state of BSM.

Part 2 - Intro to BSM Fundamental -  define the goal of BSM and five supporting points involved with accomplishing that goal.  The goal as defined by Mark is “…To Manage IT investments in alignment with business priorities in order to create competitive advantage.”

Part 1 - Defining BSM