Adventures in Data Center Automation:

CA

Apr 17 2008   9:58PM GMT

Performance and Availability Management vs. Analytics - Part 1 of ?



Posted by: Ryan Shopp
nimsoft, cittio, eg innovations, Alcatel-Lucent, Analytics, Apparent Networks, Brix Networks, Compuware, Entuity, Fluke Networks, Gomez, Groundwork, Hyperic, Indicative, Application monitoring, DCAB, Firescope, HP Software, IBM Tivoli, InfoVista, Integrien, NetScout, Netuitive, Solarwinds, Systems monitoring, BMC, Quest Software, NetIQ, Network monitoring, Packet Design, Performance management, CA, Keynote, Nagios, NetQoS, Network Instruments, OpenNMS, Opnet, Xangati, ZenOSS

I’ve had an opportunity to be briefed over the past couple months by a number of current Data Center Automation Blueprint’s Performance & Availability vendors (e.g., CITTIO, eG Innovations, InfoVista, Integrien, Nimsoft).  With that and some further research I think I’m ready to take another pass at this area of the blueprint.

First up, all these vendors use a variety of techniques to collect a variety of data from as many points of view as possible.

  • Their own server agents that collect data about systems, services, applications, databases, etc and then aggregate back to a centralized console
  • Agent-less centralized consoles that leverage infrastructure standard communications protocols (e.g., SNMP, RPC, ODBC, WMI, SSH, TCP, UDP, HTTP) to query or connect remotely to collect data from networks, systems, services, applications, databases, etc.
  • Passive traffic flow collectors (which can be an agents or appliance) that are either in-line with the traffic flows or receive an exact copy of all traffic flows traversing a network connection (e.g., switch port uplink) through hardware vendor capabilities (e.g., spanning)

These data collection points can be statistics about a specific IT infrastructure resource ; physical devices, virtual devices, physical connections, virtual connections or resources running on physical or virtual devices like services, processes, applications, databases, etc.

Or the data collection points can be traffic flows or end-to-end specifics including passive traffic flows, synthetic transactions or even as simple as a pinging from remote points.

Metrics that are captured, typically revolve around throughput, errors, utilization, latency, up/down status, etc. (there are way to many to mention here).

After saying all this, there is a list a mile long of vendors (a number already noted on the DCAB) that capture these predominately time-series oriented data points about performance, capacity, availability using any/all these methods or vantage points (I know, passive traffic flows are not time-series data but patterns/usage/performance etc can be determined from them).

So, with all that data, what most these vendors offer are two primary types of functionality; 1) a variety graphical reports and 2)metric thresholding capabilities that produce a list of outstanding issues/alerts/alarms/events/concerns (whatever you want to call them).

Ok, so why did I organize and point all this out. So I can draw a line around where most of the innovation from my perspective is occurring. The above is for the most part in my eyes a commodity these days. Most companies have had collection/reporting/thresholding capabilities spanning multiple technology silos since pretty close to the start of the enterprise networking. The reports continue to get fancier, the number of data sources a single product collects from continues to expand, etc.  Another sign of commoditization is related to the variety of economic business models offering these products; open source, managed service providers, internet distributed products, appliances deployment models and indirect sales forces, large enterprise direct sales force, completely flexible frameworks for service providers to basically “build their own,” etc.

For the most part where the majority of technical innovation is occurring these days is the next layer above this data collection, reporting and alerting. Now let me say this, yes…there is some great innovation still occurring in the data collection realm (e.g., Xangati offering real-time Netflow down to a user level, PacketDesign monitoring routing messages, NetQoS leveraging advanced TCP/IP theory to analyze where end-to-end bottlenecks are occurring). But, for the most part these new data sources are being used to augment or replace currently deployed data sources in an attempt to see things from either as many vantage points or the best vantage points to avoid surprises within their unique enterprise IT environment.

So where is the serious innovation coming from…stay tuned for part 2.

Mar 17 2008   1:22PM GMT

BMC makes the big move, buys BladeLogic for $800M



Posted by: Ryan Shopp
BladeLogic, HP Software, IBM Tivoli, RealOps, BMC, CA, EMC

So BMC is the one, not IBM or EMC that decides to piece it all together.  Responding to HP acquiring Opsware (July ‘07); BMC, in less then a year, has acquired RealOps (July ‘07), Emprisa (Oct ‘07) and now BladeLogic pulling together the critical components for their DCA strategy that all tie in nicely with Remedy, Atrium etc.  Very impressive!  They have most the pieces, now it’s about execution on the vision/strategy.

So HP & BMC have acquired the major pieces, IBM has many of the pieces too, but some are showing their age versus the newer products that were acquired by their competitors.  CA has been the quietest of all players, so I would expect for them to make some moves to shore things up ASAP (but most likely at this point having to pay premiums based on previous CCM valuations).  Meanwhile, EMC has been methodically building themselves up in the hope to make a run at knocking off one of the big 4 in IT Infrastructure Management, but they still have some serious work based on the recent moves of some of the current big 4.

Data Center Automation is about to hit the major growth curve now that multiple big guys have strong portfolio’s in the game.  As predicted, 2008 is going to be hot for Data Center Automation!


Mar 5 2008   7:59PM GMT

Top Enterprise Management Tools vs. Data Center Automation Blueprint



Posted by: Ryan Shopp
DataCenter, Analytics, Application monitoring, CMDB, DCAB, HP Software, IBM Tivoli, InfoVista, IT Process Automation, Netuitive, RBA, RealOps, Run Book Automation, Systems monitoring, BMC, Network configuration, Network monitoring, Networkingchannel, Performance management, CA, NetQoS, Opnet, Tideway

I was doing some “light” reading this morning and came upon this recent article:  Top 10 Enterprise Management Tools

It’s focused on Complete Enterprise Management, not specifically focused on the Data Center so I thought I would summarize and then compare/contrast/discuss:

  • Network Fault & Performance: CA eHealth & Spectrum
  • Consolidated Event Management: IBM Tivoli Netcool
  • Service Impact Monitoring : IBM Tivoli Business Service Manage & Service Level Advisor
  • Application Discovery Mapping: Tideway Foundation
  • Business Intelligence: Cognos
  • ITSM Workflow, CMDB and Service Desk: BMC Remedy ITSM and Atrium
  • Network & Systems Configuration Managment: HP Automation (formerly Opsware SAS & NAS)
  • Process Automation: BMC RunBook Automation

Since it isn’t data center centric, it’s light on automated management for applications & databases.  It also chooses to stay away from the very congested and sometimes confusing security/protection market.

Next up, I thought  it would be fun to do a quick mapping to the Data Center Automation Blueprint.

  • Network Fault & Performance, Consolidated Event Management, Service Impact Monitoring = Availability & Performance
  • Application Discovery Mapping, CMDB = IT Resource Reconciliation
  • Business Intelligence = Analytics (maybe…Analytics is still a work in progress…need to figure out this vs. BSM etc)
  • ITSM Workflow, Service Desk = outside of DCAB listed as Manual Task Orchestration

I was surprised not to see an End-User Application Performance Monitoring category.  These products either do their duty from passive agents on the endpoint or from data center appliances using slick algorithms, TCPIP theory, etc.  Maybe that could have indirectly been rolled under Network Fault & Performance as CA acquired Wily which offers that.  The other one missing was more towards Capacity Planning and Trending Analytics, either based off historical data like what Opnet offers or from real-time data patterns from Netuitive.

Needless to say I found it a really nice write-up and summary of those products/offerings.  The only thing I struggle with is all of the big 4 (BMC, CA, HP, IBM) are represented in this mix.  Which means you will have 4 sales guys all continously battling it out to grab more land.  This may be good from a cost competition standpoint, but it’s a real fiasco for making sure all parts are playing nicely with each other or simply managing those vendor relationships.  Bottom line, you’re always going to have at least one of the big 4 in there as they continue to snap-up the innovative smaller companies/ technologies to enhance their portfolio and offer differentiation.  So I’d typically recommend a strategy where you pick 2 of the big 4 and keep them in check versus each other while continually looking for those innovative start-up’s to fill in the gaps.  Here is an example of how you could do this using the categories in the original article.

  • Network Fault & Performance: HP Network Node Manager, Operations Manager, Performance Insight
  • Consolidated Event Management: IBM Tivoli Netcool
  • Service Impact Monitoring : IBM Tivoli Business Service Manage & Service Level Adviser
  • Application Discovery Mapping: IBM Tivoli Application Dependency Discovery Manager
  • Business Intelligence: Cognos (which IBM recently acquired)
  • ITSM Workflow, CMDB and Service Desk: HP AssetCenter (former Peregrine)
  • Network & Systems Configuration Managment: HP Data Center Automation (formerly Opsware SAS & NAS)
  • Process Automation: HP Operations Orchestration (formerly iConclude that Opsware acquired)

Or, if you want to completely rebel and go the non-big 4 route, take a look at the above mappings to the DCAB and look for a name that’s not big-4.  Example:  Network Fault & Performance: InfoVista or NetQoS


Jan 25 2008   9:00AM GMT

Couple recent notes on CMDB, aka Resource Reconciliation



Posted by: Ryan Shopp
DataCenter, CMDB, Opalis, Scalent, Symantec, BMC, NetIQ, CA

Another great post by Glenn O’Donnell; CMDB is the new integration mechanism. I’m looking forward to seeing his forthcoming book on the same topic!

2007 TechTarget Products of the Year - Data Center include (categories by DCAB functional categories):

Resource Reconciliation (category combined with Configuration & Change) solutions from CA, BMC and Scalent

A couple other categories that map to the DCAB are;

Process Orchestration solutions from Symantec, Opalis and CA

Performance & Capacity solutions from NetIQ, BalancePoint and CiRBA

I find the CiRBA solution very intriguing after my read and post on Innovations in Performance Management yesterday.


Jan 2 2008   11:10PM GMT

Digging into the DCAB 6’s functional areas: Configuration and Change



Posted by: Ryan Shopp
DataCenter, Ecora, BladeLogic, Cassatt, Configuresoft, HP Software, IBM Tivoli, mValent, Scalent, Solidcore, BMC, CA, EMC

There seem to be two key components or approaches to this functional area. Some vendors are focused on auditing & monitoring the configuration/state of a device while others are focused on that and the provisioning/deployment of configuration/software to a device. Typically, the vendors going across data center technology categories are audit-centric.

Vendors doing both Deployment & Auditing (listed alphabetical)

  • AlterPoint (for network devices)
  • BladeLogic (for appilcations, servers)
  • BMC (for applications, servers with Marimba acquisition and networks with Emprisa acquisition)
  • CA (for systems)
  • Cassatt (for systems, applications, networks
  • Cisco (for network devices)
  • ConfigureSoft (for applications, servers)
  • Ecora (for servers, applications)
  • EMC (for network with Voyence acquisition, for storage with ControlCenter)
  • HP (former Opsware for applications, servers, networks, storage)
  • IBM Tivoli (for applications, servers)
  • mValent (for applications)
  • Phurnace (for applications)
  • Scalent Systems (for servers, applications)
  • Symantec (for servers, applications with Jareva, Altiris and storage with CommandCenter)

Vendors focused on Auditing

Vendors that do both primarily for desktop’s which extends to provide some server configuration and change capabilities for the data center

Just as with my previous post on Performance & Capacity I’m not done with this one. I started going through the laundry list of vendors in the “virtualization” space but simply ran out of my allocated time for today. So I’ll pick back up on it at a later time


Dec 28 2007   11:31PM GMT

Digging into each of these 6 functional areas: Performance and Capacity



Posted by: Ryan Shopp
DataCenter, HP Software, IBM Tivoli, InfoVista, Integrien, Netuitive, Systems monitoring, OSS, BMC, Quest Software, NetIQ, Network monitoring, Performance management, CA, Zabbix, ZenOSS, OpenNMS, Nagios, Hyperic, Groundwork, Packet Design, Apparent Networks, Xangati, Gomez, Keynote, Brix Networks, Entuity, Opnet, Network Instruments, Fluke Networks, Alcatel-Lucent, Compuware, NetScout, NetQoS, Symantec, EMC

First things first, we have many of the same vendors from the Availability & Notification functional area of this Data Center Automation Blueprint in this category. Which probably begs the question, do we combine Availability & Notification with Performance & Capacity? I know in the OSS (not Open Source Software but telco-oriented Operational  Support Systems) model they do this and call it “Service Assurance”, another name could be Service Level Management as they two monitoring-centric functions are about ensuring service levels are met…or simply I call it Availability & Performance? I’ll come back to this at the end after I type up the players in this Performance & Capacity area:

But then, we have a slew of others that have been around for quite some time now…

And some innovative up-and-comers in some unique technology/approaches…

Real-Time Behavior/Pattern Analysis through Dynamic Thresholding

IP Traffic/Packet Flow Monitoring & Analysis

Open Source Software (OSS) vendors

Whew..that was more work then I expected to pull together and I’m not done yet…  Please throw into the comment who I’ve missed (I know there has to be a few).

The major challenge here is organizing and breaking down this functional area.  There are so many approaches to obtain performance metrics from/for the data center.  Some of the techniques and perspectives include;

  • passive vs. active
  • agent vs. agent-less
  • in-line appliance vs. out-of-band appliance (e.g., span a port)
  • proprietary vs. leverage infrastructure mgmt. capabilities (e.g., Cisco Netflow)
  • outside the data center looking in vs. inside the data center itself.
  • Reactive troubleshooting vs. Proactive Predictive

I’m going to need to have a part two (and maybe more) for this functional category breaking down the pro’s and con’s of various approaches.  Which vendors do what, etc.  I also need to revisit that question from the top of do we combine this into a single “availability & performance” functional category???  For now, this first pass will have to do…


Dec 24 2007   5:52PM GMT

So let’s start to dig into each of these 6 functional areas: Availability and Notification



Posted by: Ryan Shopp
DataCenter, HP Software, IBM Tivoli, BMC, Quest Software, NetIQ, CA, EMC

So it’s time to start refining the Data Center Automation Blueprint. One way I hope to do that is through these next 6 blog posts (one for each functional DCA category) that will:

1) create list of vendors I know about that have some capabilities for the data center in the specified functional area

2) during this first pass attempting I also hope to breakdown each function by some major capabilities.

*NOTE: Help me out if I miss some vendors, miss some products within vendor product lines etc. Again, the focus is for current/future complex data center so I won’t be including tools like Ipswitch What’s Up Gold or products that are on their way out (end-0f-life) by vendor (e.g., NetView).

Event consolidation & root cause analysis

A new product segment that has materialized that for now I’m going to go place here is log management where you maintain historical event/message/alert logs and then have historical reporting and applying advanced indexing and searching technology to quickly find the “needle in the haystack” problems. It also has application beyond operational availability management of the data center within the security space for compliance management.

Next up will be the current Data Center Automation Functional Area of Performance and Capacity.


Dec 14 2007   4:50PM GMT

Recent activities in Configuration Management, tis’ the season of webinars



Posted by: Ryan Shopp
DataCenter, Alterpoint, BladeLogic, mValent, Solidcore, Ecora, Configuresoft, BMC, CA, EMC, NCCM, Network configuration

December is a time when things typically “slow” down for the holidays.  Many data centers are under a freeze where no major changes can occur (or should occur), etc.  So I guess it’s a great time to do a little research for next year.  Bring on the webinars which many vendors seem to be offering up this time of year:

BladeLogic had a very successful webinar, over 400 people, where real customers talked about real benefits of configuration management automation for their data center.  The press releases on the survey results & the webinar sound like a infomercial (which it should be since it’s marketing).  I was hoping to take a watch but their archived link doesn’t allow me to register and watch.  I enter my registration information and it says the event is full.  Oh well, another time.

ConfigureSoft also had a webinar, more process centric (PLAN-DO-CHECK-ACT: Closing the Loop on Change), but it’s archived and I was able to check that one out.

 Tripwire, not wanting to be outdone, had 4 differerent webinars recently.  The one I checked out was The Five A’s of a Healthy Data Center.  Where their focus was around the 5 step process of monitoring your configurations in the data center (Assessing, Assuring, Auditing, Achieving, Automating)

Ecora back on the 11th had a webinar around surviving audits through monitoring your configurations.  Unfortunately, I couldn’t find it archived anywhere to check it out.

Solidcore didn’t have a new webinar to offer but did put out a press release highlighting how they can help with the upcoming PCI deadline on December 31st with monitoring configurations.

mValent, who focuses on very the specific challenges of application/middleware configuration management, had a very interesting press release with some hard ROI numbers;

  • The average application migration project takes 20+ man-weeks with an average labor cost of just over $72,000.
  • Total IT direct-headcount costs associated with application migration initiatives range from $500K to $800K.

AlterPoint, focused on the network side of the data center, announced their analytics solution can now extend/compliment a customers previous investment in CiscoWorks (if they are a predominately Cisco networked Data Center) without requiring replacement.

I also looked to see if their was anything new from HP (Opsware), EMC (Voyence), BMC, IBM, CA but didn’t see anything specific.  And I recently talked about configuration vendors that are focused on virtualization so I didn’t rehash that.

I know I must have overlooked some vendor(s) out there, throw your information in the comments section (if your the vendor) or if your an enterprise using another product please tell us who your using and what you think.  I’ll take a look and update the post if appropriate. 


Dec 3 2007   11:41PM GMT

Availability Management, so what’s been going on here?



Posted by: Ryan Shopp
DataCenter, Netuitive, HP Software, IBM Tivoli, Symantec, BMC, Microsoft Windows, CA, EMC, Quest Software, Integrien

As mentioned in my November 2007 round-up, I haven’t given any love to automation products watching for outages, faults or other availability of the infrastructure oriented events.

Part of the reason for this oversight is these days most data centers are locked into a product from the “big 4″ vendors; BMC (Performance, formerly Patrol), CA (formerly Aprisma), HP (NNM, Operations), IBM (NetView, formerly Micromuse) or the “upcoming 5″ vendors EMC, Oracle, Microsoft, Quest Software and Symantec due to their overall IT infrastructure architecture and strategy.

But their are other innovative players in town to consider for replacement or complimenting these bigger guys. Self-learning technologies are being advanced by companies like Netuitive and Integrien. These technologies are focused on monitoring real-time events and then leveraging mathematical algorithms to estimate baselines and set thresholds in an attempt to accurately predict system and service level degradation.


Nov 2 2007   3:11PM GMT

Why not AlterPoint, NCCM continues to consolidate?



Posted by: Ryan Shopp
Network configuration, NCCM, Alterpoint, DataCenter, HP Software, IBM Tivoli, InfoVista, BMC, Microsoft Windows, EMC, BladeLogic, CA

Now let me be clear here. I’m very biased on this topic. Full disclosure, I spent almost 4 years of “blood, sweat and tears” at AlterPoint from it’s version 1.0, no revenue days through it’s last leadership transition. Back in Summer of 2006 we had a new leadership team come in with new blood/energy that really invigorated things. This was needed since the company, like Voyence, had been around since early 2000 and in the world of start-up’s you work lots of 80 hour plus weeks that can wear and tear on a person.

What I’m perplexed on is over the past 30 days two other Network Configuration/Change Management vendors have been consolidated by major players; Voyence by EMC and Emprisa by BMC. So why not AlterPoint is what I’m pondering over the last couple days?  Time to jump on my soapbox for a minute or two…

With a marquee customer list that includes; Citigroup, HSBC, Microsoft, Yahoo, Hertz, TJX, Walgreen, Cingular (now AT&T Wireless) and numerous others. A list that easily that from my perspective and opinion eclipse what Voyence or Emprisa had captured.

Additionally, AlterPoint is diversified in their offerings. They recently announced specific new applications that leverage the core NCCM technology for Compliance & Analytics. Finally, talk about being a good corporate citizen - they have lead the way for a commercial IT management vendor taking a portion of their revenue producing product and productizing it for open source (called ziptie). So they have a thriving customer list, are not a “one trick pony” and are giving back/building a strong community behind their capabilities. What’s not to love :)

So if we take a quick look at the landscape, that leaves IBM, Symantec, maybe CA (they had an NCCM type module included in the Aprisma acqusition) and maybe Microsoft (they recently OEM’ed InfoVista which I discussed in my last posting) with a big hole! So in my opinion the best NCCM business/product is still out their on the market so let the bidding begin. :) The longer any of those players wait the further behind they will get in delivering end-to-end use cases for their customers that require the capabilities of NCCM.

Now my hats off must go to Opsware who was the first to see and execute on the end-to-end configuration vision for data centers. They acquired Rendition back in late 2004 and once they brought things together their valuation continued to increase which likely assisted with the recent acquisition of Opsware by HP.

Bottom line here, if your not currently leveraging an NCCM product either, commerical or open source, let me say they are amazing products that help save time, money and frustration for network engineering and operations. These automation tools are critical to the data center and beyond and compliment similar automation tools on the applications/systems side (those offered by BladeLogic, Opsware, etc). More on those automation players in upcoming posts. I would also recommend taking time to subscribe or at least check out the AlterPoint sponsored blog highlighting key evolutions and perspectives in Network Management.

As noted in my personal about section these are my own opinions and based on personal beliefs and public knowledge. I left AlterPoint back in September 2006 for some new opportunities but continue to be a avid fan and cheerleader of the NCCM space, all the vendors (competition is a good thing) and especially my friends still over at AlterPoint!