Adventures in Data Center Automation:

NetIQ

Apr 17 2008   9:58PM GMT

Performance and Availability Management vs. Analytics - Part 1 of ?



Posted by: Ryan Shopp
nimsoft, cittio, eg innovations, Alcatel-Lucent, Analytics, Apparent Networks, Brix Networks, Compuware, Entuity, Fluke Networks, Gomez, Groundwork, Hyperic, Indicative, Application monitoring, DCAB, Firescope, HP Software, IBM Tivoli, InfoVista, Integrien, NetScout, Netuitive, Solarwinds, Systems monitoring, BMC, Quest Software, NetIQ, Network monitoring, Packet Design, Performance management, CA, Keynote, Nagios, NetQoS, Network Instruments, OpenNMS, Opnet, Xangati, ZenOSS

I’ve had an opportunity to be briefed over the past couple months by a number of current Data Center Automation Blueprint’s Performance & Availability vendors (e.g., CITTIO, eG Innovations, InfoVista, Integrien, Nimsoft).  With that and some further research I think I’m ready to take another pass at this area of the blueprint.

First up, all these vendors use a variety of techniques to collect a variety of data from as many points of view as possible.

  • Their own server agents that collect data about systems, services, applications, databases, etc and then aggregate back to a centralized console
  • Agent-less centralized consoles that leverage infrastructure standard communications protocols (e.g., SNMP, RPC, ODBC, WMI, SSH, TCP, UDP, HTTP) to query or connect remotely to collect data from networks, systems, services, applications, databases, etc.
  • Passive traffic flow collectors (which can be an agents or appliance) that are either in-line with the traffic flows or receive an exact copy of all traffic flows traversing a network connection (e.g., switch port uplink) through hardware vendor capabilities (e.g., spanning)

These data collection points can be statistics about a specific IT infrastructure resource ; physical devices, virtual devices, physical connections, virtual connections or resources running on physical or virtual devices like services, processes, applications, databases, etc.

Or the data collection points can be traffic flows or end-to-end specifics including passive traffic flows, synthetic transactions or even as simple as a pinging from remote points.

Metrics that are captured, typically revolve around throughput, errors, utilization, latency, up/down status, etc. (there are way to many to mention here).

After saying all this, there is a list a mile long of vendors (a number already noted on the DCAB) that capture these predominately time-series oriented data points about performance, capacity, availability using any/all these methods or vantage points (I know, passive traffic flows are not time-series data but patterns/usage/performance etc can be determined from them).

So, with all that data, what most these vendors offer are two primary types of functionality; 1) a variety graphical reports and 2)metric thresholding capabilities that produce a list of outstanding issues/alerts/alarms/events/concerns (whatever you want to call them).

Ok, so why did I organize and point all this out. So I can draw a line around where most of the innovation from my perspective is occurring. The above is for the most part in my eyes a commodity these days. Most companies have had collection/reporting/thresholding capabilities spanning multiple technology silos since pretty close to the start of the enterprise networking. The reports continue to get fancier, the number of data sources a single product collects from continues to expand, etc.  Another sign of commoditization is related to the variety of economic business models offering these products; open source, managed service providers, internet distributed products, appliances deployment models and indirect sales forces, large enterprise direct sales force, completely flexible frameworks for service providers to basically “build their own,” etc.

For the most part where the majority of technical innovation is occurring these days is the next layer above this data collection, reporting and alerting. Now let me say this, yes…there is some great innovation still occurring in the data collection realm (e.g., Xangati offering real-time Netflow down to a user level, PacketDesign monitoring routing messages, NetQoS leveraging advanced TCP/IP theory to analyze where end-to-end bottlenecks are occurring). But, for the most part these new data sources are being used to augment or replace currently deployed data sources in an attempt to see things from either as many vantage points or the best vantage points to avoid surprises within their unique enterprise IT environment.

So where is the serious innovation coming from…stay tuned for part 2.

Mar 11 2008   1:27PM GMT

EMC adds Service Desk to Data Center Management portfolio



Posted by: Ryan Shopp
BladeLogic, DCAB, HP Software, BMC, NetIQ, Performance management, Symantec, EMC, NetQoS, Packet Design, Xangati

EMC made a move yesterday that continued to show their intent and desire to compete against the Big 4 in IT Infrastructure Management (e.g., BMC, CA, HP, IBM).  All those other players have their own Service Desk offering, so it was time to join those ranks.

Infra Corporation, was acquired by EMC’s Resource Management Software Business Unit for undisclosed financial terms.

Combined with their previous acquisitions:

SMARTS - Availability & Performance Management - Q1 2005
nLayers -  IT  Resource Reconciliation (e.g., CMDB) - Q3 2006
Voyence - Configuration & Change Management (for Network Devices) - Q4 2007

This acquisition shows a slowly increasing pace of their acquisitions (within the software group).  With that being said, looking at their portfolio, I would be surprised if we don’t see another one or maybe even two (depending on the size) before the year is out.  Areas they could benefit from (aka we could see) would be Configuration & Change Management (for Systems/Applications) or a move to strengthen their Availability & Performance Management offering; specifically more application performance centric.

On the CCM front there are numerous virtual & physical system configuration vendors sprouting up these days, versus before the primary game in town was BladeLogic (or Opsware before HP acquired them).  Meanwhile, on the Performance Management front they have a variety of options that could include grabbing a smaller application performance appliance vendor (e.g., Mazu, Xangati, Packet Design)  or something bigger like maybe a NetQoS.  Or even bigger and more interesting (but convoluted) could be buying out NetIQ who continues to innovate within Attachemate (e.g., Aegis product) or the artist formerly known as Precise Software (and now again known by the same name after Symantec spun them back out).  Probably long shots but just thoughts to ponder as the EMC Resource Management Software portfolio could use portfolio expansion in either or both functional areas of the DCAB.

Bottom line from my outsiders perspective is EMC is one or two moves away from changing conversations from the big 4 to maybe the big 5.


Jan 25 2008   9:00AM GMT

Couple recent notes on CMDB, aka Resource Reconciliation



Posted by: Ryan Shopp
DataCenter, CMDB, Opalis, Scalent, Symantec, BMC, NetIQ, CA

Another great post by Glenn O’Donnell; CMDB is the new integration mechanism. I’m looking forward to seeing his forthcoming book on the same topic!

2007 TechTarget Products of the Year - Data Center include (categories by DCAB functional categories):

Resource Reconciliation (category combined with Configuration & Change) solutions from CA, BMC and Scalent

A couple other categories that map to the DCAB are;

Process Orchestration solutions from Symantec, Opalis and CA

Performance & Capacity solutions from NetIQ, BalancePoint and CiRBA

I find the CiRBA solution very intriguing after my read and post on Innovations in Performance Management yesterday.


Jan 21 2008   1:43PM GMT

Quick Monday Summary of events from late last week/weekend



Posted by: Ryan Shopp
Compuware, Symantec, BMC, Quest Software, NetIQ, Indicative, NetQoS, NetScout

 Symantec to sell off Application Performance Monitoring group.  Looks like Precise Software is back and the Symantec Data Center group will focus in on the configuration and change management side of things.

BarcampESM took place over the weekend.  Here are some materials to take a look at.  BSM by Doug,  Discussions around open software and open standards, the desire for an “open agent” .  From this point forward keep track of things via the Open Management Consortium discussions.

Application Performance Management(APM) rolling review continues at InformationWeek - recently highlighted, ProactiveNet (recently acquired by BMC).  Previous reviews include Quest Software Foglight (Dec 2007), Network General (Nov 2007), Nimsoft Nimbus (Oct 2007), Compuware Vantage (Oct 2007), NetIQ AppManager (Sept 2007), NetQoS SuperAgent (Sept 2007)Indicative (Aug 2007).  As you can see this is a very congested space, pardon the pun, but it is sized to be over $2B in size by Forrester.

Now that we’ve run through the entire 6 functional areas of the Data Center Automation Blueprint we plant to discuss the impact of virtualization over the next couple posts.  Thanks in advance to those I’ve been talking with and their perspectives on this topic.


Jan 17 2008   7:14PM GMT

What are the most desired features in IT Process Orchestration (e.g. RBA)?



Posted by: Ryan Shopp
DataCenter, Enigmatec, HP Software, IBM Tivoli, IT Process Automation, Opalis, Optinuity, RBA, RealOps, Run Book Automation, Stratavia, BMC, LANDesk, NetIQ, OpTier, Scapa Technologies

Alright, looking for feedback on this one. After talking about the players in the IT Process Orchestration space, I’m wondering what are the primary capabilities people are looking for?

Here are my top five, please feel free to throw down yours in the comments below:

  1. Drag/Drop graphical interface for designing process workflows
  2. Common, normalized Data Model of common/primary attributes
  3. Library of pre-defined, re-usable actions/triggers/processes for usage out-of-the-box (bigger the better - even a community that shares is a plus)
  4. Policy/Desired-state engine driving things
  5. Sandbox, simulator to help test workflows without impacting actual resources/instances within the production enterprise.

Beyond these five core capabilities, depending on the processes you wish to automate you need to verify what interaction/communications protocols are supported (e.g., SNMP, WMI, JMX, ODBC, Telnet/SSH/FTP to CLI, XML/Web Services). Make sure they have what you need to communicate with.

Of course, it also goes without saying (just like with any commercial product) table stakes require RBAC security, reporting, logging, appropriate hardware/software requirements.

Bottom line, I guarantee if your a medium to large enterprise you have current manual processes that these products can automate for you! Reducing errors due to the mundane nature of that task, freeing up people currently doing the task for other projects or tasks and also the intangible benefit of it’s simply faster which provides better customer service depending on the process that is automated. Make this a priority in 2008 and get one of these vendors in there to help out!

Disclosure: I have no relationships with any of the vendors in this space. The comments are all made based on my personal experiences and perspectives.


Jan 14 2008   8:42PM GMT

Digging into the DCAB 6’s functional areas: Process Orchestration



Posted by: Ryan Shopp
DataCenter, HP Software, IBM Tivoli, IT Process Automation, Opalis, Optinuity, RBA, Run Book Automation, Stratavia, BMC, NetIQ, OpTier, Scapa Technologies, LANDesk, Enigmatec, GridApp Systems

Alright, back on track with our review of the 6 functional DCAB areas. We are now onto the hottest, fastest growth areas! First up, Process Orchestration or what Gartner has coined as Run Book Automation?

These products offer the ability to define, build, orchestrate, manage, monitor and report on workflows that automate specific IT intra or inter domain processes (intra = between different products for the Windows Server team or inter = between the application and network team). There are a ton of case studies and examples on most the players websites.

A couple quick examples to get a flavor include:

A monitoring product identifies a specific condition (e.g., an outage), it then checks a configuration auditing product to see if a recent change was performed for that system.

A configuration auditing product monitoring if a device is in or out of compliance notices an situation and then automatically opens a trouble ticket. Later, it notices again the situation has been resolved and it adds the appropriate details to the ticket and automatically closes it out.

Here are the companies I know about (as always, in alphabetical order)

BMC (formerly RealOps)
Enigmatec
GridApp
HP (formerly Opsware, formerly iConclude)
IBM (formerly ThinkDynamics)
LANDesk (Process Manager product)
NetIQ (Aegis product)
OpTier
Opalis
Optinuity
Scapa Technologies
Stratavia
UC4 Software
xTigo

As always, who am I missing. What are the opinions out there from users or evaluators for each platform (please chime in down in the comments section). I have personal product exposure and experience with only BMC, Stratavia. Some of the key features that I learned from those products included the value of having a normalized, common data model and “action” abstraction capabilities so you re-use previous process actions in new workflows.

Here are a couple good reviews and write-ups for further reading if desired.

Data Center Manager Primed for IT Process Automation
IT Process Automaton Overview and review of some players


Dec 28 2007   11:31PM GMT

Digging into each of these 6 functional areas: Performance and Capacity



Posted by: Ryan Shopp
DataCenter, HP Software, IBM Tivoli, InfoVista, Integrien, Netuitive, Systems monitoring, OSS, BMC, Quest Software, NetIQ, Network monitoring, Performance management, CA, Zabbix, ZenOSS, OpenNMS, Nagios, Hyperic, Groundwork, Packet Design, Apparent Networks, Xangati, Gomez, Keynote, Brix Networks, Entuity, Opnet, Network Instruments, Fluke Networks, Alcatel-Lucent, Compuware, NetScout, NetQoS, Symantec, EMC

First things first, we have many of the same vendors from the Availability & Notification functional area of this Data Center Automation Blueprint in this category. Which probably begs the question, do we combine Availability & Notification with Performance & Capacity? I know in the OSS (not Open Source Software but telco-oriented Operational  Support Systems) model they do this and call it “Service Assurance”, another name could be Service Level Management as they two monitoring-centric functions are about ensuring service levels are met…or simply I call it Availability & Performance? I’ll come back to this at the end after I type up the players in this Performance & Capacity area:

But then, we have a slew of others that have been around for quite some time now…

And some innovative up-and-comers in some unique technology/approaches…

Real-Time Behavior/Pattern Analysis through Dynamic Thresholding

IP Traffic/Packet Flow Monitoring & Analysis

Open Source Software (OSS) vendors

Whew..that was more work then I expected to pull together and I’m not done yet…  Please throw into the comment who I’ve missed (I know there has to be a few).

The major challenge here is organizing and breaking down this functional area.  There are so many approaches to obtain performance metrics from/for the data center.  Some of the techniques and perspectives include;

  • passive vs. active
  • agent vs. agent-less
  • in-line appliance vs. out-of-band appliance (e.g., span a port)
  • proprietary vs. leverage infrastructure mgmt. capabilities (e.g., Cisco Netflow)
  • outside the data center looking in vs. inside the data center itself.
  • Reactive troubleshooting vs. Proactive Predictive

I’m going to need to have a part two (and maybe more) for this functional category breaking down the pro’s and con’s of various approaches.  Which vendors do what, etc.  I also need to revisit that question from the top of do we combine this into a single “availability & performance” functional category???  For now, this first pass will have to do…


Dec 24 2007   5:52PM GMT

So let’s start to dig into each of these 6 functional areas: Availability and Notification



Posted by: Ryan Shopp
DataCenter, HP Software, IBM Tivoli, BMC, Quest Software, NetIQ, CA, EMC

So it’s time to start refining the Data Center Automation Blueprint. One way I hope to do that is through these next 6 blog posts (one for each functional DCA category) that will:

1) create list of vendors I know about that have some capabilities for the data center in the specified functional area

2) during this first pass attempting I also hope to breakdown each function by some major capabilities.

*NOTE: Help me out if I miss some vendors, miss some products within vendor product lines etc. Again, the focus is for current/future complex data center so I won’t be including tools like Ipswitch What’s Up Gold or products that are on their way out (end-0f-life) by vendor (e.g., NetView).

Event consolidation & root cause analysis

A new product segment that has materialized that for now I’m going to go place here is log management where you maintain historical event/message/alert logs and then have historical reporting and applying advanced indexing and searching technology to quickly find the “needle in the haystack” problems. It also has application beyond operational availability management of the data center within the security space for compliance management.

Next up will be the current Data Center Automation Functional Area of Performance and Capacity.


Dec 4 2007   10:04PM GMT

What are the Six Functional Areas of Data Center Automation



Posted by: Ryan Shopp
DataCenter, Alterpoint, BladeLogic, Cassatt, Integrien, IT Process Automation, HP Software, IBM Tivoli, InfoVista, BMC, Microsoft Windows, NetIQ, Netuitive, Opalis, Optinuity, PlateSpin, RealOps, Scalent, Stratavia, Veeam, Vizioncore

Alright, here is my first pass at a graphic I’m attempting to build that will capture the spirit of my previous posts (this is a work still in progress as previously mentioned);

I’m attempting to come up with a 30,000 foot reference model (functionality focused) for when you’re building out a data center’s software automation architecture.

The yellow areas are the 6 current areas I’ve functionally identified. The tricky part is based on the complexities of each category in the Data Center Infrastructure (e.g., Network vs. System), many of the functional areas require technical depth and audience-specific focus (e.g., network engineers vs. SAP administrators). The arrows are trying to capture that.

I know this still needs work but this is an evolution, and I only have a little time each week to currently work on it during these blog posts.

Below the graphic are some current vendors by function that have product(s) in each function that I’ve mentioned during previous blog posting so far.

data-center-automation-reference-model-v1.jpg

  • Configuration & Change: BMC (Marimba), CA, EMC (Voyence), HP (Opsware), IBM, BladeLogic, Cassatt, AlterPoint, Platespin, Scalent, Veeam, Vizioncore
  • Security & Protection: Symantec, IBM, EMC, McAfee, nCircle, Lumension, ArcSight
  • Performance & Capacity: BMC, CA, EMC, HP, IBM, Quest, InfoVista
  • Availability & Notification: BMC, CA, EMC, HP, IBM, Microsoft, Quest, Integrien, Netuitive, NetIQ
  • Process Orchestration: BMC (RealOps), HP (iConclude), Opalis, Optinuity, NetIQ, Stratavia
  • Resource Reconciliation: Symantec, IBM, HP, BMC, EMC

I know I’ve missed many and also it would probably be helpful to not simply mention the company but also the product name but that will have to wait until another time.


Nov 28 2007   8:22PM GMT

IT Operations Process Automation - aka “Run Book” continues to mature!



Posted by: Ryan Shopp
DataCenter, BMC, RealOps, Optinuity, Opalis, Alterpoint, BladeLogic, HP Software, IT Process Automation, Run Book Automation, RBA, NetIQ, Stratavia

This is an area I haven’t hit on yet but will also need to fit into the reference model (that one of these days I’ll get back on track)

Lots of action what Gartner and others are calling Run Book Automation or RBA!!!  So let’s summarize the latest.

Optinuity launched a new version of their product that has also been re-branded. Attempting to elevate and differentiate itself beyond the other RBA vendors through re-focusing their primary target audience (from IT Operation Executives to Enterprise Application Executives) and adding specific functionality to provide a self-contained (not reliant on IT Operations) closed loop, automated process (e.g., application monitoring).  The goal, per talking with CEO Scott Stouffer, is to get as close to the enterprise applications themselves as possible (e.g., the teams that develop and/or perform the advanced support/administration for them).  One example discussed was a unique “locked account” scenario that was happening thousands of times a month and thus wasting hundreds, if not thousands of man hours a month!

Opalis launched a new version of their product (version 5.4) which includes some intriguing enhancements in the areas of automating virtualization and the ability to run simulations of process automation workflows prior to deployment in the live environment. They also continue to sport a very impressive list of out-of-the-box IT Operations centric connectors for products/companies that don’t have a process automation product including; BladeLogic, EMC, IBM, Microsoft, Symantec along with support for various product from the other big 4 vendors that do have competing products (e.g., BMC, CA, HP).

HP announces their re-branded suite that includes the former iConclude product HP has so many pieces for automating the data center (beyond the RBA capabilities)…the question now is can the execute on it’s organization (e.g., product bundling/branding), integration (e.g., focus on delivering the right use cases end-to-end) and deployments (e.g., making this all come together inside complex enterprises).

BMC made their move into this space back in the summer time (July) with their acquisition of RealOps. They re-branded this product as BMC Run Book Automation and are using it to tighten up and automate the process flows between their other products; Remedy, Atrium, Marimba, etc. Of course you can still use the platform to integrate with non-BMC product but they are going to focus on their own product line.

NetIQ recently threw their hat into the ring also. Now a subsidiary of Attachmate, they built their solution internally over the past couple years (prior to BMC or HP joining in). Their focus appears to be, in my opinion, around helping ensure their product AppManager stays competitive with other System/Application monitoring vendors (e.g., BMC, HP, IBM, CA, Microsoft). The challenge will be that the service desks they would integrate with are part of companies that now also offer this Run Book Automation technology. So basically, if your a current NetIQ customer and happy then you now won’t be as motivated to go to BMC or HP who own all three components (e.g., system monitoring, process automation and service desk).  Smart strategy move to continue innovating and keep current customers happy.

Stratavia also announced their latest product release in October.  Originally more focused on automation tasks for databases, they continue to evolve their product to be competitive with the other non-database centric but more system/applications centric vendors.  This database automation functionality evolved from their original business model of being a managed service provider for remote database management (at that time they were called ExtraQuest).

To that point, it’s amazing how many of these RBA or IT Process Automation companies come out of operational businesses.  Stratavia was original a managed services provider, RealOps came out of the consulting ranks from Windward Consulting.  This makes sense with various Data Center Automation function…they are very complex and challenging tasks that originally are tackled with service-based approaches only then to be automated with software.  Beyond this RBA sector, another couple vendors that started from similar origins would be Opsware (originally a managed service provider) and BladeLogic (whose founder were previously responsible for operating the infrastructure for a managed service provider)

I also read in a recent Forrester report by Jean-Pierre Garbani that the first market sizing forecast for the IT process automation software space is about $50 million today, but forecasted it to grow to about $700 million by 2015.  Now that is some SERIOUS GROWTH!

One last item, I want to give credit where credit is due to a former boss, colleague and friend Dave Williams who is now at Gartner.  I remember him talking about this space looong before anyone else!  That is recognized in this write-up by internetnews.com. When he left AlterPoint back in February 2006 I remember talking about these products over lunch a number of times.  I had the chance to work closely with the RealOps executive team when AlterPoint built a partnership and integration with them.

So if you have a very, very complex IT Operations environment or are seeing skilled people doing very unskilled/mundane tasks over and over and over…it’s time to check out one or more of these vendors!

So what other “Run Book Automation” vendors are out their at what have been your experiences so far with their products, the company itself and their partners???  Please chime in with your comments as I know their are a ton of people evaluating and using these products these days!