Mar 24 2008 12:00PM GMT
Posted by: Ryan Shopp
DataCenter
The deeper we dig the tougher it becomes to make dividing lines. Originally we had 2 DCAB areas Performance & Capacity as one and Availability as another. Then, based on some further thoughts we made an adjustment and bundled Performance & Availability together along with Capacity & Analytics. I still find myself questioning this when I take a look things like yesterday’s recent announcement by Xangati; End-User Activity to Front-line support and NetQoS; add Network Behavior Analysis.
The reason for my questioning this is many performance solutions maintain their data allowing for historical/capacity analysis. To say it another way, you will more often find a performance vendor also doing capacity management vs. doing availability monitoring (unless your talking about the big 4 or 5). So it’s time to step back and take a deeper look at approaches and functionality then figure out which vendors go where (aka a bottom up perspective).
Looking back at the original post for the Data Center Automation Blueprint called Digging into these 6 functional areas: Performance & Capacity we notice a discussion that started talking through approaches.
- passive vs. active
- agent vs. agent-less
- in-line appliance vs. out-of-band appliance (e.g., span a port)
- proprietary vs. leverage infrastructure mgmt. capabilities (e.g., Cisco Netflow)
- outside the data center looking in vs. inside the data center itself.
- Reactive troubleshooting vs. Proactive Predictive
So let’s start pondering these a little more.
Passive Monitoring - the monitoring of actual traffic flows passively to collect statistics; example inline appliance or spanned port on a switch that mirrors over a copy of all traffic flows to that appliance
Active Monitoring - the monitoring of end points using different protocols to collect statistics; example create a TCP packet and query an applictions/service that should respond to it…e.g., SMTP port 25.
Now even starting with these two you can start pulling things apart due to various hybrid approaches and positioning by vendors. The case can be made that passive is nothing more then collection of data that is then passed back to a centralized point…just like active, which requests the data and receives immediate response to place within the centralized aggregation point. It gets further convoluted when some vendors allow you to use passive data as it’s gathered on appliance vs. other wait for it to be aggregated back to the central management point. So with all this confusion where do we go from here…
I was reading another blog posting last night and thought it had a interesting way of talking about these two types of performance statistics. They called them rows & columns. Rows meaning infrastructure and columns meaning flows.
So from here let’s step down one more level, what type of statistics do we want to capture….
How much (e.g., bitrate/throughtput/activity)
How fast (e.g., latency/round-trip time/response time)
How ready (e.g., availability/response)
But, there are also statistics in between to help us identify potential bottleneck points; processor, memory, etc.
Also, we need to be able to gather down to a specific endpoint for a specific application or we could desire to aggregate up to all traffic types to a specific data center.
So I’m going to push pause here, with all the Performance vendors out there I know some of y’all must have found some good resources (whitepapers etc) that already attack this question of what statistics, from what vantage point, using what technology etc. I’m going to do some more research and ponder how we can better articulate mapping things for Performance, Capacity & the Analytics of those details. Please feel free to share with me quality whitepapers that as independent as possible attempt to answer this.