Data Visualization archives - Software Quality Insights

Software Quality Insights:

data visualization

May 3 2009   8:47PM GMT

Data visualization by example: Real-world uses



Posted by: Michael Kelly
Software testing, data visualization, software development

Data visualization technologies have advanced at warp speed. While Edward Tufte might be able to break data visualization down into its basic elements, for those of us who don’t live the topic, it can be hard to keep up and, more importantly, put these new technologies to work. Just to show it can be done and is being done, I’m briefly sharing some real-world use cases in this post.

Simply put, it’s all about the visuals, or every picture telling a story. My love of graphics is one reason why I’m a regular reader of the Wall Street Journal. I like WSJ’s front-page news crib sheet and financial news; but I’m really into their charts and graphics. I find that I’m constantly cutting out charts that I see and saving them as examples of a creative way to display complicated information. (Yes, I’m the one guy still reading the paper version.) This story by Steve Myers outlines how media outlets are trying to leverage visualization to better tell their stories.

As testers, data visualization helps us not only in communicating our findings, such as when performance testing, it can also help us recognize problems. For example, if you have a bunch of data — database transactions, source code, network traffic, log files, etc. — and you’re looking for trends or possible issues, figuring out different ways to graph that information can help you understand what you’re looking at.

Because it can be difficult to talk about hard and fast rules for when to visualize your data, I thought I might instead offer some examples I’ve seen of creative uses of visualization:

  • I’ve seen a development team create an animated view of their source code commits that you could run a visual animation of commits over a repository, project, or branch for a period of time that you set. Why’s that a big deal? Because you could quickly see where all the development activity took place, who was working on it, and when. As a tester, this can give me insight into where to focus, because it tells me where there’s churn in the code.
  • I’ve seen an architect take runtime data, like log files, and use them to build graphs of customer transactions. You could then scroll through all the different images quickly and look for the outliers. Each time you stopped on an image that didn’t look roughly like the others, odds are you found some sort of interesting edge condition or error that tool place. As a tester, this can help you not only recognize when you — or an automated test — might have found an issue; but it can also help you identify and document test scenarios.
  • I’ve seen a test manager chart automated tests that ran off of production data to see what kind of coverage they were getting from their random samplings. Because the tests ran thousands of randomly selected scenarios, it was difficult to understand what was and wasn’t getting covered. Simple pie charts across the various variables involved became a simple dashboard allowing them to better focus their sampling algorithm and manual test coverage.

As you think about what you might have on your project that you might want to look at, take some time to look at examples. If you don’t like looking to the media for examples, check out some of the examples on tools like verifiable.com or many eyes.

Feb 26 2009   1:58PM GMT

Better security through better visualization



Posted by: Michael Kelly
data visualization, software security

I’m always excited when I stumble across an area which is an intersection of two of my favorite topics. Recently, I started reading Applied Security Visualization by Raffael Marty. In the book, Marty introduces the concepts and techniques of network visualization and explains how you can use that information to identify patterns in the data and possible emerging vulnerabilities and attacks. It’s the perfect merger of data visualization (a topic fellow SearchSoftwareQuality.com expert Karen Johnson has me hooked on) and security.

This morning, I stumbled across an ITworld article Marty published earlier this month on getting started with security visualization. In the article Marty provides three simple must-dos and don’ts:

The three “must-dos” from the article:

  • Learn about visualization: It’s important for security people to understand the basics of visualization. Learn a bit about perception and good practices for generating effective graphs. Learn about which charts to use for which kinds of use cases and data. This is the minimum you should know about visualization.
  • Understand your data: Visualization is not a magic method that will explain the contents of a given data set. Without understanding the underlying data, you can’t generate a meaningful graph and you won’t be able to interpret the graphs generated.
  • Get to know your environment: I can be an expert in firewalls and know all there is to know about a specific firewall’s logs. However, if you give me a visualization of a firewall log, I won’t be able to tell you much or help you figure out what you should focus on. Context is important. You need to know the context in which the logs were generated. What are the roles of the machines on the network, what are some of the security policies, what type of traffic is normal, etc. You can use visualization to help understand the context, but there are things you have to know up front.

And the three “don’ts”:

  • Don’t get scared: The topic of security visualization is a big one. You have to know a lot of things from visualization to security. Start small. Start with some data that you know well. Start with some simple use cases and explore visualization slowly.
  • Don’t do it all at once: Start with a small data set. Maybe a few hundred log lines. Once you are happy with the results you get for a small data set, increase the size and see what that does to your visualization. Still happy? Increase the size some more until you end up with the complete data set.
  • Don’t do it yourself: If you’re in charge of data analysis and you aren’t the data owner (meaning that you don’t understand the application that generates the data intimately well) you should get help from the data owner. Have the application developers or other experts help you understand the data and create the visuals together with you.

If you’d like to read more on the topic (and see some cool examples) check out Raffael Marty’s blog.