I found a great article earlier this week on static analysis tools by Mary Brandel. In the article, “How to choose and use source code analysis tools,” she cites some statistics on the static analysis market, including:
- “The entire software security market was worth about US $300 million in 2007”
- “The tools portion of that market doubled from 2006 to 2007 to about $180 million”
- “About half of that is attributable to static analysis tools, which amounted to about $91.9 million”
In the article, Brandel also offers some evaluation criteria for when you start looking at source code analysis tools. These include language support and integration, assessment accuracy, customization, and knowledge base. She also provides some dos and don’ts for source code analysis. I think the most valuable tidbits from that list include:
- DO consider using more than one tool: The article provides a good story about Lint vs. Coverity, and I’ve found that static analysis tools will find different issues as well. Each vendor will have its own specific focus on vulnerabilities and warnings.
- DO retain the human element: While I’ve yet to work with a team that thinks adding automated tools like this will allow you to remove people, there’s certainly the feeling from the marketing materials that the results are intuitive. That’s typically not the case. You often need to know what you’re looking at or you’ll miss the subtleties in the data. I agree with the “truly an art form” quote. This stuff is hard, and while tools make it easier, it’s still brain-engaged work.
- DO consider reporting flexibility: At some companies this is a big deal. When working with smaller software development organizations, it doesn’t matter what the reports look like. The only people looking at them are the people working in the code. However at a larger company, Fortune 500 for example, information like this normally needs to be summarized and reported up.
Today, the development community gets a first look at Coverity Prevent’s new Microsoft-friendly analysis tools. Yesterday, I talked with Coverity Inc. CTO Ben Chelf about how the new features will help software developers beat problems like deadlocks and race conditions and save time detecting defects. We also touched on Prevent’s role in cloud computing development. After a short bit of background here’s a Q&A based on our conversation.
Coverity Prevent, Coverity’s flagship static analysis solution, can now give Microsoft developers better tools for finding and fixing defects. With the new features, developers get modeling for Win32 concurrency APIs, Microsoft Windows Vista support and integration with Microsoft Visual Studio.
In addition, Coverity has dropped in some quality and concurrency checkers for C#. On Jan. 19, Coverity introduced Prevent for C#, a tool for identifying critical source code defects in .NET applications.
What gaps in analysis functionality will be filled by Prevent’s new features?
Chelf: The new features add Microsoft-specific checks that have a deep understanding of the Microsoft platform directly to the developer desktop in the developer’s IDE. IT pros now can save money on traditional testing techniques, since many of the problems that were previously discovered in testing or post-release are now discovered as the developer is writing the code. Every IT professional wrestles with testing costs and the time it takes to get a software system out the door, and this technology accelerates that process.
While other companies have desktop plugins for general static analysis solutions, because the checking is not Microsoft-specific, the other products tend to suffer from high false positive rates which can quickly turn off developers leaving the tool as shelfware.
In general, what software testing and quality assurance (QA) problems will this solve?
Chelf: The problem this solves is in some of the very difficult-to-reproduce defects. Especially when tracking down concurrency problems, the QA department has a very hard time putting together the exact test suite to make an application fail the way it would fail in production. These wasted cycles are now eliminated by finding the problems earlier in the development process.
So, for example, how would Prevent help with race conditions?
Chelf: That’s one concurrency problem that can happen when you’re developing in a multithreaded application and you have multiple things happening simultaneously. These threads in the application are all trying to access the same memory. If they access it at the same time without any kind of protection, the data can be corrupted. Without static analysis capability, the only way to track these things down is to find them in the testing environment. Since multithreaded problems are difficult to diagnose, because you are at the whim of how the different threads are scheduled, it can often take the developers days or weeks to reproduce a problem they encounter.
This new technology helps them find problems more quickly. As they are writing the code themselves, they’re sitting in the IDE and saving their files and checking in code into their source code management system from time to time. The Prevent technology gives them in IDE another button that says, “Analyze my source code.” Then, they get automated analysis of all the source code in the system, not only the source code they’re writing. They can do a kind of virtual simulation of the software system looking for these kinds of problems.
How can Prevent’s new features accelerate development in virtual cloud computing environments?
Chelf: And as it pertains to the cloud, many applications are moving more toward multithreaded design in order to take advantage of multiple cores on a machine as well as multiple machines in a cluster. However, distributing computation like this introduces a new class of potential coding defects that our technology helps address.
In the multicore era, there are going to be more and more multithreaded applications, and that introduces a host of problems that we’re trying to rid the world of, such as deadlocks and race conditions.
Earlier this month, the New York Times ran an article on a report criticizing the F.D.A. on device testing. The article seems to indicate that one of the leading causes for poor testing is manufacturer claims about new devices being like other existing devices already on the market.
The article also points out that the F.D.A. has failed to update its rules for Class III devices for a while now. As near as I can tell, the software (the part I care about) is in those Class III devices. For those not up on their F.D.A. history, the article has a great tidbit that I found very interesting.
Created in 1976, the F.D.A.’s process for approving devices divides the products into three classes and three levels of scrutiny. Tongue depressors, reading glasses, forceps and similar products are called Class I devices and are largely exempt from agency reviews. Mercury thermometers are Class II devices, and most get quick reviews. Class III devices include pacemakers and replacement heart valves.
Congress initially allowed many of the Class III products to receive perfunctory reviews if they were determined to be nearly identical to devices already on the market in 1976 when the rules were changed. But the original legislation and a companion law enacted in 1990 instructed the agency to write rules that would set firm deadlines for when all Class III devices would have to undergo rigorous testing before being approved.
The agency laid out a plan in 1995 to write those rules but never followed through, the accountability office found. The result is that most Class III devices are still approved with minimal testing.
I only found the article because I happened to see a letter to the editor from Stephen J. Ubl, the president and chief executive of Advanced Medical Technology Association. The letter caught me eye because in May I’ll be facilitating the second Workshop on Regulated Software Testing. His comment about the “extensive review of specifications and performance-testing information” is exactly the type of stuff I want to see at the workshop.
Regulated device/software testing is a difficult thing to do. For those who want to focus on the testing, there’s a lot of process already and it can distract from doing the testing. For those who want to make sure the process is followed and that all the right testing is taking place, then your focus is on the process and evidence. Figuring out that balance is always hard, whether you’re the F.D.A. or the company developing the product.
Software developers make common and avoidable mistakes that create vulnerabilities and expose their software to ever-present security threats, according to field observations by Vic DeMarines.
Yesterday I spoke with Vic, VP of products at V.i. Laboratories Inc. (V.i. Labs) in Waltham, Mass. V.i. Labs’ products help software providers protect themselves against piracy and associated revenue losses. The company also provides antitampering solutions and products that prevent intellectual property theft. For example, its CodeArmor Intelligence antipiracy product enables software publishers to identify organizations that are using their software illegally. V.i. Labs’ customers include financial services software companies and online gaming providers.
I asked Vic to name some of the most common security mistakes he sees. He said there are three major security threats: piracy, code theft and tampering. “You can’t stop piracy, but you can be more resistant to it,” Vic said. “When developers integrate licensing into an application, they rarely consider making it resistant to reverse engineering or the threat of piracy.” There are basic tools and techniques to help vendors resist that — namely antitamper technologies, obfuscation, or tamper detection and reporting — and not using them is a common mistake. Vic said some of these can be accomplished in-house and others are available on the market.
Developers also frequently make security mistakes when coding new applications in Microsoft .NET. “Developers need to understand the risk in .NET,” Vic said. The “bad news” with this practice, according to Vic, is that when you compile, people who know where to look can view your source code using freeware tools. This mistake could be avoided without abandoning .NET — developers can put sensitive code in a different format. They can use obfuscation techniques or protection tools to prevent people from seeing sensitive code.
I also asked Vic for tips on producing high-quality, secure software in a down economy — how do you “do more with less” when it comes to software security? Vic advised developers to think ahead — if you’re about to design an app, “make security a priority and define how you’re going to test it,” he said. Enlisting an outside security testing team is expensive, so instead have someone in your group who is strong in security “think like a cracker” to determine vulnerabilities.
Thanks to a 7thSpace news post about an academic paper, An innovative approach for testing bioinformatics programs using metamorphic testing by Tsong Yueh Chen, Joshua WK Ho, Huai Liu and Xiaoyuan Xie, I was able to find a gem of a paper on software testing. Here’s the full article on metamorphic testing.
I’ve read several academic papers on software testing in the past, and as a general rule I really don’t like them. However, this one’s really worth reading. In the article the authors suggest the use of a new software testing technique called metamorphic testing to test bioinformatics programs. I’ve tested a couple of bioinformatics programs in the past and it’s some of the most difficult testing I’ve ever done.
In the paper, the authors begin by outlining the general problem:
In software testing, an oracle is a mechanism to decide if the output of the target program is correct given any possible input. When a test oracle exists, we can apply a large number and variety of test cases to test a program since the correctness of the output can be veriﬁed using the oracle. Without a tangible oracle, the choice of test cases is greatly limited to those special test cases  where the expected outputs are known or there exists a way to easily verify the correctness of the testing results. In particular, an oracle problem is said to exist when : (1) “there does not exist an oracle” or (2) “it is theoretically possible, but practically too difficult to determine the correct output” .
(For those of you who might not be familiar with the oracle problem, I recommend Doug Hoffman’s work in test oracles as a primer.)
After some discussion about why traditional techniques aren’t enough, the authors describe their new method of metamorphic testing (MT):
Instead of using the traditional test oracle, MT uses some problem domain speciﬁc properties, namely metamorphic relations (MRs), to verify the testing outputs. The end users, together with the testers or program developers, ﬁrst need to identify some properties of the software under test. Then, MRs can be derived according to these properties.
The article then reviews a couple of studies and talks about the limitations of the technique:
It should be noted that satisfying all test cases based on a set of MRs does not guarantee the correctness of the program under test. MRs are necessary properties, hence satisfying all of them is not sufficient to guarantee program correctness. This problem is, in fact, a limitation of all software testing methods. Nonetheless, the ability to systematically produce a large number of test cases should increase our chance of detecting a fault in the target program, and hence improve its quality.
Well worth the read.
In a recent CIO blog post on SOA Testing Best Practices John Michelsen relates a story of an ERP Order Management system that went live and then subsequently dropped orders for three months in production before someone noticed. It is a sad story, but one I’ve seen played out a couple of times myself. The story makes me think of two things, end-to-end testing and production monitoring.
Having worked on several projects that have involved some sort of SOA, we’ve always segmented our testing into three phases: unit testing, integration testing, and end-to-end testing. For me, unit testing isn’t just developer unit testing; it’s also the testing team unit testing the service to ensure it fits the specification/mapping document requirements. It’s testers hand-coding XML or recording SOAP regression test beds. Once it’s been proven out at the unit level, then we start plugging other services/applications into it and looking at how they interact. When we think everything’s rock solid, we then do some end-to-end tests, where we try to simulate business scenarios from start to finish (UI to UI if possible).
In addition to testing, someone on the project needs to be thinking about operations and how we’ll know what the health of these services are at any given point. Who is monitoring queue depth and message aging? What alerts are thrown and when? When an alert is thrown, who is notified? Each of these scenarios might also be played out either in the end-to-end testing that is performed or via performance testing.
At a workshop last year, I heard Ken Ahrens from iTKO present briefly on the “Three C’s” that John Michelsen also references in the CIO posting. It’s a useful model for talking about how SOA testing is different from some more traditional manual testing contexts.
Over the last few months, I’ve been asked a lot about testing on mobile devices. At my own company and at other companies in the Midwest (which isn’t exactly known as a hotbed of technology). I’ll plead mostly ignorance, most of what I know is from Julian Harty, whose work I follow both in print and at conferences. I think the topic is important enough that in a series of workshops I co-host, we’ve added a full day to the topic.
Yesterday, I read this story on new handsets and the issues that are encountered in the marketplace as vendors work to keep up with the cutting edge and market demand. What I really like about this article, aside from it being an interesting news story, is that I think it provides a nice snapshot of some of the common challenges of testing on mobile devices.
- Meeting launch targets: I’m not sure how unique this is to mobile devices, but it certainly seems like deadlines often coincide with hardware deadlines. For example, if you’re writing software for the iPhone, then to some extent your releases are tied to theirs, if for no other reason than support (see software updates below). If a new “iClone” comes out, there’s a chance you get to port your iPhone app over to that device.
- Handsets are getting more complex: My first Palm had a calendar, took notes, and had a calculator. My first cell phone only made phone calls. They claimed there were other applications on it, but no one was really using the early cell phone calculator. I mean, really? Now, I can read a spreadsheet on a phone, take a picture, stream media, look up directions in real-time, and likely just about anything else I can think of. And I can do that using a keypad, stylus, keyboard, gestures, or the soon to be invented chip implant. I’m sure it’s only two or three years out … really.
- Consumer expectation: Expectations for mobile devices are high. From performance and reliability to recoverability of data, the quality criteria used when testing on mobile devices is subtly different from those of traditional Web and desktop applications. The software I use on my phone, while it’s typically functionally simpler, operations in a much more hostile environment: there’s less processing power, less memory, less disk space, connectivity comes and goes, and my attention span is shorter because I can’t really multitask like I can on my desktop.
- Software updates: While this isn’t a new testing problem, for many programmers this problem has largely been solved. Think of Web browsers: If you want to hit the largest markets, you need to support two to four browsers. For each of those, you’ll make some sort of decision on what versions you’ll update, which service packs, and depending on your software, what hotfixes you’ll need to test with (and on what schedule). In addition, regression test automation for applications that run in a Web browser is largely trivial (compared to even five years ago). On mobile applications, there’s way more than three main platforms, each one offering regular updates (often paired with hardware updates), and your ability to effectively automate much of that testing is largely not trivial given that it’s some mix of emulators and actual physical devices.
- Touch-screen sparks new problems: A subset of the complexity problem above, touch-screens (like graffiti before them) introduce a relatively new method of interaction with the product. This likely means increased usability testing and changes in thinking about design (as the article points out). It’s most likely just the latest step in the evolution of that platform.
Software consultants, vendors and project managers are already seeing software project failures and slowdowns resulting from the new recession.
That finding and the advice for software developers and project managers offered in this post comes from interviews I’ve recently conducted with 13 software industry experts and the blogosphere.
“Companies are panicking due to the economy. They’re compressing projects and schedules. As a result, key projects are failing,” Lawrence Oliva — senior consultant/program manager with CH2M HILL, an Englewood, Colo.-based engineering and program management firm –- told me in a recent conversation. “In part, this is the result of trying to get things done faster with less resources.”
Consultants are seeing reductions in software testing being used to cut costs. That tactic is a recipe for project failure, said consultant Karen Johnson in my recent post about why software project fail.)
My sources also spoke of seeing several large, current software projects in crisis mode because senior developers were laid off and the remaining, less-experienced developers didn’t have the know-how needed. Unfortunately, many development groups have been running lean for a long time and running leaner pretty much means stopping development.
“Lean computing is nothing new,” Oliva said. “Companies have been operating at skeletal levels since the economic downturn of 2000, when that downturn caused intense development staffing cuts. It really is sad to think about having to be even leaner.”
That’s the situation, but it isn’t hopeless. Gleaned from my interviews, here are some tips for maintaining software quality and project success during this recession:
Analysts and others advise development teams that lean computing and the agile model can help them do more with less.
Some companies have already made this move and feel ready for tough times. For example, Des Moines, Iowa-based Principal Financial Group –- which provides software products for the financial industry — uses the agile model and has standardized project management, testing and quality assurance processes and tools used by all its development groups.
“Standardizing has reduced redundancies in tools, processes, documentation and more, running as lean as possible,” Principal’s senior IT system analyst Mark Ford told me. Principal uses HP Quality Center to manage development.
Project managers whose companies have not already made cuts to development staffing or budgets need to advise decision-making executives about the business benefits of their projects, industry insiders advised. Also, when cutbacks have taken place or are proposed, PMs need to explain the business impact and guide where the cuts will be made. IBM’s director of Rational offerings, David Locke, offers this advice:
“Always talk in the context of how the project will help the business. If I take this 10% off, I will have the least impact on the business. Give real information, not data. Remind managers that the company creates software to make our businesses more streamlined, more competitive and to deliver better ROI. Your company may need immediate cost savings, but eliminating software that could give the business the most cost-effective competitive edge is probably not the way to go. Talk about business impact, always.”
This is certainly an appropriate time for PMs to push for investments in software testing automation tools, like automated code checkers, experts told me. However, Oliva added:
“Don’t trust in those automated systems absolutely. You could get the wrong results from the use of automated systems.At some point human beings have to be involved. Delivering quality software is not a machine-to-machine process. Not having quality monitored and managed by humans is dangerous.”
Another suggested labor- and cost-saving move is adopting streamlined models of programming, like Extreme Programming or Structured Programming.
The most important strategy for weathering the recession is becoming more realistic about software needs and reducing complexity of software. “If companies do this, it could be the one good thing that comes from the recession,” Oliva said.
Continue the conversation: Please tell me what adjustments you and your development team are making to deal with the tight economy. You can respond by commenting below or writing to me at email@example.com.
Here are some resources for more advice and information:
- Find out what this software developer did when his project’s staffing went south in Survivor’s lessons in test management.
- There are many links to articles on the recession’s impact on software projects in this blog post which offers advice on succeeding during a recession or anytime.
- Ryan Martens writes about how to make good staff reduction decisions.
- In this article on project management, Lawrence Oliva discusses
the economy’s impact on software projects and strategies for PMs.
Whether the CWE/SANS list of the 25 most dangerous programming errors will contribute to the creation of better software depends on whether managers, rather than developers, read it and take action, according to Jack Danahy, chief technology officer and co-founder of source code vulnerability analysis firm Ounce Labs Inc. I talked with Danahy today about the follow-up and follow-through that could make the list a valuable turning point in development, rather than a partially-remembered checklist.
“It’s one thing to come up with a list of 25 things that developers should consider, but we haven’t arrived at a point where anyone is meaningfully asking or requiring developers to consider these things,” Danahy said.
Project managers should support developers spending time to research these issues, in Danahy’s view. The best-case scenario would be that software development managers -– the program manager, business unit manager, software auditor, etc. -– would use this list in specifications, asking for metrics to make sure that those errors have been looked for and eliminated before the software rolls into production.
“Developers won’t remember this list off the top of their heads, but if it becomes codifed as a requirement they will remember,” Danahy said. “A team could come up with distilled list of 5to 12 key design criteria that would provide the essence of keeping these errors from happening.”
What will your organization do with this list? Will it have an impact or be quickly forgotten? Sound off in our comments below or by writing to firstname.lastname@example.org.
In a small cry of victory today, someone on the team found this article from the BBC detailing the “top 25 most dangerous programming errors.” I say small cry of victory, because he had recently logged a ticket in JIRA detailing one of those errors, but when the ticket came up for review it was ignored. It was acknowledged as an issue, but pushed to a later sprint for work.
While I agree with the reasons we used when we prioritized the ticket, for me this incident demonstrated a common pattern I see in the teams I’ve worked with. First, there seems to be an expectation that the testing team shouldn’t be looking for errors like this — that is, unless you’re a high-priced security tester. Second, that when issues like this are found they take a backseat to the more traditional functional defects.
I like research like this (both SANS and OWASP do great work in this area), because it gives me a way to structure the conversations that take place when these issues come up. I find that programmers typically respond well to links to catalogs of errors with descriptions. They are unrelated to the software they are working on. It makes it less personal I think — less close to home.
That said, these issues aren’t always burning, top-priority issues. Like I said, in the context of our current project where we found one of these, we have time to fix it. Given the current list of commitments, the issues we know we need to work, and the relative risk of this causing a problem — this specific issue can be sidelined for a couple of weeks until we get to it. That perspective is important as well, and it’s one that I sometimes forget. There’s always a business story to tell as well with issues like this — context is important.
For those interested in the more details on the list of programming errors, you can find the full list of errors from SAN here.