Concurrency defects in multi-core or multi-threaded applications are probably the source of more troublesome problems than other defects in recent history, Coverity’s Mark Donsky told me recently. These defects are tricky, because they are “virtually impossible to reduce reliably,” he said, and “can take months of painstaking effort to reproduce and fix using traditional testing methodologies.”
Donsky filled me in on some of the most common concurrency defects and spotlighted the three that are currently causing the most problems: race conditions, deadlocks and thread blocks.
- Race conditions describe what happens when multiple threads access shared data without appropriate locks. “When a race condition occurs, one thread may inadvertently overwrite data used by another thread,” he said. “This results in data loss and corruption.”
- A deadlock can occur when two or more two or more threads are each waiting for each other to release a resource. “Some of the most frustrating deadlocks involve three, four or even more threads in a circular dependency,” said Donsky. “As with race conditions, deadlocks are virtually impossible to reproduce and can delay product releases by months as engineering teams scramble to isolate the cause of a deadlock.”
- A thread block happens when a thread invokes a long-running operation while holding a lock that other threads are waiting for. Although the first thread will continue executing, all other threads are blocked until the long-running operation completes. Consequently, Donsky explained, “a thread block can bring many aspects of a multi-threaded application down to a grinding halt.”
Concurrency defects showed up very often in last year’s Coverity Open Source Report, an analysis of over 50 million lines of open source code, and continue to pose a big problem, said Donsky.
Other common, crash-causing defects include null pointer dereferencing, resource leaks, buffer overruns and unsafe use of returned null values, according to Donsky, director of project management for San Francisco-based Coverity, maker of software integrity products.
Coverity’s open source security Scan site has been catching some headline-making defects recently, inclulding the 0day Local Linux root exploit. “As part of the Scan program, we reported this issue back to key Linux developers so that they could respond to this vulnerability,” said Donsky.
Catching software defects before they go into production, said Donsky, is the best way for your software not to make the wrong kind of front-page news.
Steve Souders, author and respected authority on page performance, issued a call to arms for software users, developers and testers to improve the performance and power of web sites. At The Ajax Experience 2009 in Boston yesterday, he gave a state-of-page-performance overview and tips on boosting performance in a session called “Even Faster Web Sites.”
“We are really in the infancy of page performance,” said Souders, who encouraged new perspectives on the development and maintenance of sites.
Focusing on the web site back end performance is an immature practice that’s not productive, Souders said. “The back end only makes up about 20% of the total load time, and still the largest and arguably most important component the front end goes largely ignored,” he said.
Souders recommended the use of tools available online, such as Yslow, a program that Souders created, as well as Google’s Page Speed. Both can help accelerate load times and overall performance.
Addressing load times is particularly important, he said, noting the adverse effects of slow load speeds on views, traffic, and revenue.
The key to online success, said Souder, lies in attention to detail, user experience, efficiency, speed and, of course, the front end.
Stay tuned in for more from The Ajax Experience 2009.
In college, I knew two guys who tested DVD players. Their job was to watch movies. Even better, they got to pick the movies! What a sweet deal right? Who wouldn’t want that testing gig?
Well, there’s a catch. There’s always a catch. While they could pick any movie, they had to watch the movie hundreds of times. Repeatedly. Again and again. I suspect that the two of them could recite every line of Gladiator in reverse order. Apparently, after you’ve seen a movie, any movie, a couple hundred times, you don’t like it any longer.
Why, might you ask, would they need to watch the same movie over and over? Well, they were testing different aspects of the audio and video drivers in the DVD player, along with the consumer-facing software that ran the device. The only way to do that, is to actually watch a movie. And the only way for you to be able to detect a small and subtle issue with the video rendering would be if you knew the movie by heart and could recognize even the smallest detail being out of place.
Talking with these fellows opened my eyes to what testers refer to as the oracle problem. An oracle is mechanism by which a tester recognizes a problem with the software. When you have an oracle — an expected result, a requirements document, a previous version of the product, etc. — you can determine when things are working and when they aren’t. For most of us, that’s text or a picture in a document that either does or doesn’t match what we see on the screen while we’re testing. For these guys, it was their memory of how the movie should sound, look, and “feel.”
The oracle problem is that all oracles are fallible. For example, requirements specifications are incomplete, contain conflicting information, or are ambiguous. Expected results in a test case only detail out a small portion of what’s expected, the tiny little portion of the application and functionality the test is designed to expose. Oracles are hard to find, require work to effectively use and always leave a tester wanting for more. That’s the problem.
For the DVD testers, the test case was the movie, and the oracle was their memory of it. Make a configuration change, watch the movie. Get a new version of a driver, watch the movie. Use a feature of the consumer-facing software, keep watching the movie. They were the oracle. They identified the problem. This context helps highlight the role we as testers play in interpreting and exercising the various oracles that we apply.
Take a second to think about the test oracles you use on a daily basis. Where do they come from? What rules do you use to interpret them? When there are ambiguities, or gaps in information, where do you go for disambiguation? What role do you play in interpreting how the oracles you use are applied, or in determining to what degree a test passes or fails?
And, if you’re not up for reflective questions about the work you do every day, instead just think about what movie you’d choose to watch over and over again — day in and day out. It’s a difficult question, because whatever movie you choose, you’ll never want to see it outside of work again.
Virtualization and virtual lab management systems can cut application testing and QA times significantly, thus speeding development, GlassHouse consultant Rob Zylowski said in our interview at VMworld 2009 in San Francisco. Yet, he estimates that only about 10 percent of application development teams using virtual lab managers like VMware Lab Manager and VMLogix LabManager 3.8.
Most adopters of virtual lab management software are doing development, testing and QA in the data center to do system troubleshooting. “That’s a good use, but it’s not as powerful as taking virtual lab managers fully into the application development, test and QA departments,” Zylowski said.
The learning curve and developers’ resistance to giving up their in-department servers are two barriers to adoption, Zylowki said. Those barriers are insignificant when compared with the savings in development and testing time and reduction of team conflicts and repetitive work enabled by virtual lab managers.
A key value of virtual lab managers is the ability to take snapshots as developers code and quality assurance (QA) testing is done. “Systems like VMware Lab Manager give incredible power to developers to troubleshoot,” said Zylowski, director of virtualization services for Framingham, Mass.-based GlassHouse Technologies Inc.
Zylowsky talks about more issues related to and uses for tools like VMware Lab Manager in this video excerpt from our interview.
At VMworld 2009 in San Francisco this week, I saw and videotaped a demo of the new CA Wily Application Performance Management (APM) software’s visual mapper. Announced on April 28, CA Willy APM provides transaction monitoring across distributed systems.
In this demo, CA senior consultant and Wily product manager Brett Hodges shows how transactions and processes can be visually tracked and tweaked with CA Wily APM. Along the way, he describes uses and benefits for software test and QA teams. He also demonstrates SOA management tools that help pinpoint potential failing services.
For more news, views and videos from VMworld 2009, check out this blog and SearchServerVirtualization’s VMworld roundup.
Chris Wolf, virtualization expert and Burton Group senior analyst, revealed lesser known facts about software licensing in a crowded session at VMworld. He described terms, conditions and fine-print details that should not be overlooked, especially when dealing with Microsoft licenses.
Wolf began by commending IBM in making great progress with their licensing contracts in the last year; but he was quick to add that substantially more work would be necessary.
Wolf cracked jokes while guiding his audience through a minefield of licensing gotchas. Here are some of the more interesting points from his popular VMworld session:
- License tracking by physical resources is complex.
- When dealing with Microsoft licensing, read the fine print. Many times you cannot transfer licenses from one server or machine to another tariff free. Fees will add up quickly.
- Watch out, many software licenses are bound to physical hardware.
- Licenses assigned cannot be be reassigned, there are exceptions.
- Microsoft is within its legal rights to charge for additional licenses and transfers, migrations, etc. If you run into problems remember that Microsoft is not a monopoly and that there are are other choices of equal functionality.
SearchServerVirtualization.com offers more VMworld coverage.
The VMworld 2009 opening keynote began like a rock concert with pulsing music and a light show as the morning’s keynote speaker, Tod Nielsen, took the stage. Nielsen, VMware’s Chief Operating Officer, introduced the conference venue with exciting news for the company’s multiple products and services.
None of which measured up to the audience’s response when Nielsen was joined on stage by Steve Herrod, CTO and vice president of R&D for SpringSource. Herrod reviewed VMware’s plans for SpringSource, which he’d already blogged about in August.
SpringSource, a commercial open source company, started by creating tools that competed with Java; but in the last few years has released a more diverse tool developer. Combined with the VMware infrastructure, Herrod said, SpringSource’s software will bring advanced development and storage capabilities to data centers.
“I am very excited about this,” said Rob Zylowski, Director IP for GlassHouse Technologies Inc. “With VMware and SpringSource working as one, they will be formidable competition for Apple’s cloud powered initiative.”
Chris Wolf, Burton Group senior analyst, is another industry expert who voiced enthusiasm, Twittering and blogging about the SpringSource-VMware combination throughout the opening day of VMworld. On his blog, he wrote:
“Rod Johnson did a tremendous job with the SpringSource demo. Giving application owners an interface to provision an app locally, or to an internal or external cloud was spot-on. IT service delivery requires IT operations to give application owners and individual business units interfaces that they understand… this is a technology VMware ships should begin working with in their labs.”
Stay tuned for additional blogs, videos and impressions from VMworld.
Revision-controlled documents such as procedures and work-instructions are a beautiful thing. Sure, the work required to establish the documentation up front can be significant. But the reward can pay off with smooth transitions between staff members or departments as well as consistency across systems.
A good practice point that I have found useful to ensure that procedures and work instructions are correct is to engage another person to literally pick up and run with it. This can be a new hire, temporary employee or even an existing IT staff member that has developed strengths in other areas. We frequently strive to develop procedures and work instructions so that “even a monkey could do it.” But, how frequently do we actually do that?
The value of a hand-off for procedures and work instructions can be measured by its effectiveness to a new person assigned to work in technology areas related to the documentation. This will identify issues such as:
-Out of date versions of software titles
-Updated procedures that may have changed
-Clarity of the procedures
-Ensuring that nothing is omitted from the steps
-Identify unforeseen prerequisites (such as permissions)
Other benefits come from validating the correctness of procedures and work instructions as well. The effectiveness is truly measured once a person with no expectation or prior knowledge of the technologies in question is assigned to perform the procedure or work instruction.
“Software development teams work with a wide range of tools, and their biggest challenge is making all the tools work together in a way that’s effective for their software delivery process,” Scott Bosworth, Open Service Lifecycle Collaboration (OSDL) program manager, IBM Rational, told me yesterday.
Today, IBM addressed that problem, announcing new change management interfaces for three IBM Rational products available now: IBM Rational Team Concert, IBM Rational Quality Manager and IBM Rational ClearQuest. OSDL spec support for IBM Rational Change is due in September.
The new IBM Rational interfaces are the first released on OSLC change management specifications released this summer. A 20-member software industry group founded by IBM, OSLC wants to increase tool data interchange via widespread adoption of industry standards. A similar IBM initiative achieved standards adoption for Eclipse client IDE.
“The promise is that in any part of the life cycle in which you need to see a change management interface, you could now integrate with any system that supports OSLC,” Bosworth said.
Bosworth explained that the OSLC change management specification and new IBM interfaces target common problems quality assurance (QA) and testing teams face in the software development process.
QA analysts will be able to use their tools at every step of the development and application life cycle, he noted. Tools are often used only for specific roles in the life cycle, Bosworth said, and they typically have their own ways of storing data and presenting data. This change management specification was driven by the need for integration between quality management tools to be able to find, locate defects stored in a change management system.”
Software testers will be able to upgrade their tool choices as more and more tools provide the interface, the tools become more pluggable, Bosworth said. “They get more choice and easier-to -maintain integrations.”
These kinds of integrations would be applicable to any type of change management systems, according to Bosworth.
“People have existing change management systems for different reasons, like for a project that has some complexity like the technical purchase of a company. They need to have a common way of integrating change management systems with other tools.”
As an example of such integration, Bosworth mentioned recent work on IBM Rational Quality Manager.
“We set out last year to have Rational Quality Manager and Rational Team Concert integrated. We could have done that in a one-off fashion, which is traditionally how these things are done. Instead, we used the OSLC approach in which we defined common set of resource descriptions and described common services interface that would interact with any change management system in a consistent way.”
Over time, OSLC plans to move beyond change management area to requirements and quality management, software estimation, reporting, software configuration management and more domains, Bosworth said.
Recently, I spoke with Alex Adamopoulos, CEO and founder of emergn about his company’s new agile development transition consultancy program, AgilePMO. In these remarks from our interview, Adamopoulos offers advice on agile development process adoption and his views on agile.
Emergn, is a new company, but Adamopoulos’ experience in the software service field is extensive. He is a 20-year veteran and an active blogger.
What is your agile philosophy?
Adamopoulos: A transformation program. If I think about the guiding principles of an agile engagement, they’re the same fundamental principles of a well-run global company.
What are some common problems within PMOs (Project Management Offices)?
Adamopoulos: Even when I was embedded in the outsourcing community, I thought that large enterprises had a methodical process for why they’d select a vendor, manage a project, etc. I discovered that not only did a lot of them not have them, but the ones that do have them are typically by line of business.
Could you offer a hypothetical example of a company with a PMO problem?
Adamopoulos: A good example would be a top bank, in the top three. Their investment banking side, which drives more than half of the revenue, has a PMO, and that PMO is only operated by three people. It’s fragmented across two geographies. Then, if you go to asset management side, you discover that they have one-person shops or half-person shops. That is common for eight out of 10 of our clients.
Usually, they have no metrics or measurements in place. The metrics that exist are rudimentary project metrics that do not even translate into economic numbers or business value that a CIO can sit with his boss and say, “Here’s why we are making these decisions and how they are affecting our company.”
So, it would make sense for them to explore a way to drive it more efficiently. Right?
Adamopoulos: Clearly the largest problem we see is that there is no single project or program governance in place. There is no methodology for how programs should be governed. There is a lot of waste. We see morale being affected.
What are common snags that occur in transitions to agile?
Adamopoulos: Typically, it becomes a land grab. it is very difficult for some organizations to change their existing behavior and their business psychology. Asking them to collaborate and communicate, and be more dependent upon the business in several areas [is a big deal].
The biggest risk is the psychological impact that agile can have on an organization. Right or wrong, many have already settled into their comfort zones. Agile is a very disruptive methodology, not just at the software level but at the cultural level as well. The larger risks are people asking, “How are you going to impact my job, and why? What does it mean to me in terms of the responsibilities I might have?” There needs to be a lot of coaching in the transitioning people out of their current working mindsets and into something new.
Who are emergn’s target customers?
Adamopoulos: Today, the traditional customer for us is in the application development areas of IT; but we are starting to branch out with the AgilePMO product. Our primary target is the enterprise client, meaning the tier-one enterprise, the $1 billion-plus players. That is where the majority where our business is today. Is it likely that we’ll do things below that? Probably, but it would have to be very specific, because agile enablement reshapes a company’s sourcing strategy. Those are pretty important programs, ones that aren’t taken lightly, and we’ve found that the larger companies are more ready to do those than the smaller players.
The economy has been a help for us as opposed to a hurt; the whole drive of saving money, reorganizing, efficiency has supported our model. So, organizations that have very fragmented sourcing programs are the primary focus for us.
How long do you customers need emergn’s consulting services?
Adamopoulos: I am pretty sensitive to the consulting side. I have been a customer. I don’t believe in having people from the B-team or sit there for one, two years and billing against my company.
Maybe I sound old-fashioned, but we definitely want to drive value. For some companies that may take one year or even half. We are currently doing one large scale agile transfer program for one of the UK’s largest utilities that is a 24-month roadmap, but that is something we defined up front.
British Airways is a great example. We did an entire agile transformation for them. Since they are an airline, they have a gazillion projects going on. We have begun applying a number of initial successes into some points of business. How long they’ll take? I don’t know, but in their case they want to see their entire organization become as agile as possible.