Uncharted Waters

Aug 23 2017   3:19PM GMT

What is Good Software Testing? BRR I TV

Matt Heusser Matt Heusser Profile: Matt Heusser

Tags:
Continuous delivery
Continuous deployment
Continuous integration
DevOps
QA
quality
Software
Software testing
Testing

What is Good Software Testing - Association for Software TestingLast week I was at CAST, the Conference for the Association for Software Testing, known for challenging Q&A sessions referred to as “open season.”

One of the questions was “What does good software testing look like?”, which I found fascinating.

Take two teams. One has strong tooling that runs for every commit, that scales out to a grid of 256 simultaneous servers, giving feedback in five minutes. Another team has a component architecture; they make small changes and deploy them independently. The second team has high confidence that a change only impacts that one component, automated “contracts” to check correctness, but no end-to-end GUI checks. Instead, team two does the rollout, checks a few flows in production in a test account, and relies on incremental rollout and intense monitoring to find problems.

Both of these could be excellent software testing in context. For that matter, good testing for a pacemaker or avionics controls could also look very different. I want to describe a model, a sort of checklist, to see if testing is “good”, and what gaps exist.

Between the question and lightning talks later that day, I came up with BRR I TV.

What is good software testing?

Here’s how those break down.

Backwash. If the information from test to development is coming back as unable to reproduce, or needs more information, it probably isn’t good testing. Good testing has low backwash.

Relevant. If the  information from test is relevant, it’s probably good testing. If there is a concern that testing is finding “the wrong bugs”, spending time on “the wrong things”, or the bugs come back as WONTFIX or NOTABUG, then there is a real risk that testing is irrelevant. Likewise if test produces large documents and reports that no one really reads – that’s not a good sign.

Reaction of extended team. People will observe testers in action. If the concern is that testers are wasting time, spending a lot of time documenting things that don’t matter or simply “translating” from the language of specification to the language of testing documents, then there is a problem.

Importance. Do the testers find defects that matter? If the important, show-stopper bugs are consistently found in production, then the group might not be doing good testing. Note: Where relevance is wasted time on non-issues, importance is about missed issues.

Time to get feedback. What is the time period from build available to test to information flowing back to development? If it’s measured in minutes, the team is probably fine. If programmers fiddle for days or weeks waiting for feedback, even if they can move forward on other projects, there may be a problem.

Time to release. Assuming no show stoppers, how long does it take from build ready to deploy to approved to deploy? Shorter is better.

Volume of feedback. This is the most context-sensitive item. Sometimes, just knowing there are 3 bugs found, two of which are blockers, is all the product owner wants or needs. Generally, though, more information is better. Testers that make information about usability, user experience, performance, and performance trends over time will be more valuable. As an example, let’s say performance is still within benchmarks, but has been trending to slower. If the trend continues for three  more sprints, performance will hit a ‘yellow’, or moderately unsatisfactory state. Or perhaps international characters are out of scope, but the application continues to increasingly degrade when they are used. As long as the customer wants this type of information, it is likely much more powerful than a straight bugs list. Likewise, the team could add a characterization in plain text of the software, known risks, and areas not covered in testing.

Putting It All Together

Brr I TV is a heuristic; it is a guideline to analyze your testing. If your team can describe it’s test process hitting all those bases, then you are in a much better position than a team with “holes.”

Then again those holes are an opportunity to improve, so you can use BRR I TV as an assessment model.

And, of course, there is the reality that I recently just made this up. It is likely missing entire dimensions of performance. It’s a start, a collection of things I tend to see in high-performance teams and fail to see in the low ones.

What am I missing?

 Comment on this Post

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when other members comment.

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

Share this item with your network: