Thursday, June 16, 2016

AUTOMATION 8 - why automation can check, but not test

We've previously been looking at performing some checks on a simple dice rolling function.

I love using random numbers - they're very simple to understand, but hide very complex behaviour.

The examples we used flushed out some huge problems - automation works best when there's an indisputable condition to check.  If X happens, it's a pass, if Y then it's a fail.

This is how many see testing ... but in truth this is really a check.



When test automation tools started appearing at the end of the last century, people knew they would be useful, but that they also had limitations.  It was really Michael Bolton and James Bach's writing on testing vs checking in 2009 which was the first time I'd seen this put into a model.

In this work, they looked into the core of what we do when we perform testing.  And the truth is manual testers typically go way beyond "just checking".

For instance any good tester when doing their job never really says "fail", it's almost always more "well that's odd ... THAT shouldn't happen".  Rather than just fail a step and continue to proceed, there is almost always some investigation to understand and clarify the behaviour seen.  Likewise, even though an action is as written and expected, there might be something outside of the script or original scope of testing which the tester notices is odd.

Testing is an exploratory and investigative act.  This is how testers dig deep to find and investigate oddities they encounter - the process looks much more like this ...


[By the way - thanks to Michael Bolton and John Stevenson for feedback on this diagram]

Okay, the process of checking is at it's core of this action.  However, automation does not replicate the whole detailed process in red - it's not able to go beyond the parameters it's scripted with.

We saw that with the random dice test previously.  Take a moment to think about the ways we suggested to test it.  In reality as a human tester you'd probably keep rolling a dice until you'd see all the numbers 1-6.  How many dice rolls would be too many for that?  As we said, you wouldn't have a hard rule, you'd just raise an issue if you think you'd rolled to many.  An automation script has to be given a limit.

As a human tester you follow the #testing process above, you start not knowing what number is too many, but as you go along, you might flag an issue if you think it's too many "I've rolled the dice 20 times, and not seen a 6 yet".

Likewise, if I asked you to roll the dice and tabulate the result for 60 rolls, we'd expect to see 10 rolls each of each number.  The reality is that's very unlikely - what will you do if it's only 9 ... or 8?  Deciding whether to flag an issue, or try another test is an application of human judgement.

In my initial attempt to do random distribution for my Java function notice how I tested it - I ran 6 million dice rolls, expecting a clean million each.  I didn't get that, ergo FAIL.  So I tried increasing to 600 million.  Then I tried keep running the test - some numbers came above the magic 100 million figure, some below.  What I did over multiple runs was check to see the same numbers weren't consistently coming out as either above or below a million.

The problem was though - this wasn't an automated check, I was using the automation to run a mass check, and using manual judgement.  Whilst it's useful, that's not the purpose of a unit test - it needs to reduce to pass or fail.

In the end it comes down to alarm thresholds - I run six million dice, I expect them to be a million, give or take a fraction.  If I choose this threshold too low, it will always fail, and hence people will ignore the test after a while.  If I set it too high, it's really unlikely to fail, and hence it's not really checking for anything useful, so is a waste of a check (worst still, it might give me false confidence).

Designing a good automation test then is an exploratory act, as you find out if it's suitable.  It even requires a certain level of testing skills to refine and fine-tune.  But in the end,

  • The final artifact can only check.
  • It can do so, as we saw previously at amazing speeds with good levels of accuracy.
  • But it will never find problems beyond what it's programmed to check for.


Some people see that as a threat to testing as a career and livelihood.  I don't - it's actually an incredibly liberating thing, it means my time as a tester is best used exploratory and imaginatively as previously discussed, rather than running through dull checklists of items I seem to check off every two weeks, some of which I almost tick off by autopilot.

It allows us to push for testing to be a challenging, stimulating, intelligent career.  Anyone can check.  Automation can check.  I am a tester, and I'm so much more.

3 comments:

  1. "Some people see that as a threat to testing as a career and livelihood..."

    We must conclude that this group is composed of one of two subgroups, (1) the sort of person that is a panderer, selling sugar-coated ideas to under-informed IT executives or (2) the rank-and-file member of the testing community that has been seduced by such ideas and really believes them, a zealot for the cause as it were (the latter group is as poisonous as the former since he unwittingly spreads lies among his fellow testers, thinking he is spreading the truth)

    ReplyDelete
  2. The idea that unit tests are simply checks, although true, misses the fact that TDD is to unit testing as exploratory testing is to checking. No TDD'er looks at a test failure and simply "says fail". They note the failure with satisfaction, then move on to implement the code to make the test pass, before coming up with the next test that will inform their path forward. As the implement the new code, they will likely make a few existing checks fail, to which they respond by investigating the failure and addressing it.

    There is a difference in motivation; a tester is simply seeking information about the system under test, while the TDD'er is actually seeking to come out with a working system. Perhaps it might be interesting to explore the difference between TDD and exploratory testing, in light of this difference in motivation?

    ReplyDelete
  3. Excellent piece with a good points here to give some sobering thoughts. I once was told by a developer 'I am surprised there is a need for Testers still' - to which I pointed out that developers still do not have the testing mindset and will always program bugs into their code (unknowingly) that we can find - finding them all is the challenge we face every day.

    ReplyDelete