This guest blog post is part of an Atlassian blog series raising awareness about testing innovation within the QA community. You can find the other posts in this series under the QA Innovation tag.

A condensed version of this post was published on January 10, 2012. We’re now reposting the complete article.

Something Smells Fishy

You’re testing a feature, and along the way you spot something that seems unrelated and trivial. Maybe it’s a warning message in the logs that wasn’t there before. Or some page elements are out of place by a pixel or two. Or some odd behaviour that you saw once, can’t reproduce, and probably imagined. Otherwise, the functionality you tested works as expected, everything looks OK, and you really need to resolve the issue and get onto testing the next feature.

You’ve encountered a bad smell.

There are many arguments for dismissing it and moving on. You haven’t seen anything actually wrong, the feature works as expected and most likely the bad smell is just an unrelated distraction. In most cases, you’ll add more value to the product by spending time testing the next feature rather than taking time to investigate the smell.

Alternatively, if you stop and investigate the smell, most of the time it’ll turn out to be just as it appears – a trivial issue or a non-issue. If it’s a trivial issue, it probably won’t get fixed – it isn’t an efficient use of resources, and comes with a fair chance of breaking something more important. If it’s a non-issue, it’s often something specific to the testing environment that doesn’t apply to production.

However, once in a while, you investigate and find something more. The bad smell is merely a symptom of a larger issue that was otherwise unnoticeable… or, at least, unnoticed. By investigating the smell, you’ve prevented a much bigger issue from shipping – possibly even one relating to security or data loss.

A diligent tester could stop and investigate every bad smell he or she encounters, just to be sure. However, in most software development companies there just isn’t the time or the resources to investigate everything to this level of detail – particularly in environments where the dev:test ratio is significantly larger than 1:1. A tester – as does everyone else – has a responsibility to work efficiently and make the best use of their time.

How, then, does a tester tell the difference between a bad smell worth investigating, and time-consuming distraction?

I feel that the answer lies in domain knowledge. A tester who understands the domain is going to be much better equipped to identify which smells are symptoms of larger problems. This includes:

  • Product knowledge – how it works, how it’s implemented, what bug patterns have been observed previously, and what changes have been made recently.
  • Technology knowledge – the language, the libraries, how they work and how they can break.
  • Environment knowledge – the operating system, the database, the web browser, the message passing system, and any other external components that the product relies on.

So, with that in mind, here are some classes of bad smells that I consider worth spending my time following up:

Smells like an Old Friend

Smells like…: These are the familiar smells – you’ve smelt them before and you know what they mean. Most times that the same smell has turned up before, it was a symptom of the same class of bug. The nature of familiar smells will be different for each product, team, technology, and environment, but a skilled tester knows them well and recognises them immediately.

Example smell: While testing a webapp, a tester notices that her user display name, Penny Wyatt <Testing>, appears as Penny Wyatt in the page.

Underlying issue: In this example, the smell is an indication that the HTML special characters < and > aren’t being encoded as &lt; and &gt;, and hence there’s a good chance there’s an XSS security vulnerability.

Implications: The existence of familiar smells is a concern. If there is a class of serious bug that is frequently occurring, yet the only symptom is a bad smell that may not be noticed, then there is significant chance of missing these bugs in future testing. Not only that, it becomes part of the parcel of product knowledge that needs to be taught to new hires. Testers and developers should work to improve automated and manual testing to make this class of bug more visible.

Smells like a False Equivalence

A crucial aspect of testing is knowledge of the underlying application, and the use of that knowledge in equivalence partitioning. That is, the ability to accurately say, “If I test A, I have also tested B and C because they are equivalent”. Often this applies to code reuse – if the same code is reused in multiple places, it should only need to be comprehensively tested in one of them.

Smells like…: Two areas of the application that the tester believed to be the same, behave differently, in some minor way.

Example smell: A tester is testing the user picker UI control in Jira. This control exists on the Create Issue and Assign pages. He believes the code for the control to be shared between the two, and he only performs comprehensive testing of one of them. During unrelated testing, he notices that the user picker on the Create Issue page responds to the Esc key, but the one on the Assign page does not.

Underlying issue: The different behaviour of the control on the Assign page indicates that the code for the control, or the way it is called, is not identical for all instances where it is used. This is an indication that they are not equivalent.

Implications: If the tester has made testing decisions based on false equivalence, these decisions will need to be revisited. In the best-case scenario, the difference in equivalence is confined to the minor difference found by the tester. That is, the code is shared but was called with different arguments, producing the different behaviour. In the worst-case scenario, the codebase for the control in both places is entirely separate. If this is the case, the two controls are entirely independent features and require separate comprehensive testing. Any bugs fixed in one will probably need to be independently fixed in the other.

Smells like a Bad Claim

Smells like…: The application behaves in a way that’s correct, but disproves a claim made by somebody else.

Example smell: A developer performs a significant rearrangement of the way the CSS files are stored on disk. He tells the tester that it does not require testing because the files are still being batched into one file at compile time and sent to the browser as a single file, and that file hasn’t changed. However, after he checks in his changes, the next batch of automated tests pick up that the font used in the footer has changed slightly.

Underlying issue: The unexpected change to the font is an indication that the rearranged files are not, in fact, equivalent to the pre-rearranged files. This indicates that the CSS changes could have caused other, more serious effects elsewhere in the application.

Implications: In normal conversation, pointing out a minor issue like this to disprove a claim is considered nitpicking. However, when the claim has been used as the basis for risk assessments – by testers, devs, dev managers and/or product managers – it is critical to recognise when the claim is incorrect and reassess accordingly.

Bug: Smell of Smoke Coming From Computer. Resolution: Fixed, Removed Nose

One easy mistake to make is for the tester to stop at the first stage, and raise a bug about the smell itself. Unless the development team is familiar with the concept of bad smells, the bug is likely to be treated as a report about the trivial symptom, not an indicator of a larger problem. Hence, one of two things may happen:

  1. The issue is placed at the bottom of the bug backlog or marked as Won’t Fix. This is reasonable – without sufficient information, the triage team has little choice but to judge the bug report on face value.
  2. A developer picks up the bug and fixes the symptom but does not investigate the potential underlying problem. For example, in the case of the CSS rearrangement (see “Smells like a Bad Claim”, above), the developer may change the footer font back, but not work out why the CSS change with no expected side-effects changed the font in the first place. This can exacerbate the issue by making it harder to detect the smell in future.

If the tester feels that the smell is likely to be a symptom of a larger problem but lacks sufficient knowledge of the product in order to investigate fully, it is worth discussing the smell further with others before raising a bug report. Useful sources of information include other testers, developers, support engineers, designers and product managers. If, after this stage, the tester does not know the underlying issue but raises a bug on the smell itself, it should be very precisely worded to make it clear that the symptom is not, in itself, the reason for the report. It is also a good idea to attend the bug triage meeting to explain further and ensure that the bug is handled appropriately.

Turning Bad Smells into Flashing Lights

After identifying a bad smell and tracing it to a serious issue, the tester and developers involved have a responsibility to ensure that the issue is easier to notice in future. After all, if the only symptom of the issue was the bad smell, it could have been missed entirely if it had been tested by a different person, or by the same person on a busier day.

Ideally, automated tests should be written to identify the smell and fail in future if it is detected. In practice this may not always be possible or practical, and so manual testing practices and tools should also be modified to make bad smells easier to notice in future.

The process for making bad smells more noticeable will vary by product, language and environment, but some examples from my personal experience are:

Webapps

  • Using test strings that behave differently when incorrectly encoded, such as setting the display name to Penny ‘”><script>alert(“name”)</script> instead of Penny. This makes issues such as XSS vulnerabilities, incorrect URL encoding and double-escaping extremely visible when manually testing.
  • Having the logs open on another monitor and watching them out of the corner of the eye while using the application. Often an action that appears fine in the UI will generate a stack trace in the logs.
  • Using automated Visual Regression tests that notify the tester when any UI change occurs – intentional or unintentional.
  • Turning on alert popups for JavaScript errors.

Desktop Apps and Webapps

  • Using a computer-generated language pack (pseudolocalisation) by default instead of English, to make it immediately clear when English strings have been hard-coded into the UI.
  • Monitoring system resources, such as CPU usage and memory use, while using the application.

C++

  • In debug mode, wrapping all object allocations and deallocations with a class that tracks them, then throwing an assertion failure whenever a memory leak is detected. This turns subtle memory leak smells – which are hard to pin down to the source – into clear errors.
  • Running with AppVerifier, which turns subtle memory corruption issues – usually detectable as odd, unpredictable app behaviour – into crashes at the time the incorrect memory access occurred.

Encouraging Smell Investigation

If software quality is a goal of the team, it is important to foster a team culture that encourages testers to hone their sense of smell. At first it may be inefficient – it takes time and some trial-and-error for a tester to gain a working sense of which smells are relevant and which are merely distractions. However, if the testers are willing to learn, this is soon repaid in more efficient testing and more serious bugs caught before shipping.

There are many ways in which the investigation of bad smells can be inhibited in a testing environment – frequently unintentionally. Here are some examples:

  • When the progress of testing is tracked entirely through fixed checklists. If there is no scope for a tester to investigate a smell that does not correspond to a checklist item, then the software is guaranteed to ship with any bugs that could have been caught by that investigation. This includes manual testing (only performing manual steps that are listed on a checklist) and automated testing (using a green build as the sole criterion of a feature being shippable).
  • When testers are pressured to sign off on a feature in order to ship as soon as possible. A good tester will make it clear to stakeholders when they are not yet confident about a feature due to a bad smell, even though testing is otherwise complete. However, deciding when a smell is worth investigating is a judgement call, often with little evidence (after all, if there was solid evidence, it wouldn’t be a smell any more, it would already be a bug!). If the stakeholders force the issue, a tester can be forced to doubt their own judgement and sign off regardless. In the worst cases, the tester is later blamed for not finding the bug.
  • When testers are given scope to investigate smells, but not the time. If a tester has 5 days’ worth of feature testing scheduled for the week, and every feature they test contains bugs, it is always going to be more efficient to move onto the next feature than to follow up on a bad smell.

In many cases, these issues are caused when decision makers, without testing experience or knowledge, attempt to micro-manage the inner workings of the testing process based on how they assume testing occurs.

Take Time to Smell the Bugs

In conclusion, I encourage you to get to know your product’s smells better. Keep track of them, get to know them, investigate them, share the useful ones with your colleagues, and do what you can to make the familiar smells more obvious.

Your nose, and your customers, will thank you.

Testing and Bad Smells: When to Investigate Potential Bugs