The Limitations of Automated Testing

September 3, 2020
Stephen Margheim
Stephen Margheim

There are few things more important when selling software than the quality of the software itself. If your software doesn't do what users expect it to do, your business can only go so far. This means it is essential that you can [1] know what your users expect the software to do, and [2] know that your software does this. Lately, I have been thinking about the limitations of automated testing in light of these two foundational requirements of healthy software product development.

Benefits of Automated Testing

Before thinking about the limitations, let's first consider how automated testing does help us meet user expectations. Your development team is, in many ways, your first and most important users. Automated tests allow these "users" to very clearly describe how they expect the software to behave. Indeed, when you can describe how the software should behave with such precision that a computer can check, you can then check that the software behaves in the described ways quickly and repeatedly. These are massive benefits of automated testing, and these benefits have driven the commercial software industry to take automated testing seriously over the last few decades.

Limitations of Automated Testing

However, there are several limitations to automated tests, even when it comes to simply confirming that the software behaves as expected. The first limit exists because of the expectations encoded in the test suite. A large majority of the time, the developer who implemented the feature is the one writing the tests. It is difficult for that developer to consider scenarios other than those he had in mind when building the feature. That means possible edge-cases may not even be considered, let alone tested. 

There is also this reality: any software of sufficient value that others will pay for is essentially, by definition, too complicated for any human mind to keep track of its every moving part. So, it is impossible to fully articulate this complexity in the form of an automated test suite. Even with a robust automated test suite, users will find bugs because no development team can anticipate every possible combination of states or conditions users will find themselves in. Moreover, no development team can foresee the myriad (often different) expectations users will have for how the software will and should behave.

Testing Single Page Applications

These limitations are then exacerbated when building a "single page application" (SPA). In a traditional web application, a large percentage of the computation occurs on the backend server. The results (in the form of HTML, CSS, and JavaScript) are then returned to the user and handled by the browser. When the software's core complexity is centralized on a server, it is possible to test this software in a highly-similar environment (on some continuous integration server, for example). Single-page applications, however, offload the core computation to the user's browser. While a development team is expected to know, with a high degree of precision, the details of their server environment, it is difficult to anticipate the details of your various users' browsers. They can be using one of multiple different browsers, which run one of a few browser engines, on top of one of a handful of operating systems, with some combination of browser extensions, configured in numerous possible ways, on one of multitudinous devices. 

While all web software runs in some particular combination of these various possible conditions (and others), the added complexity that SPAs introduce by having the primary logic computed in some particular environment out of the literal millions of possibilities strongly limits the efficacy of automated tests. Your team can either run automated tests within one particular environment or attempt to run the tests against multiple environments. However, the latter option linearly grows the time cost of your test suite while also increasing the test suite's complexity, as it now is required to be generic enough to be run against your chosen various environments.

Reconciling Developer and User Expectations

It is ultimately impossible to confirm that your software does what users expect without having actual users use your software and confirm that it behaves as they expect. This is, of course, what providing software to users is. You give them your software, they run it in whatever particular environment they have, and then they either have their expectations met or not. 

Indeed, in their simplest form, bugs are little more than unmet expectations. Sometimes, those expectations were considered by the development team building the software, but the implementation was faulty under certain conditions. Sometimes, however, the expectation was never considered by the development team. From the point of view of the customer, however, there is no difference. They expected the software to behave one way, to be structured one way, to provide certain information in certain places, and it didn't. 

Development teams often organize this feedback based on what the original considerations were. If the team initially wanted the software to behave in the expected way, the customer reported problem is a bug. If the team didn't want the software to behave in the expected way, the customer reported problem is a feature request. In either situation, though, the developers and product managers need the same core information:

  • what did the user expect to be the case 
  • what was actually the case 
  • and, why was that unmet expectation a problem 

Of course, as every software development team knows, users simply won't or can't provide such clear feedback. A vast majority of users don't even report their unmet expectations, and those that do often offer scant details and unfocused opinions.

Getting Useful Feedback

There is simply no better feedback than well-structured feedback from real people who are really using your software. And there is no better way to gather this feedback than crowd testing. This is because it is not sufficient to provide your software to real users who will run that software in a real environment. Those users must also report back to you, with useful detail, how the software behaved, what expectations they had, why, where, under what conditions, etc. 

Automated tests provide your development team useful details whenever they fail, but they can never provide the expectations themselves. Your development team will have to encode those as tests themselves. Your real users are always "testing" your software in a real environment, but rarely do they provide you with useful details about what expectations they had, in what particular situations, and how the software behaved differently. On their own neither of these is enough.


This is the power of a crowd — real people who will really run and use your software, and who are trained to provide clear, useful details about what they did, what expectations they had, and how the software behaved. Automated testing is essential to building and maintaining quality software that users are happy to pay for; after years of discussion and growth, essentially every commercial software company knows this. It is time now to acknowledge that while essential, automated testing is necessarily limited. Luckily, where it is weak, crowdtesting is strong. There are few tools in a company's toolbox quite as well attuned to providing how your users expect the software to behave and whether it indeed behaves that way in the vast sea of possible environments within which your software will be used.

If you'd like to try crowdtesting out for yourself please reach out here.

Read More

Ship Faster, Sleep Better

Get a Demo