At Jana, we have invested time and effort in to our Continuous Deployment infrastructure enabling us to deploy our server side changes more than 10 times a day. Continuous Deployment requires several key features: automatic deployment to staging, running our integration tests against staging, automatic deployment to production, and using feature switches to give us an additional security blanket as we roll out new features or major code refactors. Each of those areas could be their own blog posts, and the faster we can accomplish each of those tasks, the more often we can deploy to production. Today I want to talk about integration testing.
Our integration tests come in two major flavors, browser testing, and API testing. Integration testing is definitely more complex than unit testing by the sheer fact that there are more moving pieces. Most developers will quickly find themselves writing an integration test that passed on their local development machine, but it doesn’t pass on another developers machine, or on staging. This is often caused by a timing issue because of environment differences.
Imagine this simple flow for testing an API endpoint:
- Make an API request
- The API request triggers some kind of background task or async processing
- Make another API request to validate the change
Or for browser testing:
- Interact with the browser, submitting a form, or clicking a link that requests data
- The form submission triggers an AJAX request
- Assert new content appears in the browser when the AJAX request completes
In both those examples, the async nature of step 2 adds a variable amount of time before you can validate the expected change. For most people, the first natural response is to add sleep time, trying to find the minimum amount of time until it passes locally. This is problematic because the processing time of the async method is different on different environments. On a faster environment, you are wasting time with the explicit sleep. On a slower environment, it’s very likely you won’t be sleeping long enough and then the assertion will fail.
Why does this matter? If you have 1 test that sleeps for 1 second, practically speaking, it doesn’t matter. If you have dozens or hundreds (or more!) of integration tests, and each of them has some sleeping, you may be adding many minutes to each test run. Speed matters. It matters for your CD pipeline. It matters to you as a developer so you can run them all often.
Here are two alternatives to using sleep that we use regularly.
In the browser
Most mature browser based testing frameworks have the idea of a poll and wait (http://selenium-python.readthedocs.org/waits.html). The idea behind this wait is that your test will continue to check the DOM for your data, up to a timeout, rather than check once and fail.
Our API integration tests using the python requests library to make requests to the API. Here we are going to use sleep, but still trying to minimize the amount of sleep. Let’s say we are using the API testing example from above:
Yes, the second snippet still uses a sleep statement, but it attempts to minimize the amount of time spent sleeping. In this extreme example, if the async processing sometimes happens in 3 seconds, we don’t have to wait for 30 seconds, saving us 27 seconds!
Multiplying this gain across a number of tests yields us several minutes of time saved on our test runs. This keeps developers happy, makes our CD pipeline faster, and means we can deploy changes for our users more frequently!
Our current continuous deployment pipeline from the time a pull request gets merged to production deploy is roughly 30-40 minutes. Are you struggling to get to continuous deployment? Have a tips that shaved off significant amounts of time from yours? Let us know in the comments!