At Jana, we like to measure things. From high level business KPIs to low level operational metrics, our goal is to instrument it all.
One of the most important things for us to measure is API performance. When your architecture decomposes frontend API calls into more granular calls to other APIs, the downstream implications of a slow service call can be enormous: everything from a slow user experience waiting for operations to complete to failed recharge transactions due to service timeouts.
At a recent hackathon, I decided it’d be fun to measure our API performance by capturing how long a request takes to service once it has reached our python HTTP servers (that is, excluding any frontend load balancing, intermediate proxies, etc). Since we already had Graphite setup in our infrastructure, statsd seemed like a natural fit for this. Once we had that setup, the question became: how do we instrument our API code to measure timings in a minimally intrusive fashion but with maximum coverage?
This is where a WSGI middleware can help. Briefly: a middleware lets you “play both sides” of a request; that is, the server and the app. With a middleware, we’re able to intercept calls incoming to our flask app, start a timer, dispatch the request and emit data to statsd once the request has been completed.
However, one non-obvious wrinkle for those new to WSGI is buried in PEP 3333:
When called by the server, the application object must return an iterable yielding zero or more bytestrings.
At first blush, you may be tempted to write code like the following handwave-y implementation to perform the timing:
def __call__(environ, start_response): start = time.time() result = wrapped_app(environ, start_response) end = time.time() emit_timing_data(end - start)
The issue is that an underlying WSGI-compliant app is permitted to return an iterable — this can be a concrete list of bytestrings, or it can be a generator (if the app wants to stream data back to a client, for example). Additionally, the PEP states:
If the iterable returned by the application has a close() method, the server or gateway must call that method upon completion of the current request
Since we are “playing both sides” we should be sure to call close() on the iterable if it is present. Here is a better solution:
def __call__(environ, start_response): start = time.time() iterable = wrapped_app(environ, start_response) for result in iterable: yield result if hasattr(iterable, 'close') and callable(iterable.close): iterable.close() emit_timing_data(end - start)
A complete middleware implementation can be found here.