Implementation Details

Request Id Management

Propagating incoming request details through the system is at the very heart of any observability attempt. The simplest case is copying a header from the request to the response and making the same identifier available in emitted log messages. It is a simple concept and has a profound impact on your ability to trace a message through the internals of a service and through the entire system. The implementation is spread throughout a few classes:

Simply making the request ID available throughout the application is a bit more work than one would expect. The process relies on an under-documented Tornado feature called transformers. A transformer is a function that is called immediately after creating the request handler instance. It is an injection point that can extract information from an incoming request or even modify the request object before it is processed. The transformer function returns an object instance that is called immediately before sending the HTTP response line and again on each subsequent outgoing chunk. The simplest implementation is a class since the class itself is callable and returns an instance. The following class is a simple no-op implementation of a Tornado transformer.

class NoOpTransformer(object):

   def __init__(self, request):
      pass

   def transform_first_chunk(self, status_code, headers, chunk,
                             include_footers):
      return status_code, headers, chunk

   def transform_chunk(self, chunk, include_footers):
      return chunk

The application installs the transformer by calling add_transform(). Then it is called on each request. The detailed sequence of events during request processing is:

  1. The tornado.web.HTTPServerRequest instance is created.
  2. The HTTP start line and headers are read and stored in the request object
  3. the tornado.web.RequestHandler sub-class is determined by the application routing rules.
  4. A new instance of the request handler is instantiated with the request instance. The initialize() method is called at this time.
  5. The transformer functions are invoked with the request object. In the simple case, new transformer instances are created.
  6. The prepare() method is called on the request handler instance.
  7. The request handler’s “action method” (e.g., get, post) is invoked.
  8. The request is “finished” if necessary.

A few very important details about this sequence of events:

  • When the request handler instance is created in step 4, the request has not passed through the transform functions so:

    • The request handler cannot rely on any information that is injected by a transformer inside of initialize.
    • The request handler has a chance to modify the request instance before it is transformed.
  • Request handler SHOULD NOT call finish() from within initialize. Doing so wreaks havoc with any installed transforms because they will be created and initialized after the request is already finished and are never called again.

  • Since prepare() is permitted to be asynchronous, transformers need to be per-request instances or completely stateless. If prepare yields, then there could be parallel active requests so the transformed SHOULD NOT contain state.

There is a rather glaring omission in the sequence of events. When are the transform methods invoked? transform_first_chunk is called BEFORE the HTTP response line and headers are written so they can influence the response headers and status code after the request handler is done with them. transform_chunk is called for each body chunk from within write().

The extra wrinkle for the HeaderRelayTransformer is that the name of the header to relay is configurable so we cannot simply let Tornado create a class instance for us – we need to get the header name into the transformer. Instead, RequestIdPropagator implements a factory method that returns a new HeaderRelayTransformer instance and installs the factory method as a transformer function. The HeaderRelayTransformer instance inserts the divak_request_id property from the request into the response headers if it is set.

Logging

The Recorder mix-in (over tornado.web.Application) ensures that it is safe to refer to the %(divak_request_id)s in any log formatter. During initialization, the Recorder instance installs the DivakLogger class as the “logger class” by calling logging.setLoggerClass(). Future calls to logging.getLogger() will return instances of DivakLogger which is a sub-class of logging.Logger. This makes the request ID available for logger instances created after the application is initialized but it doesn’t address existing logger instances.

A DivakRequestIdFilter instance is inserted into existing logging.Handler instances so that any existing loggers can safely refer to the request ID in their messages as well. This is done during application initialization after setting the logging class. Existing loggers are visited via the logging.Manager global instance. If the assigned handlers do not have a DivakRequestIdFilter, then one is added to the filter chain.

That makes it possible to refer to %(divak_request_id)s in log formats but it doesn’t actually get the value into the log messages. This is where the divak.api.Logger mix-in comes into play. It’s raison d’êtra is to add self.logger to a RequestHandler instance but it does one other thing for you – it inserts the self.request.divak_request_id into the logging context for you. This is what makes the request id available in log messages from your request handlers.

The next place that we want the request ID to show up is in the Tornado access log lines. Recorder re-implements tornado.web.Application.log_request() so that it passes the current request ID into the Tornado logger. What it does not do is add the request ID to the log format – you are required to do that if you wish. See Request Logging for the details.

Implementation Details

DivakLogger

class divak.internals.DivakLogger(name, level=0)

Extends class:logging.Logger to ensure that divak_request_id is set.

This class is installed via logging.setLoggerClass() to ensure that the divak_request_id attribute is set on all records.

DivakRequestIdFilter

class divak.internals.DivakRequestIdFilter(name='')

Logging filter that sets the divak_request_id attribute on records.

This makes it possible to use divak_request_id in log formats without having to do additional work. You shouldn’t need to tinker with this yourself since divak.api.Recorder does it for you when a instance is created.

HeaderRelayTransformer

class divak.api.HeaderRelayTransformer(header_name, request)

Tornado transformer that relays a header from request to response.

Parameters:
  • header_name (str) – the name of the header to relay from request to response
  • value_factory – callable that generates a new value

Setting value_factory to None disables the generation of response header values when the header is missing from the request.

This class implements the under-documented Tornado transform interface. Transforms are called when the application starts processing a request. Per-request instances of this class are returned from RequestIdPropagator.handle_request() with the appropriate header name and value to insert when the first chunk is processed.

transform_chunk(chunk, include_footers)

Called to transform subsequent chunks.

Parameters:
  • chunk – the data chunk to transform
  • include_footers (bool) – should footers be included?
Returns:

the possibly transformed chunk

This implementation returns chunk as-is.

transform_first_chunk(status_code, headers, chunk, include_footers)

Called to process the first chunk.

Parameters:
  • status_code (int) – status code that is going to be returned in the response
  • headers (tornado.httputil.HTTPHeaders) – response headers
  • chunk – the data chunk to transform
  • include_footers (bool) – should footers be included?
Returns:

the status code, headers, and chunk to use as a tuple

This method will inject request.divak_request_id into headers if the value is not None. The remaining parameters are passed through as-is.

initialize_logging

divak.internals.initialize_logging()

Ensure that LogRecords have a divak_request_id defined.

This is called during initialization to ensure that logging.LogRecord instances will always have a divak_request_id property. This makes it possible to refer to the request id in your log formats.

Firstly, it installs a new logging.Logger class by calling logging.setLoggerClass() that sets the attribute in it’s logging.Logger.makeRecord() method if it not already present. Then it spins through the existing loggers and adds DivakRequestIdFilter to any handler that doesn’t already have one. This process ensures that new and existing loggers can refer to the request id in log formats.