Implementation Details¶
Request Id Management¶
Propagating incoming request details through the system is at the very heart of any observability attempt. The simplest case is copying a header from the request to the response and making the same identifier available in emitted log messages. It is a simple concept and has a profound impact on your ability to trace a message through the internals of a service and through the entire system. The implementation is spread throughout a few classes:
RequestIdPropagator
ensures that a named request header is stored as a first-class property on the requestHeaderRelayTransformer
then ensures that the property value is added as a response header of the same nameRecorder
is atornado.web.Application
mix-in that installs instances of theHeaderRelayTransformer
when your application callsadd_divak_propagator()
Logger
is atornado.web.RequestHandler
mix-in that adds alogging.LoggerAdapter
instance and inserts the request ID property from the request into the logging context – this makes%(divak_request_id)s
work in log messages
Simply making the request ID available throughout the application is a bit more work than one would expect. The process relies on an under-documented Tornado feature called transformers. A transformer is a function that is called immediately after creating the request handler instance. It is an injection point that can extract information from an incoming request or even modify the request object before it is processed. The transformer function returns an object instance that is called immediately before sending the HTTP response line and again on each subsequent outgoing chunk. The simplest implementation is a class since the class itself is callable and returns an instance. The following class is a simple no-op implementation of a Tornado transformer.
class NoOpTransformer(object):
def __init__(self, request):
pass
def transform_first_chunk(self, status_code, headers, chunk,
include_footers):
return status_code, headers, chunk
def transform_chunk(self, chunk, include_footers):
return chunk
The application installs the transformer by calling
add_transform()
. Then it is called on each
request. The detailed sequence of events during request processing is:
- The
tornado.web.HTTPServerRequest
instance is created. - The HTTP start line and headers are read and stored in the request object
- the
tornado.web.RequestHandler
sub-class is determined by the application routing rules. - A new instance of the request handler is instantiated with the request
instance. The
initialize()
method is called at this time. - The transformer functions are invoked with the request object. In the simple case, new transformer instances are created.
- The
prepare()
method is called on the request handler instance. - The request handler’s “action method” (e.g.,
get
,post
) is invoked. - The request is “finished” if necessary.
A few very important details about this sequence of events:
When the request handler instance is created in step 4, the request has not passed through the transform functions so:
- The request handler cannot rely on any information that is injected by a
transformer inside of
initialize
. - The request handler has a chance to modify the request instance before it is transformed.
- The request handler cannot rely on any information that is injected by a
transformer inside of
Request handler SHOULD NOT call
finish()
from withininitialize
. Doing so wreaks havoc with any installed transforms because they will be created and initialized after the request is already finished and are never called again.Since
prepare()
is permitted to be asynchronous, transformers need to be per-request instances or completely stateless. Ifprepare
yields, then there could be parallel active requests so the transformed SHOULD NOT contain state.
There is a rather glaring omission in the sequence of events. When are the
transform methods invoked? transform_first_chunk
is called BEFORE
the HTTP response line and headers are written so they can influence the
response headers and status code after the request handler is done with
them. transform_chunk
is called for each body chunk from within
write()
.
The extra wrinkle for the HeaderRelayTransformer
is that the name of
the header to relay is configurable so we cannot simply let Tornado create
a class instance for us – we need to get the header name into the transformer.
Instead, RequestIdPropagator
implements a factory method that
returns a new HeaderRelayTransformer
instance and installs the
factory method as a transformer function. The HeaderRelayTransformer
instance inserts the divak_request_id property from the request into the
response headers if it is set.
Logging¶
The Recorder
mix-in (over tornado.web.Application
) ensures
that it is safe to refer to the %(divak_request_id)s
in any log formatter.
During initialization, the Recorder
instance installs the
DivakLogger
class as the “logger class” by calling
logging.setLoggerClass()
. Future calls to logging.getLogger()
will return instances of DivakLogger
which is a
sub-class of logging.Logger
. This makes the request ID available for
logger instances created after the application is initialized but it doesn’t
address existing logger instances.
A DivakRequestIdFilter
instance is inserted into
existing logging.Handler
instances so that any existing loggers can
safely refer to the request ID in their messages as well. This is done during
application initialization after setting the logging class. Existing loggers
are visited via the logging.Manager
global instance. If the assigned
handlers do not have a DivakRequestIdFilter
, then
one is added to the filter chain.
That makes it possible to refer to %(divak_request_id)s
in log formats but
it doesn’t actually get the value into the log messages. This is where the
divak.api.Logger
mix-in comes into play. It’s raison d’êtra is to
add self.logger
to a RequestHandler
instance but it
does one other thing for you – it inserts the
self.request.divak_request_id
into the logging context for you. This is
what makes the request id available in log messages from your request
handlers.
The next place that we want the request ID to show up is in the Tornado access
log lines. Recorder
re-implements
tornado.web.Application.log_request()
so that it passes the current
request ID into the Tornado logger. What it does not do is add the request ID
to the log format – you are required to do that if you wish. See
Request Logging for the details.
Implementation Details¶
DivakLogger¶
-
class
divak.internals.
DivakLogger
(name, level=0)¶ Extends class:logging.Logger to ensure that divak_request_id is set.
This class is installed via
logging.setLoggerClass()
to ensure that thedivak_request_id
attribute is set on all records.
DivakRequestIdFilter¶
-
class
divak.internals.
DivakRequestIdFilter
(name='')¶ Logging filter that sets the divak_request_id attribute on records.
This makes it possible to use
divak_request_id
in log formats without having to do additional work. You shouldn’t need to tinker with this yourself sincedivak.api.Recorder
does it for you when a instance is created.
HeaderRelayTransformer¶
-
class
divak.api.
HeaderRelayTransformer
(header_name, request)¶ Tornado transformer that relays a header from request to response.
Parameters: - header_name (str) – the name of the header to relay from request to response
- value_factory – callable that generates a new value
Setting value_factory to
None
disables the generation of response header values when the header is missing from the request.This class implements the under-documented Tornado transform interface. Transforms are called when the application starts processing a request. Per-request instances of this class are returned from
RequestIdPropagator.handle_request()
with the appropriate header name and value to insert when the first chunk is processed.-
transform_chunk
(chunk, include_footers)¶ Called to transform subsequent chunks.
Parameters: - chunk – the data chunk to transform
- include_footers (bool) – should footers be included?
Returns: the possibly transformed chunk
This implementation returns chunk as-is.
-
transform_first_chunk
(status_code, headers, chunk, include_footers)¶ Called to process the first chunk.
Parameters: - status_code (int) – status code that is going to be returned in the response
- headers (tornado.httputil.HTTPHeaders) – response headers
- chunk – the data chunk to transform
- include_footers (bool) – should footers be included?
Returns: the status code, headers, and chunk to use as a tuple
This method will inject
request.divak_request_id
into headers if the value is notNone
. The remaining parameters are passed through as-is.
initialize_logging¶
-
divak.internals.
initialize_logging
()¶ Ensure that LogRecords have a divak_request_id defined.
This is called during initialization to ensure that
logging.LogRecord
instances will always have adivak_request_id
property. This makes it possible to refer to the request id in your log formats.Firstly, it installs a new
logging.Logger
class by callinglogging.setLoggerClass()
that sets the attribute in it’slogging.Logger.makeRecord()
method if it not already present. Then it spins through the existing loggers and addsDivakRequestIdFilter
to any handler that doesn’t already have one. This process ensures that new and existing loggers can refer to the request id in log formats.