This is an episode transcript. Visit the podcast website for the full episode, show notes, and freebies.
The 12 factor app states that applications should not concern themselves with storing their log stream. Simply log to standard out or standard error. This works in development because developers can see logs in their terminal. It also works in production because tooling can redirect logs to files or capture and process the streams independently of the application.
My stance on the 12 factor app is that it’s a great starting point but requires amendments. Just logging to standard our or standard error is not enough build robust continuous delivery pipelines. We need to layer logging practices onto the original recommendations.
So, the 12.1 factor app does three things:
- Supports a
LOG_LEVEL
configuration option. - Uses a machine readable format, like JSON, in production
- Generates time series telemetry from logs
Let’s consider each point.
The first point relates to the config factor. More on that in the previous episode at https://smallbatches.fm/6. Applications must support log level configuration instead of hard coding it. Use a low log level like debug in development and info or higher in non-development environments.
Second, logs must be produced in a machine readable format such as JSON. Oh, and no multiline logs. Multiline entries are effectively syntax errors in a log stream. Just avoid them. Using a machine readable format enables new use cases. Error logs may contain stack traces. Contextual information–such as user IDs–may be added to all log entries. Log entries can generate time series telemetry. Log entries may be parsed and routed to different storage systems. warn and error logs may generate alerts. fatal logs may page someone. The list goes on and on.
The point regarding time series telemetry warrants extra attention. Consider nginx or apache. Both output the well known “access log” format. The format includes latencies, response codes, and other information like the origin IP. This single log line contains wonderfully useful telemetry! Parsing the log can generate a histogram on response latencies, incoming request count, percentage of satisfied requests, internal server errors, backend errors, a leaderboard on response codes, and more. That’s enough to understand how the HTTP service is operating.
The same approach applies to internal telemetry. Applications can output time series data to standard out for consumption by downstream systems. This eliminates the needs for third party libraries and external metric collection services in favor of infrastructure level log storage and metric generation. You can see this action with products like DataDog and NewRelic. Both offer centralized log storage, searching, and metric generation. Once metrics are generated, then you have access to the full suite of tools around them such as graphing, monitoring, and alerting.
Alright. Let’s recap:
- Support log level configuration
- Log in a machine readable format such as JSON
- Treat log streams as a telemetry source
These practices are especially useful in growing distributed systems since they shift responsibility out of applications and onto horizontal support layers.
That’s all for this one. Go forth and log.