Watchtower: Python CloudWatch Logging

Watchtower is a log handler for Amazon Web Services CloudWatch Logs.

CloudWatch Logs is a log management service built into AWS. It is conceptually similar to services like Splunk and Loggly, but is more lightweight, cheaper, and tightly integrated with the rest of AWS.

Watchtower, in turn, is a lightweight adapter between the Python logging system and CloudWatch Logs. It uses the boto3 AWS SDK, and lets you plug your application logging directly into CloudWatch without the need to install a system-wide log collector like awscli-cwlogs and round-trip your logs through the instance’s syslog. It aggregates logs into batches to avoid sending an API request per each log message, while guaranteeing a delivery deadline (60 seconds by default).

Installation

pip install watchtower

Synopsis

Install awscli and set your AWS credentials (run aws configure).

import watchtower, logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
logger.addHandler(watchtower.CloudWatchLogHandler())
logger.info("Hi")
logger.info(dict(foo="bar", details={}))

After running the example, you can see the log output in your AWS console.

Example: Flask logging with Watchtower

import watchtower, flask, logging

logging.basicConfig(level=logging.INFO)
app = flask.Flask("loggable")
handler = watchtower.CloudWatchLogHandler()
app.logger.addHandler(handler)
logging.getLogger("werkzeug").addHandler(handler)

@app.route('/')
def hello_world():
    return 'Hello World!'

if __name__ == '__main__':
    app.run()

(See also http://flask.pocoo.org/docs/errorhandling/.)

Example: Django logging with Watchtower

This is an example of Watchtower integration with Django. In your Django project, add the following to settings.py:

from boto3.session import Session

AWS_ACCESS_KEY_ID = 'your access key'
AWS_SECRET_ACCESS_KEY = 'your secret access key'
AWS_REGION_NAME = 'your region'

boto3_session = Session(aws_access_key_id=AWS_ACCESS_KEY_ID,
                        aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
                        region_name=AWS_REGION_NAME)

LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'root': {
        'level': logging.ERROR,
        'handlers': ['console'],
    },
    'formatters': {
        'simple': {
            'format': "%(asctime)s [%(levelname)-8s] %(message)s",
            'datefmt': "%Y-%m-%d %H:%M:%S"
        },
        'aws': {
            # you can add specific format for aws here
            'format': "%(asctime)s [%(levelname)-8s] %(message)s",
            'datefmt': "%Y-%m-%d %H:%M:%S"
        },
    },
    'handlers': {
        'watchtower': {
            'level': 'DEBUG',
            'class': 'watchtower.CloudWatchLogHandler',
            'boto3_session': boto3_session,
            'log_group': 'MyLogGroupName',
            'stream_name': 'MyStreamName',
            'formatter': 'aws',
        },
    },
    'loggers': {
        'django': {
            'level': 'INFO',
            'handlers': ['watchtower'],
            'propagate': False,
        },
        # add your other loggers here...
    },
}

Using this configuration, every log statement from Django will be sent to Cloudwatch in the log group MyLogGroupName under the stream name MyStreamName. Instead of setting credentials via AWS_ACCESS_KEY_ID and other variables, you can also assign an IAM role to your instance and omit those parameters, prompting boto3 to ingest credentials from instance metadata.

(See also the Django logging documentation).

Examples: Querying CloudWatch logs

This section is not specific to Watchtower. It demonstrates the use of awscli and jq to read and search CloudWatch logs on the command line.

For the Flask example above, you can retrieve your application logs with the following two commands:

aws logs get-log-events --log-group-name watchtower --log-stream-name loggable | jq '.events[].message'
aws logs get-log-events --log-group-name watchtower --log-stream-name werkzeug | jq '.events[].message'

CloudWatch Logs supports alerting and dashboards based on metric filters, which are pattern rules that extract information from your logs and feed it to alarms and dashboard graphs. The following example shows logging structured JSON data using Watchtower, setting up a metric filter to extract data from the log stream, a dashboard to visualize it, and an alarm that sends an email:

TODO

Examples: Python Logging Config

Python has the ability to provide a configuration file that can be loaded in order to separate the logging configuration from the code. Historically, Python has used the logging.config.fileConfig function to do so, however, that feature lacks the ability to use keyword args. Python 2.7 introduced a new feature to handle logging that is more robust - logging.config.dictConfig which provides the ability to do more advanced Filters, but more importantly adds keyword args, thus allowing the logging.config functionality to instantiate Watchtower.

The following are two example YAML configuration files that can be loaded using PyYaml. The resulting dict object can then be loaded into logging.config.dictConfig. The first example is a basic example that relies on the default configuration provided by boto3:

# Default AWS Config
version: 1
formatters:
    json:
        format: "[%(asctime)s] %(process)d %(levelname)s %(name)s:%(funcName)s:%(lineno)s - %(message)s"
    plaintext:
        format: "[%(asctime)s] %(process)d %(levelname)s %(name)s:%(funcName)s:%(lineno)s - %(message)s"
handlers:
    console:
        (): logging.StreamHandler
        level: DEBUG
        formatter: plaintext
        stream: sys.stdout
    watchtower:
        formatter: json
        level: DEBUG
        (): watchtower.CloudWatchLogHandler
        log_group: logger
        stream_name:  loggable
        send_interval: 1
        create_log_group: False
loggers:
    root:
        handlers: [console, watchtower, logfile]
    boto:
        handlers: [console]
    boto3:
        handlers: [console]
    botocore:
        handlers: [console]
    requests:
        handlers: [console]

The above works well if you can use the default configuration, or rely on environmental variables. However, sometimes one may want to use different credentials for logging than used for other functionality; in this case the boto3_profile_name option to Watchtower can be used to provide a profile name:

# AWS Config Profile
version: 1
formatters:
    json:
        format: "[%(asctime)s] %(process)d %(levelname)s %(name)s:%(funcName)s:%(lineno)s - %(message)s"
    plaintext:
        format: "[%(asctime)s] %(process)d %(levelname)s %(name)s:%(funcName)s:%(lineno)s - %(message)s"
handlers:
    console:
        (): logging.StreamHandler
        level: DEBUG
        formatter: plaintext
        stream: sys.stdout
    watchtower:
        formatter: json
        level: DEBUG
        (): watchtower.CloudWatchLogHandler
        log_group: logger
        stream_name:  loggable
        boto3_profile_name: watchtowerlogger
        send_interval: 1
        create_log_group: False
loggers:
    root:
        handlers: [console, watchtower, logfile]
    boto:
        handlers: [console]
    boto3:
        handlers: [console]
    botocore:
        handlers: [console]
    requests:
        handlers: [console]

For the more advanced configuration, the following configuration file will provide the matching credentials to the watchtowerlogger profile:

[profile watchtowerlogger]
aws_access_key_id=MyAwsAccessKey
aws_secret_access_key=MyAwsSecretAccessKey
region=us-east-1

Finally, the following shows how to load the configuration into the working application:

import logging.config

import flask
import yaml

app = flask.Flask("loggable")

@app.route('/')
def hello_world():
    return 'Hello World!'

if __name__ == '__main__':
    with open('logging.yml', 'r') as log_config:
        config_yml = log_config.read()
        config_dict = yaml.load(config_yml)
        logging.config.dictConfig(config_dict)
        app.run()

Authors

  • Andrey Kislyuk

Bugs

Please report bugs, issues, feature requests, etc. on GitHub.

License

Licensed under the terms of the Apache License, Version 2.0.

https://travis-ci.org/kislyuk/watchtower.svg https://codecov.io/github/kislyuk/watchtower/coverage.svg?branch=master https://img.shields.io/pypi/v/watchtower.svg https://img.shields.io/pypi/l/watchtower.svg

API documentation

class watchtower.CloudWatchLogHandler(log_group='watchtower', stream_name=None, use_queues=True, send_interval=60, max_batch_size=1048576, max_batch_count=10000, boto3_session=None, boto3_profile_name=None, create_log_group=True, log_group_retention_days=None, create_log_stream=True, json_serialize_default=None, max_message_size=262144, endpoint_url=None, *args, **kwargs)[source]

Create a new CloudWatch log handler object. This is the main entry point to the functionality of the module. See http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/WhatIsCloudWatchLogs.html for more information.

Parameters
  • log_group (String) – Name of the CloudWatch log group to write logs to. By default, the name of this module is used.

  • stream_name (String) – Name of the CloudWatch log stream to write logs to. By default, the name of the logger that processed the message is used. Accepts a format string parameter of {logger_name}, as well as {strftime:%m-%d-y}, where any strftime string can be used to include the current UTC datetime in the stream name.

  • use_queues – If True, logs will be queued on a per-stream basis and sent in batches. To manage the queues, a queue handler thread will be spawned.

  • send_interval (Integer) – Maximum time (in seconds, or a timedelta) to hold messages in queue before sending a batch.

  • max_batch_size (Integer) – Maximum size (in bytes) of the queue before sending a batch. From CloudWatch Logs documentation: The maximum batch size is 1,048,576 bytes, and this size is calculated as the sum of all event messages in UTF-8, plus 26 bytes for each log event.

  • max_batch_count (Integer) – Maximum number of messages in the queue before sending a batch. From CloudWatch Logs documentation: The maximum number of log events in a batch is 10,000.

  • boto3_session (boto3.session.Session) – Session object to create boto3 logs clients. Accepts AWS credential, profile_name, and region_name from its constructor.

  • create_log_group (Boolean) – Create CloudWatch Logs log group if it does not exist. True by default.

  • log_group_retention_days (Integer) – Sets the retention policy of the log group in days. None by default.

  • create_log_stream (Boolean) – Create CloudWatch Logs log stream if it does not exist. True by default.

  • json_serialize_default (Function) – The ‘default’ function to use when serializing dictionaries as JSON. Refer to the Python standard library documentation on ‘json’ for more explanation about the ‘default’ parameter. https://docs.python.org/3/library/json.html#json.dump https://docs.python.org/2/library/json.html#json.dump

  • max_message_size (Integer) – Maximum size (in bytes) of a single message.

  • endpoint_url (String) – The complete URL to use for the constructed client. Normally, botocore will automatically construct the appropriate URL to use when communicating with a service. You can specify a complete URL (including the “http/https” scheme) to override this behavior.

close()[source]

Tidy up any resources used by the handler.

This version removes the handler from an internal map of handlers, _handlers, which is used for handler lookup by name. Subclasses should ensure that this gets called from overridden close() methods.

emit(message)[source]

Do whatever it takes to actually log the specified logging record.

This version is intended to be implemented by subclasses and so raises a NotImplementedError.

flush()[source]

Ensure all logging output has been flushed.

This version does nothing and is intended to be implemented by subclasses.

exception watchtower.WatchtowerWarning[source]

Table of Contents