Using Logstash and ElasticSearch to Process Eliot Logs

ElasticSearch is a search and analytics engine which can be used to store Eliot logging output. The logs can then be browsed by humans using the Kibana web UI, or on the command-line using the logstash-cli tool. Automated systems can access the logs using the ElasticSearch query API. Logstash is a log processing tool that can be used to load Eliot log files into ElasticSearch. The combination of ElasticSearch, Logstash, and Kibana is sometimes referred to as ELK.

Example Logstash Configuration

Assuming each Eliot message is written out as a JSON message on its own line (which is the case for eliot.to_file() and eliot.logwriter.ThreadedFileWriter), the following Logstash configuration will load these log messages into an in-process ElasticSearch database:

logstash_standalone.conf

input {
  stdin {
    codec => json_lines {
      charset => "UTF-8"
    }
  }
}

filter {
  date {
    # Parse Eliot timestamp filed into the special @timestamp field Logstash
    # expects:
    match => [ "timestamp", "UNIX" ]
    target => ["@timestamp"]
  }
}

output {
  # Stdout output for debugging:
  stdout {
    codec => rubydebug
  }

  elasticsearch {
    # Documents in ElasticSearch are identified by tuples of (index, mapping
    # type, document_id).
    # References:
    # - http://logstash.net/docs/1.3.2/outputs/elasticsearch
    # - http://stackoverflow.com/questions/15025876/what-is-an-index-in-elasticsearch

    # We make the document id unique (for a specific index/mapping type pair) by
    # using the relevant Eliot fields. This means replaying messages will not
    # result in duplicates, as long as the replayed messages end up in the same
    # index (see below).
    document_id => "%{task_uuid}_%{task_level}"

    # By default logstash sets the index to include the current date. When we
    # get to point of replaying log files on startup for crash recovery we might
    # want to use the last modified date of the file instead of current date,
    # otherwise we'll get documents ending up in wrong index.

    #index => "logstash-%{+YYYY.MM.dd}"

    index_type => "Eliot"

    # In a centralized ElasticSearch setup we'd be specifying host/port
    # or some such. In this setup we run it ourselves:
    embedded => true
  }
}

We can then pipe JSON messages from Eliot into ElasticSearch using Logstash:

$ python examples/stdout.py | logstash web -- agent --config logstash_standalone.conf

You can then use the Kibana UI to search and browse the logs by visiting http://localhost:9292/.