11 November 2019

Grokking PHP monolog context into Elastic

An indexed and searchable centralized log is one of those tools that once you've had it you'll wonder how you managed without it. 
I've experienced a couple of advantages to using a central log - debugging, monitoring performance, and catching unknown problems.


Debugging becomes easier because instead of poking around grepping text logs on servers you're able to use a GUI to contrast and compare values between different time ranges.

A ticket will often include sparse information about the problem and observed error, but if you know more or less when a problem occurred then you can check the logs of all your systems at that time.

Problem behaviour in your application can occur as a result of the services you depend on.  A database fault will produce errors in your application, for example.

If you log your database errors and your application errors in the same central platform then it's much more convenient to compare behaviour between the systems.

I've often found that knowing the application context makes it a lot easier to debug a problem.  I can also more easily reason about the path my code would have taken.  If I'm feeling energetic I can make an attempt to reproduce it and play it back in slow motion with my debugger.

Monitoring performance

Indexing your logs lets you use a query language to search them.  All of the popular logging platforms include a DSL that lets you perform aggregations and other statistical functions on values contained in your logs.

This makes it a lot easier for your team to ask and answer questions about historical behaviour.

You can very easily ask "find me a list of all endpoints that take longer than 0.5 seconds to respond", or "find me all the requests that took longer than two standard deviations above mean to respond to".

Central logging doesn't enable you to answer questions, you could already do that manually.  Central logging makes it easy to answer questions, which encourages you to ask more of them.

Being able to query patterns makes it easier to identify client use patterns that are harmful to your server.  This lets you mitigate them on either the server or the client.

You could notice, for example, that a client is not respecting your cache control headers and is performing a repeat call to the API when it should be relying on a local cache.  This lets you engage the client developers with alternatives and SDK's to help them take advantage of your caching strategy.

Managing verbosity and unknowns

Anomaly detection is a powerful addition to central logging.  Logging is great for analyzing historical performance, but is too verbose to be useful for monitoring in any level of detail. 

Machine learning algorithms can analyze your logs and find outliers in message frequency or values.  They can do this quite cheaply in real time after an initial learning process.  Your central logs can become an early warning system for problems that your customers haven't reported yet.

I'm a fan of Unomaly, but there are lots of other options out there, Splunk and Elastic both include machine learning options for example.

A PHP implementation

So now that you're convinced that central logging will improve your debugging, allow you to identify performance bottlenecks, and make auditing more convenient what is the next step in your PHP project?

Monolog is the de facto standard PHP logging library. Part of its fancy features include the ability to log a contextual array of information, which is provided as a second parameter, like this:

$log->info("Example of something being logged", $visitDetails);

This results in the array being output as a JSON string at the end of your log message and can be very useful indeed.

Some examples of the useful information you can put in the context could be a request id that lets you group log entries from any particular request.  You'd assign a request a unique id when you first start processing it, and then consistently include this id in your logs.

Monolog lets you push logs straight into Elastic, but if we push our logs to a local logstash instance then we save ourselves some network I/O and allow for some buffering in the event that Elastic is down.

Logstash also lets you grok your logs and massage them into a form that Elastic can understand.

Actually writing the grok pattern is a lot like writing a regex pattern, but luckily there are some tools that will greatly speed this up for you.

Take a look at:
  1. https://grokdebug.herokuapp.com/
  2. https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns
  3. https://github.com/kkos/oniguruma/blob/master/doc/RE
Oniguruma lets you use regex in the Grok pattern, like this: (?<field_name>the pattern here)

For my out of the box installation of Monolog the logstash configuration looked like this:

The aim here is to use grok to tokenize your log line into fields. You can then use the json filter to further expand any json fields that you've grokked.

In my example the field "context" is grokked from the context that can be added as "extra data". See the Monolog manual for details.

I was able to get Monolog to push logs to Logstash by using a SockerHandler, like this:

When you mix this up with AWS managed Elasticsearch (and their plugin) you can very quickly get up and running with Kibana.

Example repository is at https://github.com/andybeak/monolog-elk-demo