Hopefully, everyone agrees that you should have some form of logging in your production systems. I would also hope that nowadays everyone agrees that using some kind of centralised logging system that you can access without SSHing onto production boxes is the right way to do things.
But what should we log?
Typically we log errors and exceptions, we want to see when things go wrong. With this logging in place when we see a 500 error on the server we simply go and check the logs to see what the problem is.
We can also monitor our logs and detect an increase in errors - this can alert us to problems before they have a large impact on users.
To me, this is the bare minimum of logging - errors and exceptions.
What we also need are logs that follow the happy path of the code, we should be able to see that the code is flowing in the way we expect it to.
Why do we need this? What do we do when a user reports a problem?
Go and check the logs to see if something is broken - no errors or exceptions - what do we do now?
We need to be able to follow the code through its path of execution and see where it deviates from what we expect:
The user says: “emails aren’t being sent”:
Check the logs, I can see the system was told to send an email, I can see it generated the template, I don’t see it sending the email - there’s something wrong in the code after template generation.”
This is an invaluable tool in our debugging arsenal saving us a huge amount of troubleshooting time.