Logging library in python has been in standard library since a very long time. In many aspects, this logging framework bears similiarties with log4j libraries in Java. In this blog post, I will go over some of the good approaches when using python log formatting.
Avoid using root logger
Avoiding root logger provides the following benefits:
- Better descriptive names regarding where the log originated from. Helps during searching of logs
- Allows better log customization
Many times, we find it quite easy to start logging using logging.info/logging.error methods. However, using any of the methods under logging module will result in root logger being invoked. Take a look at the examples below involving two modules: logger_usage.py and user_record.py
logger_usage.py file:
import logging
import user_record
logging.basicConfig(level=logging.DEBUG)
logging.info("Hello World from root logger")
logging.error("failed to process\n")
user_record.process(-100)
user_record.process(100)
user_record.py
import logging
def process(user_id):
logging.info("processing user record = %s", user_id)
if user_id < 0:
logging.error("Invalid user id = %s", user_id)
After invoking logging_usage.py script, we get the following output:
python logger_usage.py
INFO:root:Hello World from root logger
ERROR:root:failed to process
INFO:root:processing user record = -100
ERROR:root:Invalid user id = -100
INFO:root:processing user record = 100
If you carefully observe the output, the logs from both modules contain the entry root (which refers to root logger name).
If you change the user_record.py to the following:
import logging
# you could use any name. However, it is good convention to use module name.
# here __name__ => is the module name (as this is not invoked as startup script)
#https://docs.python.org/3/reference/import.html?highlight=__name__#name__
logger = logging.getLogger(__name__)
def process(user_id):
logger.info("processing user record = %s", user_id)
if user_id < 0:
logger.error("Invalid user id = %s", user_id)
After doing this change, notice the changes in the output. Instead of root we get user_record in logs originating from user_record module.
python root_logger_usage.py
INFO:root:Hello World from root logger
ERROR:root:failed to process
INFO:user_record:processing user record = -100
ERROR:user_record:Invalid user id = -100
INFO:user_record:processing user record = 100
Now, let’s say in the future you realize that there are too many processing user record logs and you only want error or higher priority logs from user_record module. You can achieve this by customizing the logger in user_record.py:
import logging
logger = logging.getLogger(__name__)
logger.setLevel(logging.ERROR)
def process(user_id):
logger.info("processing user record = %s", user_id)
if user_id < 0:
logger.error("Invalid user id = %s", user_id)
Running the script again will output the following result. Note how the info logs are missing from user_record module.
python logger_usage.py
INFO:root:Hello World from root logger
ERROR:root:failed to process
ERROR:user_record:Invalid user id = -100
This is just the tip of the iceberg of what we can achieve with a custom logger. Although this is a basic example of how we can customize logging, in most cases one would use a centralized logging config
Use %s string formatting instead of constructing a string
Using %s string formatting is more efficient if the log never appears. Creating unnecessary objects causes pressure on garbage collector.
Consider the following different ways to format a log string containing a user_id:
logger.info("processing user record id =%s", user_id)- No extra string is created as this uses delayed evaluation and final string is only constructed if corressponding level is enabled.
logger.info(f"processing user record id = {user_id}")- An extra string is constructed and then discarded.
logger.info("processing user record id={user_id}".format(user_id=user_id))- Same here as well. Extra string is constructed and then discarded.
Use isEnabledFor method to avoid expensive computations
Consider a scenario where you want to output some debugging information and computing/fetching this debugging information takes lot of time.
prev_config = fetch_previous_state(new_config.id) # Makes a db call
new_formatted_config = format_config_as_json(new_config) # just formats as json (cpu based)
prev_formatted_config = format_config_as_json(prev_config)
logger.debug("new_config=%s old_config=%s", new_formatted_config, prev_formatted_config)
In the above scenario, the log line never appears if the logging level is > DEBUG. In this case, the time spent in making a call to fetch_previous_state and 2 calls to format_config_as_json are wasted.
A better way to achive the following is to use isEnabledFor .
if logger.isEnabledFor(logging.DEBUG):
prev_config = fetch_previous_state(new_config.id)
new_formatted_config = format_config_as_json(new_config)
prev_formatted_config = format_config_as_json(prev_config)
logger.debug("new_config=%s old_config=%s", new_formatted_config, prev_formatted_config)