Python Logging Good Practices

Logging library in python has been in standard library since a very long time. In many aspects, this logging framework bears similiarties with log4j libraries in Java. In this blog post, I will go over some of the good approaches when using python log formatting.

Avoid using root logger

Avoiding root logger provides the following benefits:

Better descriptive names regarding where the log originated from. Helps during searching of logs
Allows better log customization

Many times, we find it quite easy to start logging using logging.info/logging.error methods. However, using any of the methods under logging module will result in root logger being invoked. Take a look at the examples below involving two modules: logger_usage.py and user_record.py

logger_usage.py file:

  
import logging
import user_record

logging.basicConfig(level=logging.DEBUG)

logging.info("Hello World from root logger")

logging.error("failed to process\n")

user_record.process(-100)
user_record.process(100)

user_record.py

  
import logging

def process(user_id):
    logging.info("processing user record = %s", user_id)

    if user_id < 0:
        logging.error("Invalid user id = %s", user_id)

After invoking logging_usage.py script, we get the following output:

  
python logger_usage.py 
INFO:root:Hello World from root logger
ERROR:root:failed to process

INFO:root:processing user record = -100
ERROR:root:Invalid user id = -100
INFO:root:processing user record = 100

If you carefully observe the output, the logs from both modules contain the entry root (which refers to root logger name).

If you change the user_record.py to the following:

  
import logging

# you could use any name. However, it is good convention to use module name. 
# here __name__ => is the module name (as this is not invoked as startup script)
#https://docs.python.org/3/reference/import.html?highlight=__name__#name__
logger = logging.getLogger(__name__)

def process(user_id):
    logger.info("processing user record = %s", user_id)

    if user_id < 0:
        logger.error("Invalid user id = %s", user_id)

After doing this change, notice the changes in the output. Instead of root we get user_record in logs originating from user_record module.

  
python root_logger_usage.py 
INFO:root:Hello World from root logger
ERROR:root:failed to process

INFO:user_record:processing user record = -100
ERROR:user_record:Invalid user id = -100
INFO:user_record:processing user record = 100

Now, let’s say in the future you realize that there are too many processing user record logs and you only want error or higher priority logs from user_record module. You can achieve this by customizing the logger in user_record.py:

  
import logging

logger = logging.getLogger(__name__)
logger.setLevel(logging.ERROR)

def process(user_id):
    logger.info("processing user record = %s", user_id)

    if user_id < 0:
        logger.error("Invalid user id = %s", user_id)

Running the script again will output the following result. Note how the info logs are missing from user_record module.

  
python logger_usage.py 
INFO:root:Hello World from root logger
ERROR:root:failed to process

ERROR:user_record:Invalid user id = -100

This is just the tip of the iceberg of what we can achieve with a custom logger. Although this is a basic example of how we can customize logging, in most cases one would use a centralized logging config

Use %s string formatting instead of constructing a string

Using %s string formatting is more efficient if the log never appears. Creating unnecessary objects causes pressure on garbage collector.

Consider the following different ways to format a log string containing a user_id:

logger.info("processing user record id =%s", user_id)
- No extra string is created as this uses delayed evaluation and final string is only constructed if corressponding level is enabled.
logger.info(f"processing user record id = {user_id}")
- An extra string is constructed and then discarded.
logger.info("processing user record id={user_id}".format(user_id=user_id))
- Same here as well. Extra string is constructed and then discarded.

Use `isEnabledFor` method to avoid expensive computations

Consider a scenario where you want to output some debugging information and computing/fetching this debugging information takes lot of time.

  
prev_config = fetch_previous_state(new_config.id) # Makes a db call
new_formatted_config = format_config_as_json(new_config) # just formats as json (cpu based)
prev_formatted_config = format_config_as_json(prev_config) 

logger.debug("new_config=%s old_config=%s", new_formatted_config, prev_formatted_config)

In the above scenario, the log line never appears if the logging level is > DEBUG. In this case, the time spent in making a call to fetch_previous_state and 2 calls to format_config_as_json are wasted.

A better way to achive the following is to use isEnabledFor .

  
if logger.isEnabledFor(logging.DEBUG):
  prev_config = fetch_previous_state(new_config.id)
  new_formatted_config = format_config_as_json(new_config)
  prev_formatted_config = format_config_as_json(prev_config)
  logger.debug("new_config=%s old_config=%s", new_formatted_config, prev_formatted_config)

Real Example in GitHub

Python Logging Good Practices

Avoid using root logger

Use %s string formatting instead of constructing a string

Use isEnabledFor method to avoid expensive computations

Further Reading

Populating Java PriorityQueue Efficiently

Preallocate Java's ArrayList And HashMap For Better Performance

Evaluating Alternate Variations of Kosaraju Sharir Algorithm for Computing Strongly Connected Components

Use `isEnabledFor` method to avoid expensive computations