Python Guide to Debugging and Profiling
Debugging and profiling are essential components of software development. They allow you to identify and correct errors and optimise code performance. In this guide, we will...
Filter by Category
Filter by Author
Debugging and profiling are essential components of software development. They allow you to identify and correct errors and optimise code performance. In this guide, we will...
Posted by Juanmi Taboada
Proper management of secrets, such as passwords, API keys, and certificates, is crucial for system security. In distributed environments and cloud deployments, mishandling...
Posted by Juanmi Taboada
In the last few days, I have been looking for a 3D Printer to bring some of my inventions to life. After reading and comparing, I went for an Anycubic Kobra 2 Neo. It is a great...
Posted by Juanmi Taboada
Build your own None Cat Auth (none cat is authorized, a respectful repelling cats system). This is a hardware implementation to repel cats in a non-harmful way. It shouldn’t...
Posted by Juanmi Taboada
Precision vs. Creativity: Navigating the Landscape of AI Language Models in Problem-Solving In the ever-evolving landscape of AI language models, precision and creativity stand as...
Posted by Juanmi Taboada
This post contains the conclusions of the Alioli ROV Submarine Drone and shows images and videos of it in the water. I wrote it as a diary so anybody can understand that Alioli...
Posted by Juanmi Taboada
Examples of how to work with JSON, YAML, CSV, and XML files in Python. Today I saw myself preparing some exercises for my student who is learning Python programming language, and...
Posted by Juanmi Taboada
In this post, I describe how my own Arduino Framework for Alioli ROV Submarine Drone works. In my last post about Alioli ROV Submarine Drone, I wrote, “Learn how to Build an...
Posted by Juanmi Taboada
Here is how to make a SIMCOM 7600 module work on an Arduino board (Uno, Mega 2560, Leonardo, Due, Micro, Zero, Nano, Pico), ESP8266, ESP32, Raspberry Pi Pico, MicroPython, or any...
Posted by Juanmi Taboada
I will tell you about my experience repairing a cargo ship as a naval Engineer. Some months ago, you could read in the local newspaper Málaga Hoy that a large cargo ship was...
Posted by Juanmi Taboada
Debugging and profiling are essential components of software development. They allow you to identify and correct errors and optimise code performance. In this guide, we will discover and learn techniques for debugging and profiling Python programs.
What is the difference between debugging and profiling?
Debugging is locating and correcting errors or bugs in a program’s source code, while profiling focuses on analysing program performance to identify bottlenecks and areas requiring optimisation.
The main aspects analysed during profiling are:
Table of Contents
ToggleA typical example of overhead is the excessive use of nested loops to perform operations that could be optimised:
# List of numbers numbers = [1, 2, 3, 4, 5] # Calculate the sum of the squares of numbers greater than 2 sum_squares = 0 for number in numbers: if number > 2: sum_squares += number ** 2 print(sum_squares)
Although this example is simple, this type of operation can generate significant overhead in cases involving more extensive lists or more complex conditions. We can use a generator expression with the sum() function to optimise the code and eliminate this overhead. This allows us to calculate the sum of the squares of numbers greater than 2 more efficiently:
numbers = [1, 2, 3, 4, 5] # Calculate the sum of the squares of numbers greater than 2 sum_squares = sum(number ** 2 for number in numbers if number > 2) print(sum_squares)
Generator expression: the expression (number ** 2 for number in numbers if number > 2) generates the squares of the numbers in the list numbers greater than 2. Unlike a list comprehension, the generator expression does not create a list in memory. Instead, it generates the values on the fly, which improves efficiency in terms of memory usage.
sum() function: Calculates the sum of the values generated by the generator expression. Since it is a language operation (compiled in C), its execution is much faster and lighter.
A memory leak is a software error that occurs when a program reserves blocks of memory for use but fails to release them after they are no longer needed. As a result, the allocated memory remains occupied and cannot be reused by the system, leading to a gradual depletion of available memory.
Efficient memory management is essential for application performance and stability. Although Python has a garbage collector that automatically handles memory releases, memory leaks can occur if objects are not released properly. Here we can see that a leak started happening several minutes after the program began:
In this graph, we can see details of the Heap.
Python is essentially C, specifically in its main implementation, CPython. Thus, Python inherits the memory organisation typical of C programs. Therefore, when a C program starts, the memory is organised into several sections, each with a specific purpose:
The following block diagram shows how memory is organised:
With this in mind, we can see that they could collide if the heap and stack grow too large. This is true in embedded systems (microcontrollers like Arduino and ESP32).
In modern languages running on current operating systems, attempting to expand the stack beyond its capacity often results in a stack overflow error. On the other hand, increasing the heap size causes memory allocation functions to fail. However, not all software is modern, so it is crucial to analyse the possible failure modes:
It’s important to consider that in multithreaded systems, there is one stack per thread, and the entire responsibility for memory management shouldn’t be offloaded to the guard pages. Programming practices that control memory usage and prevent overflows are essential rather than relying solely on the protections provided by the operating system.
Example of a Memory Leak:
Suppose we’re developing an application that manages user sessions. Each session maintains references to user objects, and these user objects, in turn, keep references to their respective sessions. We also use a global list to store all active sessions.
class User: def __init__(self, name): self. name = name self. session = None class Session: def __init__(self, user): self. user = user user. session = self # Global list storing all active sessions (Disclaimer: use of globals is generally discouraged) active_sessions = [] def log_in(username): user = User(username) session = Session(user) active_sessions. append(session) def log_out(username): global active_sessions active_sessions = [session for session in active_sessions if session.user.name != username]
Let’s analyse the code:
Performance Impact: over time, if numerous sessions are created and not properly managed, the memory used by circular references and sessions stored in the global list will not be freed, causing a constant increase in memory consumption and potentially leading to performance degradation or application failure.
The proposed solutions would be:
Break circular references: before deleting a session, it is advisable to explicitly break circular references by setting the references to None:
def logout(username): global active_sessions for session in active_sessions: if session.user.name == username: session.user.session = None session.user = None active_sessions = [session for session in active_sessions if session.user is not None]
Use weakref for weak references: the weakref module allows you to create weak references that do not prevent the garbage collector from deleting objects. By using weak references, you can avoid problematic circular references:
import weakref class User: def __init__(self, name): self.name = name self.session = None class Session: def __init__(self, user): self.user = weakref.ref(user) user.session = weakref.ref(self)
Properly manage the global list: ensure that closed sessions are entirely removed from active_sessions and that no lingering references to objects are no longer needed.
The pdb (Python Debugger) module is the standard Python debugger that allows developers to run programs interactively to identify and fix bugs. You can use it in several ways:
From the command line: by running the script with the -m pdb argument, which starts the program under pdb control from the beginning:
python -m pdb my_script.py
By inserting pdb.set_trace() into your code: by including import pdb; pdb.set_trace() at the point where you want to start debugging, will stop the program at that line and open an interactive pdb session:
import pdb def function(): variable = 'value' pdb.set_trace() # Start debugging here # Code to debug
Using the breakpoint() function: starting with Python 3.7, the built-in function breakpoint() was introduced, which acts as an alias for pdb.set_trace(), making it easy to insert breakpoints without having to import pdb explicitly:
def function(): variable = 'value' breakpoint() # Start debugging here # Code to debug
Running the program and reaching the breakpoint will open the pdb prompt, indicated by (Pdb), from where you can issue commands to control execution and analyse the program’s status. Some of the most commonly used include:
cProfile is a built-in module that provides deterministic program profiling, allowing you to analyse the execution time of each function and how often it is called. By instrumenting your code, cProfile collects statistics on the number of calls and time consumed by each function, making it easier to identify areas that require optimisation.
To profile an entire script using cProfile, you can run the script from the command line as follows:
python -m cProfile my_script.py
This command will run my_script.py under cProfile control and display a summary of the profiling statistics to standard output:
If you want to save the results to a file for later analysis, you can use the -o option:
python -m cProfile -o results.prof my_script.py
This will generate a file named results.prof containing the profiling statistics.
Running cProfile generates a table with several columns that provide information about the performance of each function:
These metrics help you identify which functions take the most time and could be candidates for optimisation. To profile a specific section of code instead of the entire script, you can use cProfile within the code itself:
import cProfile def function_to_profile(): -# Code to analyze pass if name == 'main': profiler = cProfile.Profile() profiler.enable() function_to_profile() profiler.disable() profiler.print_stats()
This approach allows you to focus the analysis on specific functions or blocks of code, facilitating more targeted optimisation.
Visualisation tools such as SnakeViz, gprof2dot, or kcachegrind can help you interpret cProfile results more intuitively.
SnakeViz
gprof2dot
kcachegrind
The tracemalloc module, introduced in Python 3.4, is a built-in tool for tracing memory allocations made by a Python program. It allows you to take “snapshots” of the memory state at different points in time and compare them to analyse how memory usage changes over time.
To use tracemalloc, follow these steps:
import tracemalloc # Start the memory trace tracemalloc.start() # Code whose memory allocation you want to trace ... # Take a snapshot of the current memory state snapshot = tracemalloc.take_snapshot() # Analyze the snapshot statistics: top_stats = snapshot.statistics('lineno') print("[Top 10 Memory-Using Lines]") for stat in top_stats[:10]: print(stat) # Stop the memory trace (optional) tracemalloc.stop()
By analysing the statistics provided by tracemalloc, you can gain insight into which lines of code are consuming the most memory:
[Top 10 lines that consume the most memory] <frozen importlib._bootstrap>:716: size=4855 KiB, count=39328, average=126 B <frozen importlib._bootstrap>:284: size=521 KiB, count=3199, average=167 B /usr/lib/python3.4/collections/__init__.py:368: size=244 KiB, count=2315, average=108 B /usr/lib/python3.4/unittest/case.py:381: size=185 KiB, count=779, average=243 B /usr/lib/python3.4/unittest/case.py:402: size=154 KiB, count=378, average=416 b /usr/lib/python3.4/abc.py:133: size=88.7 KiB, count=347, average=262 B <frozen importlib._bootstrap>:1446: size=70.4 KiB, count=911, average=79 B <frozen importlib._bootstrap>:1454: size=52.0 KiB, count=25, average=2131 B <string>:5: size=49.7 KiB, count=148, average=344 B /usr/lib/python3.4/sysconfig.py:411: size=48.0 KiB, count=1, average=48.0 KiB
Each entry in the statistics includes:
A vital feature of tracemalloc is comparing two memory snapshots to identify allocation differences. This is useful for detecting memory leaks by observing which objects remain in memory between two points in time. Example:
import tracemalloc tracemalloc.start() # Code before the possible memory leak snapshot1 = tracemalloc.take_snapshot() # Code that could be causing a memory leak … # Code after the possible memory leak snapshot2 = tracemalloc.take_snapshot() # Compare the two snapshots top_stats = snapshot2.compare_to(snapshot1, 'lineno') print("[Main differences in memory usage]") for stat in top_stats[:10]: print(stat)
Although this tool is no longer actively maintained, it works well, and there’s no reason not to use it. It allows you to monitor and analyse a program’s memory usage, providing detailed information line by line. This makes it easier to identify bottlenecks and optimise memory usage. To install memory_profiler, use pip:
pip install memory-profiler
It’s also recommended to install matplotlib if you want to visualise the memory usage graphically:
pip install matplotlib
memory_profiler allows you to analyse specific functions using the @profile decorator. When applied to a function, you can obtain a detailed report of memory usage line by line. Example:
from memory_profiler import profile @profile def process_data(): data = [1] * (10**6) result = [x * 2 for x in data] del data return result if name == 'main': process_data()
To run the script and get the memory profile:
python -m memory_profiler my_script.py
The output will show the memory usage before and after each line of the decorated function, indicating increases and decreases in consumption.
Line # Mem usage Increment Occurrences Line Contents -================================================================ 3 18.9 MiB 18.9 MiB 1 @profile 4 def process_data(): 5 26.6 MiB 7.6 MiB 1 data = [1] * (10**6) 6 34.1 MiB 7.5 MiB 1000003 result = [x * 2 for x in data] 7 26.5 MiB -7.6 MiB 1 del data 8 26.5 MiB 0.0 MiB 1 return result
memory_profiler includes the mprof tool, which allows you to log and view memory usage from a script over time. Run the script with mprof to record memory usage:
mprof run my_script.py
Generate a graph of memory usage:
mprof plot
This will produce a graph showing how memory usage varies during the script’s execution, making it easier to identify potential memory leaks or sections of code that require optimisation:
memory_profiler offers the memory_usage function to measure the memory usage of specific functions, allowing for a more focused analysis:
from memory_profiler import memory_usage def my_function(): data = [1] * (10**6) return data memory_usage = memory_usage(my_function) print(f'Memory usage: {memory_usage} MiB')
This approach is helpful for quickly measuring the memory usage of individual tasks without having to decorate or modify the original code.
Pympler is a development tool that allows you to measure, monitor, and analyze the memory behavior of running objects. Its main objective is to provide a detailed view of objects’ sizes and lifetimes.
To install Pympler, you can use pip:
pip install pympler
Pympler is made up of several modules that offer different memory analysis features:
asizeof: provides information about the size of Python objects, including their references. It allows you to investigate how much memory space particular objects take up. Unlike sys.getsizeof, asizeof measures objects recursively, including all referenced objects. Example:
from pympler import asizeof obj = [1, 2, (3, 4), 'text'] print(asizeof. asizeof(obj))
muppy: allows online monitoring by identifying objects that consume memory and potential leaks. It will enable you to track memory usage during execution and detect improperly freed objects. Example:
from pympler import muppy, summary all_objects = muppy.get_objects() sum1 = summary.summarize(all_objects) summary.print_(sum1)
types | # objects | total size ============================ | =========== | ============ str | 12960 | 2.20 MB dict | 4568 | 1.64 MB code | 4046 | 699.92 KB type | 796 | 624.21 KB tuple | 5164 | 320.80 KB wrapper_descriptor | 2225 | 156.45 KB builtin_function_or_method | 1276 | 89.72 KB set | 109 | 86.49 KB method_descriptor | 1099 | 77.27 KB weakref.ReferenceType | 1092 | 76.78 KB list | 378 | 69.54 KB abc.ABCMeta | 67 | 65.05 KB frozenset | 113 | 50.09 KB getset_descriptor | 759 | 47.44 KB int | 1529 | 46.21 KB
classtracker: allows you to track the lifetime of objects of specific classes, providing insight into instantiation patterns and how they contribute to memory consumption over time. Example:
from pympler import classtracker class MyClass: pass tr = classtracker.ClassTracker() tr.track_class(MyClass) tr.create_snapshot() # Create instances of MyClass tr.create_snapshot() tr.stats.print_summary()
---- SUMMARY --------------------------------------------------------------- active 0 B average pct active 0 B average pct ----------------------------------------------------------------------------
Memray is a memory profiler for Python developed by Bloomberg that allows you to trace and report memory allocations in both Python code and compiled extension modules. Its ability to deeply analyse memory usage makes it an essential tool for identifying memory leaks, analysing allocations, and optimising application performance. Memray Key Features:
Memray requires Python 3.7 or higher and can be easily installed from PyPI using:
python3 -m pip install memray
For Debian-based systems, you may need to install additional dependencies:
sudo apt-get install python3-dev libunwind-dev liblz4-dev
Then, you can proceed with installing Memray:
python3 -m pip install memray
Memray is commonly used from the command line to run and profile a Python script:
memray run my_script.py
This command runs my_script.py and traces its memory allocations, generating an output file. To generate a flame graph from the collected data:
memray flamegraph memray-mi_script.py..bin
This produces an interactive HTML file that displays memory usage as a flame graph, making it easier to identify bottlenecks and areas of high memory consumption:
In addition to flame graphs, Memray offers several other report types:
Summary Report: an overview of memory usage, highlighting the most memory-intensive functions.
memray summary memray-mi_script.py..bin
Table Report: generates a detailed table with all allocations recorded during execution.
memray table memray-mi_script.py..bin
Memray can be started with pytest using the pytest-memray module, allowing you to monitor memory usage during unit tests and prevent memory-related regressions. To install the module:
pip install pytest-memray
Then, when running tests with pytest, add the –memray option to enable memory profiling:
pytest --memray
This generates detailed memory usage reports for each test, making it easier to detect tests that consume more memory than expected.
Considerations when using Memray:
It is a statistical profiling tool compatible with Python 3.8 and higher. It allows you to analyse program performance by identifying the parts of the code that consume the most time during execution. Unlike traditional profilers that trace every function call, pyinstrument periodically samples the state of the call stack, reducing overhead and providing a clear view of the program’s behaviour.
To install pyinstrument, you can use pip:
pip install pyinstrument
To profile an entire script from the command line, use:
pyinstrument my_script.py
Upon completion, pyinstrument will display a summary in the terminal indicating where most of the time was spent.
To generate an interactive HTML report:
pyinstrument -r html my_script.py
This command will generate a detailed HTML report that allows for deeper exploration of the performance profile:
pyinstrument can also be integrated directly into the code to profile specific sections:
from pyinstrument import Profiler profiler = Profiler() profiler.start() # Code to profile result = function_to_analyze() profiler.stop() profiler.print()
Alternatively, using a context:
from pyinstrument import Profiler with Profiler() as profiler: # Code to profile result = function_to_analyze() print(profiler.output_text(unicode=True, color=True))
After profiling, pyinstrument presents a tree structure showing the time spent on each function and its subcalls. The most time-consuming functions are highlighted, making identifying critical areas that could benefit from optimisation easier.
Unlike cProfile, which is a deterministic profiler and can introduce significant overhead, pyinstrument is a statistical profiler that takes periodic samples, minimising interference with program performance. Furthermore, while cProfile focuses on CPU time, pyinstrument measures actual elapsed time, including waits and I/O operations, providing a more complete view of application performance.
pyinstrument offers middleware for frameworks such as Django, Flask, and FastAPI, allowing you to profile specific web requests. For example, in Django, adding pyinstrument.middleware.ProfilerMiddleware to the middleware configuration and accessing a URL with the ?profile parameter provides a detailed performance report for that request.
It’s a sampling profiler for Python programs that lets you visualise which parts of your code take the most time without restarting the program or modifying its source code. Written in Rust, py-spy offers minimal overhead and is safe for production environments. It runs on Linux, macOS, Windows, and FreeBSD, and is compatible with CPython versions 2.3 through 3.13.
To install py-spy, you can use pip:
pip install py-spy
For Rust users, you can install it using cargo:
cargo install py-spy
py-spy is used from the command line and offers three main subcommands: record, top, and dump.
record: records profiles and generates flame graphs. This command generates an interactive SVG file showing which functions take the most time. Example:
py-spy record -o profile.svg --pid 12345
top: displays the most time-consuming tasks in real time, similar to the Unix top command. This command provides a live view of CPU usage per function in your Python program. Example:
py-spy top --pid 12345
dump: displays the current call stack for each thread in your program. This command helps diagnose why a program is hanging or identifies bottlenecks. Example:
py-spy dump --pid 12345
It is a tool that allows you to analyse the execution time of each function line, facilitating the precise identification of bottlenecks in the code. To install line_profiler, you can use pip:
pip install line_profiler
To profile specific functions, use the @profile decorator. It’s important to note that this decorator is automatically recognised when running the script with kernprof, without explicitly importing it. Example:
@profile def slow_function(): result = 0 for i in range(10000): result += i return result slow_function()
To run the profiling:
kernprof -l -v my_script.py
The kernprof -l command runs the script and generates a .lprof profiling file, while the -v option displays the results directly in the terminal:
Wrote profile results to my_script.py.lprof Timer unit: 1e-06 s Total time: 0.139928 s File: test.py Function: process_data at line 3 Line # Hits Time Per Hit % Time Line Contents ============================================================== 3 @profile 4 def process_data(): 5 1 5628.9 5628.9 4.0 data = [1] * (10**6) 6 1 132350.9 132350.9 94.6 result = [x * 2 for x in data] 7 1 1947.2 1947.2 1.4 del data 8 1 1.0 1.0 0.0 return result
The profiler output presents a table with the following information:
Scalene is a high-performance profiler for Python that offers detailed analysis of CPU, GPU, and memory usage. It allows developers to identify and optimise bottlenecks in their applications. Unlike other profilers, Scalene distinguishes between execution time in Python and native code, providing a more accurate view of program performance.
Scalene’s main features:
Scalene can be easily installed using pip:
pip install scalene
To profile a Python script called my_script.py, use the following command:
scalene my_script.py
This command will run the script and generate a detailed report in the terminal about CPU and memory usage:
To generate a report in HTML format:
scalene --html --outfile profile.html my_script.py
This will create a profile.html file that can be viewed in a web browser for more interactive exploration.
Additionally, in the web environment, you can see support for AI-based optimisers:
Scalene presents the results in a table that includes:
Scalene offers several options to customise profiling:
Considerations and limitations when using Scalene:
In this journey through code debugging and profiling in Python, it’s clear that the ecosystem is complete with applications that allow you to fine-tune this task. However, there are some that, in my opinion, deserve to be on the pedestal for their completeness and ease of use. These are:
Whatever tool we use, we must remember that implementing profiling and debugging practices and continuous performance monitoring is essential in software development to identify and correct bottlenecks, optimise resource usage, and ensure a satisfactory user experience. These strategies improve efficiency, reduce costs, and strengthen system security and stability. By integrating analysis tools and fostering a culture of continuous improvement, companies can anticipate potential problems, adapt to changes, and maintain a competitive advantage in today’s technological environment.
In my last post, “How I designed the frame for my Underwater ROV“, I gave all details about the design I used for the frame for Alioli Underwater ROV. In this post, I...
In my last post, “Underwater Alioli ROV“, I shared all the information I got from the Internet to build my Underwater ROV. In this post, I will explain how I made the...