Python Guide to Debugging and Profiling

Python Guide to Debugging and Profiling


Juanmi Taboada
Juanmi Taboada
Python Guide to Debugging and Profiling

Debugging and profiling are essential components of software development. They allow you to identify and correct errors and optimise code performance. In this guide, we will discover and learn techniques for debugging and profiling Python programs.

What is the difference between debugging and profiling?

Debugging is locating and correcting errors or bugs in a program’s source code, while profiling focuses on analysing program performance to identify bottlenecks and areas requiring optimisation.

The main aspects analysed during profiling are:

  • Execution time: a measurement of each function or method’s execution time.
  • Call frequency: a count of how often each function or method is called.
  • Memory usage: an assessment of the amount of memory the program uses during execution.

A typical example of overhead is the excessive use of nested loops to perform operations that could be optimised:

# List of numbers
numbers = [1, 2, 3, 4, 5]

# Calculate the sum of the squares of numbers greater than 2
sum_squares = 0
for number in numbers:
    if number > 2:
        sum_squares += number ** 2

print(sum_squares)

Although this example is simple, this type of operation can generate significant overhead in cases involving more extensive lists or more complex conditions. We can use a generator expression with the sum() function to optimise the code and eliminate this overhead. This allows us to calculate the sum of the squares of numbers greater than 2 more efficiently:

numbers = [1, 2, 3, 4, 5]

# Calculate the sum of the squares of numbers greater than 2
sum_squares = sum(number ** 2 for number in numbers if number > 2)

print(sum_squares)

Generator expression: the expression (number ** 2 for number in numbers if number > 2) generates the squares of the numbers in the list numbers greater than 2. Unlike a list comprehension, the generator expression does not create a list in memory. Instead, it generates the values ​​on the fly, which improves efficiency in terms of memory usage.


sum() function: Calculates the sum of the values ​​generated by the generator expression. Since it is a language operation (compiled in C), its execution is much faster and lighter.

Memory Leak

A memory leak is a software error that occurs when a program reserves blocks of memory for use but fails to release them after they are no longer needed. As a result, the allocated memory remains occupied and cannot be reused by the system, leading to a gradual depletion of available memory.

Efficient memory management is essential for application performance and stability. Although Python has a garbage collector that automatically handles memory releases, memory leaks can occur if objects are not released properly. Here we can see that a leak started happening several minutes after the program began:

In this graph, we can see details of the Heap.

Low-level memory organisation (Heap and Stack)

Python is essentially C, specifically in its main implementation, CPython. Thus, Python inherits the memory organisation typical of C programs. Therefore, when a C program starts, the memory is organised into several sections, each with a specific purpose:

  • Code Segment (Text Segment): stores executable program instructions. It is a read-only area to prevent accidental code modifications during execution. The program’s functions reside in this segment.
  • Data Segment: contains global and static variables. Internally, it is divided into two areas:
    • BSS Segment (Block Started by Symbol): global and static variables not explicitly initialised.
    • Initialised Data Segment: global and static variables with defined initial values.
  • Heap: is the area for dynamic memory allocation during program execution. It generally grows toward higher memory addresses, i.e., in the opposite direction to the stack.
  • Stack: stores local variables, function parameters, and return addresses. It operates as a LIFO (Last In, First Out) function. It typically grows toward the lowest memory addresses, opposite to the heap. This area is used, for example, when a function is called; at the beginning, a “new” frame is created on the stack for its local variables and parameters.

The following block diagram shows how memory is organised:

With this in mind, we can see that they could collide if the heap and stack grow too large. This is true in embedded systems (microcontrollers like Arduino and ESP32).

In modern languages ​​running on current operating systems, attempting to expand the stack beyond its capacity often results in a stack overflow error. On the other hand, increasing the heap size causes memory allocation functions to fail. However, not all software is modern, so it is crucial to analyse the possible failure modes:

  • If the stack grows, invading the heap space, the C compiler will silently begin overwriting the heap’s data structures. Virtual Memory Guard Pages prevent the stack from growing indefinitely in a modern operating system. As long as the amount of memory in these guard pages is at least the size of the activation record of the growing procedure, the operating system will guarantee a segfault. If you’re on a DOS system running on a machine without a Memory Management Unit (MMU), you’ll likely face a relatively serious set of undetermined problems.
  • If the heap grows into the stack, the operating system should know the situation, and some system calls will fail. Implementing the memory allocation function will detect the failure and return NULL. What happens next depends on the library or program’s error handling.

It’s important to consider that in multithreaded systems, there is one stack per thread, and the entire responsibility for memory management shouldn’t be offloaded to the guard pages. Programming practices that control memory usage and prevent overflows are essential rather than relying solely on the protections provided by the operating system.

Example of a Memory Leak:

Suppose we’re developing an application that manages user sessions. Each session maintains references to user objects, and these user objects, in turn, keep references to their respective sessions. We also use a global list to store all active sessions.

class User:
    def __init__(self, name):
        self. name = name
        self. session = None

class Session:
    def __init__(self, user):
        self. user = user
        user. session = self

# Global list storing all active sessions (Disclaimer: use of globals is generally discouraged)
active_sessions = []

def log_in(username):
    user = User(username)
    session = Session(user)
    active_sessions. append(session)

def log_out(username):
    global active_sessions
    active_sessions = [session for session in active_sessions if session.user.name != username]

Let’s analyse the code:

  1. Circular references:
    • When a new session is created with login(), User and Session objects are instantiated and refer to each other:
      • The Session object references the User object through self.user.
      • The User object references the Session object through self.session.
    • This bidirectional relationship creates a circular reference, where each object keeps the other alive, preventing the garbage collector from automatically freeing them.
  2. Using a global list:
    • The global list active_sessions stores all active sessions. Even if references to User and Session objects are removed elsewhere in the code, as long as they exist in active_sessions, they will not be garbage collected.
  3. Problem in the logout() function:
    • When attempting to log out a session, the function filters active_sessions to exclude the session of the specified user. However, if other parts of the code still reference the User or Session objects, circular references prevent the garbage collector from freeing the associated memory.

Performance Impact: over time, if numerous sessions are created and not properly managed, the memory used by circular references and sessions stored in the global list will not be freed, causing a constant increase in memory consumption and potentially leading to performance degradation or application failure.

The proposed solutions would be:

Break circular references: before deleting a session, it is advisable to explicitly break circular references by setting the references to None:

def logout(username):
    global active_sessions
    for session in active_sessions:
        if session.user.name == username:
            session.user.session = None
            session.user = None
    active_sessions = [session for session in active_sessions if session.user is not None]

Use weakref for weak references: the weakref module allows you to create weak references that do not prevent the garbage collector from deleting objects. By using weak references, you can avoid problematic circular references:

import weakref

class User:
    def __init__(self, name):
        self.name = name
        self.session = None

class Session:
    def __init__(self, user):
        self.user = weakref.ref(user)
        user.session = weakref.ref(self)

Properly manage the global list: ensure that closed sessions are entirely removed from active_sessions and that no lingering references to objects are no longer needed.

Debugging and profiling tools

pdb: Interactive Python Debugger

The pdb (Python Debugger) module is the standard Python debugger that allows developers to run programs interactively to identify and fix bugs. You can use it in several ways:

From the command line: by running the script with the -m pdb argument, which starts the program under pdb control from the beginning:

python -m pdb my_script.py

By inserting pdb.set_trace() into your code: by including import pdb; pdb.set_trace() at the point where you want to start debugging, will stop the program at that line and open an interactive pdb session:

import pdb

def function():
    variable = 'value'
    pdb.set_trace() # Start debugging here
    # Code to debug

Using the breakpoint() function: starting with Python 3.7, the built-in function breakpoint() was introduced, which acts as an alias for pdb.set_trace(), making it easy to insert breakpoints without having to import pdb explicitly:

def function():
    variable = 'value'
    breakpoint() # Start debugging here
    # Code to debug

Running the program and reaching the breakpoint will open the pdb prompt, indicated by (Pdb), from where you can issue commands to control execution and analyse the program’s status. Some of the most commonly used include:

  • n (next): executes the following line of code in the current function and stops, without entering any functions called on that line.
  • s (step): executes the following line of code and, if it includes a function call, enters it, stopping on its first line.
  • c (continue): continues program execution until a breakpoint is encountered or until the program terminates.
  • l (list): displays the source code around the current line or a specific line if a number is supplied as an argument.
  • p (print): evaluates and displays the value of an expression or variable.
  • q (quit): exits the debugging session and terminates the program.

cProfile: Performance Profiler

cProfile is a built-in module that provides deterministic program profiling, allowing you to analyse the execution time of each function and how often it is called. By instrumenting your code, cProfile collects statistics on the number of calls and time consumed by each function, making it easier to identify areas that require optimisation.

To profile an entire script using cProfile, you can run the script from the command line as follows:

python -m cProfile my_script.py

This command will run my_script.py under cProfile control and display a summary of the profiling statistics to standard output:

If you want to save the results to a file for later analysis, you can use the -o option:

python -m cProfile -o results.prof my_script.py

This will generate a file named results.prof containing the profiling statistics.
Running cProfile generates a table with several columns that provide information about the performance of each function:

  • ncalls: number of times the function was called.
  • tottime: total time spent in the function, excluding calls to subfunctions.
  • percall: average time per call of the function (divided by tottime by ncalls).
  • cumtime: cumulative time, including time in the function and all subfunctions it calls.
  • percall: average time per call for the cumulative time (divided by cumtime by ncalls).
  • filename:lineno(function): location in the source code and function name.

These metrics help you identify which functions take the most time and could be candidates for optimisation.​ To profile a specific section of code instead of the entire script, you can use cProfile within the code itself:

import cProfile

def function_to_profile():
-# Code to analyze
pass

if name == 'main':
profiler = cProfile.Profile()
profiler.enable()
function_to_profile()
profiler.disable()
profiler.print_stats()

This approach allows you to focus the analysis on specific functions or blocks of code, facilitating more targeted optimisation.

Visualisation tools such as SnakeVizgprof2dot, or kcachegrind can help you interpret cProfile results more intuitively.

SnakeViz

gprof2dot

kcachegrind

tracemalloc: Tracking memory allocations

The tracemalloc module, introduced in Python 3.4, is a built-in tool for tracing memory allocations made by a Python program. It allows you to take “snapshots” of the memory state at different points in time and compare them to analyse how memory usage changes over time.

To use tracemalloc, follow these steps:

import tracemalloc

# Start the memory trace
tracemalloc.start()

# Code whose memory allocation you want to trace
...

# Take a snapshot of the current memory state
snapshot = tracemalloc.take_snapshot()

# Analyze the snapshot statistics:
top_stats = snapshot.statistics('lineno')
print("[Top 10 Memory-Using Lines]")
for stat in top_stats[:10]:
    print(stat)

# Stop the memory trace (optional)
tracemalloc.stop()

By analysing the statistics provided by tracemalloc, you can gain insight into which lines of code are consuming the most memory:

[Top 10 lines that consume the most memory]
<frozen importlib._bootstrap>:716: size=4855 KiB, count=39328, average=126 B
<frozen importlib._bootstrap>:284: size=521 KiB, count=3199, average=167 B
/usr/lib/python3.4/collections/__init__.py:368: size=244 KiB, count=2315, average=108 B
/usr/lib/python3.4/unittest/case.py:381: size=185 KiB, count=779, average=243 B
/usr/lib/python3.4/unittest/case.py:402: size=154 KiB, count=378, average=416 b
/usr/lib/python3.4/abc.py:133: size=88.7 KiB, count=347, average=262 B
<frozen importlib._bootstrap>:1446: size=70.4 KiB, count=911, average=79 B
<frozen importlib._bootstrap>:1454: size=52.0 KiB, count=25, average=2131 B
<string>:5: size=49.7 KiB, count=148, average=344 B
/usr/lib/python3.4/sysconfig.py:411: size=48.0 KiB, count=1, average=48.0 KiB

Each entry in the statistics includes:

  • File and line number: indicates where the memory allocation was made.
  • Total allocated memory size: the memory allocated on that specific line.
  • Number of memory blocks: number of memory blocks allocated.
  • Average size per block: average amount of memory allocated per block.

A vital feature of tracemalloc is comparing two memory snapshots to identify allocation differences. This is useful for detecting memory leaks by observing which objects remain in memory between two points in time. Example:

import tracemalloc

tracemalloc.start()

# Code before the possible memory leak
snapshot1 = tracemalloc.take_snapshot()

# Code that could be causing a memory leak
…

# Code after the possible memory leak
snapshot2 = tracemalloc.take_snapshot()

# Compare the two snapshots
top_stats = snapshot2.compare_to(snapshot1, 'lineno')

print("[Main differences in memory usage]")
for stat in top_stats[:10]:
    print(stat)

memory_profiler: Line-by-line analysis of memory usage

Although this tool is no longer actively maintained, it works well, and there’s no reason not to use it. It allows you to monitor and analyse a program’s memory usage, providing detailed information line by line. This makes it easier to identify bottlenecks and optimise memory usage. To install memory_profiler, use pip:

pip install memory-profiler

It’s also recommended to install matplotlib if you want to visualise the memory usage graphically:

pip install matplotlib

memory_profiler allows you to analyse specific functions using the @profile decorator. When applied to a function, you can obtain a detailed report of memory usage line by line. Example:

from memory_profiler import profile

@profile
def process_data():
    data = [1] * (10**6)
    result = [x * 2 for x in data]
    del data
    return result

if name == 'main':
    process_data()

To run the script and get the memory profile:

python -m memory_profiler my_script.py

The output will show the memory usage before and after each line of the decorated function, indicating increases and decreases in consumption.​

Line # Mem usage Increment Occurrences Line Contents
-================================================================
3 18.9 MiB 18.9 MiB 1 @profile
4 def process_data():
5 26.6 MiB 7.6 MiB 1 data = [1] * (10**6)
6 34.1 MiB 7.5 MiB 1000003 result = [x * 2 for x in data]
7 26.5 MiB -7.6 MiB 1 del data
8 26.5 MiB 0.0 MiB 1 return result

memory_profiler includes the mprof tool, which allows you to log and view memory usage from a script over time. Run the script with mprof to record memory usage:

mprof run my_script.py

Generate a graph of memory usage:

mprof plot

This will produce a graph showing how memory usage varies during the script’s execution, making it easier to identify potential memory leaks or sections of code that require optimisation:

memory_profiler offers the memory_usage function to measure the memory usage of specific functions, allowing for a more focused analysis:

from memory_profiler import memory_usage

def my_function():
    data = [1] * (10**6)
    return data

memory_usage = memory_usage(my_function)
print(f'Memory usage: {memory_usage} MiB')

This approach is helpful for quickly measuring the memory usage of individual tasks without having to decorate or modify the original code.

pympler: Monitoring and analysing memory behaviour

Pympler is a development tool that allows you to measure, monitor, and analyze the memory behavior of running objects. Its main objective is to provide a detailed view of objects’ sizes and lifetimes.

To install Pympler, you can use pip:

pip install pympler

Pympler is made up of several modules that offer different memory analysis features:

asizeof: provides information about the size of Python objects, including their references. It allows you to investigate how much memory space particular objects take up. Unlike sys.getsizeof, asizeof measures objects recursively, including all referenced objects. Example:

from pympler import asizeof

obj = [1, 2, (3, 4), 'text']
print(asizeof. asizeof(obj))

muppy: allows online monitoring by identifying objects that consume memory and potential leaks. It will enable you to track memory usage during execution and detect improperly freed objects. Example:

from pympler import muppy, summary

all_objects = muppy.get_objects()
sum1 = summary.summarize(all_objects)
summary.print_(sum1)
                       types |   # objects |   total size
============================ | =========== | ============
                         str |       12960 |      2.20 MB
                        dict |        4568 |      1.64 MB
                        code |        4046 |    699.92 KB
                        type |         796 |    624.21 KB
                       tuple |        5164 |    320.80 KB
          wrapper_descriptor |        2225 |    156.45 KB
  builtin_function_or_method |        1276 |     89.72 KB
                         set |         109 |     86.49 KB
           method_descriptor |        1099 |     77.27 KB
       weakref.ReferenceType |        1092 |     76.78 KB
                        list |         378 |     69.54 KB
                 abc.ABCMeta |          67 |     65.05 KB
                   frozenset |         113 |     50.09 KB
           getset_descriptor |         759 |     47.44 KB
                         int |        1529 |     46.21 KB

classtracker: allows you to track the lifetime of objects of specific classes, providing insight into instantiation patterns and how they contribute to memory consumption over time. Example:

from pympler import classtracker

class MyClass:
    pass

tr = classtracker.ClassTracker()
tr.track_class(MyClass)
tr.create_snapshot()
# Create instances of MyClass
tr.create_snapshot()
tr.stats.print_summary()
---- SUMMARY ---------------------------------------------------------------
active 0 B average pct
active 0 B average pct
----------------------------------------------------------------------------

memray: Bloomberg’s Memory Profiler

Memray is a memory profiler for Python developed by Bloomberg that allows you to trace and report memory allocations in both Python code and compiled extension modules. Its ability to deeply analyse memory usage makes it an essential tool for identifying memory leaks, analysing allocations, and optimising application performance. Memray Key Features:

  • Full Allocation Tracing: unlike sampling profilers, Memray traces every function call, providing an accurate representation of the call stack, including calls to native C/C++ libraries.
  • Detailed Reporting: generates various reports, such as flame graphs, that facilitate the visualisation and analysis of captured memory usage data.
  • Low Performance Impact: profiling with Memray minimally slows down the application, even when tracing native code, allowing for use in production environments with acceptable overhead.
  • Threading Support: works efficiently with Python and native threads, enabling analysis of concurrent applications.

Memray requires Python 3.7 or higher and can be easily installed from PyPI using:

python3 -m pip install memray

For Debian-based systems, you may need to install additional dependencies:

sudo apt-get install python3-dev libunwind-dev liblz4-dev

Then, you can proceed with installing Memray:

python3 -m pip install memray

Memray is commonly used from the command line to run and profile a Python script:

memray run my_script.py

This command runs my_script.py and traces its memory allocations, generating an output file. To generate a flame graph from the collected data:

memray flamegraph memray-mi_script.py..bin

This produces an interactive HTML file that displays memory usage as a flame graph, making it easier to identify bottlenecks and areas of high memory consumption:

In addition to flame graphs, Memray offers several other report types:

Summary Report: an overview of memory usage, highlighting the most memory-intensive functions.

memray summary memray-mi_script.py..bin

Table Report: generates a detailed table with all allocations recorded during execution.

memray table memray-mi_script.py..bin

Memray can be started with pytest using the pytest-memray module, allowing you to monitor memory usage during unit tests and prevent memory-related regressions. To install the module:

pip install pytest-memray

Then, when running tests with pytest, add the –memray option to enable memory profiling:

pytest --memray

This generates detailed memory usage reports for each test, making it easier to detect tests that consume more memory than expected.

Considerations when using Memray:

  • Supported environments: memray is supported on Linux and macOS systems. Operation on other operating systems is not guaranteed.
  • Performance overhead: although Memray is designed to minimise overhead, tracing memory allocations can affect application performance. This overhead should be evaluated in development environments before being used in production.
  • Native code analysis: to trace allocations in native code (C/C++), Memray must be run with the –native option, which can increase the overhead and the amount of data collected.

pyinstrument: Lightweight performance profiler

It is a statistical profiling tool compatible with Python 3.8 and higher. It allows you to analyse program performance by identifying the parts of the code that consume the most time during execution. Unlike traditional profilers that trace every function call, pyinstrument periodically samples the state of the call stack, reducing overhead and providing a clear view of the program’s behaviour.

To install pyinstrument, you can use pip:

pip install pyinstrument

To profile an entire script from the command line, use:

pyinstrument my_script.py

Upon completion, pyinstrument will display a summary in the terminal indicating where most of the time was spent.

To generate an interactive HTML report:

pyinstrument -r html my_script.py

This command will generate a detailed HTML report that allows for deeper exploration of the performance profile:

pyinstrument can also be integrated directly into the code to profile specific sections:

from pyinstrument import Profiler

profiler = Profiler()
profiler.start()

# Code to profile
result = function_to_analyze()

profiler.stop()
profiler.print()

Alternatively, using a context:

from pyinstrument import Profiler

with Profiler() as profiler:
# Code to profile
result = function_to_analyze()

print(profiler.output_text(unicode=True, color=True))

After profiling, pyinstrument presents a tree structure showing the time spent on each function and its subcalls. The most time-consuming functions are highlighted, making identifying critical areas that could benefit from optimisation easier.

Unlike cProfile, which is a deterministic profiler and can introduce significant overhead, pyinstrument is a statistical profiler that takes periodic samples, minimising interference with program performance. Furthermore, while cProfile focuses on CPU time, pyinstrument measures actual elapsed time, including waits and I/O operations, providing a more complete view of application performance.

pyinstrument offers middleware for frameworks such as Django, Flask, and FastAPI, allowing you to profile specific web requests. For example, in Django, adding pyinstrument.middleware.ProfilerMiddleware to the middleware configuration and accessing a URL with the ?profile parameter provides a detailed performance report for that request.

py-spy: Sampling Profiler for Python

It’s a sampling profiler for Python programs that lets you visualise which parts of your code take the most time without restarting the program or modifying its source code. Written in Rust, py-spy offers minimal overhead and is safe for production environments. It runs on Linux, macOS, Windows, and FreeBSD, and is compatible with CPython versions 2.3 through 3.13.

To install py-spy, you can use pip:

pip install py-spy

For Rust users, you can install it using cargo:

cargo install py-spy

py-spy is used from the command line and offers three main subcommands: record, top, and dump.​

record: records profiles and generates flame graphs.​ This command generates an interactive SVG file showing which functions take the most time.​ Example:

py-spy record -o profile.svg --pid 12345

top: displays the most time-consuming tasks in real time, similar to the Unix top command. This command provides a live view of CPU usage per function in your Python program. Example:

py-spy top --pid 12345

dump: displays the current call stack for each thread in your program. This command helps diagnose why a program is hanging or identifies bottlenecks. Example:

py-spy dump --pid 12345

line_profiler: Line-by-line analysis of performance

It is a tool that allows you to analyse the execution time of each function line, facilitating the precise identification of bottlenecks in the code. To install line_profiler, you can use pip:

pip install line_profiler

To profile specific functions, use the @profile decorator. It’s important to note that this decorator is automatically recognised when running the script with kernprof, without explicitly importing it. Example:

@profile
def slow_function():
    result = 0
    for i in range(10000):
        result += i
    return result

slow_function()

To run the profiling:

kernprof -l -v my_script.py

The kernprof -l command runs the script and generates a .lprof profiling file, while the -v option displays the results directly in the terminal:

Wrote profile results to my_script.py.lprof
Timer unit: 1e-06 s

Total time: 0.139928 s
File: test.py
Function: process_data at line 3

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     3                                           @profile
     4                                           def process_data():
     5         1       5628.9   5628.9      4.0      data = [1] * (10**6)
     6         1     132350.9 132350.9     94.6      result = [x * 2 for x in data]
     7         1       1947.2   1947.2      1.4      del data
     8         1          1.0      1.0      0.0      return result

The profiler output presents a table with the following information:

  • Line #: Line number in the source code.
  • Hits: Number of times the line was executed. Line.​
  • Time: Total time consumed by the line.​
  • Per Hit: Average time per execution.​
  • % Time: Percentage of the total time of the function that represents this line.​
  • Line Contents: Contents of the line of code.​

Scalene: CPU, Memory, and GPU Profiler

Scalene is a high-performance profiler for Python that offers detailed analysis of CPU, GPU, and memory usage. It allows developers to identify and optimise bottlenecks in their applications. Unlike other profilers, Scalene distinguishes between execution time in Python and native code, providing a more accurate view of program performance.

Scalene’s main features:

  • Multi-level analysis: provides detailed line- and function-level information on CPU, GPU, and memory usage.
  • Code differentiation: distinguishes between time spent in Python code and native code (e.g., libraries written in C/C++), making it easier to identify specific areas for optimisation.
  • Low overhead: Uses sampling techniques to minimise performance overhead, typically between 10% and 20%.
  • Memory profiling: identifies specific lines responsible for memory allocations and potential leaks, helping to reduce memory consumption and improve efficiency. Copy Analysis: Detects and reports the volume of data copies between Python code and native libraries, which can be crucial for optimising performance in applications that handle large volumes of data.

Scalene can be easily installed using pip:

pip install scalene

To profile a Python script called my_script.py, use the following command:

scalene my_script.py

This command will run the script and generate a detailed report in the terminal about CPU and memory usage:

To generate a report in HTML format:

scalene --html --outfile profile.html my_script.py

This will create a profile.html file that can be viewed in a web browser for more interactive exploration.

Additionally, in the web environment, you can see support for AI-based optimisers:

Scalene presents the results in a table that includes:

  • % CPU Time (Python): percentage of CPU time Python code consumes.
  • % CPU Time (Native): percentage of CPU time consumed by native code.
  • % CPU Time (System): percentage of CPU time consumed by system calls.
  • % Memory (Python): percentage of memory allocated by Python code.
  • % Memory (Native): percentage of memory allocated by native code.
  • Copy Volume (MB/s): data copy rate between Python and native code.

Scalene offers several options to customise profiling:

  • GPU Profiling: use the –gpu option to include GPU analysis.
  • Profiling Interval: the sampling interval can be adjusted with profile-interval, specifying the number of seconds between samples.
  • CPU Threshold: to focus on functions that consume more than a certain percentage of CPU, use –cpu-percent-threshold.

Considerations and limitations when using Scalene:

  • Compatibility: Scalene is compatible with Python 3.5 and higher.
  • Operating Systems: it runs on macOS and Linux. Windows’ functionality is limited and may require environments like WSL2 to run.
  • Interaction with Other Libraries: some libraries, such as recent versions of PyTorch, may conflict with Scalene. It is recommended that compatibility be checked and specific versions be considered if necessary.

Conclusions 🤔

In this journey through code debugging and profiling in Python, it’s clear that the ecosystem is complete with applications that allow you to fine-tune this task. However, there are some that, in my opinion, deserve to be on the pedestal for their completeness and ease of use. These are:

  • Memray for memory analysis
  • Pyinstrument for performance analysis

​Whatever tool we use, we must remember that implementing profiling and debugging practices and continuous performance monitoring is essential in software development to identify and correct bottlenecks, optimise resource usage, and ensure a satisfactory user experience. These strategies improve efficiency, reduce costs, and strengthen system security and stability. By integrating analysis tools and fostering a culture of continuous improvement, companies can anticipate potential problems, adapt to changes, and maintain a competitive advantage in today’s technological environment.

Comments

Related Articles

Programación

Finishing the frame for an Underwater ROV

In my last post, “How I designed the frame for my Underwater ROV“, I gave all details about the design I used for the frame for Alioli Underwater ROV. In this post, I...

Posted on by Juanmi Taboada
Programación

How I designed the frame for my Underwater ROV

In my last post, “Underwater Alioli ROV“, I shared all the information I got from the Internet to build my Underwater ROV. In this post, I will explain how I made the...

Posted on by Juanmi Taboada