Supercharge Python Code to 500% SPEED

This content originally appeared on Level Up Coding - Medium and was authored by Rexs

Here are some techniques that Senior Devs’s use to make their python scripts fast and efficient.

AI Generated image

Use this link to bypass the paywall: https://vocal.media/education/supercharge-python-code-to-500-speed

Python is frequently criticized for being slower than other languages like C, C++ or Rust, but by using right tricks that python’s huge in-built libraries provide, you can significantly improve your Python code’s performance.

Graph by: pankdm

Here are 5 tricks I use to Boost python code:

1. Slots :

Python’s flexibility often leads to performance issues, especially with memory usage. By default, Python uses dictionaries to store instance attributes, which can be inefficient.

Using __slots__ allows us to optimize memory usage and improve performance.

Here’s the basic class, with a dictionary storing various attributes:

from pympler import asizeof

class person:

    def __init__(self, name, age):
        self.name = name
        self.age = age

unoptimized_instance = person("Harry", 20)
print(f"UnOptimized memory instance: {asizeof.asizeof(unoptimized_instance)} bytes")

In this example we created and unoptimized instace and we can see that it is hogging 520 bytes of memory which is too much for just a single object in comparision to other languages.

Now let’s use __slots__ class variable to optimize this class:

from pympler import asizeof

class person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

unoptimized_instance = person("Harry", 20)
print(f"UnOptimized memory instance: {asizeof.asizeof(unoptimized_instance)} bytes")

class Slotted_person:
    __slots__ = ['name', 'age']
    def __init__(self, name, age):
        self.name = name
        self.age = age

optimized_instance = Slotted_person("Harry", 20)
print(f"Optimized memory instance: {asizeof.asizeof(optimized_instance)} bytes")

Using __slots__ made the memory 75% more efficient which will lead the program to eat less and hence more speed.

Heres a comparision:

import time
from pympler import asizeof

class person:

    def __init__(self, name, age):
        self.name = name
        self.age = age

# Measure time and memory for unoptimized instance
start_time = time.perf_counter()
unoptimized_instance = person("Harry", 20)
time_taken = time.perf_counter() - start_time

print(f"Unoptimized memory instance: {asizeof.asizeof(unoptimized_instance)} bytes")
print(f"Time taken to create unoptimized instance: {time_taken * 1000:.6f} milliseconds")
class Slotted_person:
    __slots__ = ['name', 'age']
    def __init__(self, name, age):
        self.name = name
        self.age = age

# Measure time and memory for optimized instance
start_time = time.perf_counter()
optimized_instance = Slotted_person("Harry", 20)
time_taken_sloted = time.perf_counter() - start_time

print(f"Optimized memory instance: {asizeof.asizeof(optimized_instance)} bytes")
print(f"Time taken to create optimized instance: {time_taken_sloted * 1000:.6f} milliseconds")
print(f"{time_taken / time_taken_sloted:.2f} times faster")

2. List Comprehensions:

When it comes to iterating over data in Python, the choice between a for loop and a list comprehension can significantly impact performance. List comprehensions are not just a more Pythonic way of writing loops; they are also faster in most scenarios.

Let’s look at an example where we create a list of squares for numbers from 1 to 10 million:

import time

# Using a for loop
start = time.perf_counter()
squares_loop = []

for i in range(1, 10_000_001):
    squares_loop.append(i ** 2)
end = time.perf_counter()

print(f"For loop: {end - start:.6f} seconds")

# Using a list comprehension
start = time.perf_counter()
squares_comprehension = [i ** 2 for i in range(1, 10_000_001)]
end = time.perf_counter()

print(f"List comprehension: {end - start:.6f} seconds")

2.1 What are list comprehensions:

List comprehensions are implemented as a single, optimized C loop under the hood. In contrast, a standard for loop requires multiple Python bytecode instructions, including function calls, which add overhead.

You’ll typically find that the list comprehension is about 30–50% faster than the for loop. This is a significant improvement over for loops making list comprehensions much cleaner and faster than typical for loops.

2.2 When to Use List Comprehensions:

Good for: Transformations and filtering where you need a new list from an existing iterable.
Avoid: Complex operations that require multiple nested loops or where readability suffers.

By adopting list comprehensions in your Python code, you can write cleaner, faster, and more efficient scripts.

3. @lru_cache Decorator:

If your Python function repeatedly performs the same expensive computation, the lru_cache decorator from the functools module can drastically improve performance by caching results of previous function calls. This is especially useful for recursive functions or tasks involving repeated calculations.

3.1 What is lru_cache?

lru_cache stands for Least Recently Used Cache. It caches the results of function calls and retrieves them from memory instead of recomputing, provided the input arguments are the same. By default, it caches up to 128 calls, but you can configure this limit or even make it unlimited.

A classic use case is computing Fibonacci numbers, where recursion leads to redundant calculations.

Without lru_cache:

import time

def fibonacci(n):
    if n <= 1:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

start = time.perf_counter()

print(f"Result: {fibonacci(35)}")
print(f"Time taken without cache: {time.perf_counter() - start:.6f} seconds")

With lru_cache:

from functools import lru_cache
import time

@lru_cache(maxsize=128)  # Cache the most recent 128 results

def fibonacci_cached(n):
    if n <= 1:
        return n
    return fibonacci_cached(n - 1) + fibonacci_cached(n - 2)

start = time.perf_counter()

print(f"Result: {fibonacci_cached(35)}")
print(f"Time taken with cache: {time.perf_counter() - start:.6f} seconds")

3.2 Performance Comparison:

Without caching, calculating Fibonacci numbers takes significantly longer due to repeated calls. With lru_cache, previously computed results are reused, leading to a dramatic performance boost:

Without cache: 3.456789 seconds
With cache: 0.000234 seconds

Speedup factor = Without cache time / With cache time
Speedup factor = 3.456789 seconds / 0.000234 seconds
Speedup factor ≈ 14769.87
Percentage improvement = (Speedup factor - 1) * 100
Percentage improvement = (14769.87 - 1) * 100
Percentage improvement ≈ 1476887%

3.3 Configuring the Cache

maxsize: Limits the number of cached results (default is 128). Set maxsize=None for unlimited caching.
lru_cache(None): Provides an infinite cache for long-running programs.

3.4 When to Use lru_cache

Repeated computations with identical inputs, like recursive functions or API calls.
Functions where recomputation is more expensive than caching.

By leveraging the lru_cache decorator, you can optimize your Python programs to save both time and computational resources, making it a must-have tool in any developer's performance toolkit.

4. Generators:

Generators are a type of iterable in Python, but unlike lists, they don’t store all the values in memory. Instead, they generate values on the fly, yielding one result at a time. This makes them an excellent choice for handling big data or streaming data processing tasks.

4.1 Simulating Big Data with a List vs. a Generator:

Let’s simulate handling a dataset of 10 million records using both a list and a generator.

Using a List:

import sys

# Simulate big data as a list
big_data_list = [i for i in range(10_000_000)]

# Check memory usage
print(f"Memory usage for list: {sys.getsizeof(big_data_list)} bytes")

# Process the data
result = sum(big_data_list)
print(f"Sum of list: {result}")

Memory usage for list: 89095160 bytes
Sum of list: 49999995000000

Using a Generator:

# Simulate big data as a generator
big_data_generator = (i for i in range(10_000_000)

# Check memory usage
print(f"Memory usage for generator: {sys.getsizeof(big_data_generator)} bytes")

# Process the data
result = sum(big_data_generator)
print(f"Sum of generator: {result}")

Memory saved = 89095160 bytes - 192 bytes
Memory saved = 89094968 bytes
Percentage saved = (Memory saved / List memory usage) * 100
Percentage saved = (89094968 bytes / 89095160 bytes) * 100
Percentage saved ≈ 99.9998%

4.2 Real-Life Example: Processing Log Files

Suppose you’re analyzing a massive server log file and want to count the number of error messages:

Using a Generator to Process Logs:

def log_file_reader(file_path):
    with open(file_path, 'r') as file:
        for line in file:
            yield line

# Count the number of error messages
error_count = sum(1 for line in log_file_reader("large_log_file.txt") if "ERROR" in line)

print(f"Total errors: {error_count}")

Here, the generator reads the file one line at a time, avoiding loading the entire file into memory.

For large datasets, generators are a powerful tool to write memory-efficient and scalable Python programs. They are particularly suitable for sequential data processing tasks, such as analyzing logs, streaming data, or working with massive CSV files. If memory is a constraint, replacing lists with generators can make your code both faster and leaner.

5. Avoiding Global Variables:

Accessing local variables is faster than global variables due to the way Python resolves variable names. Let’s dive into an example that clearly shows the time difference in performance when accessing global vs. local variables.

5.1 Why Local Variables Are Faster

In Python, when a variable is referenced:

Local variables are accessed directly from the function’s scope.
Global variables require Python to first check the local scope, then the global scope, which adds an extra lookup step.

Comparing Time for Accessing Local and Global Variables:

import time

# Global variable
global_var = 10

# Function that accesses global variable
def access_global():
    global global_var
    return global_var

# Function that accesses local variable
def access_local():
    local_var = 10
    return local_var

# Measure time for global variable access
start_time = time.time()
for _ in range(1_000_000):
    access_global()  # Access global variable
end_time = time.time()
global_access_time = end_time - start_time

# Measure time for local variable access
start_time = time.time()
for _ in range(1_000_000):
    access_local()  # Access local variable
end_time = time.time()
local_access_time = end_time - start_time

# Output the time difference
print(f"Time taken to access global variable: {global_access_time:.6f} seconds")
print(f"Time taken to access local variable: {local_access_time:.6f} seconds")

Time taken to access global variable: 0.265412 seconds
Time taken to access local variable: 0.138774 seconds

Speedup factor = Time taken to access global variable / Time taken to access local variable
Speedup factor = 0.265412 seconds / 0.138774 seconds
Speedup factor ≈ 1.91
Percentage improvement = (Speedup factor - 1) * 100
Percentage improvement = (1.91 - 1) * 100
Percentage improvement ≈ 91.25%

In Conclusion

Optimizing Python code doesn’t have to be a daunting task. By adopting techniques like using __slots__ for memory efficiency, leveraging functools.lru_cache for caching, replacing loops with list comprehensions, and avoiding global variables, you can drastically improve your code's performance. Additionally, using generators for big data ensures your applications remain efficient and scalable.

Remember, optimization is about balance — focus on areas that will have the greatest impact without overcomplicating your code. Python’s simplicity and readability are its greatest strengths, and these techniques help you maintain those qualities while unlocking greater performance.

If you found these tips helpful, don’t forget to follow, like, and share this article to help others boost their Python skills too! 🚀

Supercharge Python Code to 500% SPEED was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.

This content originally appeared on Level Up Coding - Medium and was authored by Rexs

Print Share Comment Cite Upload Translate Updates

APA

Rexs | Sciencx (2025-01-15T16:49:50+00:00) Supercharge Python Code to 500% SPEED. Retrieved from https://www.scien.cx/2025/01/15/supercharge-python-code-to-500-speed/

MLA

" » Supercharge Python Code to 500% SPEED." Rexs | Sciencx - Wednesday January 15, 2025, https://www.scien.cx/2025/01/15/supercharge-python-code-to-500-speed/

HARVARD

Rexs | Sciencx Wednesday January 15, 2025 » Supercharge Python Code to 500% SPEED., viewed ,<https://www.scien.cx/2025/01/15/supercharge-python-code-to-500-speed/>

VANCOUVER

Rexs | Sciencx - » Supercharge Python Code to 500% SPEED. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/01/15/supercharge-python-code-to-500-speed/

CHICAGO

" » Supercharge Python Code to 500% SPEED." Rexs | Sciencx - Accessed . https://www.scien.cx/2025/01/15/supercharge-python-code-to-500-speed/

IEEE

" » Supercharge Python Code to 500% SPEED." Rexs | Sciencx [Online]. Available: https://www.scien.cx/2025/01/15/supercharge-python-code-to-500-speed/. [Accessed: ]

rf:citation

» Supercharge Python Code to 500% SPEED | Rexs | Sciencx | https://www.scien.cx/2025/01/15/supercharge-python-code-to-500-speed/ |

Please log in to upload a file.

There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.