Multithreading & Multiprocessing Threads vs. Processes, threading, multiprocessing

Python Concurrency: Threads vs. Processes

Python Concurrency: Threads vs. Processes

Speed up your programs with parallel execution!

1. Threads vs. Processes

Feature Threads Processes
Memory Share the same memory space Separate memory space
Overhead Lightweight (faster to create) Heavyweight (slower to create)
GIL Impact Bound by Global Interpreter Lock (GIL) Bypass GIL (true parallelism)
Use Case I/O-bound tasks (e.g., web requests, files) CPU-bound tasks (e.g., math, data crunching)

2. Multithreading (threading Module)

Best for I/O-bound tasks where waiting is involved (e.g., APIs, file ops).

Example: Download Simulator


import threading  
import time  

def download_file(url):  
    print(f"Downloading {url}...")  
    time.sleep(2)  # Simulate I/O wait  
    print(f"Finished {url}")  

urls = ["https://example.com/file1", "https://example.com/file2"]  

# Create threads  
threads = []  
for url in urls:  
    thread = threading.Thread(target=download_file, args=(url,))  
    thread.start()  
    threads.append(thread)  

# Wait for all threads to finish  
for thread in threads:  
    thread.join()  

print("All downloads complete! 🚀")  

            

Output:


Downloading https://example.com/file1...  
Downloading https://example.com/file2...  
[After 2 seconds]  
Finished https://example.com/file1  
Finished https://example.com/file2  
All downloads complete! 🚀  

            

Key Notes:

  • Threads run concurrently but not in parallel (due to GIL).
  • Use Lock to prevent race conditions:

lock = threading.Lock()  
with lock:  
    # Access shared resource  

            

3. Multiprocessing (multiprocessing Module)

Best for CPU-bound tasks that need true parallelism.

Example: Number Cruncher


import multiprocessing  
import time  

def calculate_square(numbers):  
    for n in numbers:  
        time.sleep(0.2)  # Simulate CPU work  
        print(f"Square: {n*n}")  

numbers = [1, 2, 3, 4]  

# Create processes  
processes = []  
for i in range(2):  
    # Split work between processes  
    p = multiprocessing.Process(target=calculate_square, args=(numbers[i*2 : (i+1)*2],))  
    p.start()  
    processes.append(p)  

# Wait for all processes  
for p in processes:  
    p.join()  

print("All calculations done! 🔢")  

            

Output:


Square: 1  
Square: 4  
Square: 9  
Square: 16  
All calculations done! 🔢  

            

Key Notes:

  • Processes have no shared memory (use Queue or Pipe for communication).
  • Avoid excessive processes (overhead vs. benefit).

4. Real-World Use Cases

Threads:

  • Web scraping (multiple URLs at once).
  • GUI apps (keep UI responsive during long tasks).
  • Handling multiple client connections (e.g., servers).

Processes:

  • Data processing (e.g., Pandas operations).
  • Image/video rendering.
  • Machine learning model training.

5. Common Mistakes

  • Threads for CPU-bound tasks: Won’t speed up due to GIL.
  • Race conditions: Shared data accessed by multiple threads → Use locks.
  • Too many processes: High memory/CPU overhead.

6. Best Practices

  • I/O-bound? → Use threads.
  • CPU-bound? → Use processes.
  • Shared data in threads? → Use threading.Lock.
  • Inter-process communication? → Use multiprocessing.Queue.

Performance Comparison

Task Type Threads Processes
I/O-bound ✅ Faster ❌ Slower (overhead)
CPU-bound ❌ No speedup (GIL) ✅ Faster

Fun Activity: Build a Speed Test

Compare thread vs. process performance:


import time  
import threading  
import multiprocessing  

def task():  
    time.sleep(1)  # Simulate mixed workload  

# Threads  
start = time.time()  
threads = [threading.Thread(target=task) for _ in range(10)]  
for t in threads:  
    t.start()  
for t in threads:  
    t.join()  
print(f"Threads: {time.time() - start:.2f}s")  

# Processes  
start = time.time()  
processes = [multiprocessing.Process(target=task) for _ in range(10)]  
for p in processes:  
    p.start()  
for p in processes:  
    p.join()  
print(f"Processes: {time.time() - start:.2f}s")  

            

Key Takeaways

  • ✅ Threads: Share memory, good for I/O tasks, limited by GIL.
  • ✅ Processes: Isolated memory, true parallelism, ideal for CPU work.
  • ✅ Choose wisely: Match the tool (threads/processes) to the task type.

What’s Next?

Learn asyncio for modern asynchronous I/O operations!

Post a Comment

Previous Post Next Post