🐍 Python Q63 / 170

What is multiprocessing module?

AI-Powered Answer ✓ Answered

The `multiprocessing` module in Python is a package that supports spawning processes using an API similar to the `threading` module. It allows the programmer to fully leverage multiple processors on a given machine, bypassing the Global Interpreter Lock (GIL) limitations that affect multi-threaded programs.

What is the `multiprocessing` module?

The multiprocessing module is designed for parallel execution of code in Python. Unlike threading, which creates multiple threads within a single process, multiprocessing creates separate processes. Each process has its own Python interpreter and memory space, making it immune to the Global Interpreter Lock (GIL) and capable of utilizing multiple CPU cores simultaneously for true parallelism.

Key Features and Components

  • Process class: The fundamental building block for creating new processes.
  • Pool class: Provides a convenient way to parallelize the execution of a function across multiple input values, distributing the inputs among worker processes.
  • Queue and Pipe: Used for inter-process communication (IPC) to exchange data between processes.
  • Lock, Semaphore, Event: Synchronization primitives similar to those in the threading module, adapted for inter-process synchronization.
  • Value and Array: Shared memory objects to allow processes to share simple data types or arrays.
  • Manager: Provides a way to create shared objects that can be accessed by different processes, allowing them to share more complex data structures like dictionaries or lists.

Why use `multiprocessing`?

  • True Parallelism: Bypasses the GIL, allowing CPU-bound tasks to run on multiple CPU cores concurrently.
  • Improved Performance: Can significantly speed up computations that can be divided into independent parts.
  • Isolation: Each process has its own memory space, leading to greater stability as errors in one process are less likely to affect others.
  • Resource Utilization: Effectively uses multi-core processors, which are standard in modern computers.

Basic Example

Here's a simple example demonstrating how to create and run processes using the Process class.

python
import multiprocessing
import os

def worker_function(name):
    print(f"Worker {name}: My process ID is {os.getpid()}")

if __name__ == '__main__':
    print(f"Main process ID: {os.getpid()}")

    # Create two Process objects
    p1 = multiprocessing.Process(target=worker_function, args=('Alice',))
    p2 = multiprocessing.Process(target=worker_function, args=('Bob',))

    # Start the processes
    p1.start()
    p2.start()

    # Wait for both processes to complete
    p1.join()
    p2.join()

    print("All worker processes finished.")

Multiprocessing vs. Multithreading

Feature`multiprocessing``threading`
ParallelismTrue parallelism (bypasses GIL)Concurrency, but not true parallelism for CPU-bound tasks (limited by GIL)
MemoryEach process has its own memory spaceThreads share the same memory space
OverheadHigher overhead (process creation, IPC)Lower overhead (thread creation, context switching)
IPCRequires explicit IPC mechanisms (Queues, Pipes, Managers)Easier data sharing, but requires careful synchronization
Best forCPU-bound tasks (e.g., heavy computations, data processing)I/O-bound tasks (e.g., network requests, file operations)
RobustnessMore robust; failure in one process typically doesn't affect othersLess robust; error in one thread can crash the entire process