What is the difference between deep copy and shallow copy?
In Python, assigning one variable to another creates a reference, not a new object. To create independent copies of mutable objects, especially collections, we use the concepts of shallow copy and deep copy. Understanding the distinction is crucial for managing data integrity and avoiding unintended side effects, particularly when dealing with nested data structures.
Understanding Object Copying
When you assign a variable to another in Python (e.g., 'b = a'), both variables refer to the same object in memory. If the object is mutable (like a list or dictionary), changes made through one variable will be visible through the other. Copying allows us to create new, independent objects, but the level of independence depends on whether it's a shallow or deep copy.
Shallow Copy
A shallow copy creates a new compound object but does not create copies of the objects contained within the original. Instead, it populates the new compound object with references to the same objects found in the original. This means that if you have mutable nested objects (like lists within a list), changes to these nested objects in the copy will affect the original, and vice-versa.
import copy
original_list = [1, [2, 3], 4]
shallow_copied_list = original_list.copy() # or list(original_list) or copy.copy(original_list)
print(f"Original: {original_list}")
print(f"Shallow Copy: {shallow_copied_list}")
# Modify a top-level element
shallow_copied_list[0] = 100
print(f"\nAfter modifying top-level element in copy:")
print(f"Original: {original_list}") # Original is unchanged
print(f"Shallow Copy: {shallow_copied_list}")
# Modify a nested mutable element
shallow_copied_list[1][0] = 200
print(f"\nAfter modifying nested element in copy:")
print(f"Original: {original_list}") # Original's nested element IS changed
print(f"Shallow Copy: {shallow_copied_list}")
Common ways to perform a shallow copy include using the 'copy()' method available on some collections (like lists, dictionaries, sets), slicing for lists (e.g., 'original_list[:]'), or using 'copy.copy()' from the 'copy' module. The key takeaway is that while the top-level container is new, its elements (if they are mutable objects) are still shared references.
Deep Copy
A deep copy creates a completely independent new compound object, and recursively copies all objects found in the original. This means it creates new copies of all nested objects as well. Changes made to a deep copy will not affect the original object, regardless of how deeply nested the elements are. This requires the 'copy' module.
import copy
original_list = [1, [2, 3], 4]
deep_copied_list = copy.deepcopy(original_list)
print(f"Original: {original_list}")
print(f"Deep Copy: {deep_copied_list}")
# Modify a top-level element
deep_copied_list[0] = 100
print(f"\nAfter modifying top-level element in copy:")
print(f"Original: {original_list}")
print(f"Deep Copy: {deep_copied_list}")
# Modify a nested mutable element
deep_copied_list[1][0] = 200
print(f"\nAfter modifying nested element in copy:")
print(f"Original: {original_list}") # Original's nested element is NOT changed
print(f"Deep Copy: {deep_copied_list}")
To perform a deep copy, you must use the 'deepcopy()' function from Python's built-in 'copy' module. This is essential when you need to ensure that the new copy is entirely separate from the original, preventing any unexpected side effects from modifications.
Key Differences Summarized
| Feature | Shallow Copy | Deep Copy |
|---|---|---|
| Top-level Object | New object created | New object created |
| Nested Objects | References original nested objects | New copies of nested objects created |
| Mutability Impact | Changes to nested objects in copy affect original (and vice versa) | Changes to nested objects in copy do NOT affect original |
| Method/Function | `list.copy()`, `dict.copy()`, `copy.copy()`, slicing `[:]` | `copy.deepcopy()` (from 'copy' module) |
| Independence | Partial (top-level only) | Complete (all levels) |
| Use Case | When nested mutable objects don't exist or you want to share them. | When you need a fully independent duplicate, including all nested mutable objects. |
In summary, choose a shallow copy when your object contains only immutable elements or when you explicitly want the copy and original to share nested mutable objects. Opt for a deep copy when you need complete independence between the original and the new object, especially when dealing with complex, nested mutable data structures.