Explain pessimistic vs optimistic locking.
Database locking is a crucial mechanism for maintaining data consistency and integrity in a multi-user environment where multiple transactions might try to read or modify the same data concurrently. It prevents lost updates, dirty reads, and other concurrency issues. There are two primary strategies for database locking: pessimistic and optimistic.
Pessimistic Locking
Pessimistic locking, as the name suggests, assumes that conflicts are likely to occur. It prevents conflicts by locking data before a transaction attempts to modify it. Once a resource (row, table, etc.) is locked by one transaction, other transactions are blocked from accessing or modifying that resource until the lock is released. This approach ensures maximum data integrity but can severely limit concurrency.
This type of locking is typically implemented using explicit database locks, such as shared locks (for reading) and exclusive locks (for writing). In SQL, this often involves using clauses like FOR UPDATE or WITH (HOLDLOCK) to acquire an exclusive lock on selected rows.
Characteristics of Pessimistic Locking:
- Prevents conflicts proactively: Locks data upfront.
- High data integrity: Guarantees that data won't change unexpectedly during a transaction.
- Reduced concurrency: Other transactions must wait for locks to be released.
- Potential for deadlocks: If transactions lock resources in different orders, they can enter a deadlock state.
- Higher overhead: Managing and acquiring locks adds overhead.
Example (PostgreSQL/MySQL):
BEGIN;
SELECT balance FROM accounts WHERE id = 123 FOR UPDATE;
-- Logic to update balance
UPDATE accounts SET balance = balance - 100 WHERE id = 123;
COMMIT;
Optimistic Locking
Optimistic locking assumes that conflicts are rare. Instead of locking data preemptively, transactions proceed without acquiring locks. It checks for conflicts only when the transaction attempts to commit its changes. If a conflict is detected (i.e., the data has been modified by another transaction since it was read), the current transaction's changes are rejected, and it typically needs to be retried.
This approach usually involves adding a version column (e.g., version, timestamp) to the table. When a record is read, its version is also read. When the record is updated, the update statement includes a WHERE clause that checks if the current version in the database matches the version read initially. If they don't match, another transaction has modified the record, and the update fails.
Characteristics of Optimistic Locking:
- Detects conflicts reactively: Checks for conflicts at commit time.
- High concurrency: No locks are held for the duration of the transaction, allowing more parallel operations.
- Lower overhead (initially): No explicit lock management required.
- Requires retry logic: Applications must be designed to handle failed updates and potentially retry the transaction.
- Suitable for low-contention environments: Works best when conflicts are infrequent.
Example:
-- Initial Read
SELECT id, name, balance, version FROM accounts WHERE id = 123;
-- (Assume application retrieves balance=500, version=1)
-- Update attempt
UPDATE accounts SET balance = 400, version = version + 1 WHERE id = 123 AND version = 1;
If version = 1 no longer holds (meaning another transaction updated it to version = 2), the UPDATE statement will affect 0 rows, signaling a conflict to the application.
Key Differences Summary
| Feature | Pessimistic Locking | Optimistic Locking |
|---|---|---|
| Approach | Acquires locks before data access/modification. | Checks for conflicts at commit time using versioning. |
| Conflict Detection | Proactive (prevents conflicts). | Reactive (detects conflicts after they occur). |
| Concurrency | Lower (transactions wait for locks). | Higher (no shared locks held during transaction). |
| Data Integrity | Ensured by locks, no rollbacks due to concurrent writes. | Requires application-level retry logic on conflict. |
| Overhead | Higher lock management overhead. | Lower initial overhead, but retries add complexity. |
| Use Cases | High contention environments, critical data, short transactions. | Low contention environments, long-running transactions, web applications. |
| Deadlocks | Possible, requires careful handling. | Not directly susceptible (conflicts lead to retries). |
When to Use Which?
Choose Pessimistic Locking when:
- Data consistency is paramount, and even brief inconsistencies are unacceptable.
- Transaction duration is very short, minimizing lock holding time.
- Contention for specific resources is expected to be high.
- You want the database to handle conflict resolution directly via blocking.
Choose Optimistic Locking when:
- High concurrency is a primary requirement.
- Contention for resources is expected to be low.
- Transactions are long-running (e.g., user editing a document for minutes).
- You can implement retry logic in the application easily.
The choice between pessimistic and optimistic locking depends heavily on the specific application's requirements, expected contention levels, and performance goals. Often, a hybrid approach might be used, applying pessimistic locking for critical, short operations and optimistic locking for less critical, longer ones.