How SAT Solvers Actually Work: From DPLL to CDCL

Practical Boolean satisfiability solvers are built on backtracking search with constraint propagation. This chunk traces the lineage from unit propagation and pure-literal elimination through the 1962 DPLL algorithm to conflict-driven clause learning (CDCL, 1996), explaining why modern solvers handle millions of variables despite exponential worst cases.

Although 3-SAT is NP-complete, real-world SAT instances are routinely solved with millions of variables. The reason is decades of refinement to backtracking search. (See The 2-SAT / 3-SAT Boundary: Where Tractability Ends for why the worst case stays exponential, and P vs NP Problem: Fundamentals Explained.) ## Cheap propagation first Before any guessing, solvers simplify: - **Unit propagation**: a clause with one remaining literal forces that literal, and the consequences cascade. - **Pure literal elimination**: a variable appearing only positively (or only negatively) can be set immediately and removed — it has no conflicting constraint. Difficulty lives entirely in *contested* variables that appear both positively and negatively, creating interactions between clauses. ## DPLL (1962) The DPLL algorithm — Davis, Putnam, Logemann, Loveland, 1962, refining the 1960 Davis–Putnam procedure — is the foundation: ``` Pick an unassigned variable A set A = true, propagate; if no contradiction, recurse else set A = false, propagate; if no contradiction, recurse if both fail, this branch is unsatisfiable (backtrack) ``` It is exponential in the worst case because every unresolved choice branches, producing a search tree with up to 2^n leaves. ## CDCL (1996) Conflict-driven clause learning (CDCL), introduced by the GRASP solver in 1996, extends DPLL with one decisive idea: when search hits a contradiction, analyse *why* it occurred and add a new **learned clause** that forbids repeating the same mistake from any other path. It also enables **non-chronological backtracking** (backjumping) past irrelevant decisions. The analogy: on hitting a dead end in a maze, do not just step back — build a permanent wall earlier that cuts off every path leading to that dead end. CDCL is why state-of-the-art solvers scale to industrial problems despite the exponential worst case — they *learn the structure* of the formula as they go. ## Physics-inspired and message-passing approaches Other families exist. Survey propagation and belief propagation are message-passing methods that excel near the 4.27 hardness threshold. Physical approaches map SAT onto the Ising model — variables as magnets, clauses as energy penalties — and let the system relax to a low-energy state; quantum annealing hardware (e.g. D-Wave) implements this. None escapes the wall: energy landscapes have local minima, and complexity theory already accounts for physical models of computation. Whether physical computation can beat the classical hierarchy remains formally open but is widely believed to be 'no'.

How SAT Solvers Actually Work: From DPLL to CDCL

Related Knowledge

Boolean Satisfiability Problem (SAT)

Have insights to add?