Constraint Satisfaction (Sudoku Solver): From Logic to Lightning-Fast Solvers

9–13 minutes

Sudoku Game → Automated Puzzle-Solving Apps

Why CSP for Sudoku?

Constraint Satisfaction Problems (CSPs) let you describe what must be true (constraints) and let a general solver figure out how to satisfy them. Sudoku maps perfectly: each empty cell is a variable, digits 1–9 are possible values, and rules (“no repeats per row, column, or 3×3 box”) are constraints.

With a few classic CSP tricks—backtracking search, smart heuristics (MRV, Degree, LCV), and constraint propagation (Forward Checking, AC-3)—you can solve difficult puzzles in milliseconds and reuse the same solver ideas for timetabling, scheduling, routing, and more.

Theory Deep Dive

1. Formal Definition

A CSP is a triple \langle X, D, C \rangle, where:

  • X = {X_1, \dots, X_n} are variables.
  • D = {D_1, \dots, D_n} with D_i the finite domain of values for X_i.
  • C = {C_1, \dots, C_m} are constraints restricting the allowable combinations of values.

A solution is an assignment {X_i = v_i} with v_i \in D_i satisfying all constraints.

2. Sudoku as a CSP

  • Variables: X_{r,c} for row r \in {1,\dots,9} and column c \in {1,\dots,9}.
  • Domains: D_{r,c} = {1,\dots,9} for blanks; given clues have singleton domains.
  • Constraints (AllDifferent):
    • Rows: each row’s 9 variables must be all-different.
    • Columns: each column’s 9 variables must be all-different.
    • Boxes: each 3×3 subgrid’s 9 variables must be all-different.
      Equivalently, you can encode pairwise “\neq” between peers. Each cell has 20 unique peers (8 row + 8 column + 4 box that aren’t in row/col overlap).

You can formalize with the global constraint:
\text{AllDifferent}(X_{r,1}, \dots, X_{r,9}) for each row r, and similarly for columns and boxes.

3. Constraint graphs & consistency

Represent variables as nodes; add an edge if two variables share a constraint.

  • Node consistency: each domain respects unary constraints.
  • Arc consistency (AC): for every pair (X_i, X_j) with constraint X_i \neq X_j, every value in D_i has a supporting value in D_j.
  • AC-3 algorithm enforces arc consistency by repeatedly revise-ing arcs until no more domain values can be removed.

AC-3 complexity (pairwise “\neq”): O(E \cdot d^3) worst case, where E is number of arcs, d the max domain size (here 9).

4. Search with heuristics

When propagation alone can’t finish the job, use backtracking search:

  • MRV (Minimum Remaining Values): choose the variable with smallest domain size first:
    \operatorname{MRV}(X) = \arg\min_{X \text{ unassigned}} |D_X|.
  • Degree heuristic (tie-breaker): among MRV ties, pick the variable with the most constraints on unassigned neighbors.
  • LCV (Least Constraining Value): order candidate values by how few domain eliminations they cause in neighbors.

5. Propagation during search

Forward Checking: after assigning X=v, remove v from each neighbor’s domain; fail if any domain becomes empty.

Maintaining Arc Consistency (MAC/AC-3 in search): stronger than forward checking; run AC-3 after each assignment for more pruning.

6. Correctness & completeness

Backtracking with propagation is complete (finds a solution if one exists). Heuristics don’t change correctness, only efficiency. For puzzles with a unique solution, the solver will converge to that assignment; for multiple solutions, you can stop at the first or enumerate all.

Toy Problem – Sudoku 9×9

Data Snapshot (one classic puzzle)

We’ll use 0 for blanks:

530070000
600195000
098000060
800060003
400803001
700020006
060000280
000419005
000080079

As a single string (row-major, 81 chars):

"530070000600195000098000060800060003400803001700020006060000280000419005000080079"

Step 1: Board & helper structures

Create variable keys, initial domains, and peer sets.

from collections import defaultdict, deque
from typing import Dict, Set, Tuple, Optional, List

ROWS = range(9)
COLS = range(9)
DIGITS = set(range(1, 10))

def box_index(r: int, c: int) -> Tuple[int, int]:
    return (r // 3, c // 3)

# Precompute peers: for each (r,c), set of coords sharing row/col/box
PEERS: Dict[Tuple[int,int], Set[Tuple[int,int]]] = {}
for r in ROWS:
    for c in COLS:
        peers = set((r, cc) for cc in COLS if cc != c)
        peers |= set((rr, c) for rr in ROWS if rr != r)
        br, bc = box_index(r, c)
        for rr in range(br*3, br*3+3):
            for cc in range(bc*3, bc*3+3):
                if (rr, cc) != (r, c):
                    peers.add((rr, cc))
        PEERS[(r, c)] = peers

def parse_grid(grid_str: str) -> Dict[Tuple[int,int], Set[int]]:
    """Return domains: dict from (r,c) -> set of possible digits."""
    assert len(grid_str) == 81
    domains = {}
    for idx, ch in enumerate(grid_str):
        r, c = divmod(idx, 9)
        if ch in '0.':
            domains[(r, c)] = set(DIGITS)
        else:
            domains[(r, c)] = {int(ch)}
    return domains

Explanation: Build peer relationships once. Parse an 81-char grid string into per-cell domains (singletons for clues, 1–9 for blanks).

Step 2: Consistency checks & printing

Explanation: Helpers to check completion, choose the next variable (MRV + Degree), and a pretty printer.

def is_complete(domains: Dict[Tuple[int,int], Set[int]]) -> bool:
    return all(len(domains[pos]) == 1 for pos in domains)

def select_unassigned_variable(domains: Dict[Tuple[int,int], Set[int]]) -> Tuple[int,int]:
    # MRV; tie-break by Degree
    unassigned = [(pos, d) for pos, d in domains.items() if len(d) > 1]
    # MRV first
    min_size = min(len(d) for _, d in unassigned)
    candidates = [pos for pos, d in unassigned if len(domains[pos]) == min_size]
    # Degree heuristic
    def degree(pos): 
        return sum(1 for p in PEERS[pos] if len(domains[p]) > 1)
    return max(candidates, key=degree)

def format_board(domains: Dict[Tuple[int,int], Set[int]]) -> str:
    lines = []
    for r in ROWS:
        row = []
        for c in COLS:
            v = next(iter(domains[(r, c)])) if len(domains[(r, c)]) == 1 else 0
            row.append(str(v))
        lines.append(' '.join(row))
    return '\n'.join(lines)

Step 3: AC-3 (Arc Consistency)

Explanation: Classic AC-3 maintains a queue of arcs (X_i, X_j); if revising X_i removes values, enqueue neighbors again. Fail if any domain empties.

def revise(domains, xi, xj) -> bool:
    """Make Xi arc-consistent w.r.t Xj. Return True if domain changed."""
    removed = False
    to_remove = set()
    for vi in domains[xi]:
        # Support exists if Xj has some vj != vi
        if all(vi == vj for vj in domains[xj]) and len(domains[xj]) == 1:
            to_remove.add(vi)
        elif domains[xj] == {vi}:  # only same value -> violates AllDifferent
            to_remove.add(vi)
    if to_remove:
        domains[xi] -= to_remove
        removed = True
    return removed

def ac3(domains) -> bool:
    """Enforce arc consistency; return False on domain wipeout."""
    queue = deque()
    for xi in domains:
        for xj in PEERS[xi]:
            queue.append((xi, xj))
    while queue:
        xi, xj = queue.popleft()
        if revise(domains, xi, xj):
            if len(domains[xi]) == 0:
                return False
            for xk in PEERS[xi] - {xj}:
                queue.append((xk, xi))
    return True

4. Forward checking & LCV ordering

Explanation: Forward checking prunes neighbors immediately. LCV prefers values that eliminate fewer options in peers.

def forward_check(domains, pos, val) -> Optional[List[Tuple[Tuple[int,int], int]]]:
    """After assigning pos=val, prune neighbors; return list of pruned (neighbor, value) for undo, or None on wipeout."""
    pruned = []
    for nbr in PEERS[pos]:
        if val in domains[nbr] and len(domains[nbr]) > 1:
            domains[nbr].remove(val)
            pruned.append((nbr, val))
            if len(domains[nbr]) == 0:
                return None
    return pruned

def order_values_lcv(domains, pos) -> List[int]:
    """Least Constraining Value (values that rule out fewest options in neighbors)."""
    scores = []
    for v in sorted(domains[pos]):
        impact = 0
        for nbr in PEERS[pos]:
            if v in domains[nbr]:
                impact += 1
        scores.append((impact, v))
    scores.sort()  # fewer impacts first
    return [v for _, v in scores]

5. Backtracking search with propagation

Explanation: Depth-first search assigns a value, runs propagation (Forward Checking + AC-3), and recurses. Succeeds when all domains are singletons.

def backtrack(domains) -> Optional[Dict[Tuple[int,int], Set[int]]]:
    if is_complete(domains):
        return domains

    pos = select_unassigned_variable(domains)
    for v in order_values_lcv(domains, pos):
        # Copy-on-write: lightweight copy of domains
        new_domains = {p: set(d) for p, d in domains.items()}
        new_domains[pos] = {v}

        # Propagate: forward check then AC-3 (MAC)
        pruned = forward_check(new_domains, pos, v)
        if pruned is None:
            continue
        if not ac3(new_domains):
            continue

        result = backtrack(new_domains)
        if result is not None:
            return result
    return None

def solve_sudoku(grid_str: str) -> Optional[Dict[Tuple[int,int], Set[int]]]:
    domains = parse_grid(grid_str)
    if not ac3(domains):  # initial propagation
        return None
    return backtrack(domains)

6. Run the solver

Explanation: Try the sample puzzle and print the solved grid.

if __name__ == "__main__":
    puzzle = "530070000600195000098000060800060003400803001700020006060000280000419005000080079"
    solution = solve_sudoku(puzzle)
    if solution:
        print(format_board(solution))
    else:
        print("No solution found.")

Quick Reference: Full Code

from collections import deque
ROWS, COLS = range(9), range(9)
DIGITS = set(range(1,10))
def box_index(r,c): return (r//3, c//3)
PEERS={}
for r in ROWS:
    for c in COLS:
        peers=set((r,cc) for cc in COLS if cc!=c)|set((rr,c) for rr in ROWS if rr!=r)
        br,bc=box_index(r,c)
        for rr in range(br*3,br*3+3):
            for cc in range(bc*3,bc*3+3):
                if (rr,cc)!=(r,c): peers.add((rr,cc))
        PEERS[(r,c)]=peers
def parse_grid(s):
    assert len(s)==81
    d={}
    for i,ch in enumerate(s):
        r,c=divmod(i,9)
        d[(r,c)]={int(ch)} if ch not in '0.' else set(DIGITS)
    return d
def is_complete(d): return all(len(d[p])==1 for p in d)
def select_unassigned_variable(d):
    U=[(p, d[p]) for p in d if len(d[p])>1]; m=min(len(x[1]) for x in U)
    C=[p for p,_ in U if len(d[p])==m]
    return max(C, key=lambda p: sum(1 for q in PEERS[p] if len(d[q])>1))
def revise(d, xi, xj):
    rem=False; rm=set()
    for vi in d[xi]:
        if (d[xj]=={vi}) or (len(d[xj])==1 and all(vi==vj for vj in d[xj])):
            rm.add(vi)
    if rm: d[xi]-=rm; rem=True
    return rem
def ac3(d):
    q=deque((xi,xj) for xi in d for xj in PEERS[xi])
    while q:
        xi,xj=q.popleft()
        if revise(d,xi,xj):
            if len(d[xi])==0: return False
            for xk in PEERS[xi]-{xj}: q.append((xk,xi))
    return True
def forward_check(d,pos,val):
    pr=[]
    for nb in PEERS[pos]:
        if val in d[nb] and len(d[nb])>1:
            d[nb].remove(val); pr.append((nb,val))
            if len(d[nb])==0: return None
    return pr
def order_values_lcv(d,pos):
    return [v for _,v in sorted((sum(1 for nb in PEERS[pos] if v in d[nb]), v) for v in sorted(d[pos]))]
def backtrack(d):
    if is_complete(d): return d
    pos=select_unassigned_variable(d)
    for v in order_values_lcv(d,pos):
        nd={p:set(x) for p,x in d.items()}; nd[pos]={v}
        if forward_check(nd,pos,v) is None: continue
        if not ac3(nd): continue
        r=backtrack(nd)
        if r is not None: return r
    return None
def solve_sudoku(s):
    d=parse_grid(s)
    if not ac3(d): return None
    return backtrack(d)
def format_board(d):
    return '\n'.join(' '.join(str(next(iter(d[(r,c)])) if len(d[(r,c)])==1 else 0) for c in COLS) for r in ROWS)
if __name__=="__main__":
    p="530070000600195000098000060800060003400803001700020006060000280000419005000080079"
    sol=solve_sudoku(p)
    print(format_board(sol) if sol else "No solution")

Real‑World Application — A Minimal Web API for Puzzle-Solving Apps

We’ll wrap the solver in a FastAPI service so a mobile or web app can POST a puzzle and get back a solution JSON.

Step 1: Project layout

Explanation: Keep solver logic independent so other apps (CLI, batch, UI) can reuse it.

csp_sudoku_api/
  ├─ solver.py        # move the toy solver here (solve_sudoku, format_board)
  └─ app.py           # FastAPI app exposing /solve

Step 2: solver.py (reuse from toy section)Neighbor graph (everyone potentially conflicts with everyone)

Explanation: Export the two functions we need.

# solver.py
# (Paste the "Quick Reference — Full Solver" here, but expose two functions:)
#   - solve_sudoku(grid_str: str) -> Optional[Dict[(int,int), Set[int]]]
#   - format_board(domains) -> str
# Keep imports (collections, typing) as shown above.

Step 3: app.py — FastAPI service

Explanation: Define a POST /solve endpoint that accepts an 81-char grid string and returns a solved grid string or an informative message.

# app.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel, constr
from solver import solve_sudoku

app = FastAPI(title="Sudoku CSP Solver API", version="1.0")

class SolveRequest(BaseModel):
    # 81 characters, digits 0-9, where 0 means blank
    grid: constr(regex=r"^[0-9\.]{81}$")

class SolveResponse(BaseModel):
    solved: bool
    grid: str  # 81-char string of the solution if solved, else original
    message: str = ""

def domains_to_grid81(domains) -> str:
    out = []
    for r in range(9):
        for c in range(9):
            vset = domains[(r, c)]
            out.append(str(next(iter(vset)) if len(vset) == 1 else 0))
    return "".join(out)

@app.post("/solve", response_model=SolveResponse)
def solve(req: SolveRequest):
    try:
        sol = solve_sudoku(req.grid.replace('.', '0'))
        if sol is None:
            return SolveResponse(solved=False, grid=req.grid, message="No solution found (or invalid puzzle).")
        return SolveResponse(solved=True, grid=domains_to_grid81(sol), message="OK")
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

Step 4: Run the API (local)

Explanation: Start the dev server on port 8000.

pip install fastapi uvicorn pydantic
uvicorn app:app --reload --port 8000

Step 5: Try it (cURL)

Explanation: You’ll receive JSON with solved: true and an 81-char solved grid, ready for your mobile/web client.

curl -X POST http://localhost:8000/solve \
  -H "Content-Type: application/json" \
  -d '{"grid":"530070000600195000098000060800060003400803001700020006060000280000419005000080079"}'

Quick Reference: Full API (app.py)

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel, constr
from solver import solve_sudoku

app = FastAPI(title="Sudoku CSP Solver API", version="1.0")

class SolveRequest(BaseModel):
    grid: constr(regex=r"^[0-9\.]{81}$")

class SolveResponse(BaseModel):
    solved: bool
    grid: str
    message: str = ""

def domains_to_grid81(domains):
    return "".join(str(next(iter(domains[(r,c)])) if len(domains[(r,c)])==1 else 0)
                   for r in range(9) for c in range(9))

@app.post("/solve", response_model=SolveResponse)
def solve(req: SolveRequest):
    try:
        sol = solve_sudoku(req.grid.replace('.', '0'))
        if sol is None:
            return SolveResponse(solved=False, grid=req.grid, message="No solution found (or invalid puzzle).")
        return SolveResponse(solved=True, grid=domains_to_grid81(sol), message="OK")
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

Production pointer: For a consumer app, add input validation, request logging, rate limiting, and optionally CORS + HTTPS. If puzzles come as images, plug in OCR (e.g., Tesseract) and a small vision model to detect digits, then feed the recognized grid to the API.

Strengths & Limitations

Strengths

  • Declarative modeling: You describe constraints; the solver handles search and pruning.
  • Generalizable: Same engine adapts to timetables, maps, resource allocation, etc.
  • Powerful pruning: MRV/LCV + AC-3 drastically reduces search space, enabling fast solves.

Limitations

  • Scalability with naive encodings: Pairwise constraints can still explode; global constraints (e.g., AllDifferent) or specialized propagators are better.
  • Heuristic sensitivity: Poor variable/value ordering leads to large search.
  • Difficult hybrid constraints: For numeric or optimization-heavy tasks, dedicated CP-SAT/MIP or hybrid methods may outperform plain CSP.

Final Notes

You modeled Sudoku as a CSP \langle X, D, C \rangle, implemented a solver with MRV, Degree, LCV, Forward Checking, and AC-3, and exposed it as a web API.

You can now reuse the same ideas to build robust solvers for scheduling, timetables, and resource assignment!

Next Steps for You:

Upgrade the model: Use a global \text{AllDifferent} propagator or adopt OR-Tools CP-SAT to compare performance and modeling convenience.

From photo to solution: Add OCR and digit localization, then auto-solve from camera input; include confidence checks and manual correction UI.

References

[1] A. K. Mackworth, “Consistency in networks of relations,” Artificial Intelligence, vol. 8, no. 1, pp. 99–118, 1977.
[2] R. Dechter, Constraint Processing. San Francisco, CA, USA: Morgan Kaufmann, 2003.
[3] S. J. Russell and P. Norvig, Artificial Intelligence: A Modern Approach, 4th ed. Pearson, 2020.
[4] H. Simonis, “Sudoku as a Constraint Problem,” in Proc. 4th Int. Workshop on Modelling and Reformulating Constraint Satisfaction Problems, 2005.
[5] P. Norvig, “Solving Every Sudoku Puzzle,” 2012 (online article).
[6] C. Bessiere, “Constraint Propagation,” in Handbook of Constraint Programming, F. Rossi, P. van Beek, T. Walsh, Eds. Elsevier, 2006, pp. 29–83.

Leave a comment