Dancing Links: Solving Sudoku


Introduction

Dancing links is a method derived from a simple idea that both removing something and inserting it back in from a linked list can be done in constant time. Dancing links extends this idea to a 2D circular doubly linked list data structure which can significantly increase the speed of the brute force backtracking depth-first search algorithm to solve Exact Cover. Many popular puzzles such as the widely known Sudoku puzzle can be reduced to Exact Cover.

The Exact Cover problem may be described as follows:
Given an array of M*N bits, either 0 or 1, find a set of rows in the array such that every column has
exactly one “1”. For example, in the following array, an answer would be rows 1, 4 and 5:

0 0 1 0 1 1 0
1 0 0 1 0 0 1
0 1 1 0 0 1 0
1 0 0 1 0 0 0
0 1 0 0 0 0 1
0 0 0 1 1 0 1

Each row may either be picked or unpicked; each picked row solves the columns for which it has a “1”. Each column is either solved exactly once or unsolved. Exact Cover is an NP-complete problem, meaning that no one currently knows of a fast (polynomial time) algorithm to solve it.

This article will first show how Exact Cover may be solved using DLX, and then show how Sudoku may be mapped efficiently to Exact Cover, and then solved using DLX. Example code in C++ is provided below.

The Data Structure

The simple idea behind it all is within a doubly linked list, a node may be easily removed (covered), and a node can also easily be put back in (uncovered), as constant time O(1) operations:

To cover a node in a linked list:

  Node.Right.Left = Node.Left;
  Node.Left.Right = Node.Right;

And to put it back in:

  Node.Right.Left = Node;
  Node.Left.Right = Node;

We can then extend this idea from a single circular linked list, to a 2D circular linked array (matrix). Each node stores pointers to the node to its Up, Down, Left and Right. An arbitrary Header row may be defined, marking the beginning of every column. The last node of every column circularly links back to the Header, and vice versa. The last node of every row links back to the first node, and vice versa. The Header row is given a special root node
which marks the entry of the whole data structure. One way to visualise the whole thing is as a torus, a donut, hanging from a string, the string
being the root node.

Graphical representation of the DLX 2D circularly linked matrix.

Graphical representation of the DLX 2D circularly linked matrix.

How to build such a data structure is up to personal preference, but my way to implement the construction is to input an array of 0’s and 1’s, and then loop through each 1 node to find its left, right, up and down neighbours, linking the corresponding pointers as I go. For larger data sets you may need a different approach to save memory.

The Algorithm

The Dancing Links algorithm is simply an optimisation of the naive backtracking brute force solution to Exact Cover, using the data structure mentioned above. With this optimisation, even a straight out brute force seems to solve things unbelievably fast.

Lets start with looking at a straightforward brute force solution to Exact Cover:

  Find an unfilled column C - if theres no such column, we have a solution.
  For every row R that fills this column C:
      Try using row R in our solution - which means:
	      1. Removing R; so we don't consider it again
	      2. Removing every column that R fills; we don't need to fill those again
	      3. For every column that R fills, remove the other rows that also fill the column; we don't need any other row that 
	         fill what this row has already filled.
      Recurse with the new state.
      "Undo" all the things removed while trying row R; restore to previous state.
  Next

Now, without the dancing links data structure, one could probably use an array of booleans which they loop through lots of times. On the above example; this won’t seem so bad. But on sparse graphs, having to loop through the entire column in order to find one that is unfilled could take heaps of time, let alone setting entire rows/columns to and from being ‘seen’ all the time.

This is where the above DLX data structure comes in!

The DLX data structure keeps the exact list of things we need to search. When we choose a row, we hide the row, and we also hide the columns which the given row fills. This means finding an unfilled column is simply an O(1) operation, the next column in the header row. Finding a row to solve this column is also a O(1) operation, the next row in column linked list. Removing a column once its been filled is also an O(1) operation. In fact we perform no seen-array checks at all, every item we need in our brute force search is right there!

Implementation

The nodes can easily be represented in C as structures. The structure may contain:

  • A Header pointer – I like to have every node point to the “Header” node of its column, for easy and fast access.
  • A Left pointer.
  • A Right pointer.
  • A Up Pointer.
  • A Down pointer.
  • Some quick form of ID – Being able to name nodes is nice when you’re debugging or testing, for outputting, so you don’t have to look at pointer addresses all day and try to decode them.

The matrix may be built from a 2-D data array of either 0 or 1. The data in this array can, in the above example case, but just hardcoded. The data array could also be generated by code, in the case of the Sudoku solver.

I like to keep things relatively tidy, and I made my “Cover” function take a column, and make that column “solved”, i.e. covering the column, and covering every row that solves this column. I also made a similar “UnCover” function that does the reverse, for backtracking.

The actual search algorithm is just a simple recursive brute force depth-first search that loops through a bunch of linked lists, to search through every possible solution.

Changing Sudoku to Exact Cover

Ok, now we have an extremely fast solution to the Exact Cover problem, now let’s solve Sudoku with it! How?

We bring Sudoku down and turn it into an Exact Cover problem.

Lets say, you’re doing your exam for your piano grade. Lets think of the columns not as columns, but as a precise list of tricks you must show to the examiner that you are able to do before he/she can give you the pass. The examiner is also very easily bored, and will immediately fail you if you do the same trick twice.
Now, think of the rows not as rows, but as an exact song list you can choose from to play to the examiner.
Each song shows the examiner different tricks, and she/he can marks the shown tricks off one by one until you pass.

You must pick a set of songs such that every required trick on the examiner’s list is fulfilled exactly once (and you pass!).

Lets carry over this analogy of rows and columns of the matrix over to Sudoku.
The columns is a list of “requirements” that must be met. There are 4 types of requirements in Sudoku:

  1. Every square must have a number in it – [A Number has to be in 0,0] [A Number has to be in 0,1]…etc…[A Number as to be in 8,8]; 9*9 = 81 requirements.
  2. Every row must have numbers 1-9 in it – [Row 0 must have a 1] [Row 0 must have a 2] …etc… [Row 1 must have a 1] [Row 2 must have a 2]…etc…[Row 8 must have a 1]….etc….[Row 8 must have a 9]; 9*9 = 81 requirements.
  3. Every column must have numbers 1-9 in it – [Column 0 must have a 1]……etc…..[Column 8 must have a 9]; 9*9 = 81 requirements.
  4. Every box must have numbers 1-9 in it – [Box 0 must have a 1] [Box 0 must have a 2] …etc…[Box 8 must have a 9]; 9*9 = 81 requirements.

Therefore, we have 81 * 4 = 324 requirements which need to be met. If they’re all met exactly once, we have a solution! This translates to 324 columns in our matrix.

In our piano example, rows are our list of songs we can pick. In Sudoku, rows are our options, our list of things that we can do. Our options are that we can place a number [1-9] in any square ( 0 to 8, 0 to 8 ). This gives us 9*9*9=729 things we can do, or “songs to choose from” in the piano example. Add the 1 extra row for the “header” row, that gives us 729 + 1 = 730 rows in our matrix. Row thus represents “a number placement”.

In our piano example, each song shows the examiner a different set of tricks. In Sudoku, each row, or “number placement”, solves a set of requirements. For instance, “Placing a 3 in square (1,1)” solves the requirements [A Number has to be in 1,1], [Row 1 must have a 3], [Col 1 must have a 3], and [Box 0 must have a 3].

When we choose out “right set of songs” to pass our piano exam, the same applies where: we choose the right set of “number placements” to satisfy all our requirements for a complete Sudoku.

Example C Source Code

The source code boils down to:

#include <stdio.h>

#define MAX_COL 750
#define MAX_ROW 750

#define SQ_OFFSET 0
#define RW_OFFSET 81
#define CL_OFFSET 162
#define BX_OFFSET 243

struct str_node {
    
    struct str_node * Header;
    
    struct str_node * Left;
    struct str_node * Right;
    struct str_node * Up;
    struct str_node * Down;
    
    char IDName;
    int  IDNum;
} ;

int nCol;
int nRow;
struct str_node Matrix[MAX_COL][MAX_ROW];
struct str_node Root;
struct str_node *RootNode = &Root;
struct str_node *RowHeader[MAX_ROW];
char Data[MAX_COL][MAX_ROW];
int Result[MAX_ROW]; int nResult = 0;
char Finished;
int GlobalProgressUpdate;
int MaxK;

// --> Initialisation functions
inline int dataLeft(int i) { return i-1<0?nCol-1:i-1; }
inline int dataRight(int i) { return (i+1)%nCol; }
inline int dataUp(int i) { return i-1<0?nRow-1:i-1; }
inline int dataDown(int i) { return (i+1)%nRow; }

void CreateMatrix(void) {
    int a,b, i, j;
    //Build toroidal linklist matrix according to data bitmap
    for(a=0;a<nCol;a++) {
        for(b=0;b<nRow;b++) {
            if(Data[a][b]!=0) {
                // Left pointer
                i = a; j = b; do { i = dataLeft(i); } while (Data[i][j]==0);
                Matrix[a][b].Left = &Matrix[i][j]; 
                // Right pointer
                i = a; j = b; do { i = dataRight(i); } while (Data[i][j]==0);
                Matrix[a][b].Right = &Matrix[i][j];
                // Up pointer
                i = a; j = b; do { j = dataUp(j); } while (Data[i][j]==0);
                Matrix[a][b].Up = &Matrix[i][j];
                // Down pointer
                i = a; j = b; do { j = dataDown(j); } while (Data[i][j]==0);
                Matrix[a][b].Down = &Matrix[i][j]; 
                // Header pointer
                Matrix[a][b].Header = &Matrix[a][nRow-1];
                Matrix[a][b].IDNum = b;
                //Row Header
                RowHeader[b] = &Matrix[a][b];
            }
        }
    }
    for(a=0;a<nCol;a++) {
        Matrix[a][nRow-1].IDName = 'C';  
        Matrix[a][nRow-1].IDNum = a;
    }
    //Insert root
    Root.IDName = 'R'; 
    Root.Left = &Matrix[nCol-1][nRow-1]; Root.Right = &Matrix[0][nRow-1];
    Matrix[nCol-1][nRow-1].Right = &Root; Matrix[0][nRow-1].Left = &Root;
}

// --> DLX Algorithm functions
struct str_node *ChooseColumn(void) {
    return RootNode->Right;
}

void Cover(struct str_node *ColNode) {
    struct str_node *RowNode, *RightNode;
    ColNode->Right->Left = ColNode->Left;
    ColNode->Left->Right = ColNode->Right;
    for(RowNode = ColNode->Down; RowNode!=ColNode; RowNode = RowNode->Down) {
        for(RightNode = RowNode->Right; RightNode!=RowNode; RightNode = RightNode->Right) {
            RightNode->Up->Down = RightNode->Down;
            RightNode->Down->Up = RightNode->Up;
        }
    }
}

void UnCover(struct str_node *ColNode) {
    struct str_node *RowNode, *LeftNode;
    for(RowNode = ColNode->Up; RowNode!=ColNode; RowNode = RowNode->Up) {
        for(LeftNode = RowNode->Left; LeftNode!=RowNode; LeftNode = LeftNode->Left) {
            LeftNode->Up->Down = LeftNode;
            LeftNode->Down->Up = LeftNode;
        }
    }
    ColNode->Right->Left = ColNode;
    ColNode->Left->Right = ColNode;
}

void SolutionRow(struct str_node *RowNode) {
    Cover(RowNode->Header);
    struct str_node *RightNode;
    for(RightNode = RowNode->Right; RightNode!=RowNode; RightNode = RightNode->Right) { Cover(RightNode->Header); }
}

void PrintSolution(void);

void Search(int k) {
    /*if(GlobalProgressUpdate < k) {
        printf("== Search(%d)\n", k);
        PrintSolution();
        GlobalProgressUpdate = k;
    }*/
    if((RootNode->Left == RootNode && RootNode->Right==RootNode) || k == (81-MaxK)) {
        //Valid solution!
        printf("----------- SOLUTION FOUND -----------\n");
        PrintSolution();
        Finished = 1;
        return;
    }
    struct str_node *Column = ChooseColumn();
    Cover(Column);
    
    struct str_node *RowNode;
    struct str_node *RightNode;
    for(RowNode = Column->Down; RowNode!=Column && !Finished; RowNode = RowNode->Down) {
        // Try this row node on!
        Result[nResult++] = RowNode->IDNum;
        for(RightNode = RowNode->Right; RightNode!=RowNode; RightNode = RightNode->Right) {
            Cover(RightNode->Header);
        }
        Search(k+1);
        // Ok, that node didn't quite work
        for(RightNode = RowNode->Right; RightNode!=RowNode; RightNode = RightNode->Right) {
            UnCover(RightNode->Header);
        }
        Result[--nResult] = 0;
    }
    
    UnCover(Column);
}

// --> Sudoku to Exact Cover conversion

// Functions that extract data from a given 3-digit integer index number in the format [N] [R] [C].
inline int retNb(int N) { return N/81; }
inline int retRw(int N) { return (N/9)%9; }
inline int retCl(int N) { return N%9; }
inline int retBx(int N) { return ((retRw(N)/3)*3) + (retCl(N)/3); }
inline int retSq(int N) { return retRw(N)*9 + retCl(N); }
inline int retRn(int N) { return retNb(N)*9 + retRw(N); }
inline int retCn(int N) { return retNb(N)*9 + retCl(N); }
inline int retBn(int N) { return retNb(N)*9 + retBx(N); }
// Function that get 3-digit integer index from given info
inline int getIn(int Nb, int Rw, int Cl) { return Nb*81 + Rw*9 + Cl; }

void PrintSolution(void) {
    int a,b,c;
    int Sudoku[9][9] = {};
    for(a=0;a<9;a++) for(b=0;b<9;b++) Sudoku[a][b] = -1;
    for(a=0;a<nResult;a++) {
        Sudoku[retRw(Result[a])][retCl(Result[a])] = retNb(Result[a]);
    }    
    for(a=0;a<9;a++) {
        for(b=0;b<9;b++) {
            if(a>0&&a%3==0&&b==0) { for(c=0;c<9;c++) printf("--"); printf("\n"); } //horizontal lines
            if(Sudoku[a][b]>=0) printf("%d%c", Sudoku[a][b]+1, b%3==2?'|':' ' );
            else printf(". ");
        } printf("\n");
    }
}

void BuildData(void) {
    int a,b,c;
    int Index;
    nCol = 324; nRow = 729 + 1;
    for(a=0;a<9;a++) {
        for(b=0;b<9;b++) {
            for(c=0;c<9;c++) {
                Index = getIn(c, a, b);
                Data[SQ_OFFSET + retSq(Index)][Index] = 1; //Constraint 1: Only 1 per square
                Data[RW_OFFSET + retRn(Index)][Index] = 1; //Constraint 2: Only 1 of per number per Row
                Data[CL_OFFSET + retCn(Index)][Index] = 1; //Constraint 3: Only 1 of per number per Column
                Data[BX_OFFSET + retBn(Index)][Index] = 1; //Constraint 4: Only 1 of per number per Box
            }
        }
    }
    for(a=0;a<nCol;a++) { Data[a][nRow-1] = 2; }
    CreateMatrix();
    for(a=0;a<RW_OFFSET;a++) { Matrix[a][nRow-1].IDName = 'S'; Matrix[a][nRow-1].IDNum = a; }
    for(a=RW_OFFSET;a<CL_OFFSET;a++) { Matrix[a][nRow-1].IDName = 'R'; Matrix[a][nRow-1].IDNum = a-RW_OFFSET; }
    for(a=CL_OFFSET;a<BX_OFFSET;a++) { Matrix[a][nRow-1].IDName = 'C'; Matrix[a][nRow-1].IDNum = a-CL_OFFSET; }
    for(a=BX_OFFSET;a<nCol;a++) { Matrix[a][nRow-1].IDName = 'B'; Matrix[a][nRow-1].IDNum = a-BX_OFFSET; }
}

inline void AddNumber(int N, int R, int C) {
    SolutionRow(RowHeader[getIn(N, R, C)]);
    MaxK++;
    Result[nResult++] = getIn(N, R, C);
}

void LoadPuzzle(char *FileName) {
    FILE * a_file = fopen(FileName, "r");
    if(a_file==NULL) { printf("File load fail!\n"); return; } 
    int a,b; char temp;
    for(a=0;a<9;a++) {
        for(b=0;b<9;b++) {
            if(!fscanf(a_file, "%c\n", &temp)) {printf("File loading error!\n"); return; }
            if(temp>='1'&&temp<='9') { AddNumber((temp-'0')-1, a, b);}
        }
    }
}

int main(void) {
    BuildData();
    LoadPuzzle("puzzle.txt");
    Search(0);
    return 0;
}


Download here:
dlx.c
dlx.c (mirror)

The code above has seen a reasonable amount of use and abuse, so should hopefully be reasonably bug free.

References

“Dancing Links” – Dr. Donald Knuth, Stanford University – 15 Nov. 2000
http://www.ocf.berkeley.edu/~jchu/publicportal/sudoku/sudoku.paper.html
http://www.sudopedia.org/wiki/Dancing_Links

Xi (Ma) Chen
hypernewbie (at) gmail [dot] com.

Share Button

Leave a Reply