Knuth's Algorithm X (Dancing Links; Exact Cover)

Question

0 votes

I'm trying to write a Matlab version of Knuth's "Algorithm X." This algorithm makes use of the Dancing Links routine and is a recursive, nondeterministic, depth-first, backtracking algorithm. I am using this algorithm to solve a simple application of the exact cover problem. For more information:

Knuth's Algorithm X: https://en.wikipedia.org/wiki/Knuth%27s_Algorithm_X

Exact cover problem: https://en.wikipedia.org/wiki/Exact_cover

Here is pseudocode for Algorithm X:

 1. If matrix A is empty, the problem is solved; terminate successfully.
 2. Otherwise: choose a column, c, of A with the least number of 1's.
 3. Choose a row, r, that contains a 1 in column c (e.g., A(r,c) = 1).
 4. Add r to the partial solution.
 5. For each column j such that A(r,j) = 1:
         for each row i such that A(i,j) = 1:
              delete row i from matrix A
         delete column j from matrix A
 6. Repeat this algorithm recursively on the reduced matrix A.

Following the Wiki-page, here is a working example:

A = zeros(1,7); B = A; C = B; D = C; E = D; F = E;
A([1 4 7]) = 1; B([1 4]) = 1; C([4 5 7]) = 1; D([3 5 6]) = 1; E([2 3 6 7]) = 1; F([2 7]) = 1; 
M = [A; B; C; D; E; F]; % the binary 'A' matrix

Knuth's algorithm should return the following exact cover of M: {B, D, F}; i.e., rows 2, 4, and 6 of the original (unreduced) matrix collectively contribute a 1 in each column of M (exactly once).

Here is what I have so far:

function sol = dlx(X,sol)
% if isempty(X)
%     return
% end
% Find the first column of X having the lowest number of 1s in any position
c = find(sum(X,1) == min(sum(X,1)), 1);
% If column c is entirely zero, terminate unsuccessfully
% if (sum(X(:,c)) == 0)
%     sol = [];
%     return
% end
% Find the rows that have a 1 in column c
r = find(X(:,c) == 1);
% if isempty(r)
%     sol = [];
%     return
% end
% if isempty(r)
%     sol = sol(1:end-1);
%     return
% end
% Loop over each row that has a 1 in column c
for rr = 1:length(r)
      % include row r in the partial solution
      sol = [sol r(rr)];
      % find the columns in which A(r,j) = 1
      J = find(X(r(rr),:) == 1);
      % initialize the reduced matrix
      Xred = X;
      I = [];
      for jj = 1:length(J)
          I = [I; find(Xred(:,J(jj)) == 1)];
      end
      I = unique(I);
      Xred(I,:) = []; % delete the I rows
      Xred(:,J) = []; % delete the J columns
      % repeat this algorithm recursively on the reduced matrix Xred.
      sol = dlx(Xred, sol);
end
end

For the above example,

sol = dlx(X, [])
sol =
       1     2     1     1

Except that my algorithm should return sol = [2 4 6].

Here is another simple example that also bugs my code:

A = [1 1 0; 0 1 1];
sol = dlx(A, [])

sol =

1

In this case, the there is no exact cover of the matrix A, and so my algorithm should return an empty solution (e.g., sol = []).

As you can see, I am very close. I believe that I have successfully written the recursion of reducing the matrix, but I am seeking help on the following:

Looking at the pseudocode, should the "Choose a row, r, that contains a 1 in column c" be implemented as a for-loop in my code? That is, should I loop over all such rows?
I'm having trouble formulating the final solution from the partial solution (e.g., so that my function returns sol = [2 4 6]). During the recursion, the partial solution should 'backtrack' when the partial solution turns out not to be valid.
Where should I place the 'if A is empty' condition from the pseudocode? You'll see that I've tried putting this condition at the beginning of the function, and also at the 'isempty(r)' line.
I'm also open to suggestions for making my code more efficient; i.e., should/can I replace the 'find' or 'for-loops' with better code?

Can any one offer me suggestions to my questions? Thanks in advance!

3 Comments
Show 1 older comment Hide 1 older comment

Thomas Patterson on 4 Feb 2018

Open in MATLAB Online

The main problem with your code seems to be that when rows/columns are deleted to form the reduced matrix Xred the row/column indexing is changed. Thus if columns 3 and 4 are deleted subsequent reference to the remaining columns will be numbered 1,2,3,... and connection with the original columns is lost. It is also very memory zapping to have potentially a large matrix continually "saved" in the recursions. Instead one needs to maintain an index of the rows and columns. Thus if the row index is R=[2 4 7 9] then referring to row 2 actually corresponds to row R(2)=4 of the original matrix. The indices rather than the matrix elements are deleted. This preserves the sequencing and also saves a lot of memory. Additionally the partial solutions have to be properly managed. A "flag" needs to be included in the output to signal success or failure with the latest partial being removed after a failure.

I have amended your code to take account of these comments and that is appended. I haven't tested it exhaustively but it works on any examples I have tried.

    function [sol, OK] = newdlx(X,sol,RR,CC)
    % Solution of covering matrix
    % X is input matrix
    % sol is solution, initially [], (row vector)
  % RR is row selection index, initially all rows of X (row vector)
  % CC is column selection index, initially all columns of X (row vector)
  % OK is flag, true if successful, false if not (logical)
  OK=true
  % Check for successful result
  if isempty(RR) & isempty(CC), return; end
  % Find the first column of X having the lowest number of 1s in any position
  c = find(sum(X(RR,CC),1) == min(sum(X(RR,CC),1)), 1);
  % Check for unsuccessful termination
  f1=sum(X(RR,CC(c))) == 0;
  f2=isempty(RR) & ~isempty(CC);
  f3=isempty(CC) & ~isempty(RR);
  if f1 | f2 | f3
    OK=false;
    return
  end
  % Find the rows that have a 1 in column c
  r = find(X(RR,CC(c)) == 1);
  % Loop over each row that has a 1 in column c
  for rr=r'
  % Include row r in the partial solution
     sol = [sol RR(rr)];
  % Find the columns in which A(r,j) = 1
     J = find(X(RR(rr),CC) == 1);
     rows=[];
     for iii=J
        rows=[rows find(X(RR,CC(iii)) == 1)'];
     end
     rows=unique(rows); % Remove duplicates
  % Remove appropriate rows and columns
     RR1=RR;CC1=CC; RR1(rows)=[]; CC1(J)=[];
  % Repeat the algorithm recursively
     [sol,OK ] = newdlx(X, sol, RR1,CC1);
  % Remove last "guess" if unsuccessful
     if ~OK, sol=sol(1:end-1); end
  end
  sol=sort(sol)
  end

Guillaume on 4 Feb 2018

@Thomas, you should put your solution as an answer.

Matthew on 9 Mar 2018

Edited: Matthew on 9 Mar 2018

Open in MATLAB Online

@Thomas, thanks very much for posting (I've been out of town recently). Your code is very, very good, and I appreciate your narrative beforehand. I like how you address the memory issue of dealing with rather large matrices, too.

One test case that your code does not properly consider is if there are multiple solutions; i.e., if the matrix admits more than one exact cover. For instance, the matrix

A = [0 0 1 1; 1 1 0 0; 1 1 1 0; 1 0 0 1; 0 1 0 0; 1 0 1 0; 0 0 1 0];

can be exactly covered by rows [1 2] and [4 5 7]. Your code returns [1 2 4 5 7]. To accommodate, I suspect that the code ought to return a cell array since each exact cover solution may vary in length.

Other than that, I really like your code. It's certainly given me some direction as to what I want to do. I will soon post my adaptation, which will simply be an amalgamation of your code and Guillaume's (see below). Thanks again!

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Guillaume on 4 Feb 2018

Open in MATLAB Online

1 vote

Ok, just read the description of the algorithm on wikipedia. A few comments:

the test for emptiness is the first step according to the algorithm so that should be the first thing you should do. However, if require that the initial matrix is never empty you could move that test to just before calling the recursion since you know that the recursion will only return success. I'm not sure that would make the code any cleaner.

As it is at the moment, your code doesn't check for successful or unsuccessful recursion. You also need to be able to distinguish between success and failure so you need to either add an extra output or find a way to return a different sol in each case.

Furthermore, the algorithm should be able to return multiple solutions if they exist (for example for A = [1 0 0; 0 1 1; 0 1 0; 1 0 1; 0 0 1; 1 1 0] there are 4 solutions). So you can't just return a vector. Since the different solution may have different number of rows, you'd have to return a cell array.

As Thomas says in his comment, you also fails to track the actual row numbers after you've deleted rows.

In terms of loops, you have too many. The deletions can be done without loops (and even without find).

Here is how I'd implement it. I choose to move the recursion into its own function because of the extra argument needed to track the row indices. The distinction between success and failure is done by the sol return value. If it's an empty cell array it's a failure, otherwise it's a success. If the matrix was empty the cell array contains an empty matrix otherwise it contains the possible row combinations so far.

function [sol, success] = dlx(A)
    sol = recursion(A, 1:size(A, 1));
end
function [sol] = recursion(A, rownum)
    sol = {};   %empty cell array indicates failure   
    %step 1
    if isempty(A)
        %success
        sol = {[]};
        return;
    end
    %step 2
    [~, c] = min(sum(A, 1));
    if ~any(A(:, c))
        %failure
        return;
    end
    %step 3
    possiblerows = find(A(:, c));
    for tryrow = possiblerows.'
        %step 4 for later, only if success
        %step 5
        deletecols = A(tryrow, :) == 1;
        deleterows = any(A(:, deletecols) == 1, 2);
        reducedA = A(:,~deletecols);
        reducedA = reducedA(~deleterows, :);
        reducedrows = rownum(~deleterows);
        %step 6
        [subsol] = recursion(reducedA, reducedrows);
        if ~isempty(subsol)
            %step 4 
            %add current row to the all the row combinations returned by the recursion
            sol = [sol; cellfun(@(ss) [rownum(tryrow), ss], subsol, 'UniformOutput', false)]; %#ok<AGROW>
        end
    end
    %if all the tryrow ended in failure then sol is still an empty cell array which indicates failure on return
end

6 Comments
Show 4 older comments Hide 4 older comments

Matthew on 9 Mar 2018

Open in MATLAB Online

@Guillaume, great work. Your code builds from @Thomas's and is very clean. I also like how you handle tracking the row numbers of the matrix, rather than the entire matrix itself.

As I've been adapting my solution according to yours, I've encountered an edge case that your code does not properly address. The matrix,

A = [1 1 0; 0 1 1];

does not have an exact cover. Your code returns sol{:} = 1. It has hard for me to determine why, until I found this post on Stack Overflow:

Algorithm X to Solve the Exact Cover: Fat Matrices.

As one commenter points out, the confusion originates from the pseudocode of the original Algorithm X and what is meant by an empty matrix. An empty matrix, according to David Knuth's algorithm, isn't one having no entries, but one in which all of the row and column headings have been deleted. Evidently, Knuth was treating the rows/cols of a matrix as doubly-linked lists. The commenter goes on to suggest a workaround: simply force in a row of all zeros.

Thus, I propose adding amending the main function of your dlx routine as follows:

function sol = dlx(A)
% check to see if the matrix does not already have a row of all zeros
% in its first row
if (sum(A(1,:)) > 0)
    noRowOfZeros = 1; % (flag for later)
    A = [zeros(1,size(A,2)); A]; % force-in a row of all zeros
end
sol = recursion(A, 1:size(A, 1));
% if we forced-in a row of all zeros, this would shift all of the rows up 
% by one index, so now we subtract one from each row number in each
% solution
if (noRowOfZeros == 1)
    nSols = size(sol,1);
    for kk = 1:nSols
        sol{kk} = bsxfun(@minus, sol{kk}, 1); % subtract one
    end
end
% sort each solution by ascending row numbers
sol = cellfun(@sort, sol, 'UniformOutput', false);  
end

The rest of your code (e.g., the recursive recursion(A, rownum) routine) seems to work very well. Thanks again for your help and direction!

Guillaume on 26 Jun 2019

It was so long ago that I wrote this answer that I don't remember much about it.

The whole purpose of the algorithm is to find all the possible covers. While you could short-circuit the algorithm so that it stops as soon as the top-level recursion has found a cover, you would still have to store all the intermediate partial covers in each recursion, so I don't think that you would save much memory or processing time in general.

TONG DAVID on 27 Jun 2019

@Guillaume: Thanks for your quick response and I appreciate with that. It seems that make a short-circuit at the top-level recursion when the cover solution was found is the only way to reduce the memory and time (though it may not have a significant effect from the perspective of the whole problem). Thus I still want to have a try, could you please give me some help about how to put a short-circuit in the recursion? Upon your above codes, I put a "break" at the fourth line from the bottom but it seems failed.

Sign in to comment.

Answer 2

Thomas Patterson on 6 Feb 2018

Open in MATLAB Online

1 vote

Please note that the statement near the end of my newdlx code

sol=sort(sol)

should be deleted. It interferes with the proper updating of sol when a failure occurs.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Knuth's Algorithm X (Dancing Links; Exact Cover)

3 Comments
Show 1 older comment Hide 1 older comment

Accepted Answer

6 Comments
Show 4 older comments Hide 4 older comments

More Answers (1)

0 Comments
Show -2 older comments Hide -2 older comments

Categories

Tags

Community Treasure Hunt

Knuth's Algorithm X (Dancing Links; Exact Cover)

3 Comments Show 1 older comment Hide 1 older comment

Accepted Answer

6 Comments Show 4 older comments Hide 4 older comments

More Answers (1)

0 Comments Show -2 older comments Hide -2 older comments

Categories

Tags

See Also

Community Treasure Hunt

3 Comments
Show 1 older comment Hide 1 older comment

6 Comments
Show 4 older comments Hide 4 older comments

0 Comments
Show -2 older comments Hide -2 older comments