Sunday, 11 September 2011

Finding maximum zero submatrix with C#

After spending most of this evening attempting to solve this problem, it only feels right that I share my solution.

Suppose you have a binary matrix and you wish to find the largest zero submatrix, i.e. the largest rectangle of zeroes in the matrix (see below, highlighted in orange).

1 1 0 0 0 1 0
1 0 0 0 0 1 1
1 1 0 1 0 1 1
1 1 1 0 1 1 0

The brute-force approach to this problem isn't particularly efficient, with complexity O(n2m2). It involves looking at every position in the matrix, taking that position as an arbitrary point of the submatrix (e.g. the top-left point) and then testing all possible rectangles which originate at that point (e.g. all possible bottom-right points).

This algorithm finds the largest zero submatrix by looking at each position in turn and attempting to "grow" the submatrix up, to the left, and to the right. A dynamic programming approach is used and it keeps track of where the value 1 occurs. Array d contains the previous row where a 1 was found for each column. Array d1 contains the position of the left borders, and d2 contains the position of the right borders.

A complete code snippet for this program is available here. This has code to output the results and shows how to make use of the values calculated in this function.

static void MaxSubmatrix(int[,] matrix)
{
 int n = matrix.GetLength(0); // Number of rows
 int m = matrix.GetLength(1); // Number of columns

 int maxArea = -1, tempArea = -1;

 // Top-left corner (x1, y1); bottom-right corner (x2, y2)
 int x1 = 0, y1 = 0, x2 = 0, y2 = 0;

 // Maximum row containing a 1 in this column
 int[] d = new int[m];
 
 // Initialize array to -1
 for (int i = 0; i < m; i++)
 {
  d[i] = -1;
 }

 // Furthest left column for rectangle
 int[] d1 = new int[m];

 // Furthest right column for rectangle
 int[] d2 = new int[m];

 Stack<int> stack = new Stack<int>();

 // Work down from top row, searching for largest rectangle
 for (int i = 0; i < n; i++)
 {
  // 1. Determine previous row to contain a '1'
  for (int j = 0; j < m; j++)
  {
   if (matrix[i,j] == 1)
   {
    d[j] = i;
   }
  }

  stack.Clear();

  // 2. Determine the left border positions
  for (int j = 0; j < m; j++)
  {
   while (stack.Count > 0 && d[stack.Peek()] <= d[j])
   {
    stack.Pop();
   }
   
   // If stack is empty, use -1; i.e. all the way to the left
   d1[j] = (stack.Count == 0) ? -1 : stack.Peek();
   
   stack.Push(j);
  }

  stack.Clear();

  // 3. Determine the right border positions
  for (int j = m - 1; j >= 0; j--)
  {
   while (stack.Count > 0 && d[stack.Peek()] <= d[j])
   {
    stack.Pop();
   }
   
   d2[j] = (stack.Count == 0) ?  m : stack.Peek();
   
   stack.Push(j);
  }

  // 4. See if we've found a new maximum submatrix
  for (int j = 0; j < m; j++)
  {
   // (i - d[j]) := current row - last row in this column to contain a 1
   // (d2[j] - d1[j] - 1) := right border - left border - 1
   tempArea = (i - d[j]) * (d2[j] - d1[j] - 1);

   if (tempArea > maxArea)
   {
    maxArea = tempArea;

    // Top left
    x1 = d1[j] + 1;
    y1 = d[j] + 1;
    
    // Bottom right
    x2 = d2[j] - 1;
    y2 = i;
   }
  }
 }
}

1 comment:

  1. Good job! this is a very elegant implementation. It may be helpful to add to your description the following observation (which you used): The 'i' in the external loop on the rows (line 29) represents the lower end of the currently tested rectangles. Namely, after each iteration of the external loop, it is guaranteed that the currently established rectangle (x1,y1)x(x2,y2) has the maximal area among all rectangles ending on row 'i'.

    ReplyDelete