Tuesday, February 16, 2010

Solving Sudoku with MATLAB


Human puzzle-solvers and computer programs use very different Sudoku-solving techniques. The fascination with solving Sudoku by hand derives from the discovery and mastery of a myriad of subtle combinations and patterns that provide hints about the final solution. It is not easy to program a computer to duplicate these human pattern-recognition capabilities. For this reason, most Sudoku-solving programs take a very different approach, relying on the computer’s almost limitless capacity to carry out brute-force trial and error. That is the approach that I used for the MATLAB® program.

The Sudoku Challenge

As you probably know, solving a Sudoku involves filling in a 9-by-9 grid so that each row, column, and major 3-by-3 block contains all the digits 1 through 9. The initial grid is populated with a few digits, known as clues. In contrast to magic squares and other numeric puzzles, no arithmetic is involved; the elements in a Sudoku grid could just as well be letters of the alphabet or any other symbols.
Figure 1 shows an initial grid. I especially like the symmetry in this example, which is due to Gordon Royle of the University of Western Australia. Figure 2 shows the solution.
Figure 1. An example puzzle, with the clues shown in blue. This example has especially pleasing symmetry. Figure 2. The completed puzzle. The other digits have been inserted so that each row, column, and major 3-by-3 block contains the digits 1 through 9.
Figure 1. An example puzzle, with the clues shown in blue. This example has especially pleasing symmetry. Figure 2. The completed puzzle. The other digits have been inserted so that each row, column, and major 3-by-3 block contains the digits 1 through 9.

Solving Sudoku with Recursive Backtracking

Our MATLAB program uses only one pattern—singletons—together with a basic computer science technique, recursive backtracking.
To see how the program works, we can use a simpler 4-by-4 grid with 2-by-2 blocks. Such puzzles are called Shidoku instead of Sudoku because “Shi” is Japanese for “four.”
Figure 3 shows our first Shidoku puzzle. Figures 4 through 6 show its solution. In Figure 4, the possible entries, or candidates, are represented as small digits. For example, row two contains a “3” and column one contains a “1”, so the candidates in position (2,1) are “2” and “4.” Four of the cells contain only one candidate. These cells are the singletons. This first example can be solved easily by filling in the singletons. In Figure 5, we have inserted one of the singletons and recomputed the candidates. In Figure 6, we have inserted the remaining singletons as they appear to complete the solution.
Figure 3. Shidoku is Sudoku on a  4-by-4 grid. Figure 4. The candidates. The red candidates are the singletons.
Figure 3. Shidoku is Sudoku on a 4-by-4 grid. Figure 4. The candidates. The red candidates are the singletons.
Figure 5. Inserting the singleton “3” and recomputing the candidates. Figure 6. Insert the remaining singletons to complete the puzzle.
Figure 5. Inserting the singleton “3” and recomputing the candidates. Figure 6. Insert the remaining singletons to complete the puzzle.
An easy puzzle could be defined as one that can be solved by just inserting the singletons. Within this definition, our first example is easy, but the next example is not.
The input array for the puzzle shown in Figure 7 is generated by the MATLAB statement
X = diag(1:4) 
Figure 7. shidoku(diag(1:4))
Figure 7. shidoku(diag(1:4))
Because there are no singletons in this puzzle (Figure 8), we will use recursive backtracking. We select one of the empty cells and tentatively insert one of its candidates. We have chosen to consider the cells in the order implied by MATLAB one-dimensional subscripting, X(:), and try the candidates in numerical order. So we insert a “3” in cell (2,1), creating a new puzzle (Figure 9). We then call the program recursively.
Figure 8. The candidates. There are no singletons. Figure 9. Tentatively insert a “3” to create a new puzzle. Then backtrack.
Figure 8. The candidates. There are no singletons. Figure 9. Tentatively insert a “3” to create a new puzzle. Then backtrack.
The new puzzle is easy; the result is shown in Figure 10. However, this solution depends upon the choices that we made before the recursive call. Other choices could produce different solutions. For this simple diagonal initial condition, there are two possible solutions, which happen to be matrix transposes of each other. Since the solution is not unique, the grid in Figure 7 is not a valid puzzle.
Figure 10. The resulting solution. This solution is not unique; its transpose is another solution.
Figure 10. The resulting solution. This solution is not unique; its transpose is another solution.

Existence and Uniqueness

As mathematicians, we seek to prove that a solution to a problem exists, and that it is unique. With Sudoku, neither existence nor uniqueness can easily be determined from the initial clues. For example, with the puzzle shown in Figure 1, if we were to insert a “1”, “5,” or “7” in the (1,1) cell, the row, column, and block conditions would be satisfied but the resulting puzzle would have no solution. It would be very frustrating if such a puzzle were to show up in your newspaper.
Backtracking generates many impossible configurations. Our program terminates the recursion when it encounters a cell that has no candidates. Such puzzles have no solution.
Uniqueness is an elusive property. Most descriptions of Sudoku do not specify that there must be only one solution. Again, it would be frustrating to discover a solution different from the one given. Some of the puzzle-generating programs on MATLAB Central do not check uniqueness. The only way that I know to check for uniqueness is to exhaustively enumerate all possible solutions.

The Sudoku-Solving Algorithm

Our MATLAB program involves just four steps:
1. Fill in all singletons.
2. Exit if a cell has no candidates.
3. Fill in a tentative value for an empty cell.
4. Call the program recursively.
The key internal function is candidates. Each empty cell starts with z = 1:9 and uses the numeric values in the associated row, column, and block to zero elements in z. The nonzeros that remain are the candidates. For example, consider the (1,1) cell in Figure 1. We start with
z = 1 2 3 4 5 6 7 8 9 
The values in the first row change z to
z = 1 0 0 0 5 6 7 8 9 
Then the first column changes z to
z = 1 0 0 0 5 0 7 0 9
The (1,1) block does not make any further changes, so the candidates for this cell are
C{1,1} = [1 5 7 9]

A Difficult Puzzle

The puzzle shown in Figure 1 is actually very difficult to solve, either by hand or by computer. Figures 11 and 12 are snapshots of the computation. Initially, there are no singletons, so the first recursive step happens immediately. We try a “1” in the (1,1) cell. Figure 11 shows how the first column is then filled by step 22. But we’re still a long way from the solution. After 3,114 steps, the recursion puts a “5” in the (1,1) cell, and after 8,172 steps, it tries a “7.”
Figure 11. The situation after just 22 steps towards the solution of Figure 1. Values colored cyan are tentative choices made by the backtracking, and values colored green are the singletons implied by those choices. The “1” is the wrong choice for the (1,1) cell. Figure 12. After 14,781 steps, we appear to be close to a solution, but it is impossible to continue because there are no candidates for cell (1,9). The “7” is the wrong choice for the (1,1) cell.
Figure 11. The situation after just 22 steps towards the solution of Figure 1. Values colored cyan are tentative choices made by the backtracking, and values colored green are the singletons implied by those choices. The “1” is the wrong choice for the (1,1) cell. Click on image to see enlarged view. Figure 12. After 14,781 steps, we appear to be close to a solution, but it is impossible to continue because there are no candidates for cell (1,9). The “7” is the wrong choice for the (1,1) cell. Click on image to see enlarged view.
Figure 12 shows the situation after 14,781 steps. We appear to be close to a solution because 73 of the 81 cells have been assigned values. But the first row and last column, taken together, contain all the digits from 1 through 9, so there are no values left for the (1,9) cell in the upper right corner. The candidate list for this cell is empty, and the recursion terminates. Finally, after 19,229 steps, we try a “9” in the first cell. This “9” is a good idea because less than 200 steps later, after 19,422 steps, the program reaches the solution shown in Figure 2. This is many more steps than most puzzles require.

Solving Sudoku Using Recursive Backtracking

function X = sudoku(X)
% SUDOKU  Solve Sudoku using recursive backtracking.
%   sudoku(X), expects a 9-by-9 array X.
% Fill in all “singletons”.
% C is a cell array of candidate vectors for each cell.
% s is the first cell, if any, with one candidate.
% e is the first cell, if any, with no candidates.
[C,s,e] = candidates(X);
while ~isempty(s) && isempty(e)
X(s) = C{s};
[C,s,e] = candidates(X);
end
% Return for impossible puzzles.
if ~isempty(e)
return
end
% Recursive backtracking.
if any(X(:) == 0)
Y = X;
z = find(X(:) == 0,1);    % The first unfilled cell.
for r = [C{z}]            % Iterate over candidates.
X = Y;
X(z) = r;              % Insert a tentative value.
X = sudoku(X);         % Recursive call.
if all(X(:) > 0)       % Found a solution.
return
end
end
end
% ------------------------------
function [C,s,e] = candidates(X)
C = cell(9,9);
tri = @(k) 3*ceil(k/3-1) + (1:3);
for j = 1:9
for i = 1:9
if X(i,j)==0
z = 1:9;
z(nonzeros(X(i,:))) = 0;
z(nonzeros(X(:,j))) = 0;
z(nonzeros(X(tri(i),tri(j)))) = 0;
C{i,j} = nonzeros(z)’;
end
end
end
L = cellfun(@length,C);   % Number of candidates.
s = find(X==0 & L==1,1);
e = find(X==0 & L==0,1);
end % candidates
end % sudoku

The Origins of Sudoku

Most people assume that Sudoku originated in Japan. In fact, it is an American invention. It first appeared, with the name Number Place, in Dell Puzzle Magazine in 1979. In 1984, a Japanese publisher, Nikoli, took the puzzle to Japan and gave it the name Sudoku, which is a kanji acronym for “numbers should be single, unmarried.” The Times of London began publishing the puzzle in 2004, and it was not long before it spread to the U.S. and around the world.

EMBEDDED DEBUGGING

EMBEDDED SYSTEM DEBUGGING

Debugging tools
Application Debugging: Simulators and emulators are two powerful debugging tools which allow developers to debug (and verify) their application code. These tools enable programmer to perform the functional tests and performance tests on the application code. Simulator is a software which tries to imitate a given processor or hardware. Simulator is based on the mathematical model of the processor. Generally all the functional errors in an application can be detected by running it on the simulator. Since simulator is not actual device itself, it may not be an exact replica of the target hardware. Hence, some errors can pass undetected through the simulator. Also, the performance of an application can not be accurately measured using Simulator (it only provides a rough estimate). Generally most development tools come under an integrated environment, where Editor, Compiler, Archiver, Linker and Simulator are integrated together. Emulator (or Hardware Emulator) provides a way to run the application on actual target (but under the control of a emulation software) hardware. Results are more accurate with emulation, as the application is actually running on the real hardware target.

Hardware Debugging: Developer of an Embedded System often encounters problems which are related to the Hardware. Hence it is desirable to gain familiarity with some Hardware Debugging (probing tools). DVM, Oscilloscope (DSO or CRO) and Logical Analyzer (LA) are some of the common debugging tools, which are used in day to day debugging process.

Memory Testing Tools There are a number of commercially available tools which help programmers to test the memory related problems in their code. Apart from Memory leaks, these tools can catch other memory related errors - e.g. freeing a previously allocated memory more than once, writing to uninitialized memory etc. Here is a list of some freely (no cost) available Memory Testing tools:

Debugging an Embedded System

(a) Memory Faults
One of the major issue in embedded systems could be memory faults. Following types of Memory Fault are possible in a system
(i) Memory Device Failure: Some times the memory device may get damaged (some common causes are current transients and static discharge). If damaged, the memory device needs replacement. Such errors can occur in run time. However such failures are very rare.
(ii) Address Line Failure: Improper functioning of address lines can lead to memory faults. This could happen if one or more address lines are shorted (either with ground or with each other or with some other signal on the circuit board). Generaly these error occur during the production of circuit board, and post-production testing can catch such errors. Some times the address line drivers might get damaged during run time (again due to current transients or static discharge). This can lead to address line faults during run time.
(iii) Data Line Failure Can occur if the data lines (one or more) are shorted (to ground or with each other or with some other signal). Such errors can be detected and rectified during post-production testing. Again, the electric discharge and current transients can damage can damage the data line drivers, which might cause to memory failures during run time.
(iv) Corruption of few memory blocks : Some time a few address locations in the memory can be permanently damaged (either stuck to Low or stuck to High). Such errors are more common with Hard-disks (less common with RAMs). The test software (power on self test) can detect these errors, and avoid using these memory sectors (rather than replacing the whole memory).
(v) Other Faults : Some times the memory device may be loosely inserted (or may be completely missing) in to the memory slot. Also there is a possibility of Fault in Control Signals (similar to Address and Data Lines).

There are two types of sections in System Memory - Program (or code) sections, and Data sections. Faults in program sections are more critical because even the corruption of one single location can cause the program to crash. corruption of data memory also could lead to program crashes, but mostly it only cause erratic system behavior (from which the application could gracefully recover - provided that software design takes care of error handling).

Memory Tests
Following simple tests can detect memory faults:
(a) Write a known patter "0xAAAA" (All odd data bits being "1" and even bits being "0") in to the memory (across all address ranges) and read it back. Verify that the same value (0xAAAA) is read back. If any Odd Data line is shorted (with even data line or with Ground), this test will detect it. Now repeat the same test with data pattern "0x5555". This test will detect any shorting of the even Data line (short with ground or with odd data line). Also, these two test in combination can detect any bad memory sectors.
(b) Write a unique value in to each memory word (across entire memory range). Easiest way to choose this unique value is to use the address of given word as the value. Now read back these values and verify them. If the verification of read back values fails (whereas the test-a passes), then there could be a fault in address lines.

The tests "a" and "b" can be easily performed as part of power on checks on the system. However it will be tricky to perform these tests during run time, because performing these test will mean loosing the existing contents in the memory. However certain systems run such memory tests during run time (once in every few days). In such scenarios, the tests should be performed on smaller memory sections at a time. Data of these memory sections can be backed up before performing the test, and this data can be restored after test completion. Tests can be run one by one on each section (rather than running the test on entire memory at a time).

(b) Hardware Vs Software Faults
In Embedded System, Software is closely knit with the Hardware. Hence, the line dividing the Hardware and Software issues is very thin. At times, you may keep debugging you software, whereas the fault may lie somewhere in the Hardware (and vice versa). The problem becomes more challenging when such faults occur at random and can not be reproduced consistently. In order to disect and debug such tricky issues, a step-wise approach needs to be followed.

* Prepare a test case: When you observing frequent application crash because of some unknown issues, you should plan to arrive at a simpler test case (rather than carrying on debugging with entire application). There are two benefits of this approach: A simpler application will take less time to reproduce the error and hence total debugging time is fairly reduced. Secondly, there are less unknown parameters (which might be causing the error) in a smaller application, and hence less debugging effort is needed. You can gradually strip down the application (such that the error is still reproducible) to get a stripped down version of the application which can act as a preliminary version of the test case.
* Does the error change with change in operating condition: Does the error change (frequency of error or type of error) with change in the operating conditions (e.g. board temperature, processor speed etc)? If so, then your errors might be hardware related (critical timings or glitches on signals).
* Is the error reproducible: Arriving at a test case which can reliably reproduce the error greatly helps the debug process. A random error (not reproducible consistently) is hard to trace down, because you can never be sure as to what is causing the system failure.
* Keep your Eyes Open: Always think lateral. Do not be inclined towards one possibility of error. The error could be in the application software, in the driver software, in the processor hardware on in one of the interface. Some times there could be multiple errors to make your life miserable - such errors can be caught only with stripped down test cases (each test case will catch a different error). * Mantain Error Logs: While writing the application code, you should add the provision for system log in Debug version (Log can be disabled for the release version). Log of the events, just prior to system crash can tell you a great deal about the possible causes of failure.

(c) POST
Generally most systems perform a POST (Power On Self Test) on start up. POST may be during the pre-boot phase or during the boot phase. POST generally includes tests on memory and peripherals of a system.

PERIPHERALS in embedded


PERIPHERALS
Peripherals (of a processor) are its means of communicating with the external world.
(1) Peripheral Classification
Peripherals can be classified based on following characteristics
Simplex, Duplex & Semi Duplex
Simplex communication involves unidirectional data transfers. Duplex communication involves bi-directional data transfers. Full Duplex interfaces have independent channels for transmission and reception. Semi-duplex communication involves data bi-directional data transfers, however at a given time, the data transfer is only possible in one direction. Semi-duplex interfaces involves the same communication channel for both transmission and reception.
Serial Vs Parallel
Serial peripherals communicate over a single data line. The data at Tx end needs to be converted “Parallel to Serial” before transmission and the data at Rx end needs to be converted “Serial to Parallel” after reception. Serial peripherals imply less signal lines on the external interface and thus reduced hardware (circuit board) complexity and cost. However the data rate on serial interfaces are fairly limited (as compared to the parallel interface). At the same clock rate, parallel interface can transfer Nx data, as compared to the serial interface (where N is the number of Data lines).
Synchronous Vs Asynchronous
Synchronous transfers are synchronized by a reference clock on the interface. This clock signal is generally provided by one of the devices (who are communicating) on the interface, called master device. However clock can also come from an external source.
Data Throughput
Interfaces can also be classified based on the data throughput they offers. Generally parallel interfaces provide much more data throughput and are used for application data (this data needs to be processed by the application). Serial interfaces offer less data throughputs, and are generally used to transfer intermittent control data.
(2) Common Serial Peripherals
(a) UART (Universal Asynchronous Receiver Transmitter)
UART is one of the oldest and most simple serial interface. Generally UART is used to tranfer data between different PCBs (Printed Circuit Boards). These PCBs can be either in the same system or across differnt systems. In its simplest configuration, UART consists of two pin interface. One pin is used for Transmission, and other for Reception.
The data on UART is transferred word by word. A word consists of Start Bit, Data bits (5 to 8), (and optional parity bit) and (1, 1.5 or 2) Stop Bit. The individual bits of data word are transferred one by one on the serial bus.



Start Bit: The Tx Line of a UART Transitter is high during periods of inactivity (when no communication is taking place). When the transmitter wants to initiate a data transmission it sends one START bit (drives the Tx line low) for one bit duration.
Data Bits: Number of data bits can be configured to any value between 5 and 8. UART employs LSB first Transmission.
Parity Bit: One parity bit can be optionally transitted along with each data word. The parity bit can be configured either as Odd or as even.
Stop Bit: After each word transmission, transmitter transmits Stop bits (drives the Tx line high). Number of stop bits can be configured as 1, 1.5 or 2.
Asynchronous Transmission: UART data transfers are asynchronous. The transmitter transmits each bit (of the word being transmitted) for a fixed duration (defined by baud rate). The receiver polls the value of transmit line (of transmitter). In order to be able to receive the data correctly, receiver needs to be aware of the duration for which each bit is transmitted (it is defined by baud rate).
Baud Rate: Baud is a measurement of transmission speed in asynchronous communication. It is defined as the number of distinct symbol changes made to the transmission media per second. Since UART signal has only two levels (high and low), baud rate here is also equal to the bit rate.
RS-232 and DB-9
UART can be used to transfer data directly across any two devices. However the most common usage of UART involves transfer of data from a PC (or other host computer) to a remote board (other slave device). Under such scenarios (where distance between two devices is more than a few inches), physical interface between Tx and Rx devices is defined by RS-232 specifications. Signals at each end are terminated to a 9-pin (DB-9) connector.
Debugging UART Interface
Following steps could be helpful while debugging communication problems on a UART interface
(a) UART loop-back: Run the internal loop-back tests on both Rx and Tx (most UART devices provide this functionality). This will ensure that each device is functional (not damaged)
(b) Check the Configuration: If the communication between two devices is failing, there could be a configuration mismatch between Tx and Rx. Cross-check the configuration at both sides and ensure that it is identical.
(c) Check the Serial Cable: Generally two UARTs are connected through a serial cable (which has 9-pin connectors on both sides). The cable should be a cross-over (Tx on one side connects to Rx on other side). A faulty (damaged or wrong corssings) serial cable can also cause erratic behavior. Make sure that cable is not damaged.
(d) Probe the Tx signal: If UART communication still remains erratic (after checks a, b and c), the last resort would be to probe the UART signals using a scope.
Limitation: Both the sender and receive should agree to a predefined configuration (Baud Rate, Parity Settings, number of data and stop bits). A mismatch in the configuration at two ends (Transmitter and Receiver), will cause communication failure (data corruption). Data rates are very slow. Also, if there are more devices involved in communication, the number of external pins needed on the device increase proportionally.