During time the Logelium algorithm improved a lot. The first approach was based on a network of linked cells to model the grid. The tiles are then paths navigating along the cell links. It became clear very quickly that this method is a nice abstraction of many puzzles, but a computer program implementation has no good performance. Therefore bit arrays had been chosen for the basic data structure for the tiles.
The main design ideas of the Logelium algorithm will be explained with the TriTetraTan puzzle as an example. Each kind of puzzle has its own mapping into the Logelium data structure. The mapping is completely generic and defined by the configuration files of the puzzle. In some cases this step is really tricky.
The 18 tiles of the TriTetraTan puzzle consist of the four TriTans A,B,C,D and the 14 TetraTans. The tiles fit into a square grid, where each grid element is cut diagonally in both directions.
Each TriTetraTan tile can have up to eight shapes, which can be generated by rotation and mirror transformations. The number of shapes can be different with other grid types. All shape definitions are already included in the tile definitions of the puzzle, because this information is independent of the form to solve. Furthermore you can easily create a one-sided version of the puzzle by making the mirror shapes a tile of its own.
The next pictures shows the shapes of the tile »Q« of the TriTetraTans.
If the tiles are symmetric, not all shapes are different. The tile»X« has only one shape, »H« has two, »B«, »D«, »I«, »M«, »R«, »S«, »V«, »W« and»Z« have four shapes.
Every shape of a tile can occur in many positions inside the form to solve. Each position will be defined by the tile name, shape number and its X–Y coordinates of the first cell numbered bottom to top and left to right.
Example notation: Q.1@3.1 ( Tile »Q«, Shape 1, Position X=3 Y=1 )
The system of cells and atoms is very flexible and allows modelling of a large class of puzzles. The cells are defined by their column and row index. Within each cell lives at least one atom, in our case mostly four and sometimes two. Every atom is either completely used or not at all.
The picture below shows a solution with TriTetraTans and the numbering of the atoms. The order is bottom to top and left to right and a fix local order inside the cells.
The position of a tile is exactly described by the contained atoms and is represented by a bit array in the Logelium. The 136 atoms of the above form to solve are a bit array too, which is the sum of the bit arrays of the included positions. The following list describes the solution equation. The bit arrays are shown in hexadecimal MSB notation.
Z.1@1.1 A00000F0 00014000
00000000 00000000 00
K.2@1.1 5F00000C 00000000 00000000 00000000 00
Q.1@3.1 00F00001 68000000 00000000 00000000 00
F.1@4.1 000F0000 17000000 00000000 00000000 00
H.1@5.1 0000FF00 00000000 00000000 00000000 00
R.3@3.2 00000002 800017C0 00000000 00000000 00
C.3@5.2 00000000 00F00005 00000000 00000000 00
M.3@6.2 00000000 000C0000 F0000140 00000000 00
G.8@1.3 00000000 0002A800 05C00000 00000000 00
B.1@4.3 00000000 00000030 000F0000 00000000 00
I.6@5.3 00000000 0000000A 00001680 00050000 00
V.3@1.4 00000000 00000000 0A300017 00000000 00
S.2@4.4 00000000 00000000 0000E800 17000000 00
W.4@1.5 00000000 00000000 00000028 0000FA00 00
X.1@2.5 00000000 00000000 00000000 E80005C0 00
D.1@4.5 00000000 00000000 00000000 00C0003C 00
J.2@5.5 00000000 00000000 00000000 003A0003 C0
A.3@5.6 00000000 00000000 00000000 00000000 3F
Each grid type has its own specific arrangement of cells and atoms. All of them can not be shown here. Three dimensional grids are decomposed into a set of two dimensional layers, which then in turn are treated as above. This supports the graphical presentation as tableau too. Sometimes there is more than one choice for modelling a puzzle.
When reading the text file including the form to solve, the corresponding puzzle and grid definition is searched. Then follows a check, whether parameter and form description is correct and valid. A list of cells and its properties is built and all tiles with their shapes are checked and prepared for further use.
All positions of all shapes of tiles are computed and stored as bit arrays. The may result in a significant amount of memory with large puzzles. For each atom there are two reference lists. The main list contains all positions, which touch the corresponding atom. The secondary list contains all positions, where the index of the atom is minimal. This list maps each position uniquely to an atom.
To manage the symmetry of a form, all redundant positions of a selected tile will be cancelled, which can be transformed into each other by rotating and flipping. The selected tile must have the full range of shapes for this purpose. This method works only for two sided tile sets with no duplicate tiles.
In the next step all positions are cancelled, which have no continuation, because these ones are under no circumstances part of a solution. A good example is the scenario when a small area is cut off. To retain a particular position in the valid set we need for each atom at least one more position, which can be set into the form to solve. This method is repeated recursively until a stable subset of positions is left over.
The reduction of the position set is essential to make the Logelium algorithm effective.
Because of the twofold mirror symmetry only the shapes 1+2 of the tile »K« will be used.
Step 2 and 3 result in 1331 positions of the 18 tiles. With step 4 the number is reduced by 191 to finally 1140.
In the very simplest case for each recursion step we look up the smallest free atom index. Then the positions corresponding secondary list are tried to set into the form. With this method we are always sure that all atoms with smaller index are already set, so the positions of the secondary list are sufficient.
Checking of the position is very fast, because only logical AND operations are used, which are pretty basic for computers. After a positive check result the position will be set. This is done with a logical OR operation, which is fast too. The current situation is saved beforehand.
Naturally a solution is detected, if no more free bits are present.
If at a stage all positions of the list and all follow up stages of each one are calculated, the method goes back to the preceding stage to continue with the next list position there.
This method is well known as Backtracking.
You can check concurrently to the main algorithm flow additional conditions, which have to be met all the time. Conditions that define a number that adds up with the positions are usable for that purpose. Consequently this leads to the fact that the condition value for each solution of a form equals the sum of the conditions values for the contained positions. So with every step we can check if the target value can be reached with the remaining partial values.
The most important conditions of this kind are parity conditions and the most used and simplest is the chess parity. The parity value of a position can have positive or negative sign, but the absolute value is related to the tile only. The chess parity is calculated by the sum of atoms on white cells minus the sum of atoms on black cells.
It's very efficient to know conditions that guarantee for an intermediate stage, that it is never part of a solution. Again small cut off areas are the master example. As soon we have such scenario, it makes no sense to continue the recursion.
Despite of the good intention the detection of dead ends is critical. The effort of checks with all the recursion steps will reduce the possible gain of speed. There is a real danger that the dead end detection is finally counterproductive. So the effort and kind of such checks have to be balanced very carefully.
There are some good reasons to modify the order of visited atoms for the next stage. We have an extreme case when there is only one position left for a particular atom. This position is forced then and any other placement of the tile cannot lead to a solution. A generalization of this idea is to look for the atom with the least number of valid positions and continue the process at that point. The result is a fully dynamic selection of the next atom to cover. This approach is well known under the name "method of minimal branching". Note that in this case too the computing effort to find the atom with the minimal positions is significant and need careful attention. Therefore this method is used only up to a definite depth of the search and the rest is processed with the standard algorithm.
There are also reasons to cover some atoms very early, e.g. corners. This can be configured manually and leads to a static order of the selected atoms. The option of static order should always be accompanied by the dynamic order method to avoid bottlenecks.
The processing of the standard backtracking recursion did 69,625,204 successful piece placements.
Detection of 255,233 dead ends led to recursion shortcuts.
A total number of 31 solutions (PDF) were found.
It's an uttermost difficult question, at what point you can trust the result set of a technical computed puzzle solution. We face the fact that in case of a very large number of solutions the correctness of all of them can be answered only by another technical computed process. The same problem occurs with the question, if all found solutions are really unique.
So the fundamental question is multilayered and in the end there are still remaining questions.
At first you have to be pretty sure, that the concept of modelling the puzzle and the concept of the solution algorithm is correct. This is already difficult enough, but is on the level of usual mathematical and logical reasoning. The theoretical proof of correctness is impossible anyway, as we know.
Of course you can make the concept very simplistic, but the need for optimisation enhancements (e.g. the dead-end detection described above) makes the concept quickly complex and hard to comprehend. As I am no immortal god, I want to see the results during my lifetime.
The actual programming transforms the concept into a computer code written in a technical programming language. Everybody who knows a bit about such a process can imagine that many mistakes are possible on the way regardless how careful you work. The most annoying mistakes occur only in special constellations. These are therefore very subtle and can exist a long time unnoticed in the program. Another source of errors may be the computing hardware when running programs a very long time.
What remains is: test, test, test. The means on one hand comparing with manually found solutions of small forms and on the other comparing with results of independent concepts and solutions. This way the credibility of the solver is growing. Because the Logelium algorithm is not symmetric in repect of the forms, cross checks with rotated or flipped forms help a lot. Of course these forms must lead to the same solutions.
At the end of the day we find that we can increase the credibility, but 100% is absolutely impossible. We may not like this fundamental fact, but the problem is based on limitations of practical computability. If we look at the philosophic aspect, the question of credibility of any knowledge aquisition arises. I am convinced that any knowledge get less and less credible with growing complexity. I know that the generalisation from puzzle logic to any kind of logic is a very bold idea.