CHEST = CHEss problem analyST For Copyright notice read file "COPYRIGHT". $Id: README_LONG,v 1.7 1999/12/09 21:58:20 heiner Exp $ Introduction ============ If you want a quick go, try to follow the instructions in "README_QUICK". CHEST is an experimental program solving orthodox chess problems (like "white to move and mate in 5 moves"). It is written in ANSI C, and developed under UNIX. There is no user interface: CHEST just reads an input file and produces plain text on standard output. It is not hard to come up with a program that does the same job as CHEST, but it will be rather slow for non-trivial jobs. CHEST tries hard to be fast but still 100% correct. This document applies to CHEST version 3.19. Overview ======== These chapters follow: - Basic Operation =========================== Using CHEST: - Acceptable Chess Boards - Input - EPD Input - CHEST Native Input - Some Translations English / German - CHEST Options =========================== Explanations: - Job Types - Principles of Operation - Examples for EPD Input - Examples for Native Input - Implementation Highlights - Legend/Acronyms - Bugs and Caveats =========================== About CHEST: - Contact / Feedback - History - Future / Planned Features =========================== Compilation & Configuration: - Compilation - External Manifests - Configuration Manifests Basic Operation --------------- You give a board and a depth. CHEST tells you whether there is a forced mate in (at most) that depth, and if so, how to start that. =============================================================================== Using CHEST Acceptable Chess Boards ----------------------- A main part of the input to CHEST is a (description of a) chess board. A "legal" chess board is one, which could occur in a legal chess game (the moves may be arbitrarily stupid, but must be legal according to the rules). CHEST will accept every legal chess board, and some illegal ones. Proving or testing the strict legality of a board would imply the construction of a legal game leading to the board in question. That kind of backward analysis is called "retro analysis", and is (by definition) not part of CHEST's job. Retro analysis is not easy. See e.g. "http://www-leibniz.imag.fr/CONCUR/phs/Retros/Glos_Mutex.html". Castling- and e.p.-rights must be stated explicitly, CHEST will not invent them. However, CHEST requires the following conditions to always hold: - Each side has at least one, and at most 16 pieces/chessmen. - Each side has exactly one king. - The side to move must not be able to capture the other king. - For each enabled castling the appropriate pieces (king/rook) must be at the necessary board positions. - For an e.p. target there must be present an appropriate pawn with two empty squares behind it. Not enforced, but recommended is: - Pawns must not be placed into the first or last row (line/rank). - There should not be more than 9 (Q+P), 10 (R+P), (B+P) or (N+P). More generally, the set of pieces should be reachable in a legal game. Input ----- CHEST is reading its input either - from a file (if specified on the command line), or - from standard input (which may be your keyboard). An explicit empty filename or the file "-" is considered the standard input. This input is read line by line (max. line length 4K). For each input line leading and trailing white space (blanks and tabs) is stripped off. EPD Input --------- With option -b CHEST expects its input to be EPD. For each EPD input line it will try to solve that job, and if successful produces one EPD line as output. If the job cannot be solved, no output is generated. The following EPD tags are known to CHEST: acn number of nodes visited during analysis acs number of seconds spent in analysis am avoid moves (not yet generated) bm best moves (key moves of solution) ce centi pawn evaluation (see PGN doc) dm direct mate (depth in moves) pm proposed move (first move in "pv") pv principle variation These are generated or omitted by CHEST. All other opcodes are just copied into output (except for "id" which is extended by the first move for "-x"). CHEST accepts some input lines, which are not strictly EPD. If the first (non white) character is '%' recognized as PGN escape, and copied into output ';' recognized as PGN comment, and copied into output '#' recognized as comment, and discarded Empty lines are also discarded. With EPD input (currently) the job type is always normal (forced mate). The depth and style of the job to perform is derived from the EPD input line by using the first matching rule from: - If the "ce"-value is >32000 (should be even), a direct mate is computed. The depth is (32767-CE + 1)/2 moves. 32766 = #1, 32764 = #2 etc. - If the "ce"-value is <-32000 (should be odd), CHEST tries to prove that an unavoidable direct mate follows (lost in N). The depth of the mate is (CE+32767)/2 moves. - If a "dm"-value > 0 is specified, a mate in that depth is done. - If a "pv" is specified and ends with a '#' (to indicate mate) and contains M plies, then either a direct mate in (M+1)/2 moves (for M odd), or an unavoidable mate in M/2 (for M even) is done. - If a "ce"-value >600 is given, a mate with default depth is done. - If a "ce"-value <-600 is given, an unavoidable mate with default depth is done. - Otherwise, first a mate with default depth is searched, and if that fails, an unavoidable mate of that depth minus one is done. The default depth can be specified by option -Z. If not given, 5 is used. CHEST Native Input ------------------ Without option -b CHEST expects to recognize its own special input language. If you find the native input language of CHEST to be ugly, you are right, it is. It was never intended to be written by humans, directly, but rather to be produced by some user interface (which never got written). The first non-white character determines the meaning of the rest of the line: none Empty lines are ignored. # Introduces a comment: ignored. > Title line: is copied to the output (with '>' replaced by '#') % ; PGN escape lines and PGN comment lines are copied through. I Input language: the next non-white character determines the language of the (rest of the) input. Currently supported: - 'E' or 'e': English - 'G' or 'g': German - 'D' or 'd': Deutsch (= German) The initial default is "unspecified", which accepts mixtures, but prefers an interpretation in German (sequence of languages as given in "lang.h"). Note, that all FEN, EPD and PGN is always language independant. O Output Language: the next non-white character determines the language of the output. See above for choices. Currently, this influences just the characters used for color and pieces. Messages are still in English. The initial default is German (manifest DFT_LANG in "lang.h"). L Language: sets both, input and output language. It is recommended to define the wanted language to avoid misinterpretations: e.g. 'B' is the English "bishop", but the German "Bauer" (= pawn). f Forsyth line: after optional white space CHEST expects board contents. These are interpreted in the current input language. The board contents are given row by row, starting with row 8, down to row 1 (reading order). Rows may be separated by a slash (/), but need not. Uppercase letters are white pieces, lower case letters are black pieces. Digits are used to indicate a series of empty squares. F FEN/EPD line: after optional white space CHEST expects the first 4 fields of a FEN or EPD line: board, to move, castling and e.p. z Job depth: this is followed immediately by the depth of the job (in full moves), and the side to move. Example: z4w white to move and solve in 4 moves The job depth can be overriden by the -z option. c Enable (one) castling: followed by a colour-code and a castling code (see table below). To enable all 4 castling (in English): cws cwl cbs cbl NB: You must not enable castlings, for which the necessary pieces are not present at their proper place! C FEN/EPD castling, the third field of FEN or EPD. e Enable en passent (e.p.): followed by the algebraic position of a pawn which may be captured e.p., i.e. which has just done a double step. Assuming a white pawn on g5, and the last black move was h7-h5, this would be noted as eh5 E FEN/EPD e.p., the fourth field of FEN or EPD. j Job type: by default CHEST computes an orthodox mate problem. Some others are also implemented: o | n orthodox (normal) O | N stalemate h helpmate H helpstalemate s selfmate S selfstalemate For an explanation see below. . End of Job: if another dot follows (".."), this is also a logical end of input file. The double dot is useful when appending results behind the job in the input file: chest jobfile >> jobfile The double dot prevents the results to be interpreted as input. else Set a single piece: the colour of the piece, the code for the type of piece and its algebraic position. E.g. (in English) bke8 sets a black king on its original position "e8". NB: The king of each colour *must* be set before all other pieces of the same colour. Some Translations English / German ---------------------------------- As all 3 authors are Germans some German terms made it into the source code, and into the native input language. Language dependant codes for colour: w white weiss w b black schwarz s Language dependant codes for pieces: p pawn Bauer b n knight Springer s b bishop Laeufer l r rook Turm t q queen Dame d k king Koenig k PNBRQK BSLTDK Language dependant codes for castling: s short kurz k l long lang l short = king side, long = queen side Some other translations: mate Matt stalemate Patt promotion Umwandlung castling Rochade pin Fesselung CHEST Options ------------- Options are read from two sources - First, the environt variable CHEST_OPTS is scanned. - Then, the command line options are scanned. Options start with a minus sign. Multiple (single char) options may be clustered behind a minus sign. Some of the options are for the programmers of CHEST. For a normal user the following options may be of interest: -? Prints a usage and terminates. -V Prints version and terminates. -c Just checks the input board for legality, and does not compute any job. Even without this option a check for minimal legality is performed on the input board (to avoid crashes). -M N (only for ACM_DYN:) for the TT there are allocated N MB at run time. If N is -1, the compiled in default is used. If N is zero, the allocation and use of a TT is suppressed completely. -b Bulk mode: EPD in and out. See chapter "EPD input" above. This is very useful for test suites, and for large numbers of jobs produced automatically. Inverts option "-r". -l Increment depth of solution tree by one full move (default=0). -L Print complete solution tree (depth unrestricted). -r Suppress the output of the refutation table. Inverted by option "-b". -S Try to shorten (optimize) the solution tree output by using short move notation and using some wildcards: =*= all legal moves -*- any other legal move This option is used together with -l and -L, and is recommended. -u Increment dual suppression level (initially 0). This counts in full moves from the leaves of the tree. "Duals" are alternative solutions. This can greatly reduce the size of (deep) solution trees, especially when used together with -S. -U Complete suppression of duals, except at the top level, where dual computation/printing cannot be suppressed. -x From the given board, enumerate all legal moves, execute each of them, and for all the resulting boards independantly do the specified job (now with the other colour to move). This can be used to analyse "black moves but looses in x moves". -z N Override the depth in the input with N. -Z N Use depth N if no depth is provided by the input (nor -z). =============================================================================== Explanations Job Types --------- What exactly is a "white to move and mate in N moves" job? - Mate in 1 (#1): find all legal moves, which immediately result in a "mate". If there is no such move, the job fails (no solution). A "mate" occurs, iff the side to move is in check and has no legal move. A "stalemate" occurs, iff the side to move is *not* in check, and it has no legal move. - Mate in N (#N) with N>1: a "mate in N" is already solved, if there is a "mate in K" with K Shinkman: mate in 2 LE f 7b/8/Q7/8/1K4n1/8/1k1P4/7R z2w .. ---------- end of input ---------- ---------- begin of output ---------- CHEST version 3.16, 16-Jun-1999 Options = -r -LS Input file: 'i.x.shm' Reading job: # Shinkman: mate in 2 W: Kb4 Qa6 Rh1 Pd2 (4) B: Kb2 Bh8 Ng4 (3) FEN: 7b/8/Q7/8/1K4n1/8/1k1P4/7R w - - analysing (mate in 2 moves): Solution (in 2 moves): Qa6 - e2 Time (user) = 0.01 sec Qe2 Bc3+ dc3+ Bd4 d3+ Ka2 d4+ Ne3 de3+ -*- d3+ d4+ end of solution tree Total Time (user) = 0.02 sec ---------- end of output ---------- ---------- begin of input ---------- > KRK endgame LE wka1 wrh8 bke5 z10w .. ---------- end of input ---------- ---------- begin of output ---------- CHEST version 3.16, 16-Jun-1999 Options = -r Input file: 'i.x.ktk' Reading job: # KRK endgame W: Ka1 Rh8 (2) B: Ke5 (1) FEN: 7R/8/8/4k3/8/8/8/K7 w - - analysing (mate in 10 moves): No solution in 10 moves. Time (user) = 153.56 sec (ca. 2.6 min) ---------- end of output ---------- Implementation Highlights ------------------------- CHEST uses an untypically large board. Therefore, to take back moves an update/downdate mechanism is necessary; copying would not work well. Geometry: While a chess board logically is 2D, CHEST implements it 1D. That is a standard technique (e.g. 0x88). Currently we use a 16 column x 24 rows approach. This allows any legal delta between two non-border squares to be added (as an offset/step) to any legal square index/pointer without probing memory outside the 16x24 array. Per square redundancy: While the minimum necessary information per square can fit into 4 bits, CHEST separates even these few bits (into colour / piece type), and adds several redundant informations. Most notably are two sets of pieces: - those attacking this square directly - those attacking this square indirectly, i.e. they would attack directly if just one piece on the board would be taken off. These 2 sets are 32 bits each (not 64), by not referring to squares directly, but to indices into the global piece table: each side can have at most 16 pieces. All the redundant info in the squares is updated/downdated across moves. This overhead pays off, since the attack info greatly speeds up the generation of legal moves (as opposed to pseudo-legal moves): checks, illegal K-moves and pins are now easily discovered. In fact, the authors of CHEST consider this the most important invention they made for CHEST (around 1988). Initially motivated by legal move generation, it has proven to be highly valuable in many other aspects and heuristic computations. FFS: special move generators FFS: TT FFS: fac, mdb, defender heuristic Legend/Acronyms --------------- EPD Extended Position Description FEN Forsyth-Edwards Notation FFS For Further Study NB Nota Bene OS Operationg System PGN Portable Game Notation SAN Standard Algebraic Notation TT Transposition Table Bugs and Caveats ---------------- - There is no real user interface. - All input and all options are case sensitive. - The statically allocated move lists allow for up to 222 legal moves. The current known maximum is 218 (Dickins 1968): 3Q4/1Q4Q1/4Q3/2Q4R/Q4Q2/3Q4/1Q4Rp/1K1BBNNk w - - 0 1 There is no guarantee that 222 will always be enough, but if not, that would be at least a new world record (if the setup is legal). - CHEST does not strictly enforce a legal set of pieces. E.g. if one side is given 15 queens, CHEST will happily proceed. But now the limit of 222 legal moves may not be sufficient, any more. - CHEST does not yet take advantage of 64-bit operations, if present. Where necessary, they are composed from 32-bit operations. Some counters, which need more than 32 bits, are currently implemented as "double", with an expected accuracy of 53 bit (standard IEEE). - Configuration for the local machine is not (yet) automatic. FFS. - CHEST does not use any brand of endgame tables. There are plans to do so, but these are not yet implemented. An earlier experimental implementation was discarded by the author, since he was not satisfied with the effective compression. - External names are not always unique in their first 7 characters. This is not exactly maximal portability, but frankly: I don't care much. - CHEST has never been confirmed to be correct. Hence, do not blindly believe its results: you may stumble upon a bug. In fact, in 1998 I corrected a bug which were introduced in 1991, and never showed up in any test... but a bug is a bug. Most probably there are still more. =============================================================================== About CHEST Contact / Feedback ------------------ Real Name: Heiner Marxen e-mail: look@WebPageOfHeiner Web: http://turbotm.de/~heiner/ - I regularly read (and post at) CCC: "http://www.ICDChess.com/ccc/". New releases will be announced there. - I have left rec.games.chess.computer. - I want to get every bug report. IF it really is a bug. - Although CHEST is developed under UNIX, exclusively, it is intended to be portable to every platform, which supports ANSI-C. If you find and fix portablity problems, I would like to know the fix. - I will happily answer (polite) questions, as far as I find the time to do so. That may take some time. But, - Please do not send complete hacked versions. Instead send diffs, and explain them. In order to get your changes into the next release, you have to convince me. - I will not maintain different versions of CHEST for different platforms or whatsoever. There will always be one version (with compile time options (manifests)). So your proposed changes better be portable. History ------- 1973 Heiner Marxen writes a "mate in 2" solver in Fortran II. 1978 Solving "mate in N", written in IBM/370 Assembler. Adding selfmate, using complex kind of TT. 1983 Complete redesign in K&R C. Another kind of TT. 1987 Holger Pause and Thomas Rakovsky join the team. Complete redesign from scratch, K&R C. Adding the concept of direct/indirect attacks. Name is now CHEST. Some very productive years follow. 1988 Mate move generator. 1989 Defender heuristics. 1991 Snooping (ETC). 1993 Yet another kind of TT, the most simple design is really best. 1994 Switching to ANSI C. Adding special code for 2-mate candidates. 1996 Experiments with more attack info and lazy computation. Fails. 1999 First semi-public hand out to C.A.P. Implementing EPD mode, and going public. Future / Planned Features ------------------------- - endgame table bases - HTML output, board pictures - some kind of mate finder mode - more remis detection - test suite / self test - better documentation (also HTML), FAQ Don't hold your breath. =============================================================================== Compilation & Configuration Compilation ----------- Currently, there is no portable way to compile CHEST without help from you. [ You may succeed following the instructions in "README_QUICK". ] There is a "Makefile" template in "Templates/Makefile.simple", which may work for you, if you are on some kind of UNIX (e.g. Linux). Basically, CHEST is ANSI-C, and should be portable to any platform with an ANSI-C compiler. There is also some help built in for Windows NT: that should work also, if you figure out, how to call the compiler, and ... There is one important peculiarity: one of the header files, "lcltyp.h", is not distributed, but rather computed locally. It shall reflect properties of your local machine and your compiler. In "lcltyp.h" there are typedefs for integral types suitable to contain 8, 16 and 32 bit. In order to locally derive a suitable "lcltyp.h", you do: - compile the program "mklcltyp.c", - execute this program, - and save its output as "lcltyp.h". The Makefile template does this in the UNIX style. There are more computed header files (e.g. "inited.h"), but their content is portable and therefore distributed. The main points to configure in the Makefile are: - CC: the C compiler to use - ANSI: if your CC needs an option to do ANSI-C, it goes here. - OPTIM: the optimizer flags to use. Normally you want to optimize. - ACM_MEGS: the default number of MB to be used by CHEST for the TT. For "achest" that amount of memory is statically allocated, i.e. it cannot be changed by an option. For "dchest" this is just the default, and with option -M you can change your mind. CHEST itself will need 120-150 KB for the code, and 600 KB for data. This, together with the TT should fit into the available real memory. Otherwise, CHEST will become very slow due to paging to/from disk. External Manifests ------------------ Some manifests are recognized by CHEST somewhere in its sources, but are never defined by CHEST. They are expected to be set from the compiler or in the Makefile by a porting guru. __STDC__ Is used to decide for ANSI-C. If not defined, you'll most probably have problems to compile and run CHEST successfully. unix defined iff the hosting OS is some UNIX (Linux counts, NT does not) BSD if TRUE we expect the hosting OS to be a Berkley-derived version of UNIX. Used for bzero/bcopy/bcmp and getrusage. USG if TRUE indicates a System V. Ignored if defined(__STDC__). Used for timing by times(2). HP9000 Used to avoid an annoying habit of some HPUX compiler/loader. _WIN32 Indicates Borland/Inprise or Microsoft compiler. Causes special markers at main() and panic(). __cplusplus Only used if BSD. FFS: del? Configuration Manifests ----------------------- Most configuration manifests are for gurus, and not (yet) explained here. From the command line (-D option to cc) or from the Makefile: ACM_MEGS how many MB shall be allocated statically (!) for the TT. This should never be larger that you real memory, as otherwise paging/swapping will kill the performance. ACM_NODES The number of entries in the TT. See above. One entry is 40 bytes large (typically). ACM_DYN Causes the TT to be allocated dynamically (at run time). ACM_MEGS/ACM_NODES now is the default, if option -M is not specified (or its value is -1). PROD_LEV Determines the production level. Three values are supported: 0: development version. All features enabled. (default) 1: intermediate version. Basic debugging and basic statistics remain. 2: production version. No debugging, no statistics. The executable is somewhat smaller and faster. There are many more configuration manifests, but they are for gurus. Go read the sources and become a guru... or ignore them.