Asri-unix.460 net.chess utzoo!decvax!ucbvax!menlo70!sri-unix!YODER@USC-ECLB Thu Jan 7 22:53:28 1982 storing positions Something to think about if your concern is saving space in data files is to use symmetries. These are least useful in the opening and become more important as time progresses. What is noteworthy is that while the hacking to save the e.p. and castling bits reduces space usage by about 5.5%, using symmetries will get you a factor of 2, 4, or 16 depending on the situation. There is only one symmetry which is always valid: one can reflect the board so 1st and 8th ranks exchange and reverse the color of all the pieces. Once castling is no longer possible, one can add the reflection about a perpendicular axis (exchanging the rook files). Finally, when there are no pawns on the board, one has the full dihedral group of symmetries on the square board in addition to the reflection with color change (so the full group has 16 elements). It should be noted that for openings the alleged factor of 2 is largely useless, since the extra positions being stored don't generally come up (for example the reflection of the opening position is the same position, but with black to move). For endings, however, it clearly wins to use this method: if you have one pawn on the board, you need only know how to play with the pawn on the a, b, c, or d files to know the ending. You need not separately remember how to play on the e, f, g, or h files (or, for that matter, how to play if the pawn is a different color). Also note that the Huffman scheme is less efficient than simply storing n squares if the number of pieces, n, is 12 or less -- a board with only two kings takes 72 bits to encode, whereas 6*n is at most 72. Finally it should be observed that it may not be necessary to use any bits at all to encode a position if the encoding is implicit. This probably sounds like nonsense, but consider an array of 1000 integers. Where are the bits which encode the numbers 1 through 1000 which are the indices into this array? The same principle can be applied here, in fact I did use it a few years back when I was working on a program to play pawn endings. For a fixed pawn structure (only kings and pawns on the board), a "set of positions" was represented as an array indexed by king one's position of sets of king two's position. (This was Pascal -- In Ada you would want a two-dimensional array of booleans whose indices are the two king positions.) The desired information can be stored as a small number of such sets: e.g. those where white wins, those where advancing a pawn wins etc. This scheme is almost certainly about as compact as you can get provided of course that you wish to know the data being stored for a very large number of positions; a typical application would be a program which completely solved an endgame. Returning to the Huffman scheme: if you expect the average number of bits, k, to encode a "typical" position to be considerably less than 156, one can store them in such a way that the space used per position is about (k + log2(k*m)) where m is the number of positions stored. You use two files: one is a stream of bits which is the Huffman codes for all your positions, concatenated. The other is a hash table (encoded by the hash function of your choice on positions) whose contents are indices into the first file giving the location of the first bit, plus whatever data you wish to associate with the position. This will, however, probably incur a cost for wasted slots in the hash table, since it is a good idea to make it somewhat larger than the number of positions actually present to avoid collision overhead. I would be interested to hear of any other methods people have for saving space in encoding positions, in particular whether the methods above have ever been employed in working chess programs. The question of "how many bits per position are needed" would seem to be 156 for openings and 0 for endings with sufficiently few pieces. With more pieces (up to 12), one can use 6*n bits where n is the number of pieces. With the middlegame, one could try the hash scheme mentioned above, but I'm not sure there is a use for storing lots of middlegame positions. ------- ----------------------------------------------------------------- gopher://quux.org/ conversion by John Goerzen of http://communication.ucsd.edu/A-News/ This Usenet Oldnews Archive article may be copied and distributed freely, provided: 1. There is no money collected for the text(s) of the articles. 2. The following notice remains appended to each copy: The Usenet Oldnews Archive: Compilation Copyright (C) 1981, 1996 Bruce Jones, Henry Spencer, David Wiseman.