Page 1 of 1

Spracklens interview

Posted: Wed Apr 08, 2020 10:20 am
by Sargon72
Hello All,

In a Magazine Modul 4/1990 is a very interesting intervieuw with the Spracklens about the Dual CPU like the EAG 2265 -5

Here it is ,translate with google enjoy

TWO IS BETTER
THE SPRACKLENS ABOUT MULTI-PROCESSOR SYSTEMS

In PLY 1/90, the Swedes report a speed comparison of 24 test positions between an Elite No.2 and the dual processor version. The result: the Elite No.5 was once 8% slower, then up to 118% faster than the uniprocessor version! (The average speed increase was 59%.) Where did these different values ​​come from? Dan and Kathe Spracklen themselves provide answers to this question in the operating instructions. We found their brief introduction to the problem of multiprocessor systems so interesting that we asked Fidelity for permission to print, which was also granted to us. We thank the company for their kindness.

***

If you let the Elite No.5 calculate the same positions once in single processor mode and then in dual processor mode, you will find that the dual processor version is about 1.7 times as fast as the single processor. You may be wondering why it's not exactly twice as fast when two processors share the work. Therefore, we would like to make a few comments below on how the new elite dual processor works.

Let's start with a little terminology! The processor that registers the keystrokes on a chess computer and lights up the LEDs is also called the "master". The processor that helped him. assisting the selection of a train is called "slave". The process by which a train is selected is called "search". The task of the "master" is to distribute the search, i.e. perform the search work together with the "slave" and decide on the selection of the best move. There are many possible moves in most positions, and the search must decide which is the best.

A simple and obvious way of dividing the search would be to let the master and the slave count on one move each. Once either of them is done with their move, they could start on the next move on the list and so on until all moves have been examined. This method is quite applicable and results in an increase in speed of about 1.4 in comparison with a processor working alone. Why so little? The reason is in the nature of the alpha beta algorithm. Without going into long discussions of this method, it can be said simply that the first move that is examined takes significantly more time than any of the other moves. The reason for this is that the examination of the first move gives an evaluation that can be used for the shortened processing of the following moves. If the move list is split, with master and slave working on different moves, then both processors lack this comparative value and they are doomed to a long and difficult job. That means a lot of useless effort, and the end result is anything but impressive. Neither would it help to let the slave sit idle and wait for the master to do the first move; the result would be roughly the same.

A far more effective method of splitting the search was suggested by Professor Tony Marsland of the University of Alberta. His method, which we also use, is based on the fact that the work of the lengthy first train calculation is carried out together and only then is the rest of the trains divided. With this method, the master gives the slave the task of working on certain parts of the search tree that are needed to calculate the first move. The result is the remarkable 1.7 times faster acceleration achieved by our dual processor.

"Overhead" means the loss of time or power that occurs when using a certain programming method. In a multiprocessor system there are four types of "overhead", all of which contribute to the fact that the ideal speed increase by a factor of 2 cannot be achieved.

1. With COMMUNICATION: This is the time that is lost because the two processors have to contact each other and have to exchange information. One processor has to create messages that should inform the other about the direction and the results of the search work. The other processor must accept these messages and react to them. This is called the "COMMUNICATION OVERHEAD".

2. SYNCHRONIZATION: To understand this term, we have to think about what happens when the search is almost over. Let us assume that the master is thinking about the last move and the slave is thinking about the penultimate move. One of the two will finish earlier than the other and will be "unemployed" afterwards. The time that a processor has to idle waiting for the other is called the "SYNCHRONIZATION OVERHEAD"

3. When it comes to the division of labor: Let us think again of what we said about the division of search work according to the method "You take one, I take one". We mentioned that the reason for the relatively low efficiency of this method is that the evaluation found when examining the first move speeds up the processing of the following moves. The better this first value, the faster the other moves are processed. For example, if our first move wins the opponent's queen, we don't have to spend long on the other moves that either win less material or none at all. If, on the other hand, our first move only means that a figure is positioned a little better, then we have to deal with the other move options in more detail.

Due to the constant sorting of the train list, the first train is also the best in many cases. However, sometimes it happens that a move that is lower on the list turns out to be better. When this happens, the rating found on the first move is less useful for processing the other moves than the rating of the new move. The processor that has found the better move will therefore immediately continue with the new evaluation. The other processor, on the other hand, has to continue with the old value until it has finished processing the current train. The performance loss resulting from the fact that the new evaluation is not immediately available to both processors is referred to as "DIVIDED SEARCH OVERHEAD".

An interesting side effect of this overhead can be seen in tests based on solving matte problems. The basic idea behind using matting problems to measure the performance of chess computers is that there is a clear but difficult to find winning move. The tester can measure the time it takes for the computer to find the winning move, which gives it a fairly reasonable measure of how quickly and efficiently the chess computer works. Chess problems naturally consist of positions in which the best move almost never coincides with the first move examined. These are positions with an artificially increased division of labor overhead! For this reason, programmers who want to test the performance of their shared search should use any positions from master games rather than work with matt problems, in which the multiple processor is not at its best.

4.) When dividing the HASH TABLES: a hash table is a large memory area in which a processor can store information that it has obtained during the search process. If the same position reappears in another variant, the processor can look up the result in the hash table instead of having to start the calculation all over again. Hash tables are particularly effective in endgames, where the search is much faster with their help than without them. The reason for this is that position repetitions occur particularly often in the endgame. The "SPLIT HASH OVERHEAD" arises from the fact that each processor has its own hash table. The master cannot look into the slave's table and the slave cannot look into the master's. If one of the two therefore reaches a position that is already stored in the hash table of the other, he still has to start the calculation from the beginning. In the opening, the middle game and even the early endgame, that doesn't make a decisive difference. However, if the number of figures on the board is greatly reduced, the hash table overhead can drastically reduce the effectiveness of a multiple processor system. In some cases, double processor systems for pawn endgames take even longer than single processors!

It is precisely this debilitating effect of the SPLIT TADLE OVERHEAD that prompted us to look for a way out of our physical strength. While keeping one arm away from the sales department, we looked for and found a solution: if the division of the hash tables caused the problem, we wondered why not work with SHARED hash tables? In this embodiment, only one table would be used (that of the master because it is larger). The disadvantage would be an increased communication overhead, because the two processors would have to exchange more information about the hash tables. In the hardware configuration that was originally intended for the dual processor, this method would not have been applicable because neither of the two processors had the option of "tapping the other's shoulder" and saying: "I need information about the hash tables "or" I have information about the hash tables for you ". Each processor would have wasted so much time waiting for the other to acknowledge its request that the resulting communication overhead would have been worse than that caused by the shared hash tables. The method could work, however, if the slave had the option of interrupting the master at any time with an "interrupt" to retrieve or save hash table information.

While we were fighting with the sales department to postpone the deadline, we sat down with the technical department. Yes, an ingeniously simple design modification would allow us to use interrupts! As they say: the rest is history! Instead of slowing down in the endgame, the dual processor can now proudly point to a speeding up by a factor of 1.5 to 1.6.

Now you have an idea of ​​how hard the software and hardware experts from Fidelity have worked to give you even more pleasure in computer chess. We hope that you will enjoy the exciting advances in the field of multiple processors as much as we enjoyed bringing them closer to you.

Dan & Kathe Spracklen

Kr,Hans