A Study in Non-Orphaned Games

Mark Nelson

(Reprinted from Everything #93, October 1995)


In this article I present statistics on non-orphaned postal and CompuServe games. The analysed games have been culled from Everything #83 (January-December 1990) through Everything #92 (June 1995). Games played over America OnLine and Internet networks are excluded, in the former because the sample size (eleven games) is too small. A game is considered orphaned if it had more than one GameMaster and/or if it was GM'ed in more than one zine. However a game which had one GM and which appreared in two or more zines is not considered to be an orphan game if the name change in zines is "cosmetic" [the change is considered to be cosmetic if the publisher either changes the name of the zine but the "nature" of the zine remains, more-or-less, the same (for example, The Boob Report to The Abyssinian Prince) or if the publisher converts smoothly from non-warehouse to warehouse (for example, Penguin Dip to Black Tie Affairs).

Presentation of Statistical Data

Table 1 provides basic statistical data on the games, using the indicator functions introduced in an earlier article. In that article, I analysed non-orphan games reported in Everything #85 (May 1992) through Everything #88 (September 1993). The survey size in this earlier article was 112 postal games and 28 CompuServe games.

There is little difference in the statistics in the earlier article and the current article, suggesting that there has been little, or no, change in postal (CompuServe) Diplomacy in recent years. Although the win number has decreased very slightly since the earlier article (where it was 49% and 46% for postal games and CompuServe games respectively), this change does not appear to be significant; the dropout number has decreased slightly, but again this does not appear to be significant, (2.86±1.65 and 2.07±1.36 for postal and CompuServe games, respectively); the length number has slightly decreased since the first survey, the figures are 1910±3.45 and 1909.75±3.44 respectively for postal and compuserve games; the QR ranking for postal games has increased from 3.50 to 3.77, whilst that for compuserve games has decreased from 4.71 to 4.58.

Games 273 80
Wins 131 35
Draws 142 45
Win Number 48.0 43.7
Av Dropouts 2.60±1.54 2.02±1.30
Av Gamelength1909.80±3.89 1909.26±3.00
QR 3.77 4.58
Table 1: Basic statistical data on game results, average number of dropouts, average game length and QR rating for non-orphaned games

Table 2 contains the percentage Calhamer points gained by each power in non-orphaned games. In interpreting such data we normally have to worry about how significant any differences are (are the differences due to random fluxs?); in this case it is clear that there are considerable differences between results of postal games and results played on CompuServe; Italy is underachieving in CompuServe games, scoring a dismal 3.75%.

What I find fascinating about this is how little Austria is benefiting from Italy's under-achievement. The beneficiaries are Germany (+3.95%), France (+2.73%), Austria (+0.52%), and England (+0.49%). Note that Russian Calhamer Point Score has hardly been affected (-0.007%).

Power Percentage Calhamer Points
Postal Compuserve
Austria 13.540 14.062
England 15.238 15.729
France 17.478 20.208
Germany 12.094 16.041
Italy 11.226 3.750
Russia 14.902 14.895
Turkey 15.519 15.312
Table 2: Percentage number of Calhamer Points achieved in non-orphaned games by each power

The win number for postal and CompuServe games are very similar: 48% and 43.7%. Is there any difference in the type of draws achieved in the two mediums? To answer this question, Table 3 contains the Draw Spectrum for each medium. Clearly the spectra are distinct, there have been no "large" (5-way or greater) draws in CompuServe games and, most importantly, just over three draws in four on CompuServe are two-way.

Table 3: Draw Spectrum for non-orphaned games


Suppose a group of Diplomacy players is randomly distributed into two groups. One plays postal Diplomacy and the other plays over CompuServe. Does the medium of play affect how long it takes to play the game? If the average gamelength varies significantly with the medium then this variation might be sufficient to explain any differences in Calhamer Point scores for the seven powers. For example, games which finish "quickly" are more likely to see Russian, rather than Turkish, wins, so that in a survey of quickly finished games, Russia would score higher, and Turkey lower, than in a survey based on all games.

If the number of original players who drop out is significantly different between the two mediums than this might cause a difference in the structure of results. One might expect that in games with a larger number of dropouts there are smaller draws, either because standby players are more willing to concede draws to the original players, who are fewer in number, or because the original players, who are fewer in number, have better positions because they have not NMR'ed.

In my earlier article, I analysed how the result of postal games varied with the number of original players to drop from the game. The data was scattered but when presented in the form of rolling averages suggested that games with fewer dropouts (0-2 and 1-3) and large number of dropouts (5-7) were less likely to end in a win than games with an "average" number of dropouts (2-4, 3-5, and 4-6).

Table 1 shows that there is negligible difference in the average game length in postal (1909.80±3.89) and CompuServe (1909.26±3.00) mediums. Any difference in Calhamer point scores can not be explained on the basis of differing game lengths. In addition, Table 1 shows little difference in the number of dropouts between the two mediums (2.6±1.54 for postal and 2.02±1.30 for CompuServe). The number of wins is slightly lower for CompuServe games (43.7% against 48.0%) and this is consistent with the analysis of my earlier article (assuming that we can move from a discrete to continuous analysis). However, the distribution of draws is opposite to that expected: the medium with fewer dropouts has an increased number of small-way draws.

Italy is considered by many to be the most difficult power to play, and this suggests an explanation for the dismal performance of Italy in CompuServe games: Is the selection of postal and CompuServe players equally random?

Many postal players played Diplomacy face-to-face prior to entering The Hobby. For those not having FTF experience, the "long" deadlines of postal play provide an opportunity to study the rulebook, to study the game position, and to play through suggestions mentioned in correspondence. In addition, there is the opportunity to play through games reported in zines that the player receives and to analyse current positions -- this is particularly easy these days since so many zines print game maps.

From my involvement in the Internet Diplomacy community, I know that there are Internet players who start playing only days (perhaps hours!) after discovering Diplomacy. Internet deadlines are "quick;" there is less time to study a position and to consider correspondence. In theory, it is easier to get complete games to play through; in practice, I believe most players (particularly beginners) pay no attention to games other than those in which they are playing. In particularl, this means that absolute novices do not look at other games. In theory, Internet players have access to more games than postal players; in practice they access fewer games than postal players -- they do not access games unless they are playing. This means that a complete beginner has nothing to go on -- this would certainly explain some of the strange and bizarre openings that have been reported!

I suspect that something similar happens on CompuServe. It is almost too easy for an absolute novice to enter a game, with no exposure to other games and no expectations. As a consequence the "standard of play" in CompuServe games is lower than that in postal games, and the power which is the hardest to play (Italy) fares worse. However, it is difficult to believe that this could account for such a dramatic decline (-7.48%) in Italy's performance.

It is not presently possible to explain the poor performance of Italy in CompuServe games. What is required is a more in-depth survey of CompuServe games: is Italy scoring poorly because it is being eliminated unusually early in the game? Is Italy reaching four centers and then remaining static for the rest of the game? Is Italy surviving until the endgame but just not scoring results? An analysis based on supply center count variation during the course of each game would be required to answer these questions.

Whatever the explanation for Italy's poor performance, is it possible to explain that the main beneficiaries of weak Italian play are France and Germany? It would be more logical if the main beneficiaries were Austria and Turkey, and perhaps Russia.

In making the comments that I have made, I am assuming that each power has an intrinsic strength which is independent of the player pool. An analysis of British postal games played since 1969 shows that the relative strength of the powers has been constantly evolving. Two explanations are usually offered for this behaviour: Diplomacy statistics have often been widely reported in British zines, so the fact that one power has done well in the recent past results in it performing relatively poorly in the recent future, and, since the British Hobby is fairly compact, new ideas in "opening theory" are rapidly disseminated.

Whilst fluxs of this kind could account for "small" differences between postal and CompuServe results, they can not account for the large difference observed. I can only conclude that CompuServe players just don't know how to play Italy!

As noted above, the draw spectrum for CompuServe games is very different to that for postal play. My anecdotal explanation for this, which I noticed whilst entering data from Everything, is that CompuServe games are not DIAS. Although I haven't examined any CompuServe game positions, I believe that CompuServe players are agreeing to two-way draws prematurely. They are agreeing to draws when there is still play left in the game. If the standard of players on CompuServe is lower than that in postal games then it may be the case that players are not aware of minority stalemate lines. Additionally, small powers may not be using the threat to throw the game to gain larger draws.

I noted in my earlies article that it would be interesting to examine the effect that DIAS rules have on game results, I also commented that it is difficult to know which zines ran DIAS games. Without knowing the house-rules of the zines, it is still possible to gain an approximate insight into the DIAS effect. Although a draw which includes all survivors has not necessarily been played to DIAS rules, a draw which excludes one, or more, survivors is obviously not DIAS. By examining the number of survivors not included in the draw it is possible to gain an insight into the difference between DIAS/no-DIAS games. At the moment, my databases do not include this information (I didn't know it would prove interesting!). At a later date I hope to add this data and examine the DIAS/no-DIAS divide.


From Table 1 and the discussion above, I conclude that differences in results between postal and CompuServe games are not due to differences in the time taken to complete the game and the number of dropouts.

Any firm conclusions drawn from the data presented in this article are likely to be controversial. At first sight it appears that CompuServe players aren't up to the task of playing Italy, but the poor Italian results may be a result of strategic failure on the part of other powers. Austria is not greatly benefiting greatly from Italy's dismal performance, Russia appears not to be touched by it, and Turkey appears to be slightly suffering -- it may be that Compuserve players don't know how to play Austria, Italy, Russia, and Turkey!

The draw spectrum? CompuServe players are weak-willed and easily dominated; they do not have the fighting spirit, preferring to head home for an early bath rather than to fight to the last breath. Shame on them. Shame.

Mark Nelson
University of Leeds, UK

If you wish to e-mail feedback on this article to the author, and clicking on the mail address above does not work for you, feel free to use the "Dear DP..." mail interface.