Danny Loeb

The Diplomacy Programming Project

The
Observation
Module

In the S1995M issue of Diplomatic Pouch, we described the functioning of the strategy finder, that is the strategic code on which our Bordeaux Diplomat is based. This lower brain of the diplomat does all the calculations and thus forms the basis for the decisions taken by the upper brain, or negotiator part of the diplomat.

The negotiator does not know the details of the map, nor the rules of the game. On the other hand, it must be able to communicate with the other players, distinguish between friends and enemies, form alliances, and so on. To make decisions, the negotiators poses concrete questions to the strategic core, and following the answers to these questions, it choses the appropriate moves to submit, or messages to send out.

The strategic core does not know what country the diplomat is playing. It is simply a disinterested, objective observes who calculates the best moves countries given a certain number of constraints along with the value of the resulting position of all countries.

The negotiator determines these diplomatic constraints in two ways:

by negotiating with the other players, and
by observing the moves made on the board

In this article, we will discuss the observation module which is responsable for this second task. This module has been partially implemented thanks to Arnaud Moulard and Christophe Moustier.

The program was written in LCS (a version of SML developped in Toulouse for parallel processing). It takes as input the adjudicated moves from a single season, and a friendliness matrix. The program output a revised friendliness matrix. This 7 by 7 matrix of real numbers indicates the love/hate relationship between each pair of players. Positive values indicate friendliness and negative values indicate animosity.

It is to be noted that "trustworthiness" is tracked with an independant data structure called StabLimit. StabLimit is the observed minimum gain required for a country to violate its agreements. For example, in the game DipPouch, England stabbed France by opening to the English Channel. If the diplomat judged that such a move gained little for England, then he would be given a low StabLimit. Thus, even if England finally did make peace with France, the program would continue to anticipate that a stab would be made under similar circumstances.

The following factors are used to update the Friendliness Matrix. (Factors not yet implemented are in paratheses.)

Violation of Specific Agreements. Agreements not to occupy a certain space DMZ ..., to make certain moves XDO ..., or to not make certain moves NOT(XDO ...) can be automatically verified. A bonus is given in case of respect. In case of violation, a penalty is given, (and the "ambassador" process for the nation in question is signalled so that it can ask for an explanation. The bonus achieved by stabbing or ignored by refusing to stab are computed, and stored as STABLIMIT data while updating friendliness accordingly.)
(Violation of General Agreements such as "peace" PCE or absence of agreement in the case of a silent partner. Use strategy finder to check if moves correspond better to those where countries are allied and to those where country is not allied.)
Coordinated Moves. If a country supports or convoys the units of another country SUP, CNV, CTO, this indicates that the two countries are probably coordinating their moves, and a bonus is awarded. If a support or convoy was refused NSO, then this may indicate a possible stable, and a small penalty is given.
Attacks. Penalties are given for all conflicts, according to whether support was CUT, units bounced BNC, or the attacked totally failed FLD. An especially big penalty is given if the defender is forced to retreat RET.
Movement. If an attempt is made to move, convoy, retreat, build or destroy a unit of player X, then compare distance to the nearest supply center of player Y with the attempted supply center. The distance is calculated according to whether the unit is an army of a fleet. In the case of units being built or destroyed, the distance is considered "infinite" for a non-existant unit. This change in distance is filtered (0-1 is a big difference, 4-5 hardly counts at all) and reduced by 50% if move failed FLD or bounced BNC. This filter function with first derivative negative, second positive, serves to weight the distance between the unit and player in question. The larger the distance, the less the importance of that unit. We weight successful advances more heavily than unsuccessful attacks.
History. To take into account the history of the game, we add in a 70% of the previous friendliness matrix.
(Size. The fact that builds are always considered agressive and removal are always considered friendly will allow us to concentrate on our most dangerous opponents. However, additional penalties must be given to any country much closer to victory than the other country. This could be measured by number of supply centers, or better by how far many moves would be needed to capture 18 under ideal circumstances.)
Symmetry The friendliness is stored in a matrix F. F is replaced with F + F-transpose. This represents the fact that friendliness is a symmetrical relationship. The relationship between X and Y is the same as that between Y and X.
Transitivity. F is then replaced with F plus a small multiple of F^2. This means that the friends of our friends and the enemies or our enemies have a tendancy to be our friends, and the friends of our enemies and the enemies of our friends have a tendancy to be our enemies.

We thus calculate the friendliness matrix. We use so many different critirea so that it is as difficult as possible to fake true friendship or true war. Alliance thresholds are determined from this friendliness matrix in order to divide the seven countries into about 4 "blocs" or "alliances". These blocs are used as parameters of the strategy finder in its calculations.

The rules allow 15 minutes for negotiation after a movement phase. Our observation module requires about 15 seconds to calculate the new matrix leaving the bulk of the available time to negotiate and calculate new orders.

Danny Loeb
Universite de Bordeaux I
(loeb@delanet.com)

If you wish to e-mail feedback on this article to the author, and clicking on the mail address above does not work for you, feel free to use the "Dear DP..." mail interface, which is located here....

Danny Loeb

The Diplomacy Programming Project

TheObservationModule

The
Observation
Module