This Functionality is Now in {peacesciencer} ⤵️

The processes described here have been included in {peacesciencer}, an R package for the creation of all kinds of peace science data. The strategic_rivalries data frame is still in {stevemisc} as a legacy. A slightly modified version is included in {peacesciencer} as the td_rivalries data. You can add strategic rivalry data to state-year or dyad-year data in {peacesciencer} with the add_strategic_rivalries() function. Please check out the website for {peacesciencer} for updates on its continued development.
Alfred Bettanier's (1887) painting stylizes the Franco-German enmity and its focal point at the 'black spot' of Alsace-Lorraine.

This is another one of those things I find myself doing from time to time in my research so it might be advisable to write it down and remember how to do it.

Rivalries are an important concept in the quantitative study of international conflict. This relates to one of the more important takeaways from the data on inter-state disputes: they’re not randomly assigned across dyads. A few dyads are disproportionately responsible for large conflicts and wars. I use the India-Pakistan dyad in my upper-division conflict course as an illustration of this phenomenon. The dyad was created in 1947 following the contentious partition of British Raj along ill-defined Radcliffe Line. War immediately followed for the cause of territorial consolidation and relations have been tense ever since. Indeed, India and Pakistan have had four wars since their mutual creation and have been in a MID (often a fatal MID) around 70% of their existence. The rivalry scholarship in international relations argues these previous crises lead to the emergence of rivalry relationships in which states view each other as threats and compete against each other. This makes future crises more likely.

There are a number of ways of identifying rivalry relationships. Most classic rivalry scholarship identified rivalries based on past disputes. Diehl and Goertz (2000), for example, develop a rivalry classification that depends on the volume of MIDs in a given window of time. However useful, this “dispute-density” approach uses past dispute history to predict future disputes in a matter not unlike how roll call votes in the past are use to predict roll call votes in the future. Any observed relationship might simply be a tautology.

For that reason, I’ve been to drawn to Thompson’s (2000) and later Thompson and Dreyer’s (2012) “diplomatic history” approach. Herein, researchers look into diplomatic relations to see how states interact with each other, whether diplomatic communiqués treat the other side as threats, and whether military exercises target the other side with that in mind. This approach might be more “perceptive” than the dispute-density approach but it is at least conceptually distinct from the phenomenon we want rivalry to explain (i.e. conflict recurrence).

However, the ensuing data are less of a data set one can download and more a history of rivalry relationships that Thompson and Dreyer summarize in their book. Nevertheless, the appendix of their book, and a few lines of R code, can create some data sets to use in our standard dyad-year modeling approach to conflict onset.

  1. The (Raw) Data
  2. Prep the Data
  3. Create (Non)-Directed Rivalry Year Data

The (Raw) Data

I scanned Thompson and Dreyer’s (2012) book at the appendix and created a spreadsheet that I make available in my {stevemisc} package as a data object titled strategic_rivalries. Alternatively, a .csv of the data is available here. Let’s load that raw data and take a gander at it. The data are purposely minimal right now because it’s a quick scan of the information from the appendix. The goal is to record it in a spreadsheet and extend it later.

# library(tidyverse)
# library(stevemisc)
data("strategic_rivalries")

# alternatively:
# strategic_rivalries <- read_csv("https://gist.githubusercontent.com/svmiller/63dace4aa5a00eddce307d964c7bac23/raw/3941ed5654cff77dc4509ff7b81e664cf8b0875e/strategic_rivalries.csv")

strategic_rivalries %>%
  head(5) %>%
    kable(., format="html",
        table.attr='id="stevetable"',
        caption = "The First Five Rows of the Strategic Rivalry Data",
        align=c("c","l","l", "l", "l", "c","c","l","c","c","c"))
The First Five Rows of the Strategic Rivalry Data
rivalryno rivalryname sidea sideb styear endyear region type1 type2 type3
1 Austria-France Austria France 1494 1918 European GPs spatial positional NA
2 Austria-Ottoman Empire Austria Ottoman Empire 1494 1908 European GPs spatial positional NA
4 France-Spain France Spain 1494 1700 European GPs positional spatial NA
3 Britain-France 1 Great Britain France 1494 1716 European GPs spatial positional NA
5 Ottoman Empire-Spain Ottoman Empire Spain 1494 1585 European GPs spatial positional NA

rivalryno is a unique rivalry number that starts at 1 and ends at 197. rivalryname is the name of the rivalry that Thompson and Dreyer give this particular observation. In most cases, it’s a simple concatenation of the two participants in an alphabetical order. Cases where there are multiple rivalries in a given dyad will have numbers next to them (e.g. Britain-Spain 1, Britain-Spain 2). sidea and sideb are the participants in the rivalry. The rivalry data are non-directed but sidea is whatever country comes first alphabetically. styear and endyear are the start year and end year (respectively) for the rivalry. Ongoing rivalries have a right bound of 2010, but could conceivably be extended to the present year in almost every case.

region is where Thompson and Dreyer code the rivalry as occurring. These regions that Thompson and Dreyer describe are multiple and mostly consistent across time and space, but users interested in regional rivalries may want to explore these locations and standardize them further. For example, the “Germany-United States 2” rivalry (1933-1945) and “Russia-United States 2” (2007-present) rivalries are both in “Multiple” regions despite some different focal points between the two whereas the “Russia-United States 1” (1945-1989) rivalry is in the “Global” region. Of note: European great power rivalries have their own region (“European GPs”).

Finally, type1, type2, and type3 variables describe the nature of the rivalry and what it concerned, in order of importance. Not every rivalry has a second or third dimension, but every rivalry must have a primary dimension coded in the type1 variable. There are four categories of rivalry in the Thompson and Dreyer (2012) data. “Spatial” rivalries are contested over the control of territory, broadly defined. The Armenia-Azerbaijan rivalry (1991-present) is a good example of an exclusively spatial rivalry since most of the relationship concerns Nagorno-Karabakh. “Posiitonal” rivalries are competitions for relative shares of influence in a region. The Iran-Israel rivalry (1979-present) is a good example of an exclusively positional rivalry since the concern largely hinges on Israel’s misgivings about Iran’s aspirations in the region after the overthrow of the Shah in 1979. “Ideological” rivalries are relationships where two sides contest virtues of competing economic/political systems. The “Costa Rica-Nicaragua 2” (1948-1990) rivalry is the only exclusively ideological rivalry in the data. Therein, democratic Costa Rica and Marxist Nicaragua actively advocated regime change for the other side. “Interventionary” rivalries are new types of rivalries that Thompson and Dreyer introduce to this project. These are relationships in which states intrude into the internal affairs of other states for sake of leverage in the other state’s decision-making. They are often done without clear spatial, positional, or even ideological reasons. The concept borrows from Cliffe’s (1999) discussion of “mutual intervention” in the Horn of Africa and it should be no surprise that all interventionary rivalries in the data are located in Central Africa or East Africa.

Many rivalries have a second dimension but very few have three dimensions. The West Germany-East Germany rivalry (1949-1973) is an accessible three-dimensional rivalry in this data. Therein, the rivalry was primarily ideological (type1), but had a secondary positional aspect (type2) and a minimal, but still important, spatial/territorial element (type3) on top of that.

There is more coding necessary to get the most use out of these data, but the raw data can already communicate some basic descriptive statistics about strategic rivalries. For example, here are the the 10 longest rivalries in the data. It’s unsurprising that the top seven are European great power rivalries.

strategic_rivalries %>%
  mutate(duration = (endyear - styear)+1) %>%
  arrange(-duration) %>%
  head(10) %>%
  select(rivalryname, styear, endyear, region, type1, duration) %>%
  kable(., format="html",
    table.attr='id="stevetable"',
    caption = "The Ten Longest Rivalries in the History of the World",
    align=c("l","c","c","l","l","c"))
The Ten Longest Rivalries in the History of the World
rivalryname styear endyear region type1 duration
Austria-France 1494 1918 European GPs spatial 425
Austria-Ottoman Empire 1494 1908 European GPs spatial 415
Ottoman Empire-Russia 1668 1918 European GPs spatial 251
Ottoman Empire-Venice 1494 1717 European GPs spatial 224
Britain-France 1 1494 1716 European GPs spatial 223
France-Spain 1494 1700 European GPs positional 207
France-Prussia 1756 1955 European GPs spatial 200
Colombia-Venezuela 1831 2010 South America spatial 180
Britain-Russia 1778 1956 European GPs positional 179
Bolivia-Chile 1836 2010 South America spatial 175

We can also do a basic summary of the distribution of rivalries by primary rivalry type. The modal category is clearly spatial rivalry, which coincides with over 45% of all primary rivalry types. Positional rivalries over relative shares of influence in a region are the next most common. The least common rivalry type is interventionary. These are rivalries exclusive to Sub-Saharan Africa.


strategic_rivalries %>% group_by(type1) %>% 
  summarize(n = n()) %>% ungroup() %>%
  arrange(-n) %>%
  mutate(percent = paste0(mround(n/sum(n)),"%")) %>%
  kable(., format="html",
        table.attr='id="stevetable"',
        caption = "The Distribution of Rivalries by Primary Rivalry Type, 1494-2010",
        align = c("l","c","c"))
The Distribution of Rivalries by Primary Rivalry Type, 1494-2010
type1 n percent
spatial 89 45.18%
positional 65 32.99%
ideological 30 15.23%
interventionary 13 6.6%

Prep the Data

A user will need to prep the data a little to get some usable dyad-year data from this raw list of strategic rivalries. Importantly, the data are non-directed but the dyad is “ordered” alphabetically rather than by a numeric coding system (a la Correlates of War [CoW] state system membership). This will make for some headaches in standard dyad-year data because Portugal (ccode: 235) precedes Spain (ccode: 230) in the rivalry list, but will never precede Spain in a non-directed dyad-year design that relies on CoW system membership data.

Fortunately, the {countrycode} package is great for this. We’ll first convert sidea and sideb to a ccodea and ccodeb. The package will take care of almost everything here, though there are a few caveats that I’ll highlight in the code below.

require(countrycode)

strategic_rivalries %>%
  mutate(ccodea = countrycode(sidea, "country.name", "cown"),
         ccodeb = countrycode(sideb, "country.name", "cown")) -> strategic_rivalries

# Austria is "Austria" in the rivalry data, but Austria-Hungary before it.
# We'll fix some of this a bit later too.
strategic_rivalries$ccodea[strategic_rivalries$sidea == "Austria"] <- 300
# Prussia doesn't appear as a partial matching term for successor state Germany
strategic_rivalries$ccodea[strategic_rivalries$sidea == "Prussia"] <- 255 
# countrycode instinctively gives Germany's ccode to West Germany
strategic_rivalries$ccodea[strategic_rivalries$sidea == "West Germany"] <- 260 
# Ottoman Empire doesn't appear as a matching term for successor state Turkey
strategic_rivalries$ccodea[strategic_rivalries$sidea == "Ottoman Empire"] <- 640
# Silly error, but countrycode doesn't know between Vietnams
strategic_rivalries$ccodea[strategic_rivalries$sidea == "North Vietnam"] <- 816


strategic_rivalries$ccodeb[strategic_rivalries$sideb == "Ottoman Empire"] <- 640
# Note: I'm creating this since Venice never appears in the CoW data. I won't ever use it.
# You probably won't either.
strategic_rivalries$ccodeb[strategic_rivalries$sideb == "Venice"] <- 324 
strategic_rivalries$ccodeb[strategic_rivalries$sideb == "Prussia"] <- 255
# countrycode always struggles with Serbia as successor state to Yugoslavia.
strategic_rivalries$ccodeb[strategic_rivalries$sideb == "Serbia"] <- 345 

Next, we’ll create a ccode1 and ccode2 variable that makes these non-directed. The lower country code will always appear first.


strategic_rivalries %>%
  mutate(ccode1 = ifelse(ccodeb > ccodea, ccodea, ccodeb),
         ccode2 = ifelse(ccodeb > ccodea, ccodeb, ccodea)) -> strategic_rivalries

Create (Non)-Directed Rivalry-Year Data

The process of extending these rivalry data into rivalry-year data is effectively identical to what I showed in my guide on how to create country-year, non-directed dyad-year, and directed dyad-year data. People who have read that guide will see what is happening; rowwise() and unnest() are doing all the heavy-lifting here. Do note we need a quick fix for that one Austrian rivalry that actually extends past 1918.

# NRY: non-directed rivalry-years

strategic_rivalries %>%
  # We don't need these two columns and they'll only get in the way.
  select(-ccodea, -ccodeb) %>%
  # Prepare the pipe to think rowwise. If you don't, the next mutate command will fail.
  rowwise() %>%
  # Create a list in a tibble that we're going to expand soon.
  mutate(year = list(seq(styear, endyear))) %>%
  # Unnest the list, which will expand the data.
  unnest() %>%
  # Minor note: ccode change for Austria, post-1918 for rivalryno 79.
  mutate(ccode1 = ifelse(ccode1 == 300 & year >= 1919, 305, ccode1)) -> NRY

This dynamic document isn’t also creating non-directed dyad-year data, but merging non-directed rivalry-year data into non-directed dyad-year data is easy since there would be common keys of ccode1, ccode2, and year. It would look like this.

# Assume an object (NDY) has complete non-directed dyad-year data.
# See: svmiller.com/blog/2019/01/create-country-year-dyad-year-from-country-data/

NRY %>%
  # Let's just select stuff we may want since we don't want too huge a data frame.
  select(ccode1, ccode2, year, type1:type3) %>%
  # Simple mutate: every row means there's an ongoing rivalry. Duh.
  mutate(ongorivalry = 1) %>%
  # And left_join...
  left_join(NDY, .) %>%
  # if ongorivalry is NA, it's actually zero
  mutate(ongorivalry = ifelse(is.na(ongorivalry), 0, ongorivalry)) -> NDY

That’s it. Doing this takes a table of 197 rivalries in Thompson and Dreyer’s appendix, entered to a spreadsheet in about 30 minutes (if I recall that effort correctly), saved as an R data set, and extends it into rivalry-year data to be quickly merged into dyad-year data. Just a few lines of R code from {tidyerse} with some light maintenance from the {countrycode} package are all you need.