Electioneering: Part I | C. Jake Barlow

As an Australian who has been resident in New Zealand for a couple of years I was entitled to vote in the recent general¹ election, and as a big fan of liberal democracy I was keen take part and see how the whole thing worked across the ditch. Although most democracies follow a “one man, one vote” system², the actual make-up of parliament can vary depending on the system used to translate votes into representation³. Unlike Australia which uses a Single Transferable Vote (STV) to assign seats in both the upper and lower houses, New Zealand uses Mixed Member Proportional (MMP) voting to allocate all seats in a unicameral parliament.

¹ Federal, for those of you used to states.

² Or do now, at least.

³ There are of course lots of ways that this can be distorted - like gerrymandering - but we’ll focus just on the voting system here.

I thought this was interesting, and so decided to run a few simulations on what the makeup of the Australian parliament, in particular the House of Representatives, would be under different electoral systems. The series is divided into three parts:

Part One
Will define the question, provide some overview of the different electoral systems, and clean the data.
Part Two
Conduct the analysis, and generate some preliminary results.
Part Three
Refine the analysis, and offer some interpetation.

The Contenders

We’ll review three different voting systems:

First Past the Post
FPTP is a simple and widely used system. Each voter casts a single vote for their preferred candidate, and the candidate with the most votes wins, regardless of whether they have an absolute majority. This is a winner-takes-all system, and so tends to favour larger, entrenched parties. It is still widely used in many countries, such as the US, the UK, and Papua New Guinea.
Single Transferable Vote
STV is the system used in Australian elections. Candidates have to reach the quota (>50% of the vote) to be elected, whilst voters preference candidates in order from most-preferred to least-preferred. The counting process occurs in a defined sequence⁴:
- All first-preference votes are tallied
- If a candidate reaches the quota, they win
- If no candidate reaches the quota, then the candidate with the lowest number of votes has their votes redistributed to the remaining candidates, based on the preferences of the voter
- Each candidates votes are then tallied again, and if the quota is reached then that candidate wins
- The redistribution-and-counting cycle continues until someone reaches quota
Mixed Member Proportional
MMP is the system used in New Zealand. It is a hybrid voting method that combines elements of both FPTP and proportional representation. Two votes are cast:
1. Electorate vote
  Vote for your local electorate - this uses FPTP. New Zealand has 70 electorates, and so 70 electorate MPs.
2. Party vote
  This is a national vote which allocates a number of further seats based on the proportion of votes that each party received. The NZ parliament contains 120⁵ total seats, so there are an additional 50 MPs who are placed from the party vote and don’t have an electorate - these are known as ‘list’ MPs. To be eligible to receive seats from the list, a party must win an electorate seat or receive >5% of the party vote.

⁴ This only applies to the lower house. The senate uses a slightly different system, which isn’t relevant to us here as we’re only looking at the House of Representatives.

⁵ -ish, see the discussion in part 2 about overhang.

The Plan

The goal is to see how the makeup of the Australian parliament differs under each of these systems. This will require:

I’ve made an editorial decision here that code used to conduct analysis will be included in the documents but most of the code used to present analysis will be hidden. This allows the logic of the analysis to be followed along with, without drowning the reader in cosmetics.

If you want to see (for instance) how the tables are put together, the full source code for this pages are available on the Github repo.

Voting data
We’ll use the 2022 federal election data, courtesy of the Australian Electoral Commission.
A map of the electorates
We’ll use the 2021 Commonwealth Electoral Divisions shapefile, courtesy of the Australian Bureau of Statistics⁶.
A blueprint for a MMP parliament
We’ll make this up.

⁶ The ABS is definitely my favourite government department (a tough decision, obviously) - there’s a huge amount of freely available data on a dizzying number of topics that are usually amenable to merging into medical datasets - if you’re doing any multi-site or multi-state analysis it is definitely worth poking around here.

Data Wrangling

First, we’ll do the usual environment set up:

# Libraries
require(tidyverse)    # For data wrangling
require(magrittr)     # For piping arithmetic (actually)
require(sf)           # For producing maps
require(ggiraph)      # For producing interactive maps without D3
require(patchwork)    # For combining maps

Voting Data

We’ve loaded up the voting data behind the scenes, and are going to tidy up the data types. I’m not sure if we’ll need all of these columns yet, but since there aren’t that many of them it’s convenient to just transform them all now.

We make the binary Elected variables into logical, and identify the key factor variables. We of course don’t need to touch the character or numeric variables as R has classified these already⁷.

⁷ We also do a cheeky find-and-replace on the DivisionNm variable (which identifies the electorates), as special characters can play havoc when it comes time to plot maps. Replacing hyphens with an en dash insures us against typographic shenanigans. I’ve got my eye on you, Eden-Monaro.

# Data cleaning
votes = votes %>%
  mutate(DivisionNm = gsub("-", "–", DivisionNm),
         across(ends_with("Elected"), ~ ifelse(. == "N", FALSE, TRUE)),
         across(c(StateAb, DivisionNm, PartyAb, CalculationType), as.factor)) %>%
  # Pivot out the vote counts and percentages
  pivot_wider(names_from = CalculationType,
              values_from = CalculationValue) %>%
  rename(prefCount = "Preference Count",
         prefPC = "Preference Percent",
         tranCount = "Transfer Count",
         tranPC = "Transfer Percent")


# Inspect the parties in more detail
parties = votes %>%
  group_by(PartyAb, PartyNm) %>%
  summarise()

head(parties)

PartyAb	PartyNm

AJP	Animal Justice Party
ALP	A.L.P.
ALP	Australian Labor Party
ALP	Labor
ASP	Shooters, Fishers and Farmers Party

The other thing that needs correcting are the party names and abbreviations as there are some clear duplicates. There’s a couple of judgement calls here, but we follow established conventions and I think the ambiguity reduction is beneficial.

votes = votes %>%
  mutate(PartyAb = PartyAb %>%
           forcats::fct_collapse("GRN" = c("GVIC"),
                                 "IND" = c("")),
         PartyNm = PartyNm %>%
           forcats::fct_collapse("Independent" = c(""),
                                 "Labor" = c("A.L.P.", "Australian Labor Party"),
                                 "Democratic Alliance" = c("Drew Pavlou Democratic Alliance"),
                                 "The Greens" = c("The Greens (WA)", "Queensland Greens"),
                                 "Katter's Australian Party" = c("Katter's Australian Party (KAP)"),
                                 "Liberal Party" = c("Liberal National Party of Queensland"),
                                 "The Nationals" = c("National Party"),
                                 "The New Liberals" = c("TNL"),
                                 "Western Australia Party" = c("WESTERN AUSTRALIA PARTY")))

Geometric Data

With the voting data cleaned, we can turn our attention to the geometric data. The first order of business is to check that the names of the electorates in our shapefile (ced) line up with the names in the voting data.

ced = ced %>%
  mutate(CED_NAME21 = gsub("-", "–", CED_NAME21))

x = ced$CED_NAME2
y = unique(votes$DivisionNm)

setdiff(x, y)
setdiff(y, x)
rm(x, y)

All names align except the “no usual address” geoms, which we will strip out. We’ll also transform our geographic data to use the EPSG:4326/WGS84 coordinate system. Coordinate systems are complex and interesting, it is enough for our purposes to say that this datum lets us plot latitudes and longitudes from a map without having to think too hard about it.

ced = ced %>%
  ### Drop the unnecessary data and restrict our map geoms to the electorates
  select(-c(CED_CODE21, STE_CODE21, AUS_CODE21, AUS_NAME21, LOCI_URI21)) %>%
  filter(CED_NAME21 %in% votes$DivisionNm) %>%
  mutate(DivisionNm = CED_NAME21) %>%
  # Transform the CRS
  st_transform(crs = 4326) %>%
  # Reduce the window to cover continental Australia
  st_crop(xmin = 110, xmax = 160,
          ymin = -45, ymax = -10)

We should also put together a list of the state capitals. Electorates are drawn with the aim of having a similar number of enrolled voters in each⁸, and so tend to cluster around major cities. We’ll put some insets into our graphs so changes won’t be overshadowed by the large electorates that dominate much of the landmass.

⁸ Although the actual number can vary quite a bit.

# Set geography for insets
# A list of cities, their latitude and longitude, and the radius around them that we'll capture the electorates from
loc = tribble(~ city, ~ lat      , ~ long     , ~ rad,
              "mel" , -37.8142354, 144.9668884, 45000,
              "syd" , -33.8386302, 151.0310312, 40000,
              "bri" , -27.4671551, 153.0169995, 15000,
              "ade" , -34.7894706, 138.5687859, 15000,
              "per" , -32.0212408, 115.8735055, 15000)

The last thing we need to do for the setup is define some cosmetic options, so that each party is drawn in an appropriate colour. To spare the tedium, we’ll do this off screen. If you want to follow along, the files (and the source data) are in the Github repository.

Next time - some actual analysis!