Gerrymandering and the 2016 Congresional Election

Are Voters Choosing their Representatives or Vice-versa?

Here's a map of the 2016 congressional election results. It shows which party each of the 435 districts voted for, but it does not really give insight into the possibility of gerrymandering, or setting the district boundaries with the intent of aiding a particular political interest. For that, we will have to look deeper. That is the aim of this project.

First, some background on gerrymandering. From Wikipedia, these are some of the principal tactics:

"Cracking" involves spreading voters of a particular type among many districts in order to deny them a sufficiently large voting bloc in any particular district. An example would be to split the voters in an urban area among several districts wherein the majority of voters are suburban, on the presumption that the two groups would vote differently, and the suburban voters would be far more likely to get their way in the elections.
"Packing" is to concentrate as many voters of one type into a single electoral district to reduce their influence in other districts. In some cases, this may be done to obtain representation for a community of common interest (such as to create a majority-minority district), rather than to dilute that interest over several districts to a point of ineffectiveness (and, when minority groups are involved, to avoid likely lawsuits charging racial discrimination). When the party controlling the districting process has a statewide majority, packing is usually not necessary to attain partisan advantage; the minority party can generally be "cracked" everywhere. Packing is therefore more likely to be used for partisan advantage when the party controlling the districting process has a statewide minority, because by forfeiting a few districts packed with the opposition, cracking can be used in forming the remaining districts.
"Hijacking" redraws two districts in such a way as to force two incumbents of the same political party to run against each other in one district, ensuring that one of them will be eliminated, while usually leaving the other district to be won by someone from a different political party.
"Kidnapping" aims to move areas where a certain elected official has significant support to another district, making it more difficult to win future elections with a new electorate. This is often employed against politicians who represent multiple urban areas, in which larger cities will be removed from the district in order to make the district more rural.

So what do we look for when investigating gerrymandering?

This report will merely scratch the surface, but will still attempt to effectively visualize the effects of districting lines on the 2016 election.

The first thing we'll do is examine the margins by which each district was won. Rather than just the binary results map, showing the margins of victory can actually elucidate something meaningful about the concentration of party voters in each district. How so? One would expect that if partisan gerrymandering is present, then the offending party employs "packing" to give themselves many small-margin victories and concentrate their opponents in a few huge-margin victory districts. Thus we would expect small margins of victory in districts won by the controlling party and large margins in districts won by the opposing party. On the other hand, if the controlling party holds a majority, they might "crack" their opposition, and we'd see a disproportionate number of districts going for the majority as a result. With that in mind, here's this map:

Now this map is quite a bit more informative. We can see a swath of deep red through the Appalachians and down to the Texas and the South. Likewise we see slivers of deep blue along the coasts and around major cities. As for districting that might be suspicious, it's a little hard to tell.Some things grab the eye a bit, though. Namely, North Carolina has some very skinny deep blue districts surrounded by moderate red ones. Wisconsin, also facing a major court case on gerrymandering, demonstrates a large Democratic majority in the 3rd district, but seems to have slightly smaller margins in Republican districts. On the other side of the aisle, if we're looking for packing, Oregon could be a candidate. Known as a blue state, eastern Oregon is actually deep red, and the more populous coast is moderate blue.

To take a deeper look at these margin distributions, we'll use a histogram. The following composite plot shows histograms for the margin of victory distributions for districts within each state. Along the x-axis is margin of victory (far left = big Democratic win, far right = bit Republican win). Along the y-axis is number of districts falling into each margin range. Potentially these plots would be more informative if they had more bins, because for example the bin representing the smallest margin category (0-25%) is a huge range. A 25% margin is a lot bigger than a 3% margin. However, there are not that many districts in a state so increasing the number of bins would mean most bins would have 0 or 1 districts in them, which neglects large-scale trends and simply isn't very interesting. So, here's the plot:

There are some interesting insights from these histograms. Perhaps the most striking feature of some of them is a dropoff at the 0% mark, i.e. far more small-margin wis for one party. This is perhaps most pronounced in Texas, Pennsylvania, and North Carolina. These districts can perhaps be thought of as pickup-opportunities for the opposing party. In all of these states, there are many more small-margin Republican wins than Democratic wins. In other words, within what one might imagine are "swing districts" in a gerrymandering-free world, an unlikely proportion are overwhelmingly going Republican. A bit suspect? Probably. We also do have to take into account that people are certainly not settled in politically randomly distributed fasion; Democrats probably tend to cluster more into cities, and Republicans may be naturally more spread out in more politically balanced rural areas. So in that regard, it's hard to know entirely. Nonetheless, let's now examine the distribution of margins for the nation as a whole. This will give more datapoints, and allow smaller bins to show more meaningful results and general trends. Here it is:

The main thing we can see from this histogram is that margins of victory are more spread out on the Democratic size. About as many districts are won by 60-70% Democratic margin as by 0-10%. On the Republican side, however, there are the most districts in the 20-30% margin category and then it drops off rather steeply to the right. This means that in general, Democrats win more districts by larger margins than Republicans do.

So remember how people tend to distribute into non-random politically clustered ways that may skew the margin distributions in districts? Well that's a topic for another project, and makes investigating gerrymandering a bit more complicated. But perhaps we can assume that districts should ideally be somewhat random in their distributions in terms of size, shape, and location. Thus, in a non-gerrymandered world, we would expect the districts to represent the distribution of the people. So one way we can get around this is to attempt to quantify how weird the district shapes are, and thus how much more likely they may be to have been engineered to fit their residents, i.e. gerrymandered. We can quantify this with so-called compactness. There are several potential ways to calculate compactness, but for this project, the formule used was:

Determine the bounding rectangle of the district as it's projected onto the map. Call its width W and height H.
Set radius R to the square root of W² + H², divided by the square root of 2. Essentially this says, find the length of the hypotenuse of the triangle with legs W and H, and then find the length of the legs if it were a right equilateral triangle with that hypotenuse
Find the area of the circle with radius R
Find the actual area of the district (this is given in the dataset).
Finally, your compactness is A_district / A_circle, multiplied by a scaling constant since, in this case, unfortunately the two areas are not in the same units (pixels and sq. miles). Thus we just scale to make the numbers in a better range, i.e. 0 to 1. Intuitively, a circle is the most compact shape, so it should have maximum compactness. A squiggley, jagged shape will have compactness close to 0.

So here's compactness plotted on the map:

Notice that the results of this mapping intuitevely make rough sense. For example, consider how Wyoming, just a big squarish rectangle, falls into the most compact bin. Now look at some of the aforementioned suspicious skinny districts in North Carolina; they're all the way at the other end of the spectrum. This is not a perfect measure, and there's definitely better ways to measure compactness, for instance by considering the perimeter. However, we can still see that this plotting makes sense, and we can see that areas colored very light gray, such as those areas of NC, may be suspect.

Here's a histogram of the compactness scores of all the nation's districts, to help you get a sense of where each falls on the above map. Remember that as the number is multiplied by an arbitrary scaling factor, the compactness scores are only significant in relation to each other.

We can see from this histogram that compactness follows roughly a normal distribution, centered somewhere around 0.1. There are a few outliers with higher compactness. Can you find them on the map?

Lastly, we will look at how compactness and margin scores may correlate. Here's compactness plotted against margin of victory for both parties (-100 = 100% Democratic margin, 100 = 100% Republican margin).

Doesn't seem to be much correlation here, or at least it's really hard to tell.