Updated GECC Methodology

This diary explains my methodology in producing the General Election Cattle Call. It was last updated on October 15, 2004.

The last several weeks have brought me into closer contact and discussion with a wider range of statisticians, political scientists and pollsters than I ever imagined would happen for this election. The result has been a complete rethinking of the way I am going to project the Presidential Election. In an effort to improve the transparency, statistical validity, accuracy and purpose of the General Election Cattle Call, I have almost completely revamped my methodology.

Step One: Projecting the National Popular Vote

Determining the Dataset
The most important factor in producing an accurate poll is to make certain that the polling sample is representative of the population the poll is attempting to measure. Over the past several weeks, on both this blog and many others, their have been wide ranging arguments over how polling firms should weight their polls in order to develop an accurate sample population when measuring for candidate preference. Specifically, most of these discussions have centered around one topic: Party self-identification. Mystery Pollster has the most ocmplete overview of this discussion.

My position in these discussions has been to argue that Party self-identification is more of a demographic than it is an attitude, and thus it is important for polling firms to weight by Party ID (here is my most recent post on the subject). Thus, in making my election projections, I feel it is necessary to only include polls that weight by Party ID.

Fortunately for me, the four polls that I have discovered weight by Party ID also happen to be the four tracking polls as of this writing: ABC News, Rasmussen, TIPP, and Zogby. While they all use different methods in weighting by Party ID, that all four weight is good enough for me:

  • ABC News weights its likely voter model partially by Party ID. Party ID self-identification within the registered voter pool is averaged with a fixed model of Party self-identification based on exit polls from previous elections. The ABC News likely model is thus a combination of fixed and dynamic Party ID weighting systems.

  • Rasmussen weights its likely voter model according to a fixed Party ID of 39% Democrat, 36% Republican, and 25% Independent / Other.

  • TIPP weights its likely voter model according to dynamic Party self-identification. The combined Party self-identification form TIPP polls over the past several months is taken, and the poll is weighted accordingly.

  • Zogby weights its likely voter model according to a fixed Party self-identification model of 39% Democrat, 35% Republican, and 26% Independent / Other.
All of these polls engage in other forms of weighting based on national demographics that I accept. At least Rasmussen uses automated telephone calls that some do not accept, but I do.

These four tacking polls form my entire dataset for the national popular vote projection. I will only add another poll if it is a tracking poll and it weights by Party ID. I doubt that any other such polls will surface. Over the next eighteen days.

Combining the Tracking Polls
Because different polling firms use different questions, push undecideds to different degrees, and include different candidates, combining their results is a tricky, if not entirely misguided, business. Further, because of the Incumbent Rule, the presence of undecideds in polls that measure candidate preference in an election with an incumbent tend to skew the actual margin between the two candidates. However, I believe I have developed a set of steps that does a decent job of flattening out the differences between these four polls:

  • Turn all four Bush / Kerry raw scores into ratios. For example, a raw score of Bush 48, Kerry 44 would be a ratio of Bush 52.17, Kerry 47.83. This is done to flatten out the difference in the way different polls push undecideds.

  • Calculate the simple mean of both the Bush ratios and the Kerry ratios from all four tracking polls.

  • Calculate the simple mean of the combined “other” and “undecided” raw scores from all four polls. To be more specific, any response in any of the four polls that is neither “Bush” nor “Kerry” in terms of candidate preference is included in the calculation of this mean. This is done in order to flatten out difference between polls that ask different questions in relation to third-party candidates.

  • Subtract the “other / undecided” simple mean from 100. This produces the existing two-party vote total.

  • Multiply the Bush ratio simple mean by the existing two-party vote total in order to determine Bush’s unadjusted vote. Multiply the Kerry ratio simple mean by the existing two-party vote total in order to determine Kerry’s unadjusted vote.

  • Subtract two from the “other / undecided” simple mean in order to account fro the third-party vote. This produces the “non-third party undecideds.” It is based on the assumption that the third party vote will be approximately 2% in this election.

  • Add 80% of the non-third party undecideds to Kerry’s unadjusted vote in order to arrive at Kerry’s projected national vote. Add 20% of the non-third party undecideds to Bush’s unadjusted vote in order to arrive at Bush’s projected national vote. This is done in order to account for the Incumbent Rule, and specifically for how the Incumbent Rule applies to Presidential Elections.
This, I hope, results in a very close projection of how the election would break if it were held on the day of my projections.

Step Two: Calculating the Electoral Vote

As we all know from 2000, according to our impossibly antiquated system for selecting a President the winner of the national popular vote is not necessarily the person who wins the election. Thus, in order to determine who is winning the election, projecting the national popular vote is not enough. Each of the 15 or 20 closest states must also be analyzed.

Of course, the same problem in determining a representative sample population in national polls of candidate preference applies to states polls. However, there is not nearly the same amount of data available on state-level party self-identification (which is not the same as voter registration) as there is on a national level. Further, information about the methodologies of the far greater number of polling outfits that conduct state polls as conduct national polls in very difficult to come by. In order to compensate for these two problems when projecting state on an individual basis, I have adopted a metric known as the Partisan Index.

The partisan index shows the relative standing of Democrats and Republicans in a given state by comparing the state popular vote of the two parties with the national popular vote of the two parties. For example, in New Jersey in 1992, Clinton won with 42.954% to Bush’s 40.581%. However, in 1992 the partisan index favored the GOP by 3.2, since nationally Clinton had 43.007% of the vote and Bush had 37.448%.

1992 NJ
DNC  42.954-43.007 = -0.053 (0.053 absolute value)
GOP  40.581-37.448 =  3.133
Total                 3.186 (3.2)
With this in mind, here are the steps I use to project each state:
  • Combine the projected national popular vote with the partisan index for a given state in order to achieve the “assumed” statewide vote. For example, on October 15th, the national popular vote was projected at 49.1% for Kerry and 48.9% for Bush. The 2000 partisan index for Arizona was RNC +6.8. Combined, the two figures make it a margin of 6.6% for Bush in Arizona on October 15th, The assumed statewide vote in Arizona was Bush 52.3%, Kerry 45.7%.

  • Find the most recent poll of a given state. On October 15th, for Arizona this was the Northern Arizona University poll that completed its survey on October 11th. The results of this poll were 49% for Bush, and 44% for Kerry.

  • Combine the “other” and “undecided” totals from the most recent poll. In the case of the Northern Arizona University poll, that was 7%. In order to account for the third-party share of the vote, subtract 2% from this total. This results in the non-third party undecided vote (in this case, 5%).

  • Add 80% of the non-third party undecided vote to Kerry’s raw score in the most recent poll in order to account for the Incumbent Rule (44%+4% = 48% in the example of Arizona on October 15th). Add 20% of the non-third party undecided vote to Bush’s raw score in the most recent poll in order to account for the incumbent Rule (49% +1% = 50% in the example of Arizona on October 15th.. Respectively, these two figures are referred to as Kerry’s measured statewide vote and Bush’s measured statewide vote.

  • Average Kerry’s assumed statewide vote with his measured statewide vote in order to produce Kerry’s projected statewide vote. In the case of Arizona on October 15th, this meant averaging 45.7% and 48% to produce 46.85%. Then, average Bush’s assumed statewide vote with his measured statewide vote in order to achieve Bush’s projected statewide vote. In the case of Arizona on October 15th, this meant averaging 52.3% and 50% in order to produce 51.15%.
Viola. Bush is projected to win Arizona by 4.3%. Repeat this for every state, but never use one-night tracking polls of a state (for example, many Rasmussen polls). That is simply introducing far too much volatility into the system.

A state is considered “solid” if one candidate is projected to win the state by 6% or ore. There will never be any ties, and I will push the formula out to the furthest necessary decimal in order to break the tie.



You are not logged in.

In order to post a comment, you must be logged in. If you have a member account, please log in to comment.

If not, you can make an account right here. It's quick and free.