I live in a neighborhood in Philadelphia called University City. This past week, most of the students who live in the area have moved out, as the semester is over. After yesterday, I kind of feel like our little investigation into national Democratic nomination polling came to an end. Like a semester, it lasted around four months, and I think was quite informative. Here are the conclusions I have drawn from our studies:
- National Democratic nomination preference polls include too wide a net of people in their sample, typically somewhere between 35%-50% of the Voting Age Population. Typically, outside of New Hampshire, only about 10-15% of the Voting Age Population participates in Democratic Presidential primaries. However, at this early date, it would not be wise to significantly narrow the sample universe, as it is too early to know who will actually form the electorate in the Democratic primary / caucus season. That might change, come January 2008.
- Some early indications of voter turnout favor Clinton and Edwards, while others favor Obama. Specifically, Clinton and Edwards do well among older poll respondents and Clinton does better among self-identified Democrats than among Independents who lean Democratic. However, Obama does better among poll respondents who are paying more attention to the campaign. When averaged together, these effects might very well cancel each other out / compliment each other.
- Clinton does better in polls where undecided respondents are pushed to make a decision, thus emphasizing her advantage among voters who are not paying close attention to the campaign. However, Obama does better in automated IVR polls like Rasmussen that have a history of including more young voters in their samples. Once again, when combined, these skews might cancel each other out / compliment each other.
- Al Gore draws a significant percentage of support (roughly 10-15%) from all three "top tier" candidates) simply by being included in the question. This usefully shows, once again, that there is a significant amount of "soft" support for all candidates. However, Al Gore is also currently not running, thus making it quite difficult to justify including him in polls that are meant to be an accurate snapshot of public opinion on the current campaign. The solution here is probably for polls to ask "someone else" as an option for respondents, rather than to name specific candidates who have not announced. Overall, until polls settle on a consistent list of candidates to include in their questions, it will be necessary to collect two different polling averages, one with Gore, and one without.
- As demonstrated by the soft support of undecideds, the still large number of potential Gore supporters, the varying movement in the national campaign over the past couple of months, and the wide difference in results between different polls conducted at the same time, there is a lot of movement yet to be had in the Democratic primary season. However, it is probably wrong to assume that said movement is on the level of 2004, either to the degree to which early Lieberman "supporters" abandoned him before Iowa throughout 2003, or to the degree that Democrats flocked to Kerry after the 2004 Iowa caucuses. Increased star power in the field, a higher level of voter engagement, increased Democratic satisfaction with the field, and the lack of a 2004 "perfect momentum storm" are among the reasons that will probably reduce poll movement compared to the 2004 primary season.
- A few of side notes. First, there does not appear to be a large "anti-Hillary" vote in the Democratic electorate. Second, social pressure to say you are voting for a woman or an African-American does not appear to be artificially inflating either Clinton or Obama's poll numbers. Third, while Clinton performs slightly worse than Edwards or Obama in general election trial heats, the gap is not massive (currently between 2.8% and 6.9% depending on the matchup). While this is not currently indicative of an "electability" problem, and is more indicative of Clinton's longer exposure to the Republican Noise Machine, if these numbers hold, or even increase, through January of 2008, that could change.
In the end, this leaves us roughly where we were back in January: averaging polls. However, I think we now have a much better idea as to why national polls can be so different from each other, and yet all still be valid. It has also left me with a methodology to measure the current state of the national campaign in which I have a decent amount of confidence. Having a way to accurately measure the campaign is an important first step toward developing a means to influence it. To this end, it would be particularly useful if more national polls had larger sample sizes and released detailed crosstabs from within those sample sizes. It is in this way that live-interview polls commissioned by large media outlets, which are invariably have smaller sample sizes and are more hush-hush about their methodologies, remain our least useful measures of the national campaign. However, despite this, we do have some good info now, and as such we can move forward. From now on, my discussions of polls will probably be restricted to updates on the state of the national campaign, and not spill over into meta discussions on polling itself. I hope you got as much out of our polling seminar as I did, and are now excited to moving forward onto other, more qualitative topics.
The seminar's syllabus can be found in the extended entry.