Here's the raw output:
Code: Select all
. reg pctmccain density
Source | SS df MS Number of obs = 3114
-------------+------------------------------ F( 1, 3112) = 143.16
Model | 2.62173277 1 2.62173277 Prob > F = 0.0000
Residual | 56.989607 3112 .018312856 R-squared = 0.0440
-------------+------------------------------ Adj R-squared = 0.0437
Total | 59.6113398 3113 .019149162 Root MSE = .13533
------------------------------------------------------------------------------
pctmccain | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
density | -.0000164 1.37e-06 -11.97 0.000 -.0000191 -.0000137
_cons | .5731997 .0024511 233.85 0.000 .5683937 .5780058
------------------------------------------------------------------------------
And for those of you who can't read statistical output, here's a pretty graph, courtesy of Stata, plotting population density in people per square mile versus the percentage vote for McCain:
I saw an R
2 somewhat higher than Surlethe's at a 95% confidence and higher, although 0.044 isn't that much greater than 0.023. That is, 4.4% of the variation in voting for McCain can be explained by population density. Also, looking at the graphics enhanced version, the numbers make much more sense. The outliers that voted overwhelmingly Obama were almost all New York - Manhatten, the Bronx, and Queens in particular. However, the massive blob of green is so mixed up in the moderate to low density areas, that it overwhelms the contribution from the large cities and reduces R
2. Makes sense to me. Obviously the trend line dips below 0 which is impossible, so this isn't a linear relationship, but it's not bad as a rough visualisation for things at least. For those of you curious, that one relatively dense area that McCain won by 64% is St Louis, Missouri.
Also, Surlethe - the data sets have a few mismatches in how they're organised. For instance, the population density list has Manhattan Island listed as "Kings County" while on the voting list it's just "Manhattan." So far as I can tell it just effects New York (due to NYC being nonstandard), Baltimore, St Louis, Missouri, and Virginia. All due to city/county mismatches. If you didn't, that might explain why my R
2 is higher than yours.