Comments on Blank On The Map: Does Nate Silver get his maths wrong?

In events like elections where there is only once ...

2013-04-02T13:54:11.637+02:00

In events like elections where there is only once chance of testing the model, election results cannot be the basis for assessing the validity of the model. All statistical models would have structural elements that underpin the relationships between the variables. It would be possible to verify these relationships from real data with good statistical confidence. Verifying the fit of such data would be a better way to evaluate the validity of the models. Unfortunately, since Nate Silvers model is proprietary, we have no way of knowing it. Nate Silver would surely be sharpening his knife for the next iteration

I sort of agree and disagree. To sort out the ques...

2012-11-10T17:18:08.036+01:00

I sort of agree and disagree. To sort out the question of correlation of polls might take that long (except of course remember that after each election you can compare the poll margins with the actual vote percentage and so learn more about the existence of any actual correlated bias). But the question of whether it is in general worthwhile including extra information via "fundamentals" can probably be disentangled much quicker.

Of course I'm assuming science-like cooperation and information-sharing about the models, which won't be happening :-)

I disagree regarding how long it takes to get larg...

2012-11-10T17:11:25.626+01:00

I disagree regarding how long it takes to get large N. The individual state results (even the senate ones, though less so) are correlated, there isn't that much independent data per election.

Nate had a 10% probability Obama would lose. This was entirely due to the possibility that the state polls were correlated and systematically wrong across the board. Without ~10 *national* elections I can't see how Nate vs Sam can be resolved. That's 40 years. Perhaps I could accept 20 years for some indication to develop. I don't know, maybe that actually isn't so long! (Note that any a posteriori application of these models to previous elections isn't acceptable because the models were built based on those elections)

Generally speaking, I agree that state polls could...

2012-11-10T16:59:08.203+01:00

Generally speaking, I agree that state polls could have been correlated and this should have been accounted for in the probabilities. When I was too pessimistic to believe Sam Wang, part of the reason for that was I had read other sensible people saying the same thing about his method.

He does have a post on his website addressing this criticism. His answer is that adding a correlation to the model does not change his median prediction, and only changes the error bars slightly, so it constitutes unnecessary complication. The explanation is written in rather simplistic terms though, and as I have no detailed knowledge of his model I can't evaluate his claim.

I agree you need more election data to really make a sensible judgement about relative merits. If you count presidential elections in each state, plus senate elections, house elections and so on, then it shouldn't take too long to get a reasonably large N.

One thing to keep in mind (and this "Brier sc...

2012-11-10T16:41:31.010+01:00

One thing to keep in mind (and this "Brier score" doesn't) is that each one of Nate's state probabilities isn't independent from the rest. A lot of those states that were less than 90% certainties would probably have either *all* been Obama or *all* been Romney in most of Nate's individual events in his Monte Carlo.

Also, Sam Wang had a 100% prior that the polls were correct (as you mention in your post). If Nate had done that he'd have had the same near certain probabilities. There will be an election at some point in the future where the polls are wrong and thus Sam will get most of the swing states states wrong and his Brier score will be very poor, whereas (despite also getting them wrong) Nate's won't be as bad because of his conservatism.

I honestly don't think it is really all that possible to properly judge the accuracy of these models without many more elections (because of the non-independence of each state/senate race there just isn't enough data).

Nice post though. I felt like you were reading my mind as I read it (then again, most people with a basic understanding of statistics might feel the same).