January 21, 2011

Lies, Damned Lies, and Statistics

I've been writing about numbers for almost fifteen years, and one thing I've learned is that you can make data say anything you want it to, depending on how you look at things -- and that doesn't make it any more predictive of an outcome.

And that's especially true in sports. How many times have you heard how Team A has never beaten Team B when trailing at the half, on the road, in January, when it's below forty degrees in Houston? Or that Player One has a perfect record against Player Two in games played in the southern hemisphere before noon Pacific Time? Add enough caveats -- no matter how ridiculous -- to an assumption, and you can make a case for any result.

Statistics certainly have the ability to skew our perception, and they don't always tell the true story of how a battle was fought. And all you have to do is look at a couple matches at this year's Australian Open. There are some simple incongruities -- David Nalbandian and Lleyton Hewitt actually scored an equal number of points in their almost-five hour slugfest, while Ivo Karlovic again found a way to lose a match when out-acing his opponent, forty-eight to ten. But it can get more complicated than that.

Dominika Cibulkova lost her third round to top-seeded Caroline Wozniacki in straight sets, but it was the teeny Slovak who came out the aggressor. She barreled off thirty-one winners against the world #1, who only scored eleven of her own. She was a little sloppy -- over forty unforced errors, almost four times Caro's -- but that's more an indication of the chances she took versus a highly favored opponent. While Wozniacki seemed content to knock balls back over the net, it was Cibulkova who came up with the risky, more imaginative play, and maybe should have won the match.

Robin Haase didn't last quite as long, but for the first half against veteran Andy Roddick, you might have thought you were witnessing an upset. The twenty-three year old Dutchman took control early, trapping his opponent at the net and allowing no break chances in the opening set. He won nearly eighty percent of his first serves and about half of Andy's. In the second he kept it close -- the two were equal on both winners and errors, and neither made a dent on the other's serve. When Roddick won that in a tiebreak, though, it was all over for Haase, and he dropped the next two sets in about an hour.

And then you have the surprising match between last year's runner-up Justine Henin and one-time French Open champ Svetlana Kuznetsova. The Russian had a surprisingly poor 2-16 record against her third-round opponent over their eight-plus year history and had never beaten her at a Slam -- in fact her only two wins came in tight three setters. And as one of my friends pointed out, Henin had only lost at the Australian Open to players ranked -- either at the time or at some point -- #1 in the world, and Kuzey topped out at #2.

But it was Sveta who came out firing this time. Though she made slightly fewer errors and a handful more winners, she was cleaner when it counted. She stayed strong after losing a break-lead in the second set and after failing to serve it out. She saved set points and withstood faster serves, and after two hours, Kuznetsova was the one left standing. Maybe she is heading for the #1 ranking...

So what can we learn from all this? Well for one thing, numbers -- whether stats, scorelines or match length -- clearly don't tell the whole story. And for another, past performance is no indication of future results. That's not to say that keeping and monitoring such minute records doesn't have it's place -- but it certainly can't beat watching the darn games!

No comments: