The Baseball Encyclopedia, published first in 1969, blew peoples’ minds. The baseball public had never before seen every baseball stat gathered in one place with such authority and finality. Players from the old days that had been entirely forgotten were suddenly right there, on paper, in this impressive fat book that staked an unprecedented claim to truth and accuracy. Statistical books around for a while by that point, but they were limited to recent seasons and oriented around trivia and barroom argument-settling. Scraping up the name, age and number of every Johnny to put on a uniform seemed as reasonable as counting blades of grass on the lawn.
Fortunately a group of lunatics with a publisher ventured to solve this information problem. Instead of using the highly limited old books of statistics, a guy named David Neft hired a team of college interns to scour old box scores and newspaper accounts across the country and record their findings. Using computers for the first significant time to gather data in a baseball manner, Neft and his crew–the team of college kids always makes me think about it of the hapless interns from the Life Aquatic with Steve Zissou–reenacted the whole history of major league baseball. At bat to at bat, they tabulated and cross-referenced each tiny step towards the complete numerical narrative of the game. They depended on the original unsung heroes of baseball statistical analysis who had kept box scores and newspaper clippings and hand-assembled statistics long before anyone cared to see them (baseball’s early stat history consists entirely of these uheralded, isolated mavericks and mad men, as chronicled in the fantastic book The Numbers Game by Alan Schwartz).
So order was milled from chaos. Updated yearly, the great tome (over 2,000 pages when first published) corrected the errors of the past, brought old players into the present and provided a direct, specific reference to look to for info on any player who ever played. This was some kind of empowerment for the fan to have immediate access to a whole history, an entire universe of knowledge. Debates were quickly crushed, and arguments refined and well-documented. Neft and company put together the leg-work to provide the common fan with the tools to analyze the ballplayers themselves; it was an informational power transfer from the team owners to the people, and a watershed moment.
As remarkable as the Baseball Encyclopedia’s achievement was, the statistical advances of the last few years make that impressive book look like scribbles on a cocktail napkin. We live in a day when the smallest details on every player are available in seconds. Box scores and play-by-play accounts are just as accessible. Hordes of baseball stats analysts publish their work every day, working in a public realm that did not exist ten years ago. The stat world has been flattened, to coin a phrase, by the increasing power of amateur stat keepers and number wizards driven–as they’ve always been driven–by the will to learn, overtaking an older industry, typified by STATS, Inc., of authoritative unilateral providers.
So not only can you find the numbers on a player, you can find analysis of those numbers, and you can read the sequence of plays that led to each number. Computing power and Excel spreadsheets enable each of us to crunch whatever the hell we want to crunch like lickety split. Stats in this modern age are as unbelievably fast and accessible as every other thing in the world.
What got me thinking about the statistical cloud that every baseball fan now contends with is an article by the media ecologist/general media badass and Kansas State professor Michael Wesch, “From Knowledgable to Knowledge-able: Learning in New Media Environments”. Here’s a piece of the article (he’s talking about higher ed students, but in the context of this post, we are all students. For the sake of my angle I would recommend just the first section of his article though by all means read the whole thing):
There is something in the air, and it is nothing less than the digital artifacts of over one billion people and computers networked together collectively producing over 2,000 gigabytes of new information per second. …nearly the entire body of human knowledge now flows through and around these rooms [ie. us] in one form or another, ready to be accessed by laptops, cellphones, and iPods. Classrooms [ie. the Baseball Encyclopeias] built to re-enforce the top-down authoritative knowledge of the teacher [ie. single stats authority] are now enveloped by a cloud of ubiquitous digital information where knowledge is made, not found, and authority is continuously negotiated through discussion and participation.
Wesch could just as easily be describing the web world of baseball statistics. A massive cloud of numbers floats around us, sinking into our pores and threatening to overwhelm those of us who want to quote-unquote understand the game. All of these artifacts of human events–the games we’ve watched and long forgotten–are a part of our relationship to the game. If baseball fandom is living in a house, then these statistics are the closets packed with things, some wanted some not. We may want to grab our snow boots from the shelf, but we risk pulling all of the other paraphernalia down on top of ourselves. As Wesch alludes to vis-a-vis the power-altered relationship between student and teacher, there is no longer a top-down authority to tell us that the RBI is grand and that the batting average is perfy to determine who’s a bum and who’s a stud. A chorus of voices screams at us every step on the shadowy road to enlightenment.
What does Wesch have to say about managing this new universe? This:
As we increasingly move toward an environment of instant and infinite information, it becomes less important for students [ie. baseball fans] to know, memorize, or recall information, and more important for them to be able to find, sort, analyze, share, discuss, critique, and create information. They need to move from being simply knowledgeable to being knowledge-able.
Knowledge-able. It has a Thomas Friedman-esque handiness to it, but I find it immensely relevant. He’s essentially saying that as the old school power structures now fill the town dump like dot matrix printers, we as students of the game are left with a vacuum. To navigate this newly abundant but unregulated world, we have to learn to fend for ourselves, to pick and choose the data and analysis that works for us and keep moving ever forward. In the case of baseball stats and analysis, the old power structure is The One Book to Rule Them All, the old Baseball Encyclopedia model. Do you have a Baseball Encyclopedia at home? Well I do, actually, but I don’t need it for anything beyond it’s antique value (which is about 8 bucks last time I checked eBay). I’ve got Baseball-Reference.com to tell me more than 26 volumes could ever hope to. But the information will not be ordered. It will be jumbled, subjective, political. I will find myself there via veiled links and by way of unedited whimsy. Once there, I must learn for myself what to trust and what not to, which gopher holes to shimmy down and which snake holes to avoid. It’s a digital world, Charlie Brown: come along skinny dipping or spend the evening alone on the front porch with the Reader’s Digests and the dust bunnies.
Wesch goes on to say that the digital world rewards “a spirit of interactivity, participation, and collaboration.” Does that sound familiar? Try Bill James and Project Scoresheet and Retrosheet and the legions of SABRmetricians who labor endlessly for little/no pay just to make things right and inform the wanting public. The difference now is that their efforts find immediate payoff in the online realm. Nowadays, people NOTICE, because they are looking for the data and know where to find it. Retrosheet, a grassroots effort to gather the aforementioned play-by-play and boxscores, is now the most powerful tool available to the stats analyst. It was built from a sense of good will, with massive volunteer assistance. The people gave the movement strength, but the Internet gave it power, a power that is relayed directly into the hands of the rabble.
The same goes for blogs. Users create the content, and users comment on the content. Even venerable sports writers have to endure critical commentary on their columns essentially IN their columns, or at least just below them. We’ve all got access to the ideas of others, and can toss our own ideas out there at will.
Certainly pseudo-authoritarian figures have emerged, like Bill James and Rob Neyer and Nate Silver and the FanGraphs folks and others. But for the most part these figures are just doing the best with the info that everyone has access to. If I was a numbers wizard there is nothing to stop me from putting the current heads of state to shame. I’ve got access to the same data and the same tools and the same processing power that everyone does. Baseball stats, unlike intellectual property, are public access. The question is and will always from now on be, what to do with that access? What will you create, how will you collaborate and participate, assuming you give a rat’s ass at all?
If you decide you’ll just stick to the game itself, on TV or in person, will you avert your eyes when the latest splits crawl across the bottom of the screen, or when an OPS pops up on the Jumbo-Tron? Will you choose not to care, or will you strip out of your civvies and into the pond?
I’ll leave you with a final comment-free line from Wesch’s article to pop your brain just for fun:
Our old assumption that information is hard to find, is trumped by the realization that if we set up our hyper-personalized digital network effectively, information CAN FIND US.