Kait

Electioneering - Covering Election Day with data, maps and print

Election night is always tense in a newsroom - even when, as the case with the Pennysylvania governor's race, the outcome isn't in doubt, there are still so many moving parts and so many things that can change. Whether it's a late-reporting county/precinct or trying to design a front page you've been thinking about for weeks, there's always something that can go wrong. That's why, this year, I tried to prep my part of the election coverage with as few manual moving parts as possible. Though (as ever) things did not go according to plan, it definitely provided a glimpse at how things might run — more smoothly — in the future.

I set out in the middle of October with two aspects of the coverage. The first, live election results, was something I've been in charge of since the first election after I arrived at the York Daily Record in 2012. I've always used some combination of Google Docs (the multi-user editing is crucial for this part, since our results are always scattered around various web pages and rarely scrape-able) and PHP to display the results, but this year I had my GElex framework to start from (even if I modified it heavily and now probably need to rewrite it for another release).

The results actually went incredibly smoothly and (as far as I know) encountered no problems. Everything showed up the way it was supposed to, we had no downtime and the interface is as easy I can conceivably make it. You can take a gander at the public-facing page here, and the Sheet itself here. The one big improvement I made this time around was on embeds. Though there's always been the ability to embed the full results (example), this year — thanks to the move to separate sheets per race — it was possible to do so on a race-by-race basis.

This helps especially in consideration with our print flow, which has always been that election stories get written so the exact vote totals can be inserted later via a breakout box. By embedding the vote totals into the story, this meant we didn't have to go back in and manually add them on the web.

The governor's race stole pretty much all the of the headlines (/front pages) in York County owing to its status as Tom Wolf's home county. For us, this meant we'd be doing twice as many live maps as usual. The county-by-county heat map is relatively cliché as political indicators go, but it's still a nice way to visually represent a state's voters.

Since he's a native, we also decided this year to include a map of just York County, coding the various boroughs and townships according to their gubernatorial preferences. My first concern was online — we've done both of those maps in print before, so worse case scenario we'd be coloring them in Illustrator before sending them to press.

I wanted interactivity, fidelity, reusability and (if at all possible) automation in my maps. When it came to reusability and fidelity, SVG emerged as the clear front-runner. It's supported in most major browsers (older flavors of IE excepted, of course), on mobile and scales well.

The other options (Raphael, etc.) locked us down paths I wasn't really comfortable with looking ahead. I don't want to be reliant on Sencha Labs to a) keep developing it and b) keep it free when it comes to things like elections and maps. I would have been perfectly fine with a Fusion Table or the like, but I also wanted to look at something that could be used for things other than geocoded data if the need arose.

Manipulating SVGs isn't terribly difficult ... sometimes. If the SVG code is directly injected into the page (I used PHP includes), it's manipulable using the normal document DOM. If you're including it as an external file (the way most probably would), there are options like JQuery SVG (which hadn't been updated in TWO YEARS until he updated it less than a week before the election, or too late for me to use) or this method (which I was unable to get to work). (Again, I just cheated and put it directly on the page.)

Manipulating fills and strokes with plain colors is fairly easy using jQuery, just change the attributes and include CSS transitions for animations. The problems arises when you try to do patterns, which are much different.

I wrote a tiny jQuery plugin (pluglet?) called SVGLite to assist with this, which you can read more about here. When backfilling older browsers, I figured the easiest thing to do was serve up PNG images of the files as they existed. Using everyone's favorite PHP library for ImageMagick, imagick, this was trivial. Simply running a few PREG_REPLACEs on the SVG file before serving it to Imagick helped me get the colors I needed.

It turns out there aren't a lot of free options for scraping data live, and as I've mentioned before, free is pretty much my budget for these sorts of things. But there is one. Import.io, which has the classic engineer's design problem of making things more difficult by trying to make them easier, turned out to be just what we needed when it came to pulling down governor's data.

Working off the results site for each county, I set up a scraper API that trolled all 67 pages and compiled the data for Wolf and Corbett. This was then downloaded into a JSON file that was served to the live Javascript and PHP/ImageMagick/PNG maps. Given that I didn't want to abuse the election results server (or melt ours), I built a small dashboard that allowed me to control when to re-scrape everything.

This part actually went almost as well as the live results, with one MASSIVE EXCEPTION I'll get to after the next part. The boroughs/townships data presented its own problems, in the form of only being released by PDF.

Now, running data analysis on a PDF is not terribly difficult — if you're not time-constrained, I'd definitely recommend looking into Tabula, which did an excellent job of parsing my test data tables (2013 elections), as well as the final sheet when it was all said and done.

Unfortunately, processing each one took about 45 minutes, which wasn't really quick enough for what we needed. So we turned to the journalist's Mechanical Turk: freelancers and staff. Thanks to the blood, tears, sweat and math of Sam Dellinger, Kara Eberle and Angie Mason, we were able to convert a static PDF of numbers into this every 20 minutes or so.

It's always a good idea to test your code — and I did. I swear.

My problem did not lie in a lack of testing, but rather a lack of testing using real numbers or real data. For readability purposes, the election results data numbers are formatted with a comma separating every 3 numbers, much in the way numbers always are in non-financial or -computer contexts (e.g., 1,000, 3,334,332).

UNFORTUNATELY, when I did all my testing, none of the numbers I used went above 1,000. Even when I was scraping the test data the counties were putting up to test their election results uploading capabilities, the numbers never went above 500 or so — or, if they did, they were tied (1,300 for Wolf, 1,300 for Corbett).

The problem lies in how the scraper worked. It was pulling all of the data as a string, because it didn't know (or care) that they were votes. Thus, it wasn't 83000, it was '83,000'. That's fine for display purposes, but it's murder on mathematical operations.

About an hour after our first results, the ever-intrepid and knowledgeable Joan Concilio pointed out that my individual county numbers were far too low - like, single or double digits, when the total vote count was somewhere north of 200,000. After walking all of my numbers back to import.io, I realized that I needed to be removing the commas and getting the intVal() (or parseInt(), where appropriate).

(I also originally intended to provide the agate data using the same method, but the time it took to quash the number bug meant it was safer/wiser to go with the AP's data.)

Conclusion:

  1. Always test your data.

  2. Always make sure your data matches the type you're looking for.

  3. Sometimes the overhead of statically typed languages is worth the trouble.

Overall, everything went fairly well, with the exception of the aforementioned bug report (which also made us double- and triple-check the print graphic). The advantage of SVG, aside from its digital flexibility, was that after a quick Save-As and conversion in Illustrator, we had a working print file ready to go.

Another election, in the books.

I thought I was soooo smart linking to everything, except now all the links are dead and useless.