Computers will not replace reporters, except when they will

It’s disdainful in some circles to come out and say this, but there are places in journalism for automatic writing. Not the Miss Cleo kind, mind you, the kind done by computers. This is not a new trend (though news organizations, as ever, think things are invented only when they notice), but it’s received increasing notice given the continued decline of the economic status of most news organizations coupled with some high-profile examples.

The most recent was for the Shamrock Shake in LA, when an LA Times “quakebot” generated a story on the quake three minutes after it happened.

Whenever an alert comes in from the U.S. Geological Survey about an earthquake above a certain size threshold, Quakebot is programmed to extract the relevant data from the USGS report and plug it into a pre-written template. The story goes into the LAT’s content management system, where it awaits review and publication by a human editor.

Where many can (and did) look upon this story only to gasp in horror and pull their hair out in despairing hunks, I saw this and thought, “Huh. That sounds like a pretty perfect system.” Imagine no quakebot existed, and an earthquake happened. The first thing a modern news organization does is get a blurb on their site that says something to the effect of “An earthquake happened.” This then gets shared on social media.

Meanwhile (if the organization is doing it right — if not, this happens in sequence), a reporter is calling the USGS or surfing over to the web page, trying to dig up the relevant information. They will then plug it in to a fairly formulaic story (“The quake was x.x on the Richter scale, with an epicenter there about 2 miles deep. It was felt …”.) If they can get ahold of a geologist who isn’t busy (either geologisting [as we would hope, given that an earthquake just happened] or on the phone with other media outlets), you might get a quote along the lines of, “Yup, there definitely was an earthquake. There will probably be aftershocks because there usually are, although we have absolutely no way of knowing for certain.”

What’s the difference between the two stories, aside from the fact that one showed up much faster? Data-based reporting absolutely falls into my crusade to automate all tasks that don’t actually require a human. The non-computer method of initial reporting on the quake is completely identical to the automated method, except it a) takes less time and b) frees up a reporter to go do actual reporting that a computer can’t do.

The computer can’t make a qualitative assessment on how it’s affecting peoples’ moods, or how anxious people are about aftershocks. Reporters should be out talking to people, rather than querying a computer to get data that another computer can easily understand and process.

Perhaps the most cogent argument against computer-generated stories is the potential proliferation of such content. After all, one might argue, if every California news outlet had a quakebot, we’d have dozens of stories that all said the same thing without reporting anything new.

(This is me laughing quietly to myself. This is the sound of everyone waking up to the current problem with media when you no longer have a geographic monopoly thanks to the internet.)

No one is saying that all stories, or even most will be written by computers, but it’s not difficult to imagine that a good number of them will be simply because most stories today have significant chunks that aren’t deeply reported. They’re cribbed from press releases, interpreted from box scores or condensed from the wire. If we leave the drudge work to the computers, we can free up reporters to do things that computers can’t, and actually producing more, better content. It’s quite literally win-win. The primary losers are those companies who will buy too deeply into the idea that they can generate all their content automatically.