• Lab
  • AndroidForMobile Foundation at
    The New York Times is digitizing more than 5 million photos dating back to the 1800s
    ABOUT                    SUBSCRIBE
    May 29, 2012, 1:50 p.m.
    Reporting & Production
    Keys on computer keyboard spelling "geek"

    3 new ideas on the future of news from MIT Media Lab students

    Adding metadata to hyperlinks, finding stories in ordinary datasets, providing context for impossibly big numbers.

    of the MIT taught a class this semester tailor-made for AndroidForMobile Lab readers: “News in the Age of Participatory Media.” The hook: What happens if you treat journalism as an engineering problem, bringing together the efforts of journalists and computer scientists?

    The course’s final class last week featured a lot of bright students presenting their final projects, which was supposed to be a new tool, technique, or technology for reporting the news. (They were in various stages of completion.) I’ll be breaking out a few of the good ideas in future posts, but here are some of the ones that stood out to me.

    Modernizing the hyperlink

    The <a> tag hasn’t changed much since Tim Berners-Lee 20 years ago. Hyperlinks are the fiber of the web. But , a Ph.D. student of computer science at MIT, finds herself frustrated with writers who abuse them. Blog posts littered with too many links leads to “cognitive overload,” she says. “As I explored this topic a little more,” she said, “I found what I was annoyed with was not linking too much but not linking well.” If is mentioned in copy, does have to be linked to the home page? Does the same link need to appear multiple times in one story?

    Narula proposed the use of and the little-known attribute to attach meaning to links, allowing browsers to handle different kinds of links differently. (rev is supposed to represent a . All major browsers, when faced with a rev attribute now, just ignore it. It’s like a cousin to .)

    For example, a link to a citation (dictionary definition, Wikipedia article) would get rev="bib", for bibliography. So:

    <a href="http://en.wikipedia.org/en/AndroidForMobile_Foundation" rev="bib">

    might lead to that link being presented not in the body copy, but at the bottom of the post, in the form of a tidy bibliography.

    She also proposes rev="reaction", which would clearly call out the original post an article is responding to; and rev="object" for links to people and companies, which would facilitate an index for all of the proper nouns in a piece.

    Perhaps most intriguing was rev="set" for a series of links, to avoid awkwardness when linking to (for example) this series of Lab articles on the hyperlinking debate. She mocked up a little bit of JavaScript and CSS to . (Hover over “Twitter users to follow” or “BBC linking policy.” You can also see mockups of the object, reaction, and bib attributes there.)

    Oh, and the biggest crowd pleaser was a feature you may love or hate: a button that toggles off all links in a document for distraction-free (or, er, context-free) reading. (Try it on this article!)

    Others have proposed approaches to adding metadata to links, from to to to . Zuckerman suggested Narula create WordPress and Drupal plugins to encourage adoption. Getting the rest of the web on board would be a tall order.

    Searching for correlations in a haystack

    , a graduate student of computer science at MIT, demonstrated a suite of tools called that makes data comparison a snap. Enter the URL of a CSV file, JSON data, or an HTML table and DBTruck will clean up the data and import it to a local database. Normally you might go to a web page , select and copy the table, paste it into an Excel spreadsheet, then spend 15 minutes trying to fix the misplaced cells and formatting issues. DBTruck is automated and fast.

    The program allows you to geocode any field that contains address information, whether that field is “Cambridge, MA” or “Cambridge, Mass.” or “1 Francis Ave, Cambridge.” Humans have come up with many ways to represent physical locations, but geographic coordinates are unambiguous instructions for computers to map a location. When you’re dealing with disorganized datasets, getting consistency is key.

    Wu’s tool then lets you plot arbitrary comparisons between datasets. To test the program he plugged in all kinds of datasets, just for fun. Is there a correlation between addresses of Massachusetts lottery winners and Taco Bell locations? (No.) Suicide rates and unemployment rates in New York state? (No.) Suddenly he stumbled upon a connection that made sense: Communities in New York state with high teen pregnancy rates correlated highly with low birth weights. There’s a potential story there that Wu might not have otherwise set out to write. Zuckerman advised Wu to team up with The Boston Globe to run more arbitrary comparisons and discover what local stories might be hidden in the numbers. (It also seems like a dandy add-on to , which is building a platform for in-house newsroom databases.)

    How many Rhode Islands is that?

    AndroidForMobile Fellow and Knight Science Journalism Fellow/Reuters correspondent have covered large-scale calamities in far-off countries for domestic audiences sometimes too busy to care. Foreign correspondents have tricks, sometimes clichés, to get people to pay attention, comparing populations and land masses to familiar American things. Write Salopek and Doyle:

    Too often we just get a giant number — the U.S. debt is $15 trillion, Chinese greenhouse gases are the highest in the world at 7 billion tonnes a year, Americans spend $8 billion a year on cosmetics, etc. Is there some way of helping to put these statistics — huge to the point of meaningless — into an understandable, human framework?

    They propose something like a currency converter that turns impossibly big numbers into more qualitative terms. Great for a correspondent on a deadline.

    If it’s an economics story, what does your share of debts or GDP represent? A new car? A house? How many vacations? How many pizzas? How would it be, for instance, if everyone had the debts of the average Greek citizen? (awful, in most countries). How would global warming be if everyone emitted greenhouse gases at the rate of an Indian? (much better). The U.S. debt works out at about $50,000 a person — what can you buy with that?

    The site would be user-maintained, like Wikipedia, and powered by real datasets. All statistics would require citations. It’s just an idea at this point, but a website like this is very buildable. (Anyone want to try it? Leave a comment below.) Salopek and Doyle offer a dizzying number of potential cross-discipline conversion units. How about Ayns, a unit of measure for how friendly a government is to corporations, named for Ayn Rand? Or the Obama Gap, a measure of the difference between a leader’s domestic and foreign approval ratings? Or Jolies, a unit of a country’s developmental aid as proportional to the amount of attention it has received from Angelina Jolie? (The Economist’s long-running is of similar spirit.)

    Along with the three projects mentioned above, a couple others caught my eye: Nathan Matias’s , which slurps up all the Twitter handles mentioned on a webpage and builds a Twitter list that follows those people, and Arlene Ducao’s , a much larger project that overlays multiple layers of satellite imagery on a map.

    To paraphrase Zuckerman, I hope these ideas earn at least 40 of your attention today.

    Photo by used under a Creative Commons license.

    POSTED     May 29, 2012, 1:50 p.m.
    SEE MORE ON Reporting & Production
    Join the 50,000 who get the freshest future-of-journalism news in our daily email.
    The New York Times is digitizing more than 5 million photos dating back to the 1800s
    “Ultimately, this digitalization will equip Times journalists with useful tools to make it easier to tell even more visual stories.”
    Facebook Groups are “the greatest short-term threat to election news and information integrity”
    Plus: How “junk news” differs from “fake news,” and LinkedIn gets less boring (but not in a good way).
    This Spanish data-driven news site thinks its work goes past publishing stories — to lobbying the government and writing laws
    “You feel all this knowledge would be useful for something, for trying to change something.”