Features:
Collaboration can harness the power of technology and data for better story discovery
An event report from Story Discovery At Scale, bringing people together to work on tools and sustainability for local journalism
This article was originally published at Big Local News.
Email alerts sent to community journalists about newsworthy items uncovered by automating the process of sifting pages of city council agendas. Tip sheets for state Capitol reporters generated from a system that processes hours-long committee meetings videos and transcripts. An infrastructure of tools that could help smaller newsrooms produce impactful stories even though they lack technical resources.
At a conference hosted in late March at Stanford University by Big Local News, journalists, researchers, and technologists converged to discuss these efforts and more, all aimed at developing shared solutions for local journalism. The event, titled Story Discovery At Scale, brought together practitioners and thought leaders to discuss the best way to build and sustain tools for the practice of journalism.
A key goal from the gathering was to connect participants who are trying to solve the biggest issues in the industry and plant seeds of collaboration between those working on similar projects.
Marc Lavallee with the Knight Foundation, which co-sponsored the event with Big Local News of Stanford University, said the focus of the conference was to explore the best ways to increase the volume and relevance of data journalism within the current news landscape, no matter the size of a newsroom.
“We know that it’s not just a tech problem,” said Lavallee. “The hardest thing to solve is adoption around the usage of these tools—the right place, the right time, the right information, especially in putting these into the hands of folks who are not 100% of the time data journalists—they’re doing 15 other things in the span of a day.”
Knight Foundation is interested in helping put a “scaffolding in place” for local journalism, Lavallee said. The goal is to reduce duplication of effort and to assist in projects that could benefit a wide array of people and newsrooms in the industry.
This is especially critical in the current moment, with the recent developments in artificial intelligence as well as advances in machine learning and natural language processing, said Cheryl Phillips, founder of Big Local News, a program of the Stanford Journalism & Democracy Initiative. At the same time, newsrooms across the country continue to see dwindling resources.
“This moment was coming and what we want to avoid is a lot of solutions to the same problem,” said Phillips.
The two-day conference featured presentations from journalism organizations and individuals who are creating tools for gathering, analyzing and visualizing data for storytelling. Researchers presented their latest work on using automation, natural language processing and artificial intelligence to generate articles and predict newsworthiness of events. Attendees also got a chance to talk with one another about challenges they face—from difficulties in deploying tools that are accessible to all newsrooms to creating and managing complicated data streams for news storytelling.
“The hardest thing to solve is adoption around the usage of these tools—the right place, the right time, the right information,” Marc Lavallee, director of technology product and strategy for the journalism program at the Knight Foundation.
Some of the projects highlighted at the conference involve the different ways reporters and the general public can keep local and state governments accountable. Tools being developed range from training hundreds of citizen journalists to collect information at public meetings (a project called Documenters to Council Data, a search engine-type of platform that allows users to sift through videos and transcripts of city council meetings by topic.
Another project from Big Local News called Agenda Watch will automate the collection of city council documents and agendas with the goal of providing alerts to local journalists who are stretched too thin. That project came about thanks to a mix of collaborative support: the Brown Institute for Media Innovation, Lenfest Institute and the RJI Institute at the University of Missouri. Just the kind of web of support that is needed for many like efforts.
“The idea is to hoover up those documents and put them into a single web platform where any reporter or member of the public can go in and do research and get an early warning that there might be something newsworthy coming up," said Serdar Tumgoren, associate director for tools with Big Local News and Lorry I. Lokey Visiting Professor. “That is the starting point.”
Another tool being developed by CalMatters in partnership with Foaad Khosmood at Cal Poly University, called Digital Democracy, will scrape public data on bills, meeting videos and transcripts, votes on legislation, political contributions and other information to create individual tipsheets for each of California’s state legislators.
“It is trying to create more transparency in this opaque process,” said Dave Lesher, co-founder of CalMatters, a nonprofit statewide news organization. “Can technology, with the help of a handful of reporters, meaningfully cover 120 legislators?”
Tumgoren said it was exciting being in the same room as others working in different, but complementary ways, to unlock documents and data from local governments.
“There’s an army of foot soldiers that are forming around the effort to liberate documents and data that are vital for the public interest from local government agencies such as school boards, city councils, planning and zoning commissions,” said Tumgoren. “By joining forces, we’re going to be able to really scale our efforts in ways that wouldn’t be possible if we continue to work alone.”
One pressing problem that conference attendees explored is how to increase the volume and relevance of data-driven journalism and what is the best way to empower the smallest newsrooms with few resources to produce data stories with impact.
The answer might be in some of the projects shared at the conference as well as research being done at universities—from Datasette, an open-source tool that makes it easy to explore and publish data, to DocumentCloud, for analyzing, annotating and publishing documents, and MuckRock, for facilitating public records requests. MuckRock, for example, is using its massive database and a machine learning algorithm to help journalists become more successful in obtaining public records.
Prem Ramaswami, a product manager at Google, walked through the website datacommons.org, which allows users to access standardized, linked data from multiple sources.
“This can be particularly useful for journalists and policymakers who lack the resources of large corporations and academic institutions,” Ramaswami said.
Meanwhile, researchers such as Sachita Nishal and her colleagues at the Computational Journalism Lab at Northwestern University are working to build automated tools that support journalists in their work. One project uses AI for generating news angles from political press releases. Another project provides investigative journalists access to a database of algorithms currently used by governments at the federal, state, and local levels.
Alexander Spangher, a PhD student at the University of Southern California (USC) who is spending the academic year working with Big Local News, looks at how machine learning can fit into a journalist’s process of finding and writing stories. By looking at patterns of a news article and when certain events are written about, Spangher said, there could be tools created around sourcing and generating background information for stories. Spangher has also analyzed newspaper home pages to predict newsworthiness of events and has looked at the evolution of breaking news stories.
“What I really need are collaborators, interested stakeholders, people to talk to about this stuff,” said Spangher. “If you’re generating that data, think about not throwing that data away… if you have specific problems, we should talk about how we can be capturing data in a way that gets academics interested in working on your problems and really facilitate better communication and collaboration.”
“One of the biggest takeaways is a common desire to think about shared funding models, and also shared management models,” Tumgoren said.
Those working with the California Reporting Project and Community Law Enforcement Accountability Network, a consortium of data scientists, tool builders, journalists, and public defenders looking into police misconduct, emphasized the importance of specialized tools that can be easily integrated into the workflow of data journalists.
Among the shared problems many newsrooms and journalists continue to face include making government meetings more understandable to communities, making sure relevant data is shown with proper context, having a way to standardize information especially in the realm of criminal justice and political systems and finding ways to sustain the tools that are being built.
Conference attendees came up with potential solutions to helping journalists adopt more tools and giving newsrooms motivation to implement systems for better data storytelling.
Some suggested creating a type of “apps store” for tools that includes reviews, case studies and curation. Such a store could make it easier for journalists to find the right one for a specific project they’re working on. In addition, a helpdesk with additional coaching would allow more journalists to feel confident in adopting new tools. Newsrooms could also contribute to a wiki system for datasets as a way to share knowledge. There could also be a fellowship that puts those creating tools in the center of a newsroom so they can better understand the reporting process.
Katherine Ann Rowlands, president of Bay City News Service, a news agency covering 12 counties in the San Francisco Bay Area, said the gathering was useful because she learned about projects in development and it helped her think about how to harness artificial intelligence and data scraping to surface stories of interest for her organization.
“It’s too easy to get siloed into the way you currently do things in your own organization,” Rowlands said. “Having a chance to talk with others who are in different places and bringing different skill sets and different ideas to the table is always helpful.”
Rowlands said Bay City News already plans to assist Agenda Watch with testing out its features and she is also interested in many of the projects that were highlighted at the conference—especially tools that can make covering a whole region more efficient.
Since the conference, conversations have continued between news organizations with some exploring potential partnerships and applications for grant funding.
“One of the biggest takeaways is a common desire to think about shared funding models, and also shared management models,” said Tumgoren. “How do we build this tier of services that any project can use and lots of organizations can come together to share the burden of managing.”
Irene Casado Sanchez of Big Local News contributed to this story.