It's time for a real debate on reader privacy

Last week longtime local publisher Howard Owens, founder of the online news site the Batavian, launched a new publication covering Wyoming County in upstate New York. Buried in a parenthetical within his welcome message to readers was a fascinating promise: “We’ll also respect your privacy by not gathering personal data to distribute to multinational media conglomerates for so-called ‘targeted advertising.’”

This kind of explicit promise regarding reader privacy is increasingly important and all too rare.

Even though stories about government surveillance, commercial tracking, and financial data theft have become commonplace in the press over the last two years, news organizations are still loath to talk about their own practices in regards to reader privacy. It’s time for some real talk about what we owe our readers in the age of big data and mass surveillance.

Just last week the Freedom of the Press Foundation published an analysis of news organizations’ use of encrypted HTTPS connections. “Virtually none of the top news websites,” writes Kevin Gallagher, “including all those who have reported on the Snowden documents?—have adopted the most basic of security measures to protect the integrity of their content and the privacy of their readers.” Without this encrypted connection it becomes possible to essentially eavesdrop on what people are reading online, as the NSA did with people who visited the Wikileaks website.

Earlier this year, in a report on the challenges of encrypting news websites, the Washington Post pointed out how much this kind of surveillance can reveal about someone. “Among the issues potentially illuminated by what you choose to read, advocates say, are your health concerns, financial anxieties, sexual orientation and political leanings.”

And yet, the use of encrypted connections on news websites is just one part of a much larger and more complex issue.

How Much Information Are News Sites Collecting?

News organizations have long collected subscriber data, but more and more news sites are asking everyone who visits the site to sign in and create an account to access basic functionality like commenting. This means that news organizations are housing more and more of our personal data without clearly communicating how that data is being stored, secured and used. News organization privacy policies are as dense and impenetrable as other companies’ terms of service.

As analytics software develops, news organizations are collecting vast amounts of data, not just about what people read but how they read it—how fast, where they linger on the page, and so forth. There are good reasons for news organizations to measure that kind of engagement, but we should also engage our readers in a conversation about what data we are collecting, why we are collecting it, and how we are protecting personally identifiable information in the process.

Earlier this year, Kashmir Hill wrote about how the New Yorker had exposed its subscribers’ passwords and some credit card info through their subscription management software. Essentially all you needed was the info on their magazine’s mailing label to gain access to a person’s full New Yorker account. The New Yorker fixed the issue quickly, but the case is emblematic of an industry that is still adapting to new kinds of security threats.

In July, the Wall Street Journal’s computers were attacked and a hacker claimed to have possession of personal information for users of WSJ.com. At the time, the Journal said there was no evidence that customer data had been affected and noted that the same hacker had targeted other media organizations like Vice. Last year, computers at the Washington Post, New York Times, and Bloomberg News were all infiltrated by Chinese hackers.

At a time when trust has once again dropped to historic lows, it is in news organizations’ best interests to be more transparent about how they collect and protect user data. In an era of data breaches and targeted hacking, strong and clear privacy policies that create a safe and secure place to read the news may become a competitive advantage.

Advertisers, Surveillance and Journalism

As an industry we need to come to address our increasing reliance on advertising tools that collect massive amounts of data about people who visit our sites. How much do we disclose about these third party programs to our readers, and what is our responsibility when those tools are used against them?

For example, last December the Washington Post revealed that the NSA had piggybacked on one of Google’s cookies to track users and “pinpoint targets for hacking.” This Google cookie is present not just on Google’s websites but also anywhere a Google service or widget is embedded, including on many news organization websites. “This shows a link between the sort of tracking that’s done by Web sites for analytics and advertising and NSA exploitation activities,” Ed Felten, a computer scientist at Princeton University, told the Washington Post.

I installed the browser plug-in Ghostery, which tells you what trackers are active on the sites you visit, and went on a short tour of some news websites. Of the sites I visited, the Wall Street Journal topped the charts at 62 trackers. The Atlantic had 41. Forbes clocked in at 28. The New York Times had 26. Vox had 23. The Huffington Post had 19. The San Francisco Chronicle had 17. Yahoo News had 10.

When I go to the Washington Post to learn about gov data tracking, I’m hit by *fifty* commercial data trackers. pic.twitter.com/aRIIZ4ufSi

— dan sinker (@dansinker) June 7, 2013

It should be noted that not all of these trackers were from ads on the site. As noted in the Google example above, all kinds of services track users. In June, Jason Kint, the CEO of Digital Content Next, an online publishers association, wrote, “Facebook dropped a bomb on the industry with the announcement it will target ads based on the browsing histories of its users…Every page you visit with the ‘Like’ button sends data back to Facebook regardless of whether you ‘like’ it or not.”

The advertising industry has argued that this tracking software is essential to maintain and expand ad revenue by presenting readers with more personal and relevant ads. Given the financial challenges many news organizations have faced in the last five years, it is unlikely that we’ll see news organizations abandoning targeted ads wholesale.

“Once we’ve assumed that advertising is the default model to support the Internet, the next step is obvious: We need more data so we can make our targeted ads appear to be more effective,” writes Ethan Zuckerman in the Atlantic. “So we build businesses that promise investors that advertising will be more invasive, ubiquitous, and targeted and that we will collect more data about our users and their behavior.” Zuckerman calls advertising the original sin of the Internet.

But at least the original original sin brought with it new knowledge. In contrast, much of how online ads work, and the surveillance they enable, remains hidden from view. For Zuckerman, the best solution is to pay for the services we use and “abandon those that are free, but sell us ?—? the users and our attention ?—? as the product.” I think that is likely part of the answer, but when it comes to access to news and information I don’t think people should have to pay for privacy.

In his response to Zuckerman, journalism professor Jeff Jarvis acknowledges that the system as currently structured is broken but argues that we shouldn’t give up on advertising. News organizations, writes Jarvis, need to restructure their business model as a service to readers and community, built on the pillars of transparency, accountability and user-control. I’d take this idea a step further and argue that we need journalists to actually advocate for reader’s privacy (just as we need the public to advocate for press freedom and journalists’ rights).

It’s Time For Newsrooms to Lead

Journalism has long claimed to serve the public interest. In the digital age, part of that service should be standing up for its users and pushing the ad industry to strike a better balance between privacy and tracking. We don’t have to abandon advertising, but as journalists and news organizations we should be forceful advocates for better advertising systems that give people more control over how their data is used.

News organizations could also help educate readers by more actively informing people about how the ads on their site function and what steps users can take to protect themselves. See for example how sites in the U.K. have had to adapt since a law prohibited tracking without consent. This kind of active digital literacy, explicitly notifying and educating users, goes beyond passive transparency (i.e. posting a notice in your privacy policy).

Some in media, however, are going the opposite direction. Yahoo (Yahoo News is regularly ranked the most news website by traffic numbers) for example, recently announced it would no longer honor people’s use of “Do Not Track”—a privacy tool built into browsers. Over at Search Engine Land, they explain.

The Do Not Track browser setting (also referred to as Tracking Preference Expression) allows users to send a signal to websites that they don’t want to be tracked or have their information passed along to entities like analytics and advertising networks with a header request. However, websites and advertisers can choose to ignore Do Not Track requests without penalty.

As I have written before, tools like Do Not Track are useful but limited. This past July, ProPublica and Mashable reported on a new tracking tool that is nearly impossible to be blocked. According to the report, this new, “extremely persistent” online tracking technology (called canvass fingerprinting) was found on “thousands of top websites, from WhiteHouse.gov to YouPorn.com.” The source of this canvass fingerprinting was AddThis, a social sharing widget used by many news and media sites. AddThis lists ABCNews, DailyMotion, The Today Show, and financial website The Motley Fool as clients (as well as 14 million other websites).

As part of its report, ProPublica let users see how “your browser generates a unique fingerprint image,” offered a sidebar with six tips for how to try to block fingerprint collection, and provided a link to other tools readers can use to protect themselves. Finally, ProPublica’s privacy policy highlights its commitment to user privacy and security in clear and easy to understand language. Similarly, when The Intercept launched earlier this year, staff there went into great detail about the steps they had taken to invest in secure tools and protections for their readers as well as their journalists.

I’d like to see more new sites educate and advocate around these issues. But as a starting place, the industry has to at least acknowledge their own role in this debate over privacy and security in a digital age. The Society for Professional Journalists just revised their code of ethics. The Online News Association has launched a DIY code of ethics project. Poynter is investigating the intersection of algorithms and ethics.

We should also consider these questions about data collection and reader privacy in the context of journalism ethics.

Since the revelations brought about by Edward Snowden’s leak of NSA documents there has been renewed attention and debate about journalist’s security and their ability to protect sensitive reporting materials and sources. Those press freedom issues are critical, as governments around the world crack down on leakers and threaten journalists.

However, our readers and communities are also stakeholders in this debate, and they have largely been left out of the debate. It’s time for that to change.

This post originally appeared on Medium and was reprinted with permission.

Photo via Brenda Starr/Flickr (CC BY 2.0)

It’s time for a real debate on reader privacy

News sites need to open up about their privacy policies and data usage habits.

How Much Information Are News Sites Collecting?

Advertisers, Surveillance and Journalism

It’s Time For Newsrooms to Lead