Google has no right to read the news
Robert Thomson, editor of the Wall Street Journal, recently described companies that aggregate mainstream media content without paying a fee as the “parasites or tech tapeworms in the intestines of the internet.” Thomson was referring primarily to Google News, the largest aggregator of mainstream news content. Alexander Macgillivray, senior product and intellectual property counsel at Google, responded that Google only shows “snippets and links under the doctrine of fair use enshrined in the United States Copyright Act.” These snippets are generally the headline followed by the lead, the first sentence of the story.
Ask any journalist what the most important section of text in a news story is and they will tell you it is ‘the lead’. Following the inverted pyramid style of news writing, the lead signposts almost all the information that will follow and allows the old-school reader to quickly scan the newspaper and decide whether to continue reading any particular article. As the old-school reader scans the newspaper, jumping between various headlines and leads, his eyes hover past advertisements, and the advertised products can successfully catch the attention of this old-school reader and consumer. The consumer then pays for a product, the product pays the newspaper and the newspaper pays the journalist.This is why news has financial value.
As the new-school reader scans Google News, jumping between various headlines and leads, his eyes could also hover past advertisements and these products can successfully attract the attention of the new-school consumer. The consumer then pays for a product, the product pays Google and somebody else can somehow pay the journalist. This is in a nutshell why newspapers are going broke despite more people reading the news than ever before. Newspapers no longer control the distribution of news content, Google does. Google might not advertise on Google News yet, and Google insists it isn’t violating any copyright, but the company seems to overlook the innate purpose of copyright. Copyright exists to provide a financial incentive for the creation of original content. Google News uses original news content created by journalists but does not provide any financial incentive for its creation. Google News generates an incredible income from that all-important lead the journalist composed, just as the newspapers of yesteryear did, but Google doesn’t pay anyone for it.
Marissa Mayer, Google’s vice president of search products and user experience, recently spoke to a United States Senate hearing that was investigating ways the government could aid the failing newspaper industry. Mayer explained how Google News acted as a sort of conduit that channeled web traffic to the newspaper sites:
Google News and Google search provide a valuable free service to online newspapers specifically by sending interested readers to their sites at a rate of more than 1 billion clicks per month. Newspapers use that web traffic to increase their traffic and generate additional revenue.
Google functions similarly to newspapers in that it channels information to end-users. The information that Google channels are the links or ‘directions’ to the entire content. Mayer describes how Google News uses the headline and lead to direct the reader to the newspaper’s website:
We show people just enough information to invite them to read more – the headline, a line or two of text, and a link to the news publisher’s website.
Bill Grueskin, academic dean of the School of Journalism at Columbia University, says that may be true, “but many web readers are entirely satisfied with just a headline and summary.” It is the very nature of news. Report the basic facts. That is all most readers want from the news. The problem is, who is going to pay the journalists to go out and report and record these headlines if Google doesn’t?
Google claims that the content they use, the leads copied verbatim from the original content creators, are fair use under the Digital Millennium Copyright Act . The Copyright Act sets out four factors for courts to consider whether something is covered under fair use, these include whether the copyrighted work is factual in nature or fiction, whether the purpose of the copied work is transformative rather than merely copying, the amount and substantiality of the work that is copied and the effect the copying will have on the market or potential market. All of these factors are relatively difficult to judge in regards to the copying of news story leads for the purpose of aggregation, but news publishers are beginning to believe a judgment must be made soon in order for the industry to move forward.
With recent slides in profit for most news publishers, and in particular News Corp, who posted a 47 percent drop in operating profit on May 6, publishers are beginning to decry that online aggregation is having a dangerously adverse effect on the market. In April, Rupert Murdoch said that Google was stealing the news:
The question is, should we be allowing Google to steal all our copyright… People reading the news for free on the web, that’s got to change.
The Associate Press, who currently has a licensing agreement with Google that is due to expire this year, has also lashed out at online aggregation, with AP chairman, Dean Singleton, saying:
We can no longer stand by and watch others walk off with our work under misguided legal theories. We are as mad as hell, and we are not going to take it anymore.
The primary issue is what constitutes fair use. Does the use of the headline and lead of news story, what most journalists would consider the heart of the story, amount to an insubstantial part?
In Australia, the recent ruling of the IceTV v Nine Network case provides guidelines of how copyright law might apply to the aggregation and linking of news stories.
IceTV produces electronic television program guides that paying users subscribe to. The Nine Network produces a Weekly Schedule of programs to be broadcast on their free-to-air television stations and supplies the schedules to various third parties known as “Aggregators” who in turn collate this information with schedules provided by other networks to produce the “Aggregated Guides” which form the basis of most TV guides published across different media. IceTV was not one of these third parties.
The Nine Network accused IceTV of stealing their original “literary work” by taking part of the time and title from the Aggregated Guides. According to the Nine Network, the Ice Guide published by IceTV infringed copyright by reproducing a “substantial part” of the Weekly Schedule produced by the Nine Network. IceTV accepted that the Weekly Schedule produced by the Nine Network was an original literary work and that copyright subsisted in it but they denied that they reproduced a substantial part.
It is interesting to note that the only thing IceTV was reproducing were facts. IceTV were copying the name of a program, such as A Current Affair, and the time it was scheduled to appear. Facts are generally considered to be outside of copyright but Sect 42 of the Copyright Act regards only the fair use of copyrighted works if it is for the purpose of reporting the news. If the Sydney Morning Herald were to publish an article about a controversial program that was scheduled to air, and detailed the time it was to air, it would be covered under “fair dealing for the purpose of the news.” But IceTV isn’t a news publisher and this consideration of fair dealing is murky for the production of guides and aggregators.
These facts weren’t necessarily facts, as the TV programs hadn’t been broadcast yet. It could be the equivalent of reproducing some modern day Nostradamus’ predictions that in 2011 Eddie McGuire will grow a beard and the world will end. It might actually happen, it could be a potential fact, but reproducing the original literary work would be an infringement of copyright. That modern Nostradamus put a lot of time and effort into producing that literary work, just as the Nine Network put a lot of time and money into producing their Weekly Schedule. The Nine Network’s argument, however, was an extraordinarily dangerous argument and could have lead to all sorts of seemingly innocent information being considered off-limits due to copyright infringement. Sporting schedules, flight schedules, train timetables and university semester dates would all be protected under copyright if the Nine Network’s case had been successful. But IceTV didn’t argue that they were reporting facts, they argued that what they reproduced was an insubstantial amount of Nine’s literary work.
IceTV reproduced only the title and the time of the program. IceTV did not reproduce the synopses of the television programs but wrote their own. The information they did reproduce were in essence links. The title of the program indicated what the program was and the time indicated when it could be seen. These would function the same way in Google News as the headline of a news story and the accompanying URL that directs the reader to the original content. The lead that is reproduced however summarises the news story and could infringe copyright, it would be the equivalent of IceTV reproducing the synopses from the Aggregated Guides.
In the primary judgment of the IceTV case the judge ruled that that the “lengthy preparatory work involved [in the Weekly Schedule] was directed to the conduct of the business of Nine in broadcasting programmes that would attract viewers” . The judgment emphasized that the appropriation of skill and labour does not necessarily constitute infringement. In the final judgment the judge stated it was not helpful to refer to the University of London Press Ltd v University Tutorial Press Ltd case and “the rough practical test that what is worth copying is prima facie worth protecting.” Similarly the judgment stated that it was unhelpful to refer to “commercial value of the information because that directs attention to the information itself rather than to the particular form of expression” and it is the particular form of expression that is significant.
The judgment ruled against the Nine Network stating that the information reproduced by IceTV was insubstantial because the “slivers of information” and their particular form of expression were not particularly original. The judgment referenced the Ladbroke case that showed that the reproduction of an unoriginal part of an original whole would not be an infringement:
The reproduction of a part, which by itself has no originality, will not normally be a substantial part of the copyright and will therefore not be protected.
The headlines and leads that are produced in the news however are original parts of an original whole despite the fact that they are through the reporting of the news hopefully relaying plain facts. In essence if news publishers were to pursue Google for copyright infringement regarding the republishing of their headlines and leads they would have a case. This was demonstrated in 2006 when it was ruled by a Belgian court that Google violated the law by publishing copyrighted content on Google News from Copiepresse, a group of 18 French and German language publications. Google argued that the reproduction of original content was to the benefit of the news publishers:
We believe that Google News is entirely legal. We only ever show the headlines and a few snippets of text and small thumbnail images. If people want to read the entire story they have to click through to the newspaper’s website. Search tools such as Google Web Search and Google News are of real benefit to publishers because they drive valuable traffic to their websites and connect them to a wider global audience.
Google was initially threatened with a €1 million fine per day of infringement but the fine was dropped to €25,000 a day. Google stated that any news publisher can request Google to remove its content whenever they like or could through the design of their news website prevent Google from having access to it. Copiepresse argued that it was not their responsibility to prevent Google from infringing on their copyright and that Google should request to use their content rather than the onus being on the publisher to request Google not to use their content. Subsequently after the judgment in the Belgian court Google removed the Copiepresse content from both Google News and Google search. Copiepresse felt that the removal of their content from Google search was quite extreme. Margaret Boribon, Secretary General of Copiepresse, said:
They have done it to punish us. They have a bad attitude.
This is probably what prevents most news publishers from pursuing Google for copyright infringement; they do not to be removed from Google Search because disappearing from Google means disappearing from the web. Google has over a 75 percent share of the search engine market. It monopolises the internet. News publishers still want to be a part of Google, they just want Google to pay licensing fees for using their material As Boribon said:
Yes we have a problem with Google, but we don’t want to want to be out of Google. We want Google to respect the rules. If Google wanted to index us they need to ask.
Google refuses to pay licensing fees to most publishers. In reverse the Nine Network refused to license their material to IceTV preventing IceTV from having any access to their material. It is an interesting contrast of the power of monopolies with one of the few encourage forms of monopoly, that of copyright. Boribon hoped that the success of the Copiepresse case would encourage other news publishers to follow suit.
What I’m achieving now is getting information to my European colleagues so we will have other publishers taking part in the court case. Then maybe Google will change its mind. If they see this is not a Belgian case but a concern for all publishers all over the world, they will have to review their business model.
Boribon is hoping that if enough news publishers withdraw their content Google will be forced to negotiate. Yet if all the publishers were to form some sort of cartel and remove their content from Google they would be breaking most countries anti-competition laws. A few major publishers need to lead the way and from the murmurs currently coming from News Corp and Associated Press this could be happening in the near future.
But it isn’t simply a case of new media versus old media. There was a recent uproar among independent bloggers about the aggregation of their material. The bloggers were concerned not only with the copyright infringement but the infringement of their moral rights. It occurred when bloggers Matt Haughty and Joshua Schachter complained about the reproduction of their content on the Wall Street Journal’s subsidiary site All Things Digital. Matt Haughey wrote on his blog:
This is weird, apparently the Wall Street Journal’s All Things D does a reblogging thing. I sure wish they asked me first though. That’s a hell of a lot of ads on my ‘excerpt’. If they’re just trying to drive traffic to articles, why have comments on excerpts? That makes no sense to me.
Blogger Danny Sullivan, from SearchEngineLand.com, justified the aggregation saying “that is a compliment, allthingsd liked your article enough to feature you on their homepage.” This is similar to Google’s argument that the aggregation of their material is a benefit to the news publishers because it directs traffic to the publisher’s site. Web veteran and independent blogger Merlin Mann disagreed that this was fair compensation:
In the case here, for Matt and Josh, that compensation was “a link” and — what? — I guess the opportunity to pretend that you write for a giant-for-profit corporation. And because, as the story goes, every blogger writes primarily (or even exclusively) in order to generate page views that bolster his site’s advertising revenue, they/we/I should all be grateful for the largesse of our True Fourth Estate. Even if a giant for-profit corporation’s re-use of that work actually undermines the real motivations, it would be uncivil, ungrateful, and untoward for us to not thank them for helping us out with our little projects. Right?
All Things Digital published a substantial excerpt of the original blog post in a format indistinguishable from their own original content, including bylines and author photos, making it appear that the content was written for All Things Digital. Bloggers felt this compromised their independence by implying they are associated with All Things Digital. Daring Fireball blogger, John Gruber, who is often featured on All Things Digital was uneasy about the association:
I don’t like it. The look and feel of the Voices pages suggests that I’m somehow affiliated with AllThingsD, but I am not. I obviously don’t have any problem with AllThingsD, or anyone else, linking to and quoting portions from my articles at Daring Fireball, but the presentation on their Voices pages seems to imply something else.
These are bloggers that are no strangers to aggregation, and most reached success through aggregators such as Digg or Reddit, but these sites are purely aggregators, like Google News, and not publishers. All Things Digital’s aggregation of their material blurred the distinction between publishing their material and simply linking to their material. As Merlin Mann noted:
Republishing online work without consent and wrapping it in ads is often called ‘feed scraping.’ At AllThingsD, it’s called ‘a compliment.
Most independent bloggers license their material under the Creative Commons, generally with the permission to share and remix the content under the conditions that the work is attributed fairly and used for non-commercial purposes. The primary problems that the bloggers had with All Things Digital’s aggregation of their content was that it didn’t comply with those two conditions. Under the Attribution condition the republisher must “attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work).” As demonstrated, the style of the All Things Digital’s attribution gave the impression that the blogger endorsed the website so it did not comply with this condition. The Noncommercial condition stated that the republisher “may not use this work for commercial purposes.” All Things Digital also failed to comply with this condition by wrapping each post in ads.
The primary concern for the bloggers, however, was that it violated their moral rights and compromised their independence. As Merlin Mann succinctly put it:
Nobody but me is allowed to decide why I make things. And — if and when I choose to give away the things that I make — nobody but me is allowed to define how or where I’ll do it. I am independent.
The independence of these blogs is vital to their success, the bloggers are respected for their individual opinions and analysis and they make their living generally through speaking engagements and. If their independence is threatened, their income is threatened. These blogs may be thriving on the web and the bloggers may be critical of the “giant-for-profit corporations” and “the True Fourth Estate.” They may believe that the failure of newspapers to survive the Internet is simply due to old media “not getting it” but blogging is not reporting the news.
Bloggers may provide opinions and analysis but most bloggers don’t have the resources to investigate and break stories on an hourly basis. Bloggers and Google who have the web at their fingertips might believe that the news is simply out there, only one click away. Bloggers might believe that the headlines that keep filling up Google News and are the source for most of their stories are like a never ending well. But the well might dry up. Sometimes all we need is a headline and a lead, and if Google News doesn’t value this original content, doesn’t pay for it, then the news will stop coming. This is why we have copyright. It isn’t to hinder the dissemination of knowledge, as the open source community might have you believe, it is to ensure that people are encouraged to continue to create. Google has no right to read the news and not pay for it.