The hidden side of politics

What the Google-Genius Copyright Dispute Is Really About

Reported by WIRED:

The antitrust pitchforks are out for big tech. First came the European Union, then Washington, DC. Not to be left out, now comes hip hop lyrics.

Over the weekend, the music annotation site Genius publicly accused search juggernaut Google of stealing its crowdsourced song transcripts and natively publishing them on its search pages in knowledge panels Google calls its “One Box.” Doing so, Genius alleges, hurts Genius’ bottom line by diverting traffic away from Genius in favor of keeping people on Google’s monetized search page instead. As Genius sees it, this is an example not just of lyric lifting but of Google using its scale to unfairly home in on a smaller competitor’s territory, which experts say could constitute a potential antitrust matter. Google strongly denies all of it, blaming a contractor for any similarity between its lyrics and Genius’.

How could Genius be so sure the lyrics on Google came from its community? Its engineers employed a clever trick, as the Wall Street Journal reported: They boobytrapped a selection of their transcripts, secretly embedding a watermark in order to track who copied their lyrics across the web.

Beginning on an occasional basis in 2016 when it first worried Google was copying its lyrics and sent the company a letter asking it to stop, and then ramping up last year into a systemic approach, Genius engineers alternated a pattern of straight and curly apostrophes in the transcripts that in Morse code reads out the phrase “red handed.” Genius sent 100 examples of transcripts it says it found on Google with the watermark to Journal reporters, who verified that the secret code was present in three songs randomly picked from that bunch.

But the day after the story published, Genius noticed something: The watermark had disappeared on most of those lyrics now on Google. Now, for most of the 100 songs that Genius had sent the Journal, all the apostrophes are straight in Google results. Had Google tried to scrub the evidence of its pilfering? That’s how it seems to Genius. “Google has known exactly what is going on for two years,” says Ben Gross, Genius’ chief strategy officer. “Now that the issue is public, they are apparently removing evidence of their behavior without addressing the underlying problem. Google is still displaying lyrics copied from Genius.”

The engineering team at Genius has been keeping track of what appears on Google lyrics One Boxes since last October, scraping and caching hundreds of Google song lyrics results every day. So they went and looked back at the daily caches to see when the watermark disappeared. They found that the watermark had been present on all the sample lyrics until June 12, and then it disappeared on June 13. WIRED examined the HTML of a random selection of these cached pages, and they do appear to show the watermark present until June 12. Though the WSJ story published on June 16, Genius says it had been in contact with WSJ reporters before June 12, raising the possibility that the watermark was scrubbed after being reached for comment by journalists.

When reached for comment, Google denied making any changes. A spokesperson for Google insists the company doesn’t create any lyric transcripts itself or scrape any websites for lyrics, relying instead on multiple third-party providers to source song lyrics for its knowledge boxes. It pointed WIRED to Canadian-based lyric transcription service LyricFind, which on Monday publicly took the blame for the Genius watermark showing up on Google Search pages (while refuting the framing of most of the reporting on the issue).

“It’s basically indisputable that this Google contractor LyricFind was just copying their lyrics from Genius,” says John Bergmayer, senior counsel at the nonprofit Public Knowledge, who has worked on numerous antitrust issues involving Google and has been watching the Genius allegations closely. LyricFind says it previously used Genius as a legitimate “reference” for its transcriptions, as it did many other sources, and is now reassessing that practice.

When asked directly if LyricFind had gone through and scrubbed the secret Morse code from the lyrics that it was providing to Google, LyricFind CEO Darryl Ballantyne did not confirm or deny doing so, saying that he believed he had answered that question in the company’s public blog post from Monday. “Our team is currently investigating the content in our database and removing any lyrics that seem to have originated from Genius,” the company wrote.

So is the watermark disappearing proof that someone was trying to cover their tracks, as Genius suggests, or that LyricFind was actually removing Genius-sourced lyrics from its database, as its CEO seems to be suggesting? It’s unclear. Genius says it hasn’t gone through to see if all the Google lyric results it was tracking now differ from Genius transcripts in any way beyond the apostrophes, though the engineering team says that, of the handful they have carefully checked, the only change they can find is those apostrophes. If the lyrics were now being sourced from somewhere else, they presume there would be other differences, like typos or dissimilar words or punctuation.

One thing that some news stories have missed about Genius’ allegations is that Google is far from alone in surfacing lyrics that may have originated from Genius. Microsoft Bing and Amazon Music also appear to have Genius-watermarked lyrics. Genius would not comment on other sites’ apparent use of its transcripts. Keeping the focus on Google may be a way to emphasize Google’s market power and thereby its anticompetitiveness, Bergmayer suggests.

Interestingly, though Google’s results are now mostly clean of the watermark, some of those other sites are still displaying the same LyricFind-sourced lyrics bearing Genius’ watermark. At the time of writing on Tuesday, Bing, for instance, was surfacing lyrics for the song “Not Today” by Alessia Cara with the curly/straight code clearly visible. That doesn’t mean Google or LyricFind necessarily changed the Google results and not the others—it could be that Microsoft just updates its feed from LyricFind less frequently, for one thing. (Microsoft has not responded to a request for comment.)

And the watermark disappearing on Google pages doesn’t change the fact that Genius appears to be right: Its lyrics seem to have been copied and pasted all over the web. But the thing is, as icky as that is, it’s not illegal. Genius doesn’t hold the copyright to these transcripts. The publishers and songwriters do. No matter how much work Genius or its community puts into compiling the lyrics into text, the song lyrics still don’t belong to them. Rather, they license them and print them with permission.

Both Genius and Google hold a license from the music publishers to print song lyrics. And because the publishers don’t have a canonical database of lyrics for licensees to plug into, every license holder is left to cobble the text together however they can. If that means copying and pasting from one another, well, that’s fair game from a copyright perspective, according to Bergmayer. The publishers themselves could even copy Genius’ lyrics and give them to other licensees if they wanted to. “So it’s a very awkward situation for Genius,” Bergmayer says, “given the fact that both Google and Genius have licensing rights from the rights holder.”

It’s even more awkward when you remember that at Genius’ inception it had no licensing agreement with the publishers at all and took tons of heat for building a site that transcribed and annotated songs it had no right to, without sharing any revenue with the creators. Genius underscores that it has grown up a lot since then. A spokesperson says it now works closely with the industry to ensure songwriters make money when Genius makes money. One of its original fiercest critics, songwriter David Lowery, came out in support of Genius this week.

Even if Genius has no copyright claim here, Google or its contractors copying from Genius still might be unfair from a competition standpoint. “It’s still potentially an antitrust problem if Google is using its search monopoly to enter some unrelated market and tie that product to the search engine in a way that gives it a huge advantage over competitors,” Bergmayer says.

That’s the real question, and one which applies not just to lyrics and Genius but to all information that appears in Google One Boxes. Is Google entering an unrelated market (i.e., music lyric transcription) by presenting lyrics on its pages, or is it just improving its own search results? It’s not always easy to tell. When you search Google for the answer to a math question, and the search engine completes the equation for you rather than surfacing a calculator, is that anticompetitive with calculator sites? Bergmayer argues no; that’s just improving Google’s product to make it work the way people want it to.

When it comes to things like reviews sites or travel booking, the anticompetitiveness argument is easier to make—as the likes of Yelp and the travel industry repeatedly have. When Google begins not just surfacing factual information like a place name or a flight time but actually allows you to post a review or book a hotel or buy something, it’s leaving its bread-and-butter search behind and doing something new. That’s the kind of behavior that got it fined by the EU for prioritizing its own products over possibly superior ones from competitors.

Some see the whole trend toward One Boxes as part of Google’s focus on keeping people within its ecosystem—sending people to its products but also just keeping them on search pages filled with Google ads. Genius says traffic from Google to its site has dropped since Google began surfacing lyrics on its search results pages. The harm there is clear: Whether those lyrics are taken from Genius or not, by not sending people over to Genius, Genius loses out on the chance to get people more involved in their community and to sell ads against its traffic numbers. This is true for sites like Wikipedia as much as it is for Genius.

In the end, this could even hurt Google. At its most basic, Google is a repository, be it of links or of actual knowledge, and it depends on knowledge-creating sites for that data. If Google imperils the ability of those sites to make money, Google imperils itself. “It’s like, are you eating your own seed corn?” Bergmayer says. “If Google is a good product because of all this information that is out there on the web, then you want to make sure you’re not inadvertently destroying the vibrancy of the web.” Google has always been better at organizing the web’s information—not cannibalizing it.

More Great WIRED Stories