Google’s Stealing Content, and Genius Has Had Enough

Reading Time: 4 minutes.

Genius Logo I was recently trying to search for some information on two different skateboards. I Googled for a comparison, but all I could find was an article about the introduction of one of the new decks I was comparing… written by me. But I didn’t see this as a result, I saw it as the information scraped from my page and displayed in a text box on Google’s page. Now, my post didn’t have the information I was looking for (obviously), but it was the most relevant result. Let’s say my page did have what you were looking for, and Google displayed it on their page. Let’s say the segment contained all the information you needed, so you never went to my site. Google would get the ad revenue that I should have gotten, all because they copied my work. Now I make nothing, and Google profits from my work without my permission.

Not fair, right? Now, usually these snippets draw the reader into a page like mine. Usually you can’t just find a few sentences that give a person the entire perspective that a full review or news article—like those you’d find on Leaf and Core—offer. Usually, Google’s system of skipping to the relevant information benefits a site like mine.

But what if it doesn’t? What if they just scrape all of your content and display it in the Google result? What if no one ever goes directly to your site due to this scraping of all of your content?

That’s where lyrics platform Genius finds itself. They claim Google has been scraping their lyrics and displaying them in search results in full, stealing the money they could otherwise make from selling products or ads. Now they’re suing Google and LyricFind for that scraping for $50 million.

What Google and LyricFind Did

Rather than obtaining lyrics through legal means, Google, through their partnership with LyricFind, copied lyrics directly from Genius through scraping. This is the equivalent of simply copying all the content out of the page and pasting it into results. If you’ve taken a class that involved writing papers, you know this is a big no-no. Genius was able to prove that Google had been stealing their copies of the lyrics by alternating apostrophe styles programmatically. This unique apostrophe structure, which a human wouldn’t notice but computers could differentiate, proves that Google was copying the lyrics verbatim, doing no extra processing on the inputs.

Normally, this would be very illegal. However, it lands itself in a new legal gray area. As it turns out, Google has the rights to share the lyrics from the record labels that hold the copyrights to these songs. However, they got the lyrics through Genius, not through the record labels. That’s where it became theft.

Theft? Seriously?

Yes, seriously. For those of you who haven’t worked in tech, specifically music tech, let me do some broad-stroke explaining. When putting together a catalog of songs, there’s a lot of information you get from publishers, called metadata. This includes song length, artist, album, song notes, copyright information, attributions, and lyrics. Each company has a slightly different process, and formats the information slightly differently. Therefore, you have what’s often called an “ingestion process.” This pulls this data in from many sources, reformats it depending on where it came from, and turns the metadata into database entries that match the format of your company. It’s what makes using a product like Genius seamless.

Ingestion is an expensive process with a lot of upkeep. Large teams of people work to translate gigantic streams of terabytes of data daily. There’s more going on in music than you realize. Figuring out what parts of the song are verses, choruses, interludes, rap segments, and more, can be difficult, especially if any of that information isn’t included for one reason or another. Often it requires a human touch.

Getting the picture? This is a lot of work. It’s a lot of processor work. It takes many employees creating code to run on many servers along with many people reading over those results to perfect them.

If I were to equate it to anything, it would be like writing an essay, or perhaps a blog post like this one. You take many sources, perhaps 20 sources, do a lot of hard work, and put it together into one item, one source. Now imagine if every day you went to school with your essay some bigger kid came up, stole the money in your pocket, and copied your essay verbatim. You wouldn’t be too happy, would you?

Right now, Genius is doing the work that Google has more than enough resources to do. However, they’re bullying the smaller company, stealing their work, and even stealing the money they’re making off of that work.

This is actually a huge problem.

What’s Next?

Genius already asked Google to stop this behavior nicely in June of this year. Google continued stealing Genius’ work. Now they’re taking them to court for $50 million in damages. A pittance for Google, but it’s what Genius thinks will make up for the theft of their labor. Unfortunately, legal experts say Genius may not have strong legal ground. They’re going to have to prove that the ingestion process creates a unique work. It does. Take it from this engineer, it most certainly does. Scraping itself is legal, but theft of that labor is a different story. However, explaining that to a judge and jury, who will have zero experience in software engineering, may be a struggle. Legal experts believe Google may have the upper hand.

However, if Genius can make their case that their work, their employees labor, and their server time was stolen when Google copied their work without paying for it and without driving traffic to their site to pay for it, then they may be able to win the jury over. No one likes to see a bully like Google come in and crush a small business because it can.

Software engineers have a lot to lose if Genius loses this battle. It’ll minimize the value of their work, and make U.S. law more akin to Chinese law when it comes to copying and sealing work. To fuel innovation, we need work to have value. We can’t let Google’s bottom line hold innovation back.


Sources:
,