Over thirty thousand DMCA notices reveal an organized attempt to abuse copyright law
The Lumen Project’s database houses copies of millions of content removal requests originally sent to online service providers. These copies form the foundation for a wealth of research related to, among other topics, illegitimate attempts at chilling online free speech.
Between June 2019 and January 2022, the Lumen Database received copies of almost 34,000 notices that appear to be deliberate fraudulent attempts to misuse the DMCA notice-and-takedown process. In this post, I will discuss certain features of the notice set, including how I assessed them to be fraudulent, the likely motivation behind this abuse of DMCA and the potential impact of such organized takedown attempts.
How has the DMCA process been abused?
Since February 2022, I have been researching the notices within the Lumen Database, looking for evidence of misuse of the DMCA process. As a part of this research, I found a set of notices that are in strong likelihood an organised attempt to abuse the DMCA notice and takedown process in order to have legitimate news articles and related critical information taken down from the internet.
The notices I found use the “back-dated article” technique. With this technique, the wrongful notice sender (or copier) creates a copy of a ‘true original’ article and back-dates it, creating a ‘fake original’ article (an article that is a copy of the true original) that at first glance appears to have been published prior to the true original. Then, based on the claim that this back-dated article is the ‘original’, the copiers send a DMCA to the relevant Online Service Providers, alleging that the true original is the copied or ‘infringing’ article and that that the copied article is the original article — requesting the takedown of the true original article. The wrongful notice sender then removes the fake original url after sending the DMCA request, likely in order to ensure that the article does not stay online in any form. If the takedown notice is successful, this means the disappearance from the internet of information that is most likely to be legitimate speech.
My research within the Lumen Database found 33,988 notices sent to Google between June 2019 and January 2022, by over thirty different notice senders, targeting over 550 online domain names. All of these notices used ‘today-news.press’ as the domain where the ‘fake original’ url was created and then used as the basis of sending the DMCA request.
Most of the 550+ domain names targeted by the 33,988 notices appear to be online news forums. I manually investigated a randomly selected representative pilot set of 500 DMCA notices within the larger notice set. In this smaller set most of the domains targeted were Lithuanian, Ukrainian and Russian online news, including Horoshiye News, Antikor News, Radio Free Europe, Pravda News, Glavnik News, and Respublika among many others. Further, the material located at the allegedly infringing urls in this subset of notices all relate to allegations of misconduct, corruption, sexual harrasment and other allegations against the same set of individuals, making it quite plausible that these notices were all part of a systematic and organized attempt to remove critical news articles.
To investigate whether the takedown attempts had been successful, I cross-referenced the notice set of the 33988 DMCA notices with Google’s transparency report and Google search results. 99.2% of the takedown attempts were unsuccessful and the true original urls continue to be indexed in Google, while 0.8% of the DMCA notices resulted in the delisting of legitimate content. Despite this small percentage, the overall large scale of this coordinated attempt to abuse the DMCA process means that approximately 300 URLs with legitimate content are no longer present in Google’s search results. I was able to retrieve additional data and statistics from the notice set with exceptional support from Justin Clark, a data scientist and one of Lumen’s developers.
The findings from the notice set:
The DMCA takedown requests in the present notice set of 33,988 notices listed sixteen different jurisdictions (as marked by the notice sender), including (in descending order of frequency) USA, Ukraine, Russia, France, UK, Netherlands, Germany and Switzerland. Over 60% of the DMCA notices were sent from the USA and 28% were sent from Ukraine and Russia.
All the notices in the present notice set of 33,988 notices claim to have the ‘fake original’ URL or URLs they cite (the false basis for takedown) published in the blog ‘today-news.press’. A search of https://today-news.press indicates that there is no currently live website with this domain name and accordingly, none of the ‘fake original’ urls with the domain name today-news.press that have been used to send the 33,988 notices are live either.
Not all back-dated articles have been, well… back-dated. There are notices in the notice-set where the wrongful notice sender of the DMCA has not even attempted to ‘back-date’ the article and has merely copied the article and then “published” it with a different date that is later than the date of the ‘true original’ url, but has nevertheless claimed that the original is a copy and has requested the takedown of the original article. For example, this notice in the database requests the removal of eight news articles, one of which is titled “Miami Is ours: Why Russian Businessmen and Bandits Settle in Trump Towers,” dated July 11, 2016. The basis on which this ‘allegedly infringing url’ is requested to be taken down is a today-news.press blog, which is no longer live, although a screen capture of the url in the Wayback machine indicates that the ‘fake original’ page was dated July 25, 2016. In this specific instance, the true original url which was the subject of takedown was not de-indexed from Google, although it is unclear whether the lack of back-dating was the reason for this decision.
Another observation from this set of notices is that not only do they recycle the same fake domain, but exactly the same fake original url has been used in order to request the takedown of multiple articles. This means that not all of the true original articles have been copied to create fake originals prior to seeking their removal. In many cases, a fake original of one article has been used as the basis to request the takedown of over a dozen other articles with different text. For example, the fake original url in this DMCA notice has not only been used to attempt the takedown of a true original article with the same name, but was also used to attempt the removal of over 5000 other true original urls across over 15 news domains. For example, in this DMCA notice, the same fake original url from today-news.pres is being used to attempt the removal of 10 true original urls. Examples of other such notices are here, here, here, here and here. Similarly, fifteen ‘allegedly infringing urls’ have been taken down based on a single ‘fake original’ url in this notice.
Not only has the same fake original url been used to attempt the removal of thousands of true original urls, but in this set of notices, multiple notice senders used the same fake original url for a range of different materials. For example, over ten senders used the same fake original url to attempt the removal of over 750 true original urls in a series of notices available in this search query. This is another piece of evidence that points toward the conclusion that the different senders are a facade for a single systematic effort to abuse the DMCA process.
Motivation behind abuse and its impact:
The content in all of the articles for which the fraudulent DMCA notices have been sent relate to allegations of criminal allegations including corruption, child abuse, sexual harassment, human trafficking and financial fraud against US, Russian, and Khazakstani bureaucrats, people allegedly belonging to the Russian mafia and individuals with ultra-high net worth, with certain high-profile beauraucrats mentioned in most, if not all the material. The material located at the urls in question all collectively reveal the relatedness and relationships between a powerful group of individuals and allege the ways that such power is abused.
Several of the domains targeted (with the true original articles) have also been banned by Roskomnadzor, Russia’s Ministry of Information Technology, which is infamous for its overbroad censorship. Roskomsvoboda, a Russian non-governmental organisation founded by the Pirate Party of Russia that aims to counteract censorship on the internet, maintains a register of banned sites and has several urls from the targeted domain names on its register including Horoshiye, Antikor, Pravda among others.
Sets of notices similar to the one I describe here have previously been studied by organisations such as Rest of World and Qurium, which have exposed DMCAtakedowns that have been falsely orchestrated in order to manage the reputation of bureaucrats and public officials. The findings from both this prior research and from this present notice-set indicate that this sort of business finds at least some of their clients in individuals or organisations that require reputation management services.
For example, in the work by Peter Guest from Rest of World, he found that “Among the thousands of names listed as clients (of a reputation management firm) are the former foreign minister of the Dominican Republic, an individual indicted in Argentina for his role in a cryptocurrency pyramid scheme, and people accused of corruption worldwide, all apparently looking to erase information about themselves from the internet.” These clients are willing to pay amounts of up to $30,000 to target articles and pages that are critical of them. It seems plausible that this uniform method for sending wrongful takedowns could be a standardised business model relying on abuse of the DMCA to remove legitimate content. An evidence of this is in the present notice-set itself, where the streamlined methodology to manage reputations of individuals is clear because they all target articles critical of the same group of powerful individuals.
Fortunately, this technique has not gone unnoticed by at least one of the major Online Service Providers — Google. Google has specifically addressed this form of DMCA misuse in their transparency reporting on content de-listing. In Google’s transparency page on content de-listings due to copyright, Google highlights the example of a DMCA notice it received by an “individual claiming to represent a news site filed a copyright complaint against a second, reputable news site for use of their article”. Google noted that it did not take action to remove the ‘allegedly infringing’ url because investigation showcased that “the individual’s site had been created and then back-dated for the purpose of filing this takedown request.”
Even though Google presumably has ways of identifying such misuse of the takedown process, the set of DMCA notices I have identified show that despite the efforts and willfulness on Google’s part to detect DMCA abuse, some attempts nevertheless make it through the filter and lead to a chilling effect of free speech online. For example, as a result of the present notice-set, over 300 articles are no longer a part of Google search results. This data-set is just one example of several other such systematic attempts and techniques to misuse the DMCA takedown process. It is also likely that the motivation to prevent damage to personal reputation is also one of many other motivations underlying similar abuses of the DMCA.
Such research into the abuse of DMCA notices is possible because of the transparency that Google offers by sharing copies of content removal requests with the Lumen Database. Similar research cannot be done or validated through various other online platforms since they do not share content removal requests with Lumen. In order to make such research possible, we encourage more online service providers to come forward and share copies of content removal requests with the Lumen Database. For more information or to get access to the notices in the database, reach out to Lumen at firstname.lastname@example.org.
Note on Methodology (How I did this research and how you can too):
My intention starting out was to find evidence of abuse of the DMCA notice process with an underlying motivation of personal reputation management. I first searched multiple terms in the Lumen Database including ‘Counternotice’ ‘Criticism’, ‘President’, ‘Ministry’, and ‘Corruption’. I also/then used the ‘advanced search’ tool to conduct country-specific searches. For each country, I opened all notices from the first 3 or four result pages in the hopes of finding a counternotice that contested wrongful takedown. In one of the notices with Russia’s country code within the results of a search for the word ‘corruption, I found the term ‘today-news.press’ as the ‘fake original’ url. Upon further investigation, I found that it wasn’t live.
I then conducted additional searches for the term ‘today-news.press’ and found tens of thousands of notices that used it as the basis for takedown. I also carried out the search of the same domain name along with the name of individuals mentioned in many of the notices — to investigate whether the notices had been sent for notices about the same/similar group of people. To determine whether the notices were indeed an attempt to back-date articles and abuse the DMCA process, I used the WayBack Machine to find screen captures of the today-news.press urls that had alleged to be the original urls to find whether they had existed online prior to the alleged copies (which are the true originals).
Using the resulting set of notices as my initial finding, I continued to manually explore the types of content that had been the subject of the DMCAs in question. Since most senders and articles seemed to be in Russian, downloading an automatic language detection and translator extension was very helpful in making the search efficient. Using Lumen’s advanced search tools, I found the final dataset of over 33,000 notices that I used as the notice-set for the findings in this post.
Author: Shreya Tewari is a Research Fellow at Berkman Klein Center’s Lumen Project.