According to analysis from different sources, Yandex source code does not contain user data, but it does contain over 1,900 factors for ranking search results and more.
The source code repository of the Yandex search engine and technology giant was leaked as a torrent, containing over 1,900 factors for ranking search results. The company, also called Russian Google, had the data leaked on Brached Forums, a hacker forum that surfaced as an alternative to the popular and now-seized Raidforums.
The incident should not come as a surprise, since Yandex or its products are often under cyber attack. In 2016, Hackread.com exclusively reported on how a dark web vendor was selling 6.3 million Yandex user account data.
In September 2021, the Russian search engine giant was hit by one of the largest DDoS attacks powered by 200,000 compromised IoT devices.
What was Leaked?
The leaker has shared a magnet link containing 44.7GB of files linked to Yandex git sources. The files were allegedly stolen from Yandex in July 2022. Apart from containing anti-spam guidelines, the code repositories are believed to have Yandex’s source code.
The leak revealed around 1,922 ranking factors the search engine uses in its search algorithm. The code was leaked as a torrent. Per the analysis posted by Twitter user Alex Buraks, the leaked data includes numerous ranking factors, including text relevancy, PageRank, content age, freshness, etc.
Moreover, several end-user behaviour factors, link-related factors, and host reliability exist. SEOs find some unusual ranking factors, such as the number of unique visitors, average domain ranking across queries, and percent of organic traffic.
You probably heard about Yandex, it’s the 4th biggest search engine by market share worldwide. Yesterday proprietary source code of Yandex was leaked.— Alex Buraks (@alex_buraks) January 27, 2023
The most interesting part for SEO community is: the list of all 1922 ranking factors used in the search algorithm
According to a data leaks investigator/researcher, Arseniy Shestakov, the leaked Yandex Git repository contained technical data and coding related to Yandex’s major products such as the following:
- Yandex Taxi
- Yandex Mail
- Yandex Maps
- Yandex Market
- AI assistant Alice
- Yandex Direct Ads service
- Workspaces service Yandex360
- Cloud storage service Yandex Disk
- Travel booking service Yandex Travel
- Payment processing service Yandex Pay
- Yandex Cloud, and internet analytics solution Yandex Metrika.
Shestakov further noted some API keys, which most likely have been used to test deployment.
Yandex Denies Hacking Attempt
Yandex claims that it is aware of the leak and has already initiated an investigation to check how source code ‘fragments’ were exposed to the public. It is worth noting that the leak doesn’t include user or employee personal data.
However, considering the significance of Yandex in Russia’s IT infrastructure and leaked data, it could be assumed that the attack was motivated by the country’s invasion of Ukraine. So, pro-Ukraine hackers could be involved.
In its official statement, Yandex clarified that the company wasn’t hacked and a former employee could be involved in leaking its source code in the public domain. Russia’s leading IT firm noted that the leaked archive includes code fragments that are part of an internal repository, the data of which is different from what is used in the latest version of the repository.
“Yandex was not hacked. Our security service found code fragments from an internal repository in the public domain, but the content differs from the current version of the repository used in Yandex services,” the company’s statement read.
Nevertheless, source code leaks are dangerous for posing serious security issues to organizations since threat actors can observe the company’s intellectual property and system data. Leaking of source code would help attackers create targeted security exploits.