Data scraping firm leaks 235m Instagram, TikTok, YouTube user records

According to researchers, the trove of data was left for public access without any security authentication.


According to researchers, the trove of data was left for public access without any security authentication.

While we do as much as we can to protect our personal information, more often than not, a lot of it is still out in the open thanks to social media networks. This has led many agencies and individuals in the past to scrape this information in bulk using it for marketing purposes despite the fact that it is illegal.

See: Database with millions of Instagram influencers’ info leaked online

One such agency is Social Data which offers access to the data of influencers to marketers. However, things went wrong recently when Bob Diachenko from Comparitech found a database of the company exposed on August 1st containing 235 million user profiles from different social networks composed of the following:

  1. Instagram – total 192,392,954 records from 2 datasets
  2. TikTok – 42,129,799 records
  3. YouTube – 3,955,892 records


Here’s how the data looked like for the public:

Screenshot from the leaked data (Credit: Comparitech)

The origination of the data points at another non-functioning company named Deep Social which allegedly has no link with Social Data but nonetheless has been notorious for illegal data scraping in the past.


The data exposed includes usernames, full names, contact information, images, follower statistics, age, gender, and a few more details – most of which is publicly available but nonetheless it is important to remember that despite that, all social media networks have prohibited such scraping activities.

(Credit: Comparitech)

Yet, Social Data seems to differ on this with their spokesperson stating to Comparitech that:

Please, note that the negative connotation that the data has been hacked implies that the information was obtained surreptitiously. This is simply not true, all of the data is available freely to ANYONE with Internet access….Social networks themselves expose the data to outsiders – that is their business – open public networks and profiles. Those users who do not wish to provide information, make their accounts private.


Concluding, currently after this report, the database was taken offline but we’re still not sure of who may have had access to the data before. If a threat actor did, they could use the data for a range of nefarious purposes such as spearphishing and spam campaigns, social engineering for sophisticated attacks, and even plain cyberharassment

For the future, we hope social networks could further improve upon their anti-data-scraping solutions to decrease the likelihood of automated scrapers escaping detection.

See: Misconfigured Amazon S3 Buckets Exposed US Military’s Social Media Spying Campaign

The incident should not come as a surprise since misconfigured databases have exposed billions of sensitive records in the last couple of years. In fact, the situation is so critical that according to a new poll database configuration errors are the number one threat to cloud security.

Remember, last year, LexisNexis, a legal search engine providing “computer-assisted legal research and, knowns as the world’s largest people search engine exposed their databases online. It took hackers just a few days to access both databases before ending up selling them online.


Did you enjoy reading this article? Do like our page on Facebook and follow us on Twitter.

Related Posts