Data scraping firm leaks 235m Instagram, TikTok, YouTube user records

According to researchers, the trove of data was left for public access without any security authentication.

August 19, 2020

2 minute read

Data scraping firm leaks 235m Instagram, TikTok, YouTube user records

According to researchers, the trove of data was left for public access without any security authentication.

While we do as much as we can to protect our personal information, more often than not, a lot of it is still out in the open thanks to social media networks. This has led many agencies and individuals in the past to scrape this information in bulk using it for marketing purposes despite the fact that it is illegal.

See: Database with millions of Instagram influencers’ info leaked online

One such agency is Social Data which offers access to the data of influencers to marketers. However, things went wrong recently when Bob Diachenko from Comparitech found a database of the company exposed on August 1st containing 235 million user profiles from different social networks composed of the following:

Instagram – total 192,392,954 records from 2 datasets
TikTok – 42,129,799 records
YouTube – 3,955,892 records

Here’s how the data looked like for the public:

Data scraping firm leakes 235m Instagram, TikTok YouTube user records — Screenshot from the leaked data (Credit: Comparitech)

The origination of the data points at another non-functioning company named Deep Social which allegedly has no link with Social Data but nonetheless has been notorious for illegal data scraping in the past.

The data exposed includes usernames, full names, contact information, images, follower statistics, age, gender, and a few more details – most of which is publicly available but nonetheless it is important to remember that despite that, all social media networks have prohibited such scraping activities.

Yet, Social Data seems to differ on this with their spokesperson stating to Comparitech that:

Please, note that the negative connotation that the data has been hacked implies that the information was obtained surreptitiously. This is simply not true, all of the data is available freely to ANYONE with Internet access….Social networks themselves expose the data to outsiders – that is their business – open public networks and profiles. Those users who do not wish to provide information, make their accounts private.

Concluding, currently after this report, the database was taken offline but we’re still not sure of who may have had access to the data before. If a threat actor did, they could use the data for a range of nefarious purposes such as spearphishing and spam campaigns, social engineering for sophisticated attacks, and even plain cyberharassment.

For the future, we hope social networks could further improve upon their anti-data-scraping solutions to decrease the likelihood of automated scrapers escaping detection.

See: Misconfigured Amazon S3 Buckets Exposed US Military’s Social Media Spying Campaign

The incident should not come as a surprise since misconfigured databases have exposed billions of sensitive records in the last couple of years. In fact, the situation is so critical that according to a new poll database configuration errors are the number one threat to cloud security.

Remember, last year, LexisNexis, a legal search engine providing “computer-assisted legal research and Pipl.com, knowns as the world’s largest people search engine exposed their databases online. It took hackers just a few days to access both databases before ending up selling them online.

Did you enjoy reading this article? Do like our page on Facebook and follow us on Twitter.

The Latest

“TTF Trap” Phishing Emails Use Fake Font Files to Deliver Windows Malware

Two Scattered Spider Members Sentenced to 5.6 Years Over TfL Cyberattack

OkoBot Malware Uses ClickFix, Hidden Browser Extensions to Steal Crypto Data

Fake Céline Dion Paris Tickets Sold on Facebook and Ticketmaster Clones

Data scraping firm leaks 235m Instagram, TikTok, YouTube user records

According to researchers, the trove of data was left for public access without any security authentication.

Pulse Security Debuts Operational Management Platform Built for Security Leaders

Insignary Launches Clarity On-Demand: SBOMs, No Annual Commitment Required

Tego AI Finds Claude Tag Slack Integration Can Trigger Unauthorized Enterprise Actions

Greenhat Announces Successful Delegation at Web Summit Vancouver 2026

Torq and Criminal IP Partner to Deliver Decision-Ready Threat Intelligence for Autonomous SOC Operations

Data scraping firm leaks 235m Instagram, TikTok, YouTube user records

According to researchers, the trove of data was left for public access without any security authentication.

Related Posts