The Facebook Data Leak Explained

The Facebook Data Leak Explained
Ivan Righi
Read More From Ivan Righi
April 8, 2021 | 8 Min Read

This weekend press exposed a significant data leakage containing the records of 533 million Facebook users. The records were posted on multiple cybercriminal forums for free. This incident exposed the personal information of Facebook users, including phone numbers, emails, full names, job occupations, and birth dates. Much of this information was likely scraped from public Facebook profiles, as Facebook alluded to in a new statement made on 06 Apr 2021. However, the leakage also included data that wasn’t made public by users, such as their phone numbers. In this blog we dive into what happened, how the information was exposed, who has taken responsibility for the attack, and the risks involved to affected users.

What is the Facebook Data Leak?

The initial incident started in mid-to-late 2019. It is believed that threat actors scraped Facebook’s website to acquire the information of millions of users. Web scraping refers to the process of using automated scripts or bots to harvest public information from sites, such as any information users make publicly available on their profiles (Names, City, Education, etc.). 

Scraping is not a new technique, and it occurs daily. Cybercriminals frequently scrape sites such as Facebook, Twitter, and Reddit, and many other sites. Cybercriminals can leverage the data extracted from sites for a variety of purposes, including spamming, information gathering, and social engineering attacks. They can also sell scraped data for a profit to other cybercriminals, marketing companies, or call centers.

Figure 1: Raidforums user advertising scraped Instagram database

As previously mentioned, data scraped from sites is usually public data. If users set their emails, names, and locations to be public, then that data could be viewed and harvested by virtually anyone. However, the data exposed from Facebook wasn’t your usual data scraping incident. Threat actors were able to harvest users’ phone numbers, even if the users had set their number to be private on their Facebook profiles. Facebook stated that they believed that cybercriminals accomplished this by exploiting Facebook’s “contact importer” feature, which allows users to find other users by using their phone numbers. 

This feature could have been exploited by uploading large sets of phone numbers and identifying which Facebook profiles matched the numbers. Facebook stated that this feature was fixed in September 2019, following the discovery that threat actors were abusing the feature. However, while Facebook fixed the feature in 2019, the phone numbers of 533 million users had already been harvested by malicious individuals, along with other identifying information on users.

How Was the Facebook Data Distributed in the Cybercriminal World?

Initially, attackers offered the data at quite a steep price. As the data began circulating in open and gated cybercriminal forums in 2020, a listing on Russian-speaking cybercriminal forum XSS in August 2020 advertised the sale of this data for “only” USD 25,000 (see Figure 2). Listings were identified across several other forums, such as Raidforums. The sheer size of the data leakage and the wide geography it covered (106 countries) made the data a gold mine for cybercriminals. Therefore, these listings often caught the interest of multiple threat actors.

Figure 2: XSS user advertises Facebook leak in August 2020

The XSS user who initially shared the data was allegedly responsible for the attack. When other forum members questioned the origin of the breached data, the original poster claimed that they had exploited a zero-day vulnerability on Facebook’s website. This vulnerability allegedly allowed the threat actor to grab users’ data from their Facebook ID (see Figures 3-4). The user also stated that the data extracted dated from 01 Jan 2020, as Facebook had patched the vulnerability by then. The user did not provide further information.

Figure 3: XSS user claims that they exploited a vulnerability to crawl Facebook users
FIGURE 4: XSS user provides more information on how they claim to have acquired the Facebook data leak

Cybercriminals often purchase data to re-sell it to other cybercriminals for a profit, the cost or set price of the data breach lowering with each transaction. From 2019-2021, the data likely exchanged hands multiple times— an activity frequently observed in cybercriminal forums. Eventually, the data breach becomes devalued, and users will expose it for free to gain reputation or notoriety within a cybercriminal forum. In the case of the Facebook breach, this is the most likely situation. On 03 April 2021, a user on the English-speaking cybercriminal forum Raidforums uploaded the entire Facebook breach for a negligible cost of eight forum tokens (approximately USD 2.52).

FIGURE 5: Raidforums user exposes the Facebook data leak for free

Within 5 days, more than 4,800 forum members had unlocked the data with their tokens; the thread received over 1,000 replies and 200,000 views, making it one of the most viewed threads on the criminal forum. The data was an instant success within the cybercriminal community. The data leakage and free download links have since been reposted across multiple deep and dark web forums. The data can now be easily acquired by any cybercriminals who wish to use it.

What Data was Included in the Breach?

Virtually every individual included in the data leakage had their phone numbers exposed, including Mark Zuckerberg himself and other founding members of Facebook. The exposure likely depended on how much information users left public on their profile, with the exception of their phone number. Any data that was public on the affected Facebook profiles was likely harvested. The dataset typically included the victim’s full names, location, phone numbers, Facebook IDs, the company they worked for, and birth dates. 

FIGURE 6: Mark Zuckerberg’s data exposed in the Facebook leak (phone number censored)

Email addresses were also a high-value, sensitive piece of personal data exposed in this leakage. However, not all accounts contained exposed emails— security researchers predicted that only those accounts that opted to make their email addresses public in 2019 were affected. Digital Shadows identified more than 122 million email addresses listed in the data leak. Most of these emails were Facebook.com emails in the format: Facebook_ID@Facebook.com, which were likely emails used for Facebook messages, and not users’ personal email addresses. Therefore, removing these revealed a more realistic number of emails exposed in the breach. The number of email addresses exposed was distributed as follows.

Total Emails exposed (excluding Facebook.com) emails3,300,747
.com emails2,602,626
.edu emails5,997
.org emails3,428
.gov emails514
Others (.de, .net, .fr, .co.uk, .ru, etc)688,182
Table 1: Number of email addresses exposed in the Facebook data leak

What is Your Risk as a Facebook User?

If you believe that your email address or phone number was affected by the breach, you check whether or not your data was exposed with the service HaveIBeenZucked.

Fear not! The leaked data included no passwords, and it is unlikely that cybercriminals can use the information by itself to hack into your accounts. However, users who had their data exposed should be aware of suspicious and unsolicited emails, phone calls, and messages from unknown sources. Considering the high interest that this leakage has gathered within cybercriminal communities, it is highly likely that criminals will attempt to use the data to launch social engineering attacks or spam users with unwanted messages. Call centers may also use this data to continue launching vishing (voice phishing) attacks on unsuspecting victims.
While this data may be “old,” it is likely that information has remained unchanged for most users. After all, individuals do not usually change their phone number and email address every year or two. High-profile Facebook users, such as politicians, company executives, and public figures, are most likely to be targeted by attacks, but all affected users should proceed with care. Data leakages such as this one are common, and if your information wasn’t affected by this leakage, it might have been exposed in other incidents. As security experts often say, it is not a matter of if data has been exposed, but when. Therefore, it is crucial for users to always be cautious and exercise security best practices wherever possible.

Annex A

The list of countries affected, along with the number of records exposed:

Egypt45,183,147
Italy35,677,337
USA32,315,291
Saudi Arabia28,804,686
France19,848,557
Turkey19,638,821
Morocco19,147,770
Colombia17,957,906
Iraq17,116,398
South Africa14,323,766
Mexico13,330,561
Malaysia11,675,893
United Kingdom11,522,328
Algeria11,505,898
Spain10,894,206
Russia9,996,405
Sudan9,464,722
Nigeria9,000,127
Peru8,075,316
Brazil8,064,915
Australia7,320,478
UAE6,978,927
Syria6,939,528
Chile6,889,082
Tunisia6,247,880
India6,162,449
Germany6,054,422
Netherlands5,430,387
Oman5,048,532
Yemen4,617,359
Kuwait4,502,021
Libya4,204,514
Israel3,956,428
Bangladesh3,816,531
Canada3,494,385
Palestine3,367,570
Kazakhstan3,214,290
Belgium3,183,540
Jordan3,105,988
Singapore3,073,009
Iran3,057,522
Bolivia2,959,209
Hong Kong2,937,841
Qatar2,789,724
Poland2,669,381
Argentina2,339,557
Portugal2,227,361
Cameroon1,997,658
Lebanon1,829,661
Guatemala1,645,068
Switzerland1,592,039
Uruguay1,509,317
Panama1,502,310
Costa Rica1,464,002
Ireland1,449,921
Bahrain1,424,219
Finland1,381,569
Czech Republic1,375,988
Austria1,249,388
Sweden1,092,140
Ghana1,027,969
Philippines889,629
Mauritius848,558
Taiwan734,807
China670,334
Croatia659,115
Denmark639,841
Greece617,722
Afghanistan558,393
Angola508,903
Albania506,602
Norway475,809
Bulgaria432,473
Japan428,615
Macao414,284
Namibia409,356
Jamaica385,890
Hungary377,045
Ecuador318,824
Botswana240,632
Slovenia229,039
Lithuania220,160
Brunei213,798
Luxembourg188,201
Serbia162,898
Puerto Rico138,183
Indonesia130,321
South Korea121,744
Cyprus119,022
Malta115,367
Azerbaijan99,472
Georgia95,193
Estonia87,533
Maldives86,337
Moldova46,237
Iceland31,343
Honduras16,142
Burundi15,709
Haiti15,407
Djibouti14,327
Ethiopia12,752
Burkina Faso6,413
Fiji5,364
El Salvador4,479
Cambodia2,838
Table 2: Records exposed per country (order from largest to smallest)

Access Our Threat Intel In Test Drive

Test Drive SearchLight Free for 7 Days
Try It Now

Connect with us