Data Exfiltration

What is Data Exfiltration?

A critical function of computer and network security is keeping sensitive data inaccessible to unauthorized entities. Data exfiltration is a cybercrime that poses a serious threat to computer network security. Data exfiltration is a form of illegal leakage of sensitive data from a particular individual or organizational system. Google explains data exfiltration as, “…when an authorized person extracts data from the secured systems where it belongs, and either shares it with unauthorized third parties or moves it to insecure systems. Authorized persons include employees, system administrators, and trusted users. Data exfiltration can occur due to the actions of malicious or compromised actors, or accidentally.”¹ Current techniques for preventing data exfiltration attacks include the use of firewalls, intrusion detection systems, intrusion prevention procedures, and anti-virus and anti-malware programs.

Despite heavy use of techniques to prevent data exfiltration, cyber-attackers continue to have successful schemes. A study by Nyakomitta & Abeka² explored how the commonly used techniques fail to prevent data loss. The authors noted that data exfiltration attacks are hard to catch as they often achieved through social engineering, such as those attacks where an insider leaks information such as usernames and passwords. This can be accomplished through the use of email attachments, where a clicked link leads to an installation of malicious software, or through the actions of an insider who plugs in a USB device infected with malware. As these actions are hidden behind legitimate processes, anti-malware, anti-virus, and host-based intrusion systems often do not detect them. Further, they also employ multi-faceted approaches which make forensic analysis of the attack difficult.

Nyakomitta & Abeka found that current data exfiltration mechanisms fall short of expectations for effective and efficient data loss prevention, and they recommend that organizations adopt an algorithm that is based on information entropy- segregating plaintext and encrypted traffic, heuristics- scanning the behavior of network traffic, and functional correlations of network traffic- intercepting traffic that is relaying different formats of communication than what is expected based on four decision-trees: expected encrypted received encrypted, expected encrypted received plaintext, expected plaintext received plaintext, and expected plaintext received encrypted.

The use of cloud services introduces some new and different data exfiltration risks to be mindful of, which mostly concern the actions of entities who use features of cloud servers in insecure ways¹. Any entity who has the ability to manipulate virtual machines (VMs), deploy code, or make requests to cloud storage or computation services can introduce data infiltration potential. Data infiltration in cloud service use can be prevented by maintaining specific and narrowly scoped permissions, comprehensive logging, scanning, enforcing compliance with security policies, identifying and redacting sensitive data, compartmentalizing data, crating redundancy to increase accountability, connecting monitoring systems to employees with impending termination, and creating a baseline of normal data flows to identify abnormal behaviors.

¹ Google Cloud, 2021, “Preventing Data Exfiltration”

² Nyakomitta & Abeka, 2020, “A Survey of Data Exfiltration Prevention Techniques”