Data Loss Prevention (DLP) is a newer area of information security and assurance which has arrived in recent years. There are a host of software products, controls and solutions which have found there way onto the market to help facilitate DLP, whether those losses be malicious or inadvertent. This market seems fledgling but is maturing as time goes on. People are just starting to understand the effects of losing data, most of which is lost by mistake. Around 77% of data loss is “inadvertent” and unintended. Basically, people make mistakes. A much lower percentage of data loss is malicious. Compliance seems to be a major driver for the implementation of the solutions and many key security players are positioning DLP as a core element of ongoing strategy. The question I have is, at this stage is are we ready to effectively apply AI(Artificial Intelligence) based systems, where the intended objective is for those AI systems to scan, analyse and more important classify information as sensitive or unimportant?
The DLP market does seem to be a slow starter with a very small percentage of companies intending to deploy, with a further fraction of that minority actually having a deployed system. The bulk of these solutions are what Gartner terms “content aware”. They generally monitor network/email traffic and at the same time deploy agents which can scan internal network resources (file shares, etc) for sensitive data which is available where it shouldn’t be. The idea is, that when sensitive information is located, it should be either removed, quarantined, blocked in transit or authorised to remain in place or be distributed. The problem is, that while it is easy enough to recognize information like credit card numbers, it becomes exponentially more difficult for these systems to understand more qualitative content. Qualitative content (e.g. information that is expressed in verbose literal wording and not distinctive formats or patterns) is difficult for an AI system match up against a particular pattern or template for it to effectively classify the information. Examples of this type of information may include, a new product idea for an investment bank, a ground breaking formula for a new medicine in a pharmaceutical company or perhaps even a world cup winning team strategy for a national football team. Information of this nature is usually specific on a company-by-company basis and also a case-by-case basis. One sports team strategy may not look anything like another.
It is for this reason, the term “False Positive” is becoming widely used in the market and anyone who’s worked with DLP systems (or tried to deploy one) will Continue reading