Data is growing at a phenomenal rate, for every type of organization. With that expansion comes a renewed need to make sure that sensitive personal and business information isn’t exposed or in violation of the strict new compliance laws that are being enacted around the world.
In the previous part of this series, we introduced the challenge of identifying and understanding the sensitivity of information within unstructured and structured data across the enterprise storage silos. How are companies defining what sensitive data is? What kind of technologies can they rely on to identify this information to help make sure that they aren’t exposed to serious privacy breaches and compliance violations?
As you can guess, some pieces of information are more difficult to detect than others. The following sections will illustrate this point more precisely by outlining some approaches to detect, map, and categorize sensitive data: rule-based methods, and machine learning methods: supervised learning and unsupervised learning (clustering).