Back to CloverDX Blog on Data Integration

GDPR: Finding your data is the first step

Posted by CloverDX on January 09, 2018

GDPR needs you to look at 3 different aspects of your business in order to comply with the regulations – the legal side, the processes you manage, and the data itself. While many people are focusing on the first two, it’s your data that is the basis for any other effort you’re making with GDPR.

Data is the foundation of GDPRDo you know where your data is?

To be able to manage your data properly, you first need to know where it is. And that’s not necessarily easy to determine. Especially in large, complex organisations, there can be many, many places where personally identifiable information (PII) data is stored, each potentially managed by a different region, department or individual.

Even if you think you know where your data is – it’s likely that what you actually know is where your data is supposed to be. An example – you might think you have a database that contains no personal information, because it has no fields for ‘name’ ‘address’ or equivalent information. But what about a ‘description’ field? As soon as someone enters in a note there that includes a phone number – that becomes PII. In most organisations, there are many thousands of these data points, so finding this data, wherever it lives, is impossible to do manually.

Why it’s important to know where your data is

Without knowing where you are storing data, you can’t put the necessary legal and process changes in place for compliance. Establishing rules to ensure that your organization is processing data legally is a crucial part of adhering to GDPR regulations (the ‘legal’ and ‘process’ parts of the solution), and one which relies on having a unified vision of your data.

Under GDPR, you have obligations to provide information to individuals about what personal data you hold on them and how it’s processed. If there’s data lurking that you are unaware of, there’s a good chance you could breach these obligations.

And a key implication of GDPR is the subject’s ‘right to be forgotten’. If a request to erase personal data needs to be dealt with, you need to know where this information is stored across your organization. You can’t delete it if you don’t know where it is.

Step 1: Finding your data for GDPR

Determining where all your data is can be a daunting task but the CloverDX GDPR approach can help, with the CloverDX Harvester managing the first step of finding your data.  The CloverDX Harvester crawls all your data, and finds where you have PII. Not where you think it is – but where it actually is. The Harvester builds a complete map of where your data exists so that you have a comprehensive view – the first step in being able to establish rules and processes for how your systems work.

How CloverDX Harvester creates a map of your organization's data

  • The CloverDX Harvester profiles your data and finds sensitive information in various data sources
  • The Cleaner module can remove explicitly defined entities from the system
  • The Pseudonymizer performs the anonymization transformation process

GDPR: How CloverDX maps your data

  1. CloverDX Harvester receives list of sensitive domains and data samples
  2. CloverDX Harvester profiles database columns and reports statistics on the sensitive matches including weighting scores
  3. Business Analyst examines and refines results, e.g. excluding false positives

Step 2: Keeping your data useful

Anonymizing your PII is a way of complying with GDPR requirements, but truly anonymized data (let’s say by replacing sensitive information with asterisks) has limitations when using it for analysis. For example, if you’re getting statistical insights about your customers’ geographies, that information can potentially be lost when data is anonymized in this way. The answer can be in pseudoanonymisation – breaking the link between personal info and other data, but retaining some of the data’s previous qualities or characteristics. The CloverDX anonymization engine has been built specifically to address this challenge, and to keep your data usable, allowing you to extend your compliance with GDPR to broader usage of your data – for example sharing portions of the pseudoanonymized data with your software vendors as reliable test data sets.

Step 3: Managing consent

GDPR is very clear in requiring an affirmative consent to how a subject’s personal data will be used. It also suggests that multiple consents for various uses can be given, and independently withdrawn at any point. Many software vendors are working hard to remedy this issue by putting some consent management into their products. However, GDPR leaves much to interpretation when it comes what the right way of managing consent looks like. Do you need to keep an encrypted record of the consent transaction, signed with a timecode by a trusted 3rd party? Or will simple record in a database suffice?

Either way, the problem quickly grows out of individual applications and become an exercise in managing consent centrally within the entire organization. In many cases this can require additional data connections between applications, sharing the information about where, when and which consent is valid or withdrawn.

Step 4: Executing data rights

With GDPR your subjects are empowered to execute their fundamental privacy rights, like the right to be erased (forgotten), right of access or right of data portability. We can expect a significant increase of requests like these, calling for automating the process from ground up. Meaning, using the ‘PII map’ for identifying where information about a data subject sits, as well as automating fulfillment of these requests with little to no human intervention.

Step 5: Reporting and auditing

An important part of GDPR compliance includes full reporting, auditing and logging capabilities. Being able to log all manipulations with sensitive data,  as well as being able to simply build audit trails for all incoming and outgoing data to identify where it came from and where it is being shipped to, helps to reinforce your compliance.

Solid data foundations

Ensuring you can map and navigate your data landscape is essential for successful implementation of GDPR policy. CloverDX’s Harvester helps you solve the data challenges that need to be overcome in order to ensure that your organization complies with the new requirements.

Start now – contact us for an assessment

As you plan your GDPR policy and implementation, talk to us for a free, no-obligation assessment of your data situation, and advice on what your first step towards GDPR should be.

Read the white paper
'Conquering Challenges of Data Anonymization'

Data integration software and ETL tools provided by the CloverDX platform (formerly known as CloverETL) offer solutions for data management tasks such as data integration, data migration, or data quality. CloverDX is a vital part of enterprise solutions such as data warehousing, business intelligence (BI) or master data management (MDM). CloverDX Designer (formerly known as CloverETL Designer) is a visual data transformation designer that helps define data flows and transformations in a quick, visual, and intuitive way. CloverDX Server (formerly known as CloverETL Server) is an enterprise ETL and data integration runtime environment. It offers a set of enterprise features such as automation, monitoring, user management, real-time ETL, data API services, clustering, or cloud data integration.