Data Privacy is the new gold. Information is gathered from a variety of sources: IoT devices, M2M communication, facial recognition software, etc. These extensive datasets provide companies with immense opportunities to better understand customer needs, detect important correlations between seemingly non-related variables, support decision making, predict customers’ behavior, weather conditions, market fluctuations, you name it.
Collecting, structuring, and processing data is not the easiest task, but it’s perfectly attainable even for small businesses. Given that companies can extract so much value at a relatively low cost, this data-collection gold rush shouldn’t be surprising.
However, with great power comes great responsibility. Any piece of personal information that becomes accessible to a company is a serious opportunity to invade privacy and damage an individual’s fundamental rights. Unauthorized collection, weak protection, or irresponsible processing imply serious risks for both an individual and the company. Despite the immense benefits that big data analytics brings to the table, data should never jeopardize privacy. Balancing the risks and power of big data should be one of organizations’ top priorities.
Today, big data consulting is as relevant as never before: failing to comply with security and privacy regulations can result in huge fines, lawsuits, and permanently damaged public image. For example, GDPR infringement fines can go up to 20 million EUR or 4% of a company’s annual revenue. Moreover, authorities can prohibit companies to process data at all, which means going out of business in many cases.
Big Data Privacy Challenges
Data analysis is not a revolutionary concept. Businesses have been using data analytics for decades to gain competitive advantage and increase profits. The consequential question here is, why did big data privacy only become a pressing issue recently?
The problem is in the scale. The volume and variety of data have skyrocketed due to significant technological advancements, making privacy concerns more apparent. In this regard, we can identify these data privacy challenges:
Lack of Control
The extensive variety of data sources across different systems makes control over personal information much more complex. In many cases, people have no clue what personal information is gathered and how it is processed. This raises a significant technological challenge for companies to inform their users accordingly. Data captured with security cameras or web search cookies are great examples of often ungoverned data processing.
Companies collect data for a reason. For example, healthcare organizations need your medical history to treat you safely. However, when this data is placed in the hands of third parties without your consent, it can raise many ethical concerns.
The proliferation of AI algorithms allows us to combine and process seemingly unrelated data sets and deanonymize highly sensitive personal information. This implies a significant threat to one’s confidential information. One of the most famous scandals related to data inference happened in 2012 when Target identified a teenage girl’s pregnancy by analyzing her web search history and sent her discount coupons for associated products.
Excessive Automation and Profiling
Companies using big data in ecommerce, for example, are looking to make profit thanks to it. Targeting plays a huge role here. The more you know about your customer, the more effective your marketing and pricing strategies.
However, customer profiling can often lead to discrimination and unfair allocation of benefits based on race, age, location, etc. For example, today’s online retail platforms frequently apply ML-based pricing differentiation, which proved to be an effective method for driving revenue. However, in case of incomplete data, such practices may lead to mistakes that deprive perfectly qualified people of their entitled prices or bonuses.
EU Data Protection Acts
With the all-permeating digitalization of our world and the rapid advancements of big data, the EU had no other choice but to enforce new regulations to protect personally identifiable information of its residents. The GDPR (General Data Protection Regulation) has substituted the Data Protection Directive, which was first introduced in 1995. The new regulation’s main goal is to make companies more responsible for how they collect, store, and process data of EU citizens. The GDPR came into force on May 25, 2018, and applies to any country within the EU, regardless of the legislative stance of local governments.
There are two main roles identified in the GDPR:
- Data Controller is the main body, which outlines the purposes and ways of processing personal data. Essentially, this is the business, which needs data to operate.
- Data Processor, in the majority of cases, is the company responsible for creating software that processes data for one’s business needs. This is of highest importance, as now the responsibility is split between both roles, making software development companies pay more attention to securing data privacy from the get-go.
The GDPR defines personal information as every little piece of information that can be linked to a particular person. Besides obvious characteristics like full name, occupation, race, and physical characteristics, any hints left on the internet that can be traced back to a person are subject to GDPR compliance. For example, the nickname you once used on a long-forgotten fan page ten years ago can still be used to identify you as a person and, therefore, is also considered protected under the GDPR.
Next, let’s briefly go through the main principles set by the GDPR:
- The reason for collecting, storing, and processing data must be supported by a legal document, such as a written user consent or a specific contract.
- The above mentioned contracts must be understandable, transparent, and concise. In a nutshell, vaguely identifiable terms or a tiny blueprint are not tolerated by the GDPR.
- Data may be collected for explicit reasons for a particular business. Basically, businesses can gather only data identified in the consent.
- Businesses may collect only the minimum necessary data for their operations. It’s important because today’s automation tools and seemingly infinite cloud storages make it tempting to collect data just for the sake of it or even for ill-intended purposes.
- Data must be accurate and up-to-date. Any small misinformation may constitute GDPR infringement. Companies need to be equipped with the latest data cleaning techniques to ensure the compliance. Inaccurate data has to be immediately erased; delays are not tolerated.
- Individuals may be identified for no longer than necessary. For example, if a service is no longer provided to a specific person, this person’s data may not be stored or processed.
- Companies now take full responsibility for any data loss, damage, or unlawful processing. Anonymization systems are required to fully protect users’ identity, keeping their data confidential.
- Businesses must keep records of all data-related activities. Any act that concerns data collection or processing must be documented and justified. Authorities now have the right to request documents that prove GDPR compliance.
These rules concern both old and new software products. It’s also worth noting that even if the company doesn’t currently have EU clientele, it still may be the case in the foreseeable future, so the GDPR compliance is required. Unsurprisingly, it is much easier to develop new software with GDPR compliance in mind than try to tailor existing apps to the new regulations.