A growing number of data breaches, coupled with greater breadth of global data privacy laws, is raising awareness of the critical reasons why data protection policies must not only be formalized within the enterprise, but also fully integrated into the enterprise architecture and holistically integrated into the system development lifecycle.
Many organizations have relied on traditional perimeter security as a defense against unauthorized data exposure. However, with an increasing tendency toward cloud migration and hybrid information environments, perimeter security is more complicated to deploy. And, in the event that perimeter security is breached, unprotected data assets are an open target.
At the same time, there is an increased demand for data democratization and the ability to provide data access to a variety of data consumers with potentially different data visibility profiles. Data democratization influences different data “owners” within the enterprise in how their data assets are managed and shared to reduce data extracts and corresponding data redundancy (and subsequent inconsistency). And while conventional relational database systems may provide some granularity with respect to managing access privileges, the scope of variables that influence and affect data access controls is ballooning far past traditional role-based access control (RBAC) frameworks--especially as data sharing expands beyond the conventional corporate firewall in a variety of different ways, including data federation and access via APIs.
These issues are driving different facets of organizational implementation of data protection protocols, such as the aforementioned role-based control policies as well as newer techniques such as attribute-based access control (ABAC), in which access privileges are modulated in relation to attributes assigned to the data assets. And while there are practical differences between these types of data protection policies, there are some characteristics among them that can help frame the process for soliciting, interpreting and synthesizing data controls, as well as simplifying procedures for defining and describing data protection directives.
One approach is to look at data protection policies as a function of variables. Here is one attempt at identifying some of the key variables:
- Asset – the data asset that requires a protection policy, such as a database table
- Level of granularity – the component of the asset subject to the policy, such as a data attribute or column that is to be restricted in some way
- Actor – the user, role and/or group that is subject to the policy
- Privilege – the permissions associated with accessing or using the asset, such as selecting attributes from a database table
- Constraint – the restriction imposed by the policy, such as not allowing certain individuals to see protected values
- Context – The circumstances under which the constraint is effective, which might depend on characteristics of either the actor or the asset
- Duration – The timeframe within which the policy is in effect
In turn, data protection policies that depend on these variables can be structured using a layout such as this:
Within the context, the actor’s privilege is limited via the constraint in accessing the asset at the level of granularity during the duration.
Let’s look at a specific example: An online game company collects information about its customers, including demographic data--such as name, birthdate, home address, and telephone number--into a customer database table called Customer. Since these data attributes could be considered personally identifiable information (PII), the company has decided that employees in the accounting department can see all these data attributes, but when other employees are running queries, the resulting records will have those data attribute values redacted.
In this case the asset is the Customer table; the level of granularity is the collection of data attributes name, birthdate, home address, and telephone number; the actor is the employee; the privilege is selecting and viewing; the constraint is redacting the values; the context is that the employee is not in the accounting department; and the duration can default to forever. The policy then can be stated as:
“When the employee is not in the accounting department, the employee’s ability to select and view data is limited to redacted values when accessing the Customer table’s data elements of name, birthdate, home address and telephone number forever.”
Perhaps this formulation is a bit clunky, and from a linguistic perspective could be massaged to read more clearly. From a formalization perspective, though, the goal is to come up with a framework that presents a reasonable description to the data security steward defining the policy, but can be equally easily interpreted by an automated process that will enforce the policy. Both of these features are going to be critical when evolving an inventory of data protection policies that is likely to explode in relation to the growing numbers of data assets and actors identified within the enterprise.