Chapter 6. Access Controls

Overview

Access controls are a crucial layer of data management and security. By “access controls,” we mean any electronic mechanisms designed to limit the availability of data to users within an application. (Electronic mechanisms that limit access to the application itself, such as a login and password, would fall under information security, discussed in Chapter 4.)

Of all the capabilities discussed in this book, access controls may be the most malleable in terms of supporting diverse privacy requirements. Numerous aspects of an enterprise’s privacy policy will rely on access controls as a means of ensuring proportionality in data access, controlling data usage, and enhancing security beyond the broad system-access level. Access controls are quite versatile—there are ways to control access to the information itself (access control models), and ways to control what you can do to the information once you have access to it (access types). Access controls can even be used to limit the knowledge of the data’s existence—separate from the contents of the data. This hiding of the existence of a record can be an important part of safeguarding privacy.

The more precisely access controls can be defined and the more flexibly they can be applied, the more policy options that become available to those trying to create a robust privacy-protective regime around the use of the technology. Such flexibility reduces friction with the technology and supports creative innovation.

Keep in mind that when we refer to a system, we’re speaking about the overarching, coherent data system, which will likely be composed of multiple servers responsible for different aspects of functionality. The realization of some of the access control patterns we describe will probably require interlocking pieces of functionality running on multiple, logical (i.e., virtual or corporeal) computer systems.

Access-Control Models

Potential use cases for an application may involve the use of the same core data in a variety of contexts and by multiple people fulfilling various roles in an organization. An effective, flexible security environment, then, must allow many users to have multiple and varying access permissions in order to utilize data with multiple and varying access restrictions. There is no single ideal configuration for this security environment. For example, small organizations with a limited number of users who access data for the same general purpose may only need a few basic access control options because users are authorized to see broad cuts of the data. But larger organizations with a significant number of users each accessing data for a wide variety of purposes will likely need greater diversity of access-control options. Selecting the right one will depend on the specific context in which the system will operate. System designers can choose between several levels of access control granularity:

Network/system-level security

Users in this configuration are given broad access to the data in a system. They can access either an entire network of databases or the entirety of a single database.1 You can envision this as the electronic equivalent of a large filing cabinet with a single lock on it. When the single lock is opened, the entire contents of the filing cabinet are available.

Collection-level security

Data within the system is maintained in various distinct data structures, such as database tables. Users can access an entire, coherent collection of information within the larger data system but be restricted from others. In this case, each drawer in our filing cabinet has a separate lock with different users having different sets of keys. Once they open the drawers, they have access to all the file folders and their contents within that drawer.

Record-level security

Users are restricted from accessing information on a record-by-record basis. In this case, our filing cabinet metaphor starts to become insufficient, but imagine that once a user has opened the file drawer there were some means of controlling which folders the user could open, at which point the user would be able to view all of the documents within that folder.

Cell-level security

In this configuration, access is controlled on a data-point-by-data-point basis. For each level of granularity of chopped-up data in your system, there is a separate access control for that discrete segment of information. By now, our metaphor has all but fled the field, but you could think of this as some means of redacting the documents within the folder on a page-by-page basis depending on each user’s authorization.

Sub-cell-level security

Here, you are not only controlling access to the data in your database but also data about the data (the metadata). Metadata may include information about the time and date of the creation of the data in the system, the system user who originally input the data, previous modifications of the data, and other information relevant to the creation and use of the data. At this point, our metaphor is in tatters. One way to understand this would be to imagine a separate filing cabinet next to our original filing cabinet, where the new filing cabinet has all of the aforementioned data about the first filing cabinet stored with its own cell-level access controls.

However, not all systems will need controls as granular as sub-cell level security. Each of the levels above can be optimal in the right context. Selecting the right level of granularity for a specific application will depend on the situation, and a few considerations can help you determine the appropriate level. For instance:

  • How big is the system in terms of both users and data? The more users and the more data in the system, the more likely the system will contain data not every user needs to see.

  • Will some or all of those users have different uses for the system? A system with a single purpose will likely require fewer access controls because everyone who logs in to the system will probably need to see the same data. More complex systems involving a wider variety of data used for a greater array of purposes are likely to have more complex access requirements.

  • Can the data be structured in a way that lends itself to logical divisions? The level of granularity necessarily depends on how small you can chunk the data while still being sensible to the user. If the data is only understandable when viewed at the record-level, then it would be useless to enable access control at the cell level. This would be like allowing a system to secure words one letter at a time—it’s hard to find the point of securing every other letter in a w_r_, for example.

These factors will all interact to help determine the right level of access control. Consider the following examples:

Electronic library card-catalog system

Many users may access the system, but generally, all are using the system for the same purpose—to find books. The data looks like the standard card catalog information, with a row of the database table representing a single index card where each cell in that row contains information such as title, author, ISBN, Dewey decimal system number, etc. In this case, there might be little need for a very granular access control regime, as libraries typically do not deny people access to portions of the card catalog. However, perhaps the library has a separate children’s/young adult catalog (as often is the case), and the library would rather not have minors browsing some of the racier titles on its shelves. In this case, it might opt for table-level security, where one table contains children’s titles, and the other, adult titles. The library is unlikely to require cell-level security, however, because there is usually not any reason to obscure portions of a library index card.

Alternatively, using a record-level approach, the same library could structure its card catalog as a single table of cards and restrict access to just the racier titles in the library (assuming they are marked accordingly). This would allow young adults free reign of the whole library, minus some select works of Henry Miller. While the record-level security requires greater granularity in marking individual records than the two catalogs approach, it maximizes the ability of any patron, regardless of age, to access as much of the library as possible—something that is probably core to the mission of the library.

Health clinic patient-record system

A limited number of nurses and physicians in a clinic might end up treating any patient who walks in the door. In a greatly simplified version of an electronic medical record (EMR) system, let’s say a patient record is contained in a single row in a database table with each cell containing various bits of information such as contact information, prior conditions, notes on the diagnoses, tests performed, drugs prescribed, and so on. If the system is only accessed by the doctors and nurses to manage patient records, then system- or collection-level security would be sufficient. However, suppose the system is also used to manage patient billing. Now billing clerks also require access to the system. They might need to see the list of tests performed as well as basic patient contact information, but they would not need to see other information in the record, such as potentially sensitive patient history. In this scenario, cell-level security might be more privacy-protective because it would allow the clerks to have access to only those portions of the patient record necessary to do their jobs. Figure 6-1 illustrates alternative views of an EMR using cell-level security.

Intelligence analysis system

Consider a massive database, one that might be used by a national government intelligence agency. Data comes from a variety of sources, including highly-protected operational assets (i.e., spies) or sophisticated and secret electronic-collection methods. The collected data may be used by potentially hundreds or even thousands of analysts for a variety of purposes—counter-terrorism, detection of economic espionage, defense of electronic networks, counter-intelligence, support for foreign policy initiatives, etc. Operational security is very high and includes multiple security clearance classifications that must be attached to the data. In addition, there is a high level of public demand for protection of individual privacy. This should lead to policies designed to minimize exposure of data to only those analysts with an absolute need to view it (and official authorization to see the information). Here, sub-cell level security is best suited to provide the flexibility essential to managing this complex spider web of security needs. Securing data on a point-by-point basis allows administrators to carefully control the potential exposure of information. In addition, separately securing the metadata also ensures that critical information about the data remains available to those who need it while safeguarding potentially sensitive details from other users (e.g., information about a data source or the system user who created it might reveal closely-held details about a particular source or method of collection).

aopr 0601
Figure 6-1. Cell-level security reveals only the information necessary to support a particular user’s needs.

While these examples are fairly notional, they serve to highlight the balance that must be struck between effectiveness—a system delivering its intended utility for its users—and privacy. Questions of which type of access control method to implement may also be constrained (or at least made more difficult) by what is technically and administratively feasible, given that access controls often rely on good metadata with which to make decisions. Properly tagging whole data sets with useful metadata can be a very tough organizational challenge.

When designing a new system, it’s important to preserve flexibility for future decisions that would require finer-grained access control than currently in place. For example, if you build in mechanisms that can enable sub-cell access controls from the start, then you will never have to sacrifice more granular access controls in the future as a result of technical decisions made before their requirement arose.

Types of Access

The most basic access control system is binary—either a user has total control over a set of data or a user has absolutely no control over that data (and may not even be aware of its existence). The user can see and edit a patient record or an intelligence report, or the user cannot.

However, as noted above, the more flexibility available within a system, the more likely it can be tailored precisely to maximize privacy protection and effectively accomplish whatever purposes it has been designed to achieve. Some users may only be able to see data. Others might be able to control every aspect of the data, including granting access to others. Still others may only be granted limited awareness of the data they are seeking. A system that allows a variety of types of access increases configuration options by allowing degrees of control over data.

Basic Access

At a minimum, an access control system will rely on four basic types of access to data:

No access

A user cannot see the data at all. Sometimes, there is not even an indication that the data exists in the system. If the user types “John Doe” into the search field, the system will return no results even if “John Doe” is in the system.2

While “no access” may seem like a straightforward concept to implement, it’s trickier than it may first appear. There are numerous pitfalls that involve accidentally leaking the existence of information without leaking the information itself. For example, imagine a record made up of ten cells, two of them restricted as “no access” to a set of users. If the routine that calculates the summary of the record does not take the access controls into account, it may report to users that there are ten cells that make up this record while only showing eight—effectively leaking to the user that there are two cells of the record that are hidden to their level of access. Similarly, a user may see the name of a database table in a listing of tables that they cannot read rows from.

In a “no access” scenario, it’s important to think carefully about whether just the data, or the fact of its existence (or even potential to exist), needs to be hidden from certain users.

Read access

A user can view the data but cannot modify it. Typing “John Doe” into the search bar returns networks, databases, or records (depending on the granularity of the access control model implemented) containing the term “John Doe” and the user can view that information. However, the user cannot edit the information, nor can the user grant or deny access to the information to other users.

Write access

Not only can a user view the data, but the user can also modify it. Typing “John Doe” into the search bar pulls up data concerning John Doe, and the user can add new information and save it to the system for other users to view.

Owner access

A user has complete control of the data. The user can retrieve the data in the system, make modifications to it, delete the data, and grant or deny access to other users in the system.

Discovery Access

Beyond these basic access types, systems can be configured to offer even more nuanced access options. In some cases, data owners might want to allow users to know that information exists within a system without revealing specific details. This type of access can increase user confidence that they are more likely to find information relevant to their use of the system while preventing overbroad access.

A frequent argument against the use of rigid access controls is that they might prevent users from seeing information that could be vital to accomplishing their objectives. At the same time, it may not always be possible to determine immediately whether information is relevant or not. Such determinations might be context-specific to changing conditions such as user roles, real-world events, exigent circumstances, and other factors. Discovery access provides a mechanism to err on the side of privacy while slightly cracking the door open to access.

Discovery access can take a variety of forms. For example, a search for “John Doe” could return a message that information concerning John Doe does exist within the system but does not return the details of that actual John Doe data. Instead, a user might be able to access the John Doe record (a row in the database) where the user can see some of the data in that record (a selection of cells within the row) and also can see a message that more data (additional cells within the row) is available.3 The message could also be configured to provide some indication as to the type of data made available. For example, it might tell the user, “There is more data available on John Doe: phone number, address, and medical records.” Equipped with this information, the user could then go to a system administrator or anyone in the system with “owner access” to the data and request access (generally, the message to the user would contain instructions, both administrative and technical, for how the user could obtain access to the data).

However, discovery access can also create some security risk by allowing users to draw potentially privacy-infringing inferences even from the limited information available. Take a very basic example: if a database contains a list of sex offenders and a user types “John Doe” and receives a message that a John Doe record is in the system, then an analyst can easily conclude that John Doe is a sex offender. This becomes potentially even more damaging if the database contains mixed information. For instance, if the aforementioned database contained both the names of sex offenders and their victims, then a search hit regarding John Doe might lead someone to erroneously conclude that John Doe is a sex offender when in fact he was a victim.

Discovery permissions are also vulnerable to more sophisticated attacks, particularly through the use of conjunctive searching. Imagine a database that contains medical records where someone wants to determine if a patient has a certain medical condition. If the attacker searched for “John Doe AND syphilis” and received a positive match, the user would then know that there was a record in the system containing both those terms together and draw a conclusion about John Doe’s past or present medical condition. Some organizations choose to disable conjunctive searching for this reason. Others allow it but use careful auditing and oversight to find patterns of abusive conjunctive searching to guard against attacks.

These vulnerabilities should not discourage you from considering using discovery access in your access control design. Rather, you should carefully weigh the security risks and their potential consequences for data subjects against the overall benefit of more nuanced and controlled sharing decisions enabled by this permission.

Managing Access

On top of these building blocks of access, you can layer additional limitations designed to control who can see the data, when they can see the data, and what they can do with the data once they can see it.

Role-Based Access

Role-based access refers to access to data that is based on the characteristics of the individual user. These characteristics can involve such things as the user’s position within an organization, their mission (i.e., what they do every day), or their level of security clearance. For commercial systems, these characteristics might be simply based on the subscription level of a customer paying for access to data. Once these characteristics are identified, users are organized into groupings called access control lists (ACLs). Each ACL is then granted one of the above types of access to a discrete set of data.

Role-based access regimes can be very basic or they can be as complex and granular as the access control model allows. Most systems will, at the very least, have an administrator ACL consisting of individuals with complete access to the data in the system, the ability to create new ACLs, and the ability to add or remove users from them as needed. Beyond that, there is no set standard for how to create ACLs, and the organization of your particular access control regime will largely depend on the context in which it is built (i.e., who your users are and the data they need to use).

For instance, imagine a cell-level security data system designed for a major metropolitan hospital. A variety of roles might be reflected in the organization of the ACLs. Some ACLs might be organized based on particular hospital departments—Oncology, Epidemiology, Cardiology, etc. Others might be built according to the organizational structure of the personnel—surgeons, nurses, pharmacists, lawyers, etc. The ACLs can be hierarchical, containing sub-ACLs such as Supervisors and General Users within each category. The following figure offers a sample map of such an access control regime.

Time-Based Access, or Data Leasing

Access controls can also be temporal in nature, allowing data stewards to share information for only a designated period of time. Temporal access controls give users access to the data for a predetermined time at the end of which the system automatically revokes the privileges. This type of “data leasing” encourages data sharing when circumstances require it, but provides assurances that access privileges will revert to the status quo once there is no longer a need, without any further action on the part of the data steward. This can be particularly useful in a large, complex access control regime where data stewards may already be overwhelmed by sizeable data management responsibilities.

Temporal access controls are generally best for circumstances where the access period is predetermined. For example, an employee is temporarily seconded to another department with new data access needs for 30 days. Of course, circumstances can change, so some systems might also use an automatic notification (such as a pop-up message on login or some kind of email alert) notifying a user that their access privileges are expiring. Such a notification could advise the employee to contact the data steward to request an extension of those privileges.

Temporal access controls can be applied either because of temporal factors related to the status of the person accessing the data (as above), or based on the temporal relevance of the data itself. For example, automated license plate reader (ALPR) data is very useful in tracking down stolen cars, but can be thought of as having a finite window of relevance—most stolen cars will be dumped, broken down piecemeal and sold for parts, or have their plates changed within a matter of days. Allowing investigators unfettered access to search a week’s worth of data could greatly aid in those investigations while still preventing improper use of the larger collection of license plate records.

Functional Access

Functional access control applies not to access to the data in the system but to the system’s functionality itself. Data analytics systems offer a variety of tools to the users, from basic capabilities like cutting and pasting text to sophisticated analytic capabilities specially designed to extract specific insights from a data set. Functional access controls allow administrators to control how users interact with the data to which they have access. Administrators concerned about exfiltration of important data might allow only certain, trusted users to utilize built-in data export or sharing features. In other cases, administrators might want to prevent selected users from accessing certain analytic tools that could lead to the revelation of particularly sensitive insights about a data subject that should not be available to those users.

Strengths and Weaknesses of Access Control

The assorted variations of access control can be combined in nearly infinite configurations in order to meet the specific requirements of a data system. Consequently, it can be difficult to evaluate their general effectiveness without a specific operational context. This section discusses the strengths and weaknesses of access controls in very general terms, and it’s important to recognize that the relative degree of these strengths and weaknesses may vary considerably depending on the specific characteristics of the system in question.

Strengths

The primary strengths to consider are as follows:

Flexibility

System architects have an almost overwhelming number of options when it comes to selecting the overarching access control model, generating the ACL membership rolls, and determining the type of access that each of those ACLs should have. Systems can secure data broadly through a few large ACLs, or they can be composed of thousands of different ACLs with painstakingly intricate access permissions attached to each data point in the system. Therefore, system architects will have multiple configuration options to choose from when attempting to design a system that meets both analytic and privacy imperatives.

Security

Access controls offer an additional layer of security beyond the initial user login to the system—just because someone has access to the system does not mean they can see everything in it. This allows the consolidation of more data within a single data system without necessarily exposing all of that data to every system user. Without internal access controls, it would be more difficult to build large data systems with a significant number of users without substantially increasing risks to privacy.

Control

The word speaks for itself. Access controls give data stewards far greater control over their data. They can precisely manage who can see data and how they can interact with it. Such control often encourages greater data sharing. Data owners can be more confident that when data is shared with users, they can be selective about these sharing decisions. Data owners can carefully limit user actions to those allowed by law and/or necessary to accomplish a specific function.

Dynamism

Access controls can be configured in ways that allow frequent changes to permissions depending on changing circumstances. Through temporal controls, conditional controls, and ad hoc decisions made by users with owner access, data access can be expanded or contracted in response to ever-changing technical, legal, and ethical imperatives. No access control regime has to be permanent, which allows data stewards to err on the side of privacy with the confidence that adjustments can be made as warranted.

Weaknesses

Following are a few weaknesses of access controls:

Scale

As systems scale both in terms of users and data, it can become increasingly difficult to maintain a granular access control regime. Managing millions of records within a cell-level or sub-cell-level security model would require a significant commitment of personnel and time in order to make effective data-management decisions. Without those resources, a data steward is more likely to grant broad access to users to ensure they have access to the data they need, thereby negating the potential privacy benefits of a granular regime tailored to specific needs. However, careful generation and capture of metadata that can automate the application of access controls can offset some of the undue burden of painstaking and error-prone manual administration of access control lists.

Ease of use

Access control user interfaces are notoriously weak and often last on the list of priorities for user experience (UX) in software development. In order to make access control decisions effectively, a data steward must know, at the very least, the complete membership of an access control list as well as details about the complete set of data to which those lists are being granted access. This requires a user either to know this information already or somehow to be provided with this information when they begin to interact with the interface. If a data steward wants to allow ACL X, comprised of 100 system users, to see Database A, comprised of 100 records, how can they be provided the necessary feedback to clearly understand the implications of this decision? Are they fully cognizant that every user in ACL X needs to see every record in Database A? Not only do most user interfaces not provide the necessary feedback to give a user complete information, but any attempt to do so would quickly become unwieldy.

Dynamism

This strength can also be a potential weakness. Allowing access controls to be adjustable to account for changing circumstances also means they can be changed for nefarious reasons. Consequently, an access control regime with any dynamism also makes data potentially vulnerable because it must rely to some extent on trust in those charged with responsibility for managing access decisions. Other capabilities, such as the oversight mechanisms discussed in Part III, can mitigate some of this vulnerability.

Access Controls and the Fair Information Practice Principles (FIPPs)

Access controls are enormously helpful in implementing each of the Fair Information Practice Principles:

Collection limitation

Although access control itself is managed after collection, the architectural decisions implementing it do have an effect on how data is collected. Adopting a granular security model for your data allows more precise data collection decisions, enabling data stewards to manage only the exact data points necessary to accomplish a goal.

Data quality

Careful control over who can access data and what those users can do with that data can help to maintain data quality by ensuring that only appropriately authorized and suitably trained users can make modifications or deletions.

Purpose specification/use limitation

Access controls can be configured to provide data access only to those system users who will handle the data in accordance with the authorized purposes for which the data was collected.

Security safeguards

Access controls provide additional layers of security for data beyond just general access to the system.

Accountability

Access controls can be used to establish roles within the system based on the authorities and skills of the ACL members. Those with more access and more functional capabilities are held responsible for the management of the data under their purview.

When to Use Access Controls

Put simply: always use access controls.

Almost every system will have some sort of login and password (i.e., system-level security) in order to control who can use the system. Generally, any system with multiple users and more than 100 or so records will not need to share all data with every user. These systems might be able to operate within the letter of the law without access controls or with very basic access control regimes, but it’s likely they will be sharing some data with users who do not need to see it to do their jobs. By more carefully controlling your data with sufficiently granular access controls, you create a system that can tailor data access exactly to data needs and therefore reduce any unnecessary infringement of privacy.

Further considerations will affect the specific structure of your access control regime. Several factors will influence your design decisions:

Data scale/granularity

The numbe of data points in the system. This does not refer to the number of bytes of data in the system—a massive system could contain a few very large data files. Instead, this refers to the number of discrete pieces of information that could be consumed by a human user. For example, a record contains a person’s name and a phone number. This record actually could contain up to seven (or more) data points that could be controlled—first name, middle name(s), last name, country code, area code, prefix, and line number.

Data schema

The organization of the data itself. This might sit on a spectrum between a hierarchical structure (e.g., databases, records, data points, etc.) and a huge number of single data points with no order at all.

User scale

The number of system users that will interact with the data.

System uses

The potential applications of the system. A system could be designed to interact with data for a single use, or it could aggregate data and make it available for a nearly infinite variety of uses on the frontend.

Dynamism

The potential for these and other factors to change over time. A static system might involve the same basic users, data, and functionality over a long period, whereas a dynamic system might involve constant flows of new data, new users, and new uses.

Support personnel

The people who are charged with managing the data and the access control lists. This could range from a single person charged with multiple administrative tasks with little time and/or knowledge for access management to an elaborate regime of multiple data administrators solely tasked with managing system access.

These factors will play off of each other in a variety of ways. A system with a large number of data points, large user base, and dynamic data flows will need a large number of support personnel to manage the data at a granular level. If such infrastructure is not available, then you might want to go with a less granular access control regime. On the other hand, a static system with even a large number of users and data points might allow for a more granular regime if time can be initially invested to set a workable access control regime that rarely requires adjustment. Meanwhile, a small data set with a large number of users might only need a few basic data controls—the only data management investment would be putting users into appropriate ACLs.

As you can probably guess, the possibilities are as extensive as the number of combinations of the various types of access controls. It will be up to you as the system architect to design a system that maximizes data utility and privacy protection.

Conclusion

You can think of application-level security (as opposed to the system- and physical-level security of the machines themselves) as having two key aspects: restriction and oversight. Restriction allows the architect to build a system that enforces its own rules without human intervention. Oversight lets careful observations of the use of the system limit the abuse a user can inflict given the access that they have. Both are necessary and neither is sufficient to create truly robust privacy protections alone.

As such, ACLs play a huge role in creating a privacy-protected system as they are the mechanism for application-level restrictions. While intelligently implementing ACLs may seem complicated (and it is), they are half of the heart and soul of privacy protections in your software. As an architect, expect to spend a good deal of time on this aspect of your system.

Finally, while it’s important to be careful and rigorous in the ACL regimes you create, ACLs can be thought of as a primitive functionality that is ripe for innovation. A fresh, usable-but-effective take on managing access control lists could spawn a new long-lasting paradigm in application design. Get creative.

1 Generally, this will be done through a basic login and password, at which point all data in the system will be available to them. In this case, there are no additional access-control mechanisms—the access control is synonymous with the system login. We include it here for the sake of completeness.

2 We’re using simple searches as examples here, but these access types can be applied to any means of retrieving information in the system.

3 This is only possible if the system is configured using the cell- or sub-cell-level security model.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.67.70