In this chapter, you will learn about
• Components of a privacy program framework
• Well-known control frameworks
• Significant privacy laws and regulations
• Data governance and data management
• Data classification, data retention, and data loss prevention
• Data minimization and de-identification
• Working with data subjects
• Privacy metrics
• Tracking technologies and countermeasures
This chapter covers the Certified Information Privacy Manager job practice II, “Privacy Program Framework.” The domain represents approximately 14 percent of the CIPM examination.
When properly implemented, governance is a process whereby senior management exerts strategic control over business functions through policies, objectives, delegation of authority, and monitoring. Governance is management’s continuous oversight of an organization’s business processes that is intended to ensure that these processes effectively meet the organization’s business vision and objectives.
Organizations often establish governance through a committee or formalized position that sets long-term business strategies and makes changes to ensure that business processes continue to support business strategies and the organization’s overall needs. Effective governance is enabled through the development and enforcement of documented policies, standards, requirements, and various reporting metrics.
The framework of a program such as a privacy program comprises the structure and organization of all of its parts. These parts are represented by artifacts, including
• Privacy program charter This describes the program and its purpose, scope, objectives, roles and responsibilities, and budget.
• Privacy standards These are prescribed practices, methods, protocols, and vendors to be used in privacy and security processes and systems.
• Privacy processes These are business processes in the privacy program that apply to normal day-to-day operations as well as to incident response and other unplanned (but expected) events.
• Privacy guidelines This information helps personnel better understand how to comply with privacy policies and standards.
• Controls These formal statements of expected activities or outcomes ensure that privacy policies and processes are correctly followed.
A charter is a formal document used in some organizations to define and describe a major business activity and/or department. Typically, a privacy program charter will contain these elements:
• Program name
• Program purpose
• Executive sponsorship
• Roles and responsibilities
• Primary business processes
• Budget and other resources
Organizations that collect personal information from persons outside the organization often also have internal, nonpublic privacy policies that affect their employees and other workers. Such policies, like cybersecurity policies, describe the required characteristics and expectations of staff members as well as information systems used by the organization.
If an organization that processes personal information employs any service providers that also store, process, or have access to the personal information used by the organization, the organization will also impose requirements upon those service providers. This aspect is discussed in Chapter 3.
Virtually all organizations possess and use information about their employees and other workers. In many countries, these organizations are required to disclose to their employees that the organization has a policy of confidentiality and to inform workers that their personal information is used only in its role as employer to carry out activities required by law.
Organizations lacking a formal privacy program often stop there and make no further attempts to define the meaning or enforcement of “for sanctioned business purposes only.” This can, however, lead to improper use of personal information through tactical decisions made by nearly any staff members who may have little or no awareness about applicable laws on this topic.
Organizations with more mature privacy and security programs will have detailed privacy policies that define expected behaviors of their workers and the required characteristics of their information systems. In many cases, their privacy policies will exist as a part of their information security policies for the protection of said information. On the minimum side, privacy or security policies will include general statements about sensitive or personal information that shall be used “for sanctioned business purposes only” without further detail.
• Roles and responsibilities for the organization’s privacy program. This should include
• Those who have data management responsibilities
• Those who approve and review access to personal information
• Those who review and approve of new uses for personal data
• Those who receive and process subject data requests
• Those who have responsibility for monitoring uses of personal data
• Those who have responsibility for responding to incidents that represent the misuse of personal information
• Those who review privacy business processes
• Those who audit privacy business processes
• Business processes governing the use of personal information. This should include periodic reviews of business processes to ensure that they remain compliant with applicable laws and regulations.
• Provisions for the review and audit of privacy business processes. This will describe reviews that take place within privacy business processes, as well as audits of privacy business processes. Reviews will typically be performed by process owners to confirm that all necessary actions take place as required. Audits of privacy business processes will be performed by persons outside of the organization’s privacy office.
• Descriptions of measurements of privacy business processes. Any statistics, metrics, key performance indicators (KPIs), key risk indicators (KRIs) and leading or trailing indicators required in the privacy program will be described here.
Privacy laws enacted in many regions of the world have brought about the near universality of public-facing privacy policies posted on web sites or in the form of visible notices at business locations. Driven by these privacy laws, public-facing privacy policies generally include the following:
• Descriptions of the methods used to collect personal information
• Descriptions of the methods used to protect personal information
• Descriptions of primary and secondary uses of personal information
• Descriptions of any international transfers of personal information
• Descriptions of any third parties that may store or process personal information on behalf of the organization
• Descriptions of any tracking or logging activities performed by the organization, such as the use of web browser cookies
• Any procedures that may be available to explain how personal information has been used or is being used
• Statements describing the legal rights of data subjects concerning the organization’s use of their personal data, often including several subsections for specific regions or localities
• Contact information or procedures for persons who want to initiate subject data requests or inquiries or lodge complaints regarding the organization’s use of their personal information
• Contact information or procedures for regulators or other authorities to whom persons can make inquires or lodge complaints regarding the organization’s collection or use of their personal information
Policies define what is to be done, and standards define how policies are to be done. For instance, a policy may stipulate that strong passwords are to be used for end-user authentication. A password standard, then, would be more specific by defining the length, complexity, and other characteristics of a strong password.
Where policies are designed to be durable and long-lasting, they do so at the expense of being somewhat unspecific. Standards that are more specific may be frequently affected by change, because they are closer to the technology and are concerned with the details of the implementation of policy.
Standards need to be developed carefully, so that
• They properly reflect the intent of one or more corresponding policies.
• They can be implemented.
• They are unambiguous.
• Their directives can be automated, where large numbers of systems, endpoints, devices, or people are involved, leading to consistency and uniformity.
Several types of standards are in use, including the following:
• Protocol standards Examples include Transport Layer Security (TLS) 1.2 for web server session encryption, Advanced Encryption Standard (AES) 256 for encryption at rest, Institute of Electrical and Electronics Engineers (IEEE) 802.11ac for wireless networking, and Security Assertion Markup Language (SAML) 2.0 for authentication.
• Vendor standards For example, cloud storage will be implemented through Box.com.
• Configuration standards For instance, a server-hardening standard would specify all of the security-related settings to be implemented for each type of server operating system in use.
• Programming language standards Examples include C++, Java, and Python. Organizations that do not establish and assert programming language standards may end up with programs written in dozens of different languages, which may drive up the cost of maintaining them.
• Methodology standards Examples include the use of Factor Analysis of Information Risk (FAIR) risk analysis techniques; Operationally Critical Threat, Asset, and Vulnerability Evaluation (OCTAVE) for security assessments; and SMART (specific, measurable, attainable, relevant, and timely) for the development of strategic objectives.
• Control frameworks These include National Institute of Standards and Technology (NIST) Privacy Framework, Payment Card Industry Data Security Standard (PCI DSS), Health Insurance Portability and Accountability Act (HIPAA), and ISO/IEC 27701.
As the organization develops its privacy framework, it may discover that one or more of its standards are impacted by new requirements or that new standards need to be developed. Because regulations, practices, and technologies change so rapidly, standards are often reviewed and updated more frequently than policies. New privacy strategies are not the only reason that standards are reviewed and updated; other reasons include the enactment of new or updated privacy laws, new techniques for data protection, and the acquisition of new network devices and applications.
Laws passed by governments at national, state, and provincial levels impose data collection, handling, and retention requirements on organizations that store or process personal information about natural persons. These laws have been enacted to respond to citizens’ outcry at the abuse of their personal information and violations of their desire for privacy as the use of technology and the Internet has become part of our daily lives. Activities such as telemarketing and tracking people’s locations and habits have been the focus of growing concerns by citizens and privacy advocates. Advances in the capabilities of information systems and data mining of large databases containing personal information have enabled practices that many private citizens find inappropriate and even intrusive. Abuses of such capabilities would allow for the creation of a new and very invasive form of a surveillance police state, which is not difficult to imagine, as some governments in the world are already there. Legislators in many countries have sought to counterbalance these capabilities by defining the rights of natural persons concerning the data that is collected about them and used in various ways.
The most influential privacy laws include the European Union General Data Protection Regulation (EU GDPR), HIPAA, the US Fair Credit Reporting Act (FCRA), the US Electronic Communications Privacy Act (ECPA), the Canadian Personal Information Protection and Electronic Documents Act (PIPEDA), the California Consumer Privacy Act (CCPA) and California Privacy Rights Act (CPRA), and the Chinese Cybersecurity Law (CCSL). These and other laws are discussed in the remainder of this section. This is not intended to be an exhaustive or authoritative list, but is instead a sampling of better-known privacy laws.
One of the most notable modern privacy laws, the GDPR enacts sweeping requirements upon organizations within and beyond the European Union that store, process, or transmit personal data about EU citizens and residents. The GDPR was passed in April 2016 and became effective in May 2018. The main privileges enacted by the GDPR include the following:
• Rights of data subjects Any organization collecting information about EU citizens and residents is required to operate with transparency in collecting and using their personal information. Chapter III of the GDPR defines eight data subject rights that have become foundational for other privacy regulations around the world:
• Right to access personal data. Data subjects can access the data collected on them.
• Right to rectification. Data subjects can request a modification of their data to correct errors and the updating of incomplete information.
• Right to erasure. Also referred to as the right to be forgotten, data subjects can request that their personal data be erased from an entity’s processing activities.
• Right to restrict processing. In certain circumstances, data subjects can request that processing of their personal data be stopped.
• Right to be notified. Data subjects must be notified about what information is being collected at or before collection, and the use of that information.
• Right to data portability. Data subjects can request that their personal data be provided in a commonly used, machine-readable format.
• Right to object. Data subjects can object to processing of their data when a controller attempts to argue legitimate grounds for the processing that override the interests, rights, and freedoms of the data subject or for legal purposes.
• Right to reject automated individual decision-making. Data subjects can refuse the automated processing of their personal data to make decisions about them.
• Definitions of data controller and data processor The GDPR defines a data controller as an organization that directs the use of personal data. A data processor is an organization that processes personal data as directed by a data controller. Controllers and processors are required to maintain records regarding the processing of personal information.
• Data protection and privacy by design and by default Organizations are required to design, operate, and maintain their business processes and information systems with privacy and security as a part of their default design.
• Cybersecurity Organizations that process personal information are required to enact cybersecurity capabilities to protect that data.
• Breach notification Organizations are required to notify supervisory authorities and affected data subjects in the event of a privacy or security breach of personal data.
• Data protection impact assessment (DPIA) The DPIA is a formal process to identify and minimize the data protection risks of a data processing activity. Organizations are required to perform DPIAs whenever they are implementing new, or making significant changes to, business processes or information systems.
• Data protection officer (DPO) An organization is required to appoint a DPO if it processes personal data on a large scale. The DPO is expected to have expert knowledge of data protection law and practices.
• Certification The GDPR permits the creation of certification authorities and voluntary certifications of organizations that process personal data.
• Cross-border data transfers The GDPR contains rules regarding the transfer of personal data out of the European Union.
• Binding corporate rules The GDPR accommodates the use of binding corporate rules that multinational organizations may enact to ensure the protection and appropriate use of personal data when transferred outside the European Union to their overseas systems.
• Supervisory authority Each EU member state establishes a supervisory authority that is responsible for monitoring GDPR compliance.
• Penalties The GDPR provides supervisory authorities to enact fines against organizations that violate any terms of the GDPR. Administrative fines can be as high as €20 million or 4 percent of global turnover, whichever is greater.
The GDPR claims to have extraterritorial jurisdiction over companies not based in the European Union that provide services to EU citizens in EU member states. In November 2019, the European Data Protection Board issued its final guidelines on the territorial scope of the GDPR. The guidelines are designed to assist both companies and regulators in assessing whether certain data processing activities are within the scope of the GDPR. While claims of jurisdiction beyond its own borders have yet to be seriously tested in courts, the final guidelines provide some clarity for data processing activities of a non-EU entity. For organizations offering goods or services in the EU to become subject to the GDPR, these activities should result from intentional actions rather than inadvertently or incidentally.
Enacted in 1996, HIPAA includes two rules that are concerned with the protection of protected health information (PHI) and electronic protected health information (ePHI). Covered entities are organizations that store or process medical information and are subject to one or both of these rules:
• Security Rule This part of HIPAA requires that organizations enact several administrative, physical, and technical safeguards to protect ePHI. Many of these controls are compulsory, while others are “addressable” (an organization can rationalize their disuse).
• Privacy Rule Effective in 2003, this part of HIPAA requires that organizations protect PHI, mainly in hard copy form.
Note that additional provisions of HIPAA are not related to security or privacy.
HIPAA defines various civil and criminal penalties for organizations that violate the law. Covered entities are required to enact business associate agreements (BAAs) with all third parties that store or process ePHI on behalf of covered entities.
Health Information Technology for Economic and Clinical Health Act Enacted in 2009, HITECH extends the Security Rule and Privacy Rule in HIPAA by expanding security breach notification requirements and expanding the disclosures of the use of a patient’s PHI.
Enacted in 1970, the FCRA provides visibility, remedies, and assurances through civil liability that the information contained in consumers’ credit history is accurate. Since credit reports are used by banks and other financial institutions (as well as employers in many states for making hire/no-hire decisions), FCRA provides a means for consumers to obtain copies of their credit reports and procedures for making corrections to erroneous information contained in those credit reports.
FCRA is an early example of a privacy law that provides persons with the ability to know what personal information is being stored, how it is used, and what methods can be used for obtaining copies and requesting corrections.
If consumers’ rights are violated, they may recover damages, attorney fees, and court costs, and punitive damages may be awarded if actions against them were willful.
PIPEDA went into effect in 2000 and seeks to ensure consumer data privacy in the context of e-commerce. In part, PIPEDA was enacted to provide assurances to European countries and consumers that their personal information present in Canadian companies’ information systems would be safe and free from abuses.
PIPEDA also gives Canadians the right to know why organizations collect, use, or disclose their personal information, and the assurance that their information will not be used for any other purpose. They can further know who within the organizations are responsible for protecting their information. They can contact the organizations to ensure that their personal data is accurate and can lodge complaints if they believe their privacy rights have been violated. Canadian companies must obtain consent for the collection of personal information, and they cannot refuse to provide service to a Canadian citizen if the citizen refuses to provide such consent.
Made effective on January 1, 2020, CCPA is a state law designed to improve California residents’ privacy rights. Provisions of CCPA give California residents certain rights, including the following:
• Knowledge of what personal information is being collected
• Notification of whether such personal information is subsequently transferred or disclosed to another party
• The ability to prohibit an organization from transferring or selling personal information
• The ability to examine the personal information held by organizations, with the right to request that the information be corrected or removed
• The freedom from discrimination should individuals choose to exercise their privacy rights
Californians can sue for damages when unencrypted and unredacted personal information is subject to unauthorized access and exfiltration, theft, or disclosure as a result of a business’s violation of the duty to implement and maintain reasonable security procedures.
Like the EU GDPR, the CCPA claims jurisdiction over companies not located in California if they collect personal information about California residents. And like the GDPR, this provision has yet to be tested in court.
Approved by ballot measure in 2020 and effective in 2023, the CPRA expands the CCPA’s provisions, including
• Triples the fines imposed on violators
• Permits civil penalties for theft of login information
• Creates the California Privacy Protection Agency that will implement and enforce California privacy laws
Enacted in 2016 and effective in 2017, the CCSL consists of three main parts:
• Data protection Organizations holding personal information about Chinese citizens must take measures to protect that information.
• Data localization Organizations collecting information about Chinese citizens must keep such data within the country of China. Organizations that want to transfer data out of China must undergo a data assessment by the Chinese government.
• Cybersecurity Organizations are required to enact controls to prevent malware, intrusions, and other attacks. The law includes mandatory standards, assessments, and certifications for network devices.
CCSL’s data protection principles resemble those of the GDPR. Both laws require that citizens be informed if an organization holds their personal data; the organization must explain the use of their personal data, identify collection methods, and obtain consent for continued use.
Enacted in 2019 and effective in August 2020, Brazil’s Lei Geral de Proteção de Dados (LGPD) is similar to the EU GDPR. The LGPD establishes a National Data Protection Authority that is designated as the federal agency responsible for overseeing the data protection regulation.
The LGPD establishes a number of individual rights over personal data, many of which are similar to what is provided by the GDPR. However, some additional rights are provided by the LGPD, including access to information about entities with which an organization has shared the individual’s personal data.
Like the GDPR and the CCPA, the LGPD claims jurisdiction over companies not located in Brazil if one of the following criteria are met:
• The processing operation is carried out in Brazil.
• The purpose of the processing activity is to offer or provide goods or services to individuals located in Brazil.
• The personal data was collected in Brazil.
Similar to the GDPR and the CCPA, this provision has yet to be tested in court.
In the United States, the Federal Trade Commission (FTC), the agency that monitors consumers’ rights, has brought legal action against scores of companies in alleged violation of consumer protection laws, including violations of posted privacy policies, deceptive practices with regards to US-EU Safe Harbor registration, breaches of personal information, improper collection of personal data without consent, and failures to protect personal information.
The FTC enforces more than 70 laws, including the following:
• Federal Trade Commission Act, which protects consumers from unfair and deceptive practices
• Children’s Online Privacy Protection Act (COPPA), which protects children’s online privacy, particularly those under age 13
• Do-Not-Call Implementation Act, which provides for consumers to opt out of all telemarketing calls
• Controlling the Assault of Non-Solicited Pornography and Marketing Act (CAN-SPAM), which prevents misleading advertising and requires consumer opt out
• Gramm–Leach–Bliley Act (GLBA), which protects and secures the privacy of consumer personal information by financial institutions
• HITECH, which extends the scope of HIPAA
• Identity Theft and Assumption Deterrence Act, which provides a central clearinghouse of identity theft complaints
• Fair Credit Reporting Act (FCRA), which protects personal information collected by consumer credit bureaus, medical information companies, and tenant screening services
• Clayton Antitrust Act, which prevents illegal contracts, mergers, and acquisitions
Multinational organizations, as well as organizations doing business with citizens in many countries, need to be aware of the presence of and requirements imposed by international data-sharing agreements. These agreements between governments serve as implementation requirements that organizations must follow.
These international data-sharing agreements have a history of changing more frequently than privacy laws. Case in point: International Safe Harbor Privacy Principles (known to most in the privacy business as Safe Harbor) were developed in the late 1990s. In 2015, the European Court of Justice invalidated Safe Harbor. In 2016, the EU-US Privacy Shield was developed, but through US court decisions it was weakened, until the EU-US Umbrella Agreement was approved in 2017. In 2020, the EU Court of Justice struck down EU-US Privacy Shield. The uncertainty continues.
International organizations are able to make use of Binding Corporate Rules, now known as Standard Contractual Clauses, that define international transfers and protection of data within organizations. This approach most often applies to employment and human resources records of an organization’s workforce.
Organizations in Europe and other places are often required to engage with local works councils, which are groups not unlike labor unions that represent organization workers regarding matters of collection and use of their personal information. Unlike US-based organizations, which are less constrained on the collection and use of personally identifiable information (PII), European organizations, and European branches and subsidiaries of non-European organizations, must make a business case with local works councils to win approval for data collection as well as various types of information systems monitoring and recordkeeping.
In addition to abiding by applicable laws and regulations, organizations may negotiate additional terms and conditions regarding the protection of personal information. Such obligations may be related to specific services rendered by third-party service providers that store or process personal information on behalf of other organizations. For instance, a service provider may comply with all applicable laws, agree to specific protective measures, undergo periodic audits or examinations, or provide specific reporting regarding the storage or use of personal data.
Within the context of data privacy and privacy regulations, organizations must identify specifically the legal basis under which they are collecting and/or processing personal information. Simply put, it must be lawful for the organization to collect and use data subjects’ personal information, and the organization must be able to cite specifically how it is lawful.
Article 6.1 of the GDPR provides five possible avenues of legal basis (quoting directly from GDPR):
• processing is necessary for the performance of a contract to which the data subject is party or in order to take steps at the request of the data subject prior to entering into a contract;
• processing is necessary for compliance with a legal obligation to which the controller is subject;
• processing is necessary in order to protect the vital interests of the data subject or of another natural person;
• processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller;
• processing is necessary for the purposes of the legitimate interests pursued by the controller or by a third party, except where such interests are overridden by the interests or fundamental rights and freedoms of the data subject which require protection. [This does not apply to public authorities in the performance of their duties.]
Individual EU member states are permitted to include additional provisions.
The CCPA does not treat this in the same way as the GDPR. Instead, entities subject to CCPA must follow the law in general. Certain use cases, such as healthcare and employment, with specific provisions are cited.
The term legitimate interest is used in the GDPR as one of five bases for legal collection and processing of personal information. One might consider legitimate interest a loophole, but it is the author’s opinion that this provision was included so that GDPR did not need to enumerate every possible use of personal information (which would soon be out of date, requiring frequent updates to the law).
Legitimate interest gives organizations a basis for collecting and processing personal information if they benefit from doing so. But it doesn’t end there: the real issue of legitimate interest lies in the fact that organizations must balance their interests with those of the data subject.
Online advertising is an interesting case to consider. Advertisers have an interest in serving advertising content to readers and viewers. To stay in business and earn advertising revenue (which is the business model for a considerable number of web sites), advertisers claim that readers and viewers benefit by advertising that is aligned with topics of interest to them (the readers’ and viewers’ interest). Thus, there is an argument that online advertising benefits the advertiser (and web site operator) as well as the viewer (the data subject).
Before we dive into the topic of technical privacy controls, we first need to discuss the concept and application of controls in general. This section contains a brief discussion of controls; a comprehensive discussion of controls and control frameworks can be found in CISM Certified Information Security Manager All-In-One Exam Guide.
Controls are statements that define required outcomes. Controls are often implemented through policies, procedures, mechanisms, systems, and other measures designed to reduce risk. An organization develops controls to ensure that its business objectives will be met, risks will be reduced, and errors will be prevented or corrected. Controls are used for two primary purposes in an organization: they are implemented to ensure desired outcomes and to avoid unwanted outcomes. In the context of privacy and information security, controls should be defined and implemented to ensure the protection and proper handling of personal information.
Control objectives are statements of desired states or outcomes from business operations to mitigate risks. When building a security program, and preferably before selecting a control framework, you need to establish high-level control objectives. Example control objective subject matter includes the following:
• Confidentiality and privacy of personal and sensitive information
• Proper and sanctioned use of personal information
• Protection of IT assets
• Accuracy of transactions
• Availability of IT systems
• Controlled changes to IT systems
• Compliance with corporate policies
• Compliance with applicable regulations and other legal obligations
Control objectives are the foundation for one or more controls. For each control objective, one or more control activities will exist to ensure the realization of the objective. For example, the “availability of IT systems” control objective could be implemented via several control activities, including these:
• IT systems will be continuously monitored, and any interruptions in availability will result in alerts sent to appropriate personnel.
• IT systems will have resource-measuring capabilities.
• IT management will review capacity reports monthly and adjust resources accordingly.
• IT systems will have antimalware controls that are monitored by appropriate staff.
Together, these four (or more) controls contribute to the overall control objective of IT system availability. Similarly, other control objectives will include one or more controls that will ensure their realization.
After establishing control objectives and defining the control activities that will support the objectives, the next step is to design controls. This can be a considerable undertaking when done in a vacuum. A better approach is to utilize one of several high-quality, industry-accepted control frameworks as a starting point.
If an organization elects to adopt a standard control framework, its next step is to perform a risk assessment to determine whether controls in the control framework adequately meet each control objective. Where there are gaps in control coverage, additional controls must be developed and put in place.
Privacy control objectives resemble ordinary control objectives but are set in the context of privacy and information security. Following are some examples of privacy control objectives:
• Protection of personal information from unauthorized personnel
• Protection of personal information from unauthorized modification
• Integrity of personal information
• Controlled use of personal information
An organization will probably create several additional information systems control objectives on other basic topics such as malware, availability, and resource management, many of which directly or indirectly contribute to the protection and proper use of personal information.
While every organization may have unique missions, objectives, business models, tolerance for risk, and so on, organizations need not invent governance frameworks from scratch to manage their privacy and security objectives.
In the context of strategy development, some organizations may already have suitable control frameworks in place, while others may not. Although it is not always necessary for an organization to select an industry-standard control framework, it is advantageous to do so. Industry-standard control frameworks have been used in thousands of companies, and they are regularly updated to reflect changing business practices, emerging threats, and new technologies.
Information security is a somewhat more mature profession than privacy. Thus, organizations developing privacy programs and privacy controls are often overlaying privacy controls over an existing information security framework. Doing this is the expected approach, and this idea is reinforced with the recent publication of privacy-specific control frameworks that are extensions of existing information security and risk management frameworks.
It is often considered a mistake to select a control framework because of the presence or absence of a small number of specific controls. Usually, such selection is made on the assumption that control frameworks are rigid and inflexible. Instead, the strategist should take a different approach: select a control framework based on industry alignment and then institute a risk management process for developing additional controls based on the results of risk assessments. This is precisely the approach described in ISO/IEC 27701, as well as in the NIST Privacy Framework. Start with a well-known control framework and then create additional controls, if needed, to address risks specific to the organization. When assessing the use of a specific framework, you may find that a specific control area is not applicable. In those cases, do not just ignore the section; instead, document both the business and technical reasons why the organization chose not to use the control area. If a question about the decision is raised in the future, this information will indicate why the decision was made not to implement the control area. The date and those involved in the decision should also be documented.
Several standard privacy and security frameworks are discussed in the remainder of this section:
• ISO/IEC 27701
• NIST Privacy Framework
• NIST SP 800-122
• NIST Cybersecurity Framework (CSF)
• ISO/IEC 27001
• NIST SP 800-53
• Center for Internet Security Critical Security Controls (CIS CSC)
• PCI DSS
ISO/IEC 27701:2019, Security techniques – Extension to ISO/IEC 27001 and ISO/IEC 27002 for privacy information management – Requirements and guidelines, is an international standard that directs the formation and management of a Privacy Information Management System (PIMS), including the controls and processes to ensure privacy by design and proper ongoing monitoring and management of personal information. The first version of this standard was published in August 2019.
ISO/IEC 27701 follows a similar structure to ISO/IEC 27001 and is divided into three main sections: requirements, guidance, and controls.
Requirements This section describes required activities included in effective PIMS. The structure of the section uses the same seven sections used in ISO/IEC 27001.
Guidance This section provides direction on how privacy programs can utilize ISO/IEC 27002 within the PIMS and provides specific considerations for controllers and processors. The section is divided into three groups: PIMS-specific guidance related to ISO/IEC 27002, additional ISO/IEC 27002 guidance for PII controllers, and additional ISO/IEC 27002 guidance for processors.
PIMS-specific guidance related to ISO/IEC 27002 expands on the 14 control categories to enable an organization to evaluate the control objectives and controls in the context of risks to information security as well as risks to privacy. The sections on additional ISO/IEC 27002 guidance for PII controllers and additional ISO/IEC 27002 guidance for processors create specific guidance on privacy management for controllers and processors.
Controls This section contains a baseline set of controls for controllers and processors that can be used in context with ISO/IEC 27001. The controls for controllers and processors in ISO/IEC 27701 are described in these four categories:
• Conditions for collection and processing
• Obligations to PII principals
• Privacy by design and privacy by default
• PII sharing, transfer, and disclosure
The NIST Privacy Framework is a guide for organizations that need to protect and properly handle personal information. The Privacy Framework is deliberately organized similarly to the NIST CSF to facilitate the parallel use of both tools. The framework consists of three parts:
• Core This set of privacy protection activities facilitates the communication of protection activities. There are five core activities: Identify, Govern, Control, Communicate, and Protect.
• Profile This is an organization’s current set of activities used to protect personal information. You can think of the profile as a baseline that can be referenced at a future date to gauge progress, as well as a foundation for defining the desired future state of a privacy program.
• Implementation Tiers These are maturity levels, from least to most mature: Partial, Risk Informed, Repeatable, and Adaptive.
Similar to the extension of ISO/IEC 27001 with ISO/IEC 27701, the integration with the NIST CSF is highlighted in the Privacy Framework Core with a key that identifies whether control objectives are identical to the CSF or they align with the CSF, but the descriptions have been adapted for privacy programs. This approach reinforces the idea that an effective privacy program requires integration with information security and risk management programs.
Organizations seeking to adopt the framework will find a wealth of information, including crosswalks (mappings to other standards), profiles, guidance, and tools to build and improve their privacy practices.
NIST SP 800-122, Guide to Protecting the Confidentiality of Personally Identifiable Information (PII), contains directives for protecting personal information. Although the guidelines are required of US government agencies, many other organizations employ them, as they are considered good practices.
HIPAA established requirements for the protection of ePHI. These requirements apply to virtually every corporate or government entity (known as a covered entity) that stores or processes ePHI. HIPAA requirements fall into three main categories.
• Administrative safeguards
• Physical safeguards
• Technical safeguards
Several controls reside within each of these three categories. Each control is labeled as Required or Addressable. Required controls must be implemented by every covered entity. Addressable controls are considered optional in each covered entity, meaning the organization does not have to implement an Addressable control if it does not apply or if there is negligible risk if the control is not implemented.
The NIST CSF is a risk-based life-cycle methodology for assessing risk, enacting controls, and measuring control effectiveness, not unlike ISO/IEC 27001. The components of the NIST CSF are as follows:
• Framework Core This set of functions—Identify, Protect, Detect, Respond, and Recover—makes up the life cycle of high-level functions in an information security program. The Framework Core includes a complete set of controls (known as references) within the four activities.
• Framework Implementation Tiers These are maturity levels, from least mature to most mature: Partial, Risk Informed, Repeatable, and Adaptive.
• Framework Profile This is an alignment of elements of the Framework Core (the functions, categories, subcategories, and references) with an organization’s business requirements, risk tolerance, and available resources.
Organizations implementing the NIST CSF would first perform an assessment by measuring its maturity (Implementation Tiers) for each activity in the Framework Core. Next, the organization would determine the desired levels of maturity for each activity in the Framework Core. The differences found would be gaps that would need to be filled through several means, which could include the following:
• Hiring additional resources
• Training resources
• Adding or changing business processes or procedures
• Changing system or device configuration
• Acquiring new systems or devices
ISO/IEC 27001, Information technology – Security techniques – Information security management systems – Requirements, is an international standard for information security and risk management. This standard contains a requirements section that outlines a properly functioning information security management system (ISMS) and a comprehensive control framework.
ISO/IEC 27001 is divided into two sections: Requirements and Controls. The Requirements section describes required activities found in effective ISMSs. The Controls section contains a baseline set of controls that serve as a starting point for the organization. The standard is updated periodically; the latest version is known as ISO/IEC 27001:2015. The requirements in ISO/IEC 27001 are described in seven sections: Context of the Organization, Leadership, Planning, Support, Operation, Performance Evaluation, and Improvement.
Although ISO/IEC 27001 is a highly respected control framework, its adoption has been modest, partly because a single copy of the standard costs more than US $100. Unlike NIST standards, which are free of charge, it is unlikely that students or professionals will pay this much for a standard just to learn more about it. Despite this, ISO/IEC 27001 is growing in popularity in organizations throughout the world.
NIST SP 800-53, Security and Privacy Controls for Federal Information Systems and Organizations, is one of the most well-known and adopted security control frameworks. NIST SP 800-53 is required for all US government information systems, as well as all information systems in private industry that store or process information on behalf of the US government.
Even though the NIST 800-53 control framework is required for US federal information systems, many organizations that are not required to employ the framework have utilized it, primarily because it is a high-quality control framework with in-depth implementation guidance and because it is available without cost.
NIST SP 800-53A, Assessing Security and Privacy Controls in Federal Information Systems and Organizations: Building Effective Assessment Plans, is the companion standard to NIST SP800-53 that defines techniques for auditing or assessing each control in NIST SP 800-53.
The CSC framework from CIS, or CIS CSC, is a control framework that traces its lineage to the SANS Institute. The framework is still commonly referred to as the “SANS Top 20” or “SANS 20 Critical Security Controls.”
PCI DSS is a global control framework specifically for the protection of credit card numbers and related information when stored, processed, and transmitted on an organization’s networks. The PCI DSS was developed by the PCI Standards Council, a consortium of the world’s dominant credit card brands, namely Visa, MasterCard, American Express, Discover, and JCB.
PCI DSS is mandatory for all organizations that store, process, or transmit credit card data. Organizations with larger volumes of card data are required to undergo annual onsite audits. Many organizations use the controls and the principles in PCI DSS to protect other types of financial and personal data such as account numbers, Social Security numbers, and dates of birth.
Frequently, organizations find themselves in a position where more than one control framework needs to be selected and adopted. The primary factors driving this are as follows:
• Multiple applicable regulatory frameworks
• Multiple operational contexts
Organizations with multiple control frameworks often crave a simpler framework for their controls. Often, organizations will “map” their control frameworks together, resulting in a single control framework with controls from each framework present. Mapping control frameworks together is time-consuming and tedious, although in some instances, the work has already been done. For example, Appendix H in NIST SP 800-53 contains a forward and reverse mapping between NIST SP 800-53 and ISO/IEC 27001. Other controls mapping references can be found online. Some must be built manually.
Once an organization selects a control framework and multiple frameworks are mapped together (if the organization has decided to undertake that), security managers will need to organize a framework of activities around the selected/mapped control framework.
Risk Assessment Before a control can be designed, the privacy or security manager needs to have some idea of the nature of risks that a control is intended to address. In a running risk management program, new risks may have been identified during a risk assessment that led to the creation of additional controls. In this case, information from the risk assessment is needed so that the controls will be properly designed to handle these risks.
If an organization is implementing a control prior to a risk assessment, it may not design and implement the control properly. Here are some examples:
• A control may not be rigorous enough to counter a threat.
• A control may be too rigorous and costly (in the case of a moderate or low risk).
• A control may not counter all relevant threats.
In the absence of a risk assessment, the chances of realizing one or more of these undesirable outcomes are quite high. If an organization is implementing a control, a risk assessment needs to be performed. If an organization-wide risk assessment is not feasible, then a risk assessment that is focused on the control area should be performed so that the organization will know what risks the control will be intended to address.
Control Design An early step in control use is its design. In a standard control framework, the control language itself appears, as well as some degree of guidance. The privacy or security manager, together with personnel who have responsibility for relevant technologies and business processes, need to determine what activities should occur. In other words, they need to figure out how to operationalize the control.
Proper control design will potentially require one or more of the following:
• New or changed policies
• New or changed business process documents
• New or changed information systems
• New or changed business records
Control Implementation After a control has been designed, it needs to be put into service. Depending upon the nature of the control, this could involve operational impact in the form of changes to business processes and/or information systems. Changes with greater impact will require greater care so that business processes are not adversely affected. For instance, an organization may implement a control that requires production servers and other devices to be hardened from attack to comply with recognized standards such as CIS Benchmarks. After the hardening standards are developed (no easy task, by the way), they need to be tested and implemented. If a production environment is affected, it could take quite a bit of time to ensure that none of the hardening standard items adversely affects the performance, integrity, or availability of affected systems.
Control Monitoring After an organization has implemented a control, it needs to monitor the control. For this to happen, the control needs to have been designed so that monitoring can occur. In the absence of monitoring, the organization will lack the methodical means for observing the control to determine whether it is being operated correctly and whether it is effective.
Some controls are not easily monitored. For instance, a control addressing abuse of intellectual property rights includes the enactment of new acceptable use policies (AUPs) that forbid employees from violating intellectual property laws and copyrights. There are many forms of abuse that cannot be easily monitored.
Control Assessment Any organization that implements controls to address risks should periodically examine those controls to determine whether they are working as intended and as designed. There are several available approaches to control assessment:
• Security review One or more information security staff members examine the control along with any relevant business records.
• Control self-assessment (CSA) Control owners answer questions and provide any relevant evidence that demonstrates a control’s operation.
• Internal audit The organization’s internal auditors (or information security staff) perform a formal examination of the control.
• External audit An external auditor formally examines the control.
An organization will select one or more of these methods, guided by any applicable laws, regulations, legal obligations, and results of risk assessments.
A nearly worn-out but still highly relevant cliché in information security is this: you cannot protect what you don’t know you have. This statement underscores the need for effective asset management at all levels, because only specifically identified information can be managed and protected.
For a privacy program to be effective, organizations must have a complete and accurate inventory of all the personal information it has collected. Although an inventory of structured information (data residing in application database management systems) will remain fairly static, the transient nature of unstructured data creates additional challenges. Somehow, organizations must identify means for knowing about all structured and unstructured data, particularly when it contains personal information that is in scope of relevant privacy laws. Proactive data discovery, discussed later in this chapter in the section “Data Loss Prevention Automation,” can be put in place to provide visibility into the creation and use of unstructured data.
For an organization’s data inventory to remain current, three activities need to be included in business-as-usual processes:
• Change management processes must require updates to data inventory whenever an addition or change to an information system impacts the data inventory.
• Business processes that interact with personal information must be documented.
• Periodic reviews of the data inventory should be performed to confirm its accuracy.
Periodic data inventory reviews should not only catalog existing instances of sensitive and personal information, but they should also determine in each instance whether data should exist where it is found. By understanding the business processes that interact with personal information, you can better identify where personal information is collected, processed, and stored. Thus, a data inventory should be thought of less as a census and more of a gap analysis. Every instance of sensitive information should be examined through the lens of the business processes and data management policy to determine whether each instance should exist and whether current protective controls are adequate.
Organizations with lower process maturity are more likely to use unstructured means for performing procedures and completing tasks. Often this will result in a greater use of e-mail for process workflow and a greater use of unstructured data stores for storing data. E-mail, file servers, and cloud storage services represent the majority of unstructured data in many organizations.
When inventorying data, you should include the following information in each catalog entry:
• Name of the file(s) or directory/directories
• Description of the contents, including PII data fields
• Date of last update (this will aid in the removal of old data)
• Access permissions
• Data owner
This information will be useful for the development of data flow diagrams, as discussed later in the section “Data Flow and Usage Diagrams.”
Many types of information reside in an organization’s information systems. Some of this information is highly sensitive because it contains personal information, intellectual property, and internal financial information; some information is important but not sensitive at all; and some is not very important. Because resources are required to protect information, it doesn’t make much sense to apply equal rigor to protect both unimportant data and highly secretive or sensitive information. To this point, former US national security advisor McGeorge Bundy is attributed to have said, “If we guard our toothbrushes and diamonds with equal zeal, we will lose fewer toothbrushes and more diamonds.”
A data classification policy is a formal and intentional way for an organization to define levels of importance or sensitivity of information. A typical data classification policy will define two or more (but rarely more than five) data classification levels, such as the following:
Along with defining levels of classification, a data classification policy will include examples that show the classification levels that should be assigned to various datasets. Table 2-1 provides examples of this concept.
Table 2-1 Examples of Information at Varying Data Classification Levels
Data classification policy can go still further and emphatically state the classification levels that are, by policy, assigned to specific datasets. This is shown in Table 2-2.
Table 2-2 Examples of Official Data Classification Levels
Because so much information is handled by personnel on a daily basis, data classification policy goes still further to define acceptable handling procedures for data at various levels of classification and in numerous types of situations. Often called data handling standards, these procedures provide real-world guidance that workers can easily follow and use. Because information can be used and moved in many different ways, data handling standards should clearly state what is expected of personnel when handling sensitive data.
Data handling standards usually take the form of a matrix, with various levels of classification as the columns and different data handling situations as rows. Each individual cell defines the standard for handling data at a given classification level in a certain way. Table 2-3 shows a part of such a matrix.
Table 2-3 Example Data Handling Standards Matrix
To help the workforce better understand handling standards, organizations should develop training content or tutorials to explain in detail their meanings and to introduce appropriate procedures.
The Data Handling Culture Shift
Privacy professionals in organizations introducing data classification, handling standards, training, and automation must understand that such an undertaking may represent a significant culture shift. It’s potentially a tall order to expect a workforce that formerly took data handling for granted to become data-aware and to understand and follow new procedures. Such a change does not happen overnight. Even when executives lead by example and in the presence of an internal marketing plan, many workers’ responses will range from confusion to resistance and outright evasion. It is, therefore, important for the workforce to understand the purpose of data classification.
Along with defining levels of classification, a data classification policy will define policies and procedures for handling information in various settings at these levels. For instance, a data handling standard will state the conditions at each level in which sensitive information may be e-mailed, faxed, stored, transmitted, and shipped. Note that some methods for handling may be forbidden—such as e-mailing a registered document over the Internet.
Relying on an organization’s workers to apply data handling standards consistently is chancy at best—not because of the lack of good intentions, but because workers simply will not have safe data handling on their minds all of the time. This situation is not unlike workers who click a link on the occasional phishing message despite having attended effective security awareness training. People are simply not “on their guard” all of the time.
Data loss prevention (DLP) systems can greatly aid in the effort to provide visibility and even control over the use of personal and other sensitive information. Approaches to the implementation of DLP capabilities fill the remainder of this section.
Static DLP Static DLP tools scan static data stores to identify files containing data matching specific patterns. Most often, static DLP scanning is performed on file servers—both the on-premises and cloud varieties. DLP scanning can also be performed on database management systems, either by scanning specific tools or by scanning flat-file exports of databases.
Organizations undertaking static DLP scanning for the first time may find an abundance of files containing PII and other sensitive data. Privacy managers need to keep in mind that such data may have been accumulated over a long period of time, and it may represent current practices or former activities that are no longer practiced.
A careful analysis of the results of an initial DLP scan should be undertaken to determine the following:
• The age of files containing PII found in file stores
• The extent to which files containing PII are still being deposited in file stores
• The access rights of files containing PII
• Which users actively access the files (available in some DLP static scanning tools)
• Whether current use is following sanctioned policies, procedures, and practices
Privacy managers should not be overly hasty in the quest to “solve” any or all of the discovered instances of PII in static data stores. Some uses may be a part of key business processes along with adequately restricted access controls. Often, privacy managers will find that files containing PII in file stores are the result of one-time or ad hoc activities. For instance, a business analyst may be asked to perform a research task on the demographics of customers; the business analyst would perform a query or run a report on the customer relationship management (CRM) system, export the report to a spreadsheet, and save the spreadsheet on a file server. After completing the task, the business analyst will keep the file there in case questions are asked about it later. Soon the existence of the spreadsheet containing PII is forgotten, and there it will reside in perpetuity unless some purge, cleanup, or DLP scan is performed that discovers its existence.
Data Tagging Through the process is similar to static DLP analysis, data files can be tagged, or marked in some way, if they are found to contain PII or other sensitive information. Such tagging can take on several forms, including these:
• Metadata tagging The metadata of a data file can be updated to include a specially coded tag that will be recognized by DLP tooling.
• Watermarking Visible or invisible watermarks can be added to data files.
• Marking According to data file-marking policy, a human-readable word or phrase can be added to the header, footer, or other location in a data file (such as “XYZ Company restricted to internal use only”).
Including human-readable watermarks or other markings on documents provides a visual reminder to workers using data files about the sensitivity of the files. The main purpose of including machine-readable marks or tags in a data file is to facilitate appropriate action by dynamic DLP tools, discussed next.
Dynamic DLP Dynamic DLP represents a variety of technologies used to detect and even intervene in the transfer of PII and other sensitive information. Dynamic DLP tools can take the form of network devices or tools running on operating systems with the ability to observe data in motion in many different circumstances. These tools can be configured to identify the specific sensitivity of data files in motion by reading their contents to determine whether they contain specific PII or other sensitive information, or they can be configured to look for previously applied tags.
There are several common forms of dynamic DLP:
• E-mail DLP DLP tools can examine the contents of an outgoing e-mail message to determine whether it contains specific sensitive information. E-mail–based DLP will consider whether the information is being sent to internal or external recipients.
• USB storage control Host-based DLP tools restrict USB usage to company-approved (and usually encrypted) USB drives only, or they block USB storage entirely.
• Local file storage control Host-based agents observe and optionally block actions violating policy, such as local storage of highly classified documents.
• File server storage control Host- or server-based agents observe and optionally block actions violating policy, such as storage of highly classified documents on shares with broad access.
• Cloud server storage control DLP capabilities in cloud storage services are configured to mimic local or file server DLP controls to monitor and optionally block actions violating policy.
• Network DLP Network devices, or DLP modules in next-generation firewalls, observe the content of data in motion.
Most of these forms of dynamic DLP, when operating in the context of an interactive user, can present the following to the user:
• Silently note the occurrence.
• Block the action and inform the user.
• Warn the user that the intended action is forbidden by policy, and give the user the ability to permit the action anyway after providing a user-entered business justification to complete the action.
Other tools may be used to assist in dynamic DLP efforts:
• Firewall Blocks access to/from specific networks or systems
• IDS/IPS Blocks access to/from networks and systems thought to be hazardous
• Web content filter Blocks browser access to sites based on policy
• Cloud access security broker (CASB) Monitors and controls access to cloud-based service providers based on organization policy
• NetFlow Monitors network traffic and produces alerts when anomalous traffic is seen on the network
Dynamic DLP controls, in any of the forms discussed, can be highly valuable for preventing the mishandling of sensitive information. Unfortunately, dynamic DLP is also adept at interfering with legitimate business processes by blocking activities approved by management (including the privacy or security manager). Legitimate activities are blocked either because the DLP system is misconfigured or because of a “false positive” situation where the DLP system misidentifies data. Examples of such false positives include strings of numerals that are mistaken for social insurance numbers, bank account numbers, phone numbers, or credit card numbers.
Run in Learn Mode First Despite what readers may be told by experienced users of DLP systems or DLP vendors themselves, readers are highly recommended to run any dynamic DLP system in “learn” mode first for an extended period of time. The best way to do this is to configure the DLP system to silently log occurrences that are thought to be file-handling policy violations. This enables privacy and security personnel to see whether the DLP system properly identifies the actual motion of private and other sensitive information.
Another highly useful benefit of running DLP in learn mode is similar to that of running static DLP scans: to determine what data movement currently takes place in the organization to help personnel better understand existing business processes, as well as those occasional (hopefully infrequent) actions that represent actual violations of policy.
When privacy and security personnel become confident in their dynamic DLP system’s ability to identify sensitive data movement correctly without false positives, they can proceed to begin activation of preventive actions to be performed by the DLP system. It is suggested that organizations proceed slowly as they build their confidence in the proper operation of the system. Additionally, once a DLP system is in preventive mode, there is an expectation that alerts corresponding to blocked actions will be properly investigated and addressed.
Develop Response and Exception Procedures Before activating any rules in a dynamic DLP system that will block file handling actions, privacy and security personnel—preferably in cooperation with the IT service desk—should develop a playbook of response procedures when end users encounter DLP systems blocking their intended actions. Rather than be caught by surprise, IT service desk personnel (or others designated to work with end users) must have clear procedures in place for handling users who insist that the DLP system is blocking legitimate actions. Often, IT service desk personnel will need to contact privacy or security personnel who can look into these matters to determine the best course of action. The response will often require that a privacy or security manager approve exceptions and direct the configuration of the DLP system to permit specific activities that are contrary to policy. Recordkeeping of all such exceptions is important, whether as a part of an existing change control, incident management, or policy exception, or as part of another process.
Data Security Has Arrived
As far back as computers have existed and have been used to store and process personal information, data security has been on the minds of management. For decades, data security has come in the form of firewalls and other network access controls, system hardening, event monitoring, and other measures—most of which protected the information enclave while virtually ignoring access to and use of personal information.
Emerging privacy laws, as well as emerging capabilities such as DLP, have created the first real opportunity for organizations to enact actual data security capabilities focused on the data itself instead of merely keeping intruders out and preventing workers from removing data. There is much progress yet to be made, but the capabilities exist today for organizations to focus specifically on data they intend to protect and manage.
One could say that privacy laws are “sunshine laws” that pertain to organizations’ use of personal information. They have sparked lively debate in organizations that were accustomed to doing “whatever they wished” with personal information they collected from customers and others, with little or no scrutiny. Indeed, prior to modern privacy laws, organizations had little accountability regarding creative uses of the personal information they held. Privacy leaders often face an uphill battle as they compel organizations to rein in and control uses of personal information.
Organizations need to include governance structures to provide visibility and control on the topic of data usage. Such a structure would include the following:
• Internal policies stating permitted use of personal information
• Establishment of controls to ensure these outcomes
• Monitoring of these controls to verify their effectiveness
• Corrective action to remedy all deviations
• Metrics that measure all of the foregoing
The following sections discuss various privacy principles that privacy leaders need to transform into intentional practices.
A key tenet of the GDPR and other privacy laws is the concept of limiting the use of personal information. Following numerous abuses of PII by private organizations, privacy laws now restrict how organizations can use the personal information they collect.
Article 5(1)(b) of the GDPR reads,
Personal data shall be collected for specified, explicit and legitimate purposes and not further processed in a manner that is incompatible with those purposes; further processing for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes shall, in accordance with Article 89(1), not be considered to be incompatible with the initial purposes (“purpose limitation”).
Title 1.81.5 of the CCPA states,
A business that collects a consumer’s personal information shall, at or before the point of collection, inform consumers as to the categories of personal information to be collected and the purposes for which the categories of personal information shall be used. A business shall not collect additional categories of personal information or use personal information collected for additional purposes without providing the consumer with notice consistent with this section.
Privacy leaders face challenges on the topic of data use limitation. Organizations that are accustomed to a freewheeling “we do anything we want with customer information” attitude are now seeing constraints in the form of privacy laws and changing social norms. These constraints often include data governance councils or other bodies to oversee the uses of personal information to ensure that organizations don’t run afoul of privacy laws or their own stated privacy policies.
The concept of data minimization refers to the practice of collecting and retaining only those specific data elements necessary to perform agreed-upon functions. In other words, organizations should be careful to collect or accept only those specific PII details required to perform whatever services they provide.
Historically, organizations have retained business records containing personal information for long periods of time. Privacy laws counter this tendency, which leads many privacy leaders to enact changes in organizational processes and systems to minimize the collection and retention of personal information.
In the same way that double-entry accounting describes each item on a balance sheet as both asset and liability, personal and sensitive information can provide value to an organization as an asset, but it also represents a liability. Although the asset value aspect of personal and sensitive information may be clear, organizations are slower to realize that accumulating and retaining personal and sensitive information also represents a liability.
Unfortunately, the financial liability portion of retained personal information rarely shows up on an organization’s financial balance sheet. And yet it is indeed a liability: the impact upon an organization if cybercriminals steal that information or if the information is misused is real, in the form of breach response costs, the costs related to reducing harm inflicted on affected parties (think of credit monitoring services that are a frequent remedy for stolen credit card numbers), the costs of fines from governmental regulators, and the occasional class-action lawsuit.
Data minimization has multiple dimensions:
• Collect only required data items.
• Collect only required records.
• Retain only as long as is needed.
• Pseudonymize or anonymize as soon as possible.
• Reduce accessibility.
These and other concepts are discussed in the remainder of this section.
Collecting Only Required Data Items When collecting personal information directly from data subjects, organizations should collect only data items that are required for the organization to fulfill the intended purpose for the collection. Every item collected must be rationalized and the reason for collecting it documented. Data items that cannot be justified as necessary should not be collected.
Any data item proposed to be collected that does not have a present business purpose should not be collected. Such collection would introduce liability to the organization with no corresponding benefit. For example, suppose an e-commerce company that sells books developed a customer portal where a customer can select favorite categories of books and save a shipping address, so the customer does not need to enter them in for each order. An analyst is proposing that the organization collect the date of birth (including year) for each customer so that a birthday discount code can be sent during the month of their birthday. The privacy officer successfully argues that this function does not require the day or year of the customer’s birth but only the month. The organization decides not to collect the birth year and day for its customers because there was no purpose identified that would require it (such as selling adult-only merchandise).
Though in the preceding example a valid decision was made, more is required. Organizations need to create a business record in the form of a detailed inventory of data items collected, including the collection purpose(s). This way, months or years later, a privacy professional can examine business records and understand the reasons for specific data collection decisions without relying upon the memory of people who may not even be employed in the organization any longer.
There remains some inferred responsibility for providing personal information that falls on data subjects who are providing it. Data subjects should be aware of the information they provide to an organization or government and attempt to withhold any information they believe is unnecessary for the organization to fulfill the intended purpose. For example, most e-commerce sites should not require a data subject’s date of birth to complete transactions (the sale of products prohibited for minors is one exception; applying for a credit card is another). If an e-commerce site requests a date of birth, a data subject should avoid providing it unless some clear purpose is stated that the data subject agrees with.
Collecting Only Required Records Organizations that acquire personal information in bulk (such as purchasing from a data broker) should collect only those records required to fulfill the intended business purpose. The collection of unneeded records brings only liability to an organization when unneeded records provide no value or benefit. If, for example, an intruder breaks in and steals this information, the organization that collected the data may need to make reparations to all persons whose records were collected.
Organizations that purchase data about citizens in bulk must ensure that they obtain only the records they need to meet their business objectives. Sometimes, only large datasets are available when just certain records are needed. In such cases, organizations should remove unneeded records as soon as it’s practical to do so.
On a record-by-record basis, organizations should devise options to enable choices when collecting individual records. One example is e-commerce; many online shopping sites permit customers to “check out as a guest,” where buyers provide only enough information to complete the purchase, instead of being required to create a persistent user account that requires collection of additional information such as a credit card number, billing address, shipping address, and other information. For customers who opt for guest checkout, the organization collects only what is necessary to complete the transaction.
Discarding Data When No Longer Needed Organizations collecting and retaining personal data should understand their uses of personal data fields to determine whether long-term retention is appropriate. For instance, an e-commerce web site accepting a credit card payment may collect the CVV (card verification value) from the customer and discard the CVV as soon as the transaction has been approved or rejected. Or a web site providing services to adults who are 21 or older may require evidence of the customer’s age before they can conduct business. This practice may involve the uploading of a government-issued ID or some other identification means. Once the organization has verified that the subject is at least 21 years old, the organization can discard the PII collected to prove the subject’s age and simply indicate that their age has been verified.
Minimizing Access Data minimization is all about risk reduction by limiting the amount of data available for various functions. In risk-speak, data minimization reduces the impact of improper data usage or a breach of personal information. An organization can achieve effective data minimization through access controls in several ways:
• Reduce access volume. Organizations can limit the number of records that a worker can access, extract, or download in bulk operations. In B2C (business-to-customer) organizations, few persons in the organization genuinely need access to the entire customer database, and access could be limited by various means.
• Reduce the number of personnel with data access. Organizations can limit access to customer data to those workers whose jobs require it.
• Reduce access to sensitive fields. Organizations can limit the data fields accessible by their workers. For instance, if a worker doesn’t need to access customers’ birthdays or full credit card numbers, their ability to see these or other sensitive fields should be reduced.
Data masking can be used to protect the contents of personal information from personnel who do not need to see it. A typical example is the display of credit card or social insurance numbers. Although an information system may store the full contents of these values, some of the characters can be masked so that personnel cannot see them. Most often, programs will display only the last four digits of a credit card or social insurance number.
Minimizing Storage Organizations can significantly enhance the security of sensitive and personal information by limiting where and how workers can store such data. When workers work with downloads or extracts from business applications, organizations can enact controls that limit where that data can be stored. Primarily, organizations should limit the storage of personal information about their customers, constituents, and employees to organization-managed systems.
Organizations can enact policies that state that all sensitive and personal information must be stored only on organization file servers, and not on laptop or desktop computers, mobile devices, removable storage devices, or personal cloud-based storage services. Further, sensitive and personal information can be blocked from being sent via company or personal e-mail. These controls are typically implemented with DLP controls, but web content filtering systems and CASB capabilities can supplement DLP solutions.
The ultimate objective is to limit the storage of personal information to locations permitted by policy and controlled through tooling. Privacy and security professionals need to tread carefully, however, to prevent disruption of sanctioned business processes. This topic is explored further in the section “Developing and Running Data Monitoring Operations,” later in this chapter.
Minimizing Availability A common approach to minimizing the availability of information is to migrate data on widely accessed systems to archival systems that few personnel can access. For example, suppose a regional hospital’s patient care and billing systems contain medical records and billing records for the past 15 years. To reduce the risk of exposure, the hospital migrates all records more than two years old to an archival system that few hospital personnel can access. This permits the hospital to comply with minimum data retention requirements while reducing risks associated with hospital personnel having access to large volumes of medical and financial information. Further, since the data archival system is accessible only from internal networks and by few personnel, there is a correspondingly lower risk of a break-in by intruders.
Organizations can implement additional controls to further protect the archival system and reduce risk, including DLP, end user behavior analytics (EUBA), and NetFlow.
Minimization Through Retention Practices Organizations are accustomed to retaining data for very long periods, often in perpetuity. For generations, the risks associated with long-term data retention have been quite low. Digital transformation has changed all of that: datasets that contain more details about data subjects (more fields with sensitive information, and more records) are considered high-value targets by cybercriminal organizations. Privacy professionals often call highly sensitive information “toxic data,” since its theft can have dire consequences on the organization.
Although the accumulation of sensitive data may bring value to the organization, it increases liability as well: the theft of a large trove of sensitive data will incur greater costs than the theft of a smaller dataset. For example, suppose each of two similar e-commerce organizations has about 5 million customers. One of the organizations keeps only two years’ worth of transaction data and moves dormant customer data to an offline storage system. The other organization keeps all customer data online, even for customers who have not patronized the organization for years. If each organization’s customer database was stolen, the organization that reduced its customer database size would incur fewer costs than the organization that kept all of its customer data online.
The approach to data retention involves the development of a data retention schedule, a chart that specifies the minimum and maximum periods that specific types or sets of data will be retained by the organization. A data retention schedule is considered policy, and organization departments are expected to comply. Security and privacy personnel may perform reviews or audits periodically to determine whether the organization complies with its data retention policy, and corrective actions may result when violations are found.
Enacting the purging of older records is not always easy. Several challenges may present themselves:
• Database referential integrity The design of relational databases may make the prospect of removing records a bit tricky. Primarily, a record cannot be removed if another record elsewhere in the database refers to that record through a foreign key. The referential integrity concept refers to the restriction where a record (or row) cannot be removed if another table has a row whose foreign key points to the record to be removed. The other table’s row would first have to be modified or removed. This is typically a problem in older databases that were not designed with data retention in mind.
• Comingling of data Some storage media cannot be modified once created. For instance, magnetic tape is an “all-or-nothing” medium; it is impossible to remove specific data from a magnetic tape while retaining other data. If, for example, a system to be backed up to magnetic tape contains specific data that must be purged after two years, along with other data that must be retained for ten years, storing all of this data on magnetic tape will create a conflict, and the organization will not be able to conform to both of these retention requirements.
• Unstructured data Because data can be extracted and further manipulated by users’ workstations, it can become difficult to know whether individual workbooks contain information that has exceeded its retention. The date stamp on a workbook does not indicate the transaction dates of rows in the workbook, which could be recent or quite old. Variations in the ways that data can be represented in workbooks make it infeasible to enforce transaction-level data retention effectively in unstructured file stores.
• Third parties Organizations outsourcing business applications to third parties (through platform as a service or software as a service models) may find that one or more of the third parties cannot remove older records from their systems. They may have referential integrity issues in their databases, or they may simply lack the tools to remove older records from selected customers’ databases. This kind of situation often arises when an organization, after selecting and using third-party applications, enacts data retention policy only to discover that one or more third parties cannot comply.
• E-mail In some organizations, workers send sensitive and personal information to one another via e-mail. Searching for and removing specific e-mail messages that contain sensitive information may not be feasible; this problem is similar to the unstructured data problem discussed earlier. Removing all older e-mail messages may be a viable approach. Still, the organization needs to fully understand the nature of the organization’s use of e-mail so that purging older e-mail messages does not introduce unintended consequences such as the destruction of other records that should be retained for longer periods.
Data retention does not require an “all-or-nothing” approach that requires the organization to delete all older records. Instead, organizations can define storage locations for business records that can be tightly managed and establish a generic retention schedule for other storage locations. Additionally, records that have reached their expiration date can be pseudonymized or anonymized, thereby removing personal information from records but retaining other aspects of the records for historical purposes. Pseudonymization and anonymization are discussed in the next subsection.
Minimization Through De-identification Depending upon the purpose of acquiring personal information, organizations can consider de-identification as a method for minimizing the amount of PII that they retain. For instance, if an organization acquires personal information for statistical purposes, it can pseudonymize or anonymize the records to remove their association with specific persons, while still providing statistical value.
When implemented correctly and from the perspective of privacy, de-identification is as effective as the outright removal of records. When de-identification is implemented correctly, an organization will continue to derive value from de-identified data through analytical value. For instance, after a database of user transactions has been de-identified, the organization won’t know precisely who performed individual transactions, but it will still be able to understand the details of sales trends and other big-picture insights.
Two primary techniques are used in de-identification: pseudonymization and anonymization. While these techniques differ, the results are similar.
Pseudonymization Pseudonymization is the substitution of data in sensitive data fields with alternate values to de-identify data records with specific persons. Article 4 of the GDPR defines pseudonymization as “the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person.” Here, specific identifying fields such as name, address, phone number, e-mail address, and financial account numbers are removed and replaced with pseudonyms. Pseudonymization is generally a reversible substitution technique: field values that identify actual persons are replaced with pseudonym values. Here are some examples:
• Peter Gregory becomes Qoem Rebnurvo
• 118 Elm Street becomes 539 Tlo Uepv
The substitution technique permits software to function correctly, while substitutions eliminate the association of the record from the actual person. Pseudonymization differs from anonymization because pseudonymization still may enable an individual’s data to be singled out and linkable across different datasets.
Anonymization Anonymization is the process of irreversibly altering or removing sensitive data fields from records so that an individual can no longer be identified directly or indirectly. ISO 25237 (Health Informatics – Pseudonymization) defines anonymization as any “process by which personal data is irreversibly altered in such a way that a data subject can no longer be identified directly or indirectly, either by the data controller alone or in collaboration with any other party.” PII fields can be removed or hashed so that the data cannot be associated with specific data subjects. Anonymization can involve a simple removal technique in which data fields that could associate a record with a specific person are removed. Here are some examples:
• Peter Gregory becomes (blanks)
• 118 Elm Street becomes (blanks)
• [email protected] becomes (blanks)
Note that anonymization may cause software to behave in unexpected ways. Further, database management systems may resist anonymization, because this could endanger referential integrity. For anonymization by removing field data to work correctly, it may be necessary to copy records in a database to a separate database whose structure lacks the removed fields.
In the context of data privacy, data quality or data accuracy is a gauge of the care that an organization places on the fidelity of its stores of personal data. Since the reality of data usage includes personal information being passed from organization to organization, the task of maintaining the accuracy of PII is an important one. Privacy leaders building and implementing privacy programs need to determine where in an organization’s business processes and information systems the means to ensure data quality and accuracy reside.
In addition, the organization must ensure that, at a minimum, the quality and accuracy of stored data complies with the requirements of applicable privacy laws. Article 5(1)(d) of the GDPR, for example, reads, “Personal data shall be accurate and, where necessary, kept up to date; every reasonable step must be taken to ensure that personal data that are inaccurate, having regard to the purposes for which they are processed, are erased or rectified without delay (‘accuracy’).” Once just a good idea, data accuracy is now required by law.
Data quality and accuracy are more than just the completeness and accuracy of data fields for data subjects; it also includes whether records for specific data subjects should even reside in an organization’s database at all. Data subjects’ information sometimes ends up in an organization’s database simply by accident, often because of a matching error. Here are some examples:
• Matches by name Some organization databases key off a subject’s name only, resulting in snafus of every sort. In a real-life instance, this book’s author and another person of the same name were in the Seattle job market at the same time, applying for some of the same jobs. Communications between companies and the two applicants were frequently crossed-up.
• Matches by characteristic Some organizations use ancillary information to associate people. In a real-life instance from decades ago, parking tickets in Reno, Nevada, were entered into a computer system; if there was no license plate on the offending vehicle, the word “none” was entered. After a citizen ordered a vanity plate that read, “NONE,” he was charged tens of thousands of dollars in unpaid parking tickets that had been issued over a period of many years (all predating the existence of the issued vanity plate).
• Data entry errors These are certainly the most common reason that things get fouled up. With literally billions of people using e-mail today, countless errors occur because e-mail addresses are miskeyed, resulting in messages being sent to the wrong persons. This book’s author regularly receives e-mails intended for a physician located in the US Northeast, as well as e-mails intended for the owner of a private jet aircraft maintenance company in Southeast Asia. This problem goes way beyond e-mail addresses: miskeying dates of birth and other personal characteristics can result in communications and records being crossed-up, as well as many intended actions not being carried out because some of the information is incomplete or incorrect.
Prior to the introduction of modern privacy laws such as GDPR, many private sector organizations had little reason to care about the accuracy of personal information, unless the accuracy had a direct monetary impact on them. For instance, an automobile manufacturer’s database of vehicle owners is surely going to have incorrect mailing addresses for customers who move and do not inform the manufacturer. The result is twofold: marketing materials sent by mail are no longer reaching the customer, and neither are safety recall notices that represent cost-per-vehicle repairs.
The efforts undertaken to build and maintain data inventories do not end when it is known where all data is stored, who the owners are, and what access permissions are granted. An essential aspect of data inventory is the knowledge of two additional data characteristics: data flow and data usage.
Understanding data flow requires a deeper study of the information systems where data resides to understand how data arrives in the system and where data is sent from the system. Interviews with business users with regard to the business process and IT personnel at the application layer as well as in system and network layers are needed to build a complete picture. The term “picture” is used deliberately: it’s often useful to build a visual schematic of data flows, as this can help privacy and security professionals, as well as business leaders, better understand the usage of personal information in an organization. A data flow diagram (DFD), like the one shown in Figure 2-1, is a visual depiction of the flow of information between systems.
Figure 2-1 A high-level DFD depicts general data flow between IT applications.
In this effort, privacy professionals need to understand that data flow implies usage, and data usage implies flow. One is often perceived with the other.
Discovering Data Flow and Usage
Prior to the passage of privacy laws, including the GDPR and the CCPA, many organizations simply had no idea of the extent of the movement of their data. In many organizations, a common first step in working toward GDPR compliance was the development of a data inventory, along with the creation of data flow diagrams and the discovery of data usage. Amazing as it sounds, in many organizations, nobody was responsible for knowing these things.
The next step in most organizations is the development of data governance, providing management with visibility and control over the storage and use of personal information. It’s as though there were no real rules for the management of personal information prior to GDPR.
Mainstream organizations have access to advanced data management and data analytics capabilities that can provide them with additional insight into their businesses. Indeed, monetization is a primary impetus for an organization to mine its own data to improve how it can exploit its customers’ buying preferences to increase revenues. Data analytics techniques have other purposes as well, including the discovery of new potential customers and improved insight into the use of an organization’s products and services.
Marketing departments in organizations frequently embark on targeted marketing and advertising, whether it’s by e-mail, postal mail, online ads, or other means. To reach their target markets with the right message at the right time, organizations often purchase lists of targeted individuals from data brokers and merge that data into their marketing databases. Data aggregation is the practice of combining databases to enrich available data. For instance, suppose a marketing department wants to send out flyers about a new line of luxury vehicles to wealthier persons (who are more likely to buy than those with lower incomes), and it purchases data from a data broker that includes household income and other details. The organization merges this information into its database and then selects those wealthy persons as targets for their campaign.
This sort of activity occurs far more frequently than most people realize. There exists an entire industry of organizations with vast dossiers on virtually all adults in the United States and many other countries. This data is traded, bought, sold, merged, sorted, culled, updated, and recirculated in an endless cycle. Most of this occurs in companies most people have never heard of until a breach occurs—and, even then, personal notices are rarely sent to affected parties.
Returning to the main point of data aggregation and embellishment, another activity that frequently takes place is this: organizations seeking to aggregate customer data purchase additional data from data brokers in order to add specific data to their databases. Sometimes they receive additional data fields that are also retained, resulting in the organization having more details about its customers and prospects than it really wants or needs. This phenomenon is an example of data sprawl.
Aggregation works in other ways. Organizations with large customer and prospect databases can purchase data from data brokers in attempts to keep their data up to date. For instance, a motor vehicle manufacturer can purchase data from data brokers in an attempt to obtain up-to-date mailing addresses for its customers so that its safety recalls will actually reach these people.
Citizens are most concerned about the potential for data aggregation among various government agencies. One can only imagine the abuses that could occur if data from one agency were accessible by malevolent persons in other agencies. To this end, citizens need to be aware of whether privacy laws apply to government agencies as well as private businesses. For example, while the GDPR applies both to government agencies as well as private businesses, the CCPA applies only to private businesses, and government agencies are exempt.
It is said that, similar to information security, privacy is “everyone’s job.” This means that the procedures and practices followed by all persons involved in the processing of personal information include steps to ensure that personal information is used only in officially sanctioned ways. The main function of the privacy office is to ensure this ongoing outcome.
An organization establishing a privacy office needs to define the scope of its responsibilities. In smaller organizations, the scope would typically include all business operations in all locations. In larger organizations, particularly those with a presence in one or more countries with strict privacy laws (such as the GDPR), the organization may appoint local privacy personnel in each local country. This can help better align local business operations with local laws and requirements.
Before the privacy office can begin enforcing privacy-related activities in an organization, it must first identify and document requirements that define the specifics regarding the collection, protection, and use of personal information.
Culture and Values At the risk of implying a “motherhood and apple pie” sentiment, it’s necessary to start with an understanding of the organization’s culture and stated values. Culture and values define the personality and uniqueness of an organization; the way that the organization values its assets, including the personal information of its customers, constituents, and employees, should be reflected in its policies and requirements.
Applicable Regulations All regulations that apply to the organization need to be identified. This includes industry-specific regulations such as GLBA and HIPAA, as well as geographically related regulations such as PIPEDA, CCPA, CPRA, and GDPR. These and other regulations are cited and described earlier in this chapter.
Privacy regulations are developing and changing at high velocity. Organizations need to have an established system in place to keep them informed about new and changing regulations on the topics of information privacy and cybersecurity. Cybersecurity professionals and lawyers, for example, have particular industry news sources; at present, the best sources appear to be newsletters and paid subscriptions from legal sources that keep their subscribers up to date on new laws and related developments. It is recommended that organizations keep an official inventory of applicable laws and regulations related to cybersecurity and privacy; depending upon the organization’s industry sector, this inventory may extend to other industry-specific topics as well.
Legal Interpretation It’s a wise practice to employ internal or outside legal counsel to provide an interpretation of privacy and cybersecurity laws. Legal counsel experienced in these fields should first provide guidance on the applicability of these laws. For laws deemed applicable, legal counsel should guide the organization on the meaning of applicable portions and how they should be implemented.
As the de facto risk officer, an organization’s legal counsel is responsible for identifying legal and regulatory requirements and risks. It is in this capacity that legal counsel determines which laws are applicable and what the organization should do to comply with them. Like individuals in other professions, legal counsel will often confer with their industry peers and outside experts to get an idea of the consensus of opinion on the applicability and compliance approach to new and existing laws.
Cybersecurity Policies, Requirements, and Regulations As is often cited in this book, it’s impossible to implement privacy successfully without also implementing effective cybersecurity. For the protective aspect of information privacy, organizations also need to identify their cybersecurity practices, policies, requirements, and applicable regulations and do their best to implement them in the form of cyber-risk management and cybersecurity operations. If the protective side of privacy is unable to succeed, the proper handling side of privacy will be in danger of failure.
The protection of personal data is one of the primary responsibilities of an organization’s information security function. In most organizations, this function is separate from the privacy office or privacy operations. Generally speaking,
• Data protection operations are generally built upon a framework of security controls. Security control frameworks are discussed in Chapter 1.
• Data protection technologies are typically operated by the IT department, although in some organizations, a separate security operations function will manage these. The technology of data protection is discussed fully in CISM Certified Information Security Manager All-In-One Exam Guide.
• Decisions regarding the ongoing development of security privacy and controls are a part of the larger risk management life cycle, described in Chapter 3.
Monitoring the use of personal data is at the core of many organizations’ privacy programs. Since privacy is concerned with the protection and use of personal information, privacy operations are uniquely different from operational security processes.
To determine whether personnel are complying with privacy and data classification and handling policies, organizations will conduct data discovery scans of their data storage systems. Typically performed by automated DLP scanning tools, these scans generally target file shares and other structured and unstructured repositories and employ rules to identify the presence of specific types of information.
Examples of scan targets include account numbers, credit card numbers, medical records, and government-issued identification numbers, in an attempt to discover whether files containing personal information have been stored on file shares in violation of policy. Scans can also include nonpersonal information such as source code, financial information, and intellectual property.
Upon receiving the results of discovery scans, security or privacy analysts will investigate the presence of these files and attempt to determine why those files are there, who put them there, and when. Where such files are used as a part of sanctioned business processes, security or privacy analysts will confirm that access rights comply with access policies, including least privilege and need-to-know principles. Where such files are not a part of legitimate business processes, corrective action should be taken to prevent such security or privacy issues from recurring.
Data discovery often identifies undocumented procedures as well as improper behaviors. Over time, corrective actions will gradually lift the maturity of related business processes and inform staff of proper data handling policies and procedures.
Information systems can be supplemented with DLP tooling that will monitor the movement of sensitive and personal information in real time. Monitoring agents placed in key information systems can detect the creation, movement, and deletion of specific information and generate alerts that are sent to security or privacy personnel for investigation and follow-up.
Following are examples of data movement monitoring:
• E-mail Agents on e-mail servers and endpoints can detect sensitive information in the contents of incoming or outgoing e-mail.
• Endpoint storage Agents on endpoints can detect the local storage of information.
• File servers Agents on file servers and other storage systems can detect the creation and movement of information.
• USB storage Agents on endpoints can detect the movement of information to and from external USB storage devices,
• Internet ingress/egress Agents on endpoints and network ingress/egress points (including Internet connections) can monitor data movement.
In all of these cases (and more that are not mentioned here), alerts can be sent to security or privacy analysts who would investigate these events to determine whether the data movement is legitimate (in which case, alerts can be adjusted to reduce the number of false positives) or whether corrective action is warranted.
In addition to monitoring the movement of sensitive information, DLP monitoring agents can be configured to intervene and prevent the data movement that is attempted. When implementing these DLP systems, organizations often configure them initially to operate in passive monitoring mode to help them understand and distinguish legitimate business processes from activities that violate policy. Then, carefully, organizations can configure DLP agents to intervene in circumstances where data movement is a clear policy violation.
On end-user systems, in many cases, DLP agents can display a window to the end user that asks for confirmation of the intended data movement. Though the agents do not overtly block such data movement, asking for confirmation can remind users that some data movement may violate policy. Still, users can be empowered in some circumstances to confirm that the intended data movement is legitimate. This action will still produce an event or an alert that can be investigated to determine whether the user is abusing policy. If the movement is legitimate, privacy and security analysts can determine whether they want users to continue to confirm such data movement or whether the movement can be permitted without intervention.
DLP systems, whether used to perform discovery scans or monitor data movement, require a good deal of “tuning” to ensure that they do not interfere with legitimate business processes but properly alert personnel when potential violations of data classification or privacy policies occur. Because business processes typically change slowly over time, the task of tuning DLP is never finished and is an ongoing activity.
A data subject may send an inquiry regarding the presence and usage of his personal information. Their request may be general or quite specific. For instance, a data subject may ask whether any of his personal information is present in the organization’s systems. Or a data subject may ask about specific personal information, such as a home address.
Smaller organizations may provide only an inquiry form, an e-mail address, a telephone number, or a surface mail address where such inquiries may be sent. These organizations must train personnel to manage these inquiries properly and respond to data subjects within specific timeframes (which are sometimes spelled out in regulations). Personnel who handle the requests must have access to systems and applications containing personal information so that they may respond accurately.
Larger organizations automate inquiries in some cases. For instance, a data subject with an existing account on an organization’s systems can log in and click a link to learn how and where personal information is used. Often, such tools provide the means for data subjects to make changes to some of their information. This is discussed more fully in the next section.
Organizations must maintain a log of inquiries, including the subject’s name (or other identifying information), so that management can better understand the frequency of requests and the workload incurred. Privacy personnel will recognize that these logs themselves may also contain personal information.
In some circumstances, a data subject may ask for changes in personal information held by an organization. For instance, a data subject may change residences and need to update a mailing or shipping address, or a data subject may request a correction in the spelling of their name. Or they may make changes in a payment method, family status, or service provider such as insurance. Finally, sometimes personal information has been mistyped, so spelling and other corrections are needed.
Organizations are required to provide one or more means through which data subjects can request these corrections. Data subjects often can make these changes through self-service programs, but sometimes they must request that personnel in the organization make the changes on their behalf.
Privacy policies often provide one or more methods that can be used by data subjects to make these requests. Whether the means are automated or manual, organizations typically log these events as a part of routine systems and activity measurements. Like other mature business processes, this logging will sometimes compel management to make changes or improvements to systems and processes. For example, if the organization is receiving numerous requests that personnel must deal with manually, the organization may provide more self-service tools for data subjects to make some changes themselves.
To respond to data removal requests appropriately, organizations must understand the full range of legal obligations regarding the use and retention of specific information. For example, laws regarding the financial records of public companies require that detailed financial records be kept for many years; requests by a current or former employee to have their data removed from these records would violate those financial recordkeeping laws: the organization would have to respectfully deny the request and give the reason for doing so. Also, some privacy laws provide specific exclusions, such as denying a request to remove one’s criminal history from criminal records.
Data subject requests should include the ability for persons to file a general complaint regarding the use of their personal information. At times, complaints will involve one or more provisions of applicable privacy laws, and at times they will involve other matters. Customers and constituents are not experts in privacy law and should not be expected to know their rights in detail. Organizations should effectively and respectfully respond to all such complaints, whatever their nature, and without being tossed back and forth between departments in internal “not my problem” handoffs.
In the context of data privacy, consent is a distinct action taken by a data subject to grant an organization permission to collect and/or process his or her personal information. There are several ways in which consent is given and obtained, including the following:
• Prior to data collection When a data subject is establishing a relationship with an organization, a part of the agreement may include the collection of consent for instances of data collection that will take place in the future.
• Consent obtained through a third party In some instances, it is not feasible for an organization to collect consent directly from a data subject. For example, an organization displaying advertising that is specifically chosen for and delivered to a data subject will have collected consent “for other uses” including advertising. The advertiser will have been assured that consent has been obtained from all data subjects to whom the advertiser is displaying ad content.
Regardless of the method used to collect consent, it is obligatory for organizations to record the specific date, time, stipulations, and circumstances in which that consent was collected. It is important for organizations to be quite specific with regard to this collection. For instance, if an organization collects items of personal information for use in a specific context or event, and consent for the use of that information is for that context or event only, the organization cannot later use that information for other contexts or events.
Some privacy laws, such as GDPR and CPRA, provide for the creation of government authorities that act in a supervisory capacity as a part of the enforcement of these laws. (Although this book does focus on organizations that will, from time to time, work with supervisory authorities, details on work performed by these authorities are beyond its scope.)
Organizations need to understand the nature of their relationships with any supervisory authorities and treat their requests not unlike data subject requests from customers and constituents: specific personnel should be identified and trained in response procedures. But unlike DSRs, it is suggested that management also be informed of supervisory authorities’ inquiries and requests upon receipt if special treatment is required.
To respond effectively to supervisory authority requests, organizations need to maintain complete and up-to-date business records, including processes and procedures, data flow diagrams, and complete business records. Supervisory authorities are more likely to want to dig deeper into companies that appear disorganized or out of control. Such inquiries may consume more time and result in “damage control” responses.
Working with supervisory authorities should be considered a two-way street. For instance, organizations may be required by law to notify authorities when certain types of privacy incidents and breaches occur. Again, these proceedings should be organized and consistent, giving authorities confidence in organizations’ ability to respond effectively.
Metrics are the means through which management can measure key processes and determine whether their strategies are working. Metrics are used in many operational processes, but in this discussion, metrics in the privacy governance context are the emphasis. In other words, there is a distinction between tactical privacy metrics and those that reveal the state of the overall privacy program.
Metrics and stats are also used in operations to ensure that processes and machinery are operating correctly. Often, these metrics and stats are not reported up, but are monitored by operations personnel to ensure that processes and systems are running as expected.
Privacy metrics are often used to observe technical privacy controls and processes to determine whether they are operating correctly. This helps management better understand the impact of past decisions and can help drive future decisions. Here are some examples of technical metrics:
• Number of personal information records received
• Number of personal information records purged
• Number of personal information records anonymized
• Number of personal information records accessed
• Number of subject data requests received
• Number of privacy impact assessments performed, and their results
• Number of workers trained in information privacy and information security
While useful, these metrics do not address the bigger picture of the effectiveness or alignment of an organization’s overall privacy program. They do not answer key questions that boards of directors and executive management often ask, such as the following:
• How much security is enough?
• How should security resources be invested and applied?
• What is the potential impact of a threat event?
• Are our privacy practices aligned with customer or constituent expectations?
These and other business-related questions can be addressed through the appropriate metrics, discussed in the remainder of this section.
Privacy strategists sometimes think about metrics in simple categorizations such as these:
• Key risk indicators (KRIs) These metrics are associated with the measurement of risk.
• Key goal indicators (KGIs) These metrics represent the attainment of strategic goals.
• Key performance indicators (KPIs) These metrics are used to show the efficiency or effectiveness of privacy- or security-related activities.
For metrics to be effective, they need to be measurable. A common way to ensure the quality and effectiveness of a metric is to use the SMART method. A metric that is SMART is
Effective risk management is the culmination of the highest order activities in information privacy and security programs; these include risk analyses, the use of a risk ledger, formal risk treatment, and adjustments to the suite of privacy and security controls.
Although it is difficult to measure the success of a risk management program effectively and objectively, it is possible to take indirect measurements—much like measuring the shadow of a tree (and some applied trigonometry) to gauge its height. Thus, the best indicators of a successful risk management program would be improving trends in metrics involved with the following:
• The number of privacy impact assessments (PIAs) performed and their results
• Reduction in the number of privacy and security incidents
• Reduction in the impact of privacy and security incidents
• Reduction in the time to remediate privacy and security incidents
• Reduction in the time to remediate vulnerabilities
• Reduction in the number of new unmitigated risks
Regarding the reduction in the number of privacy and security incidents, a privacy and security program improving its maturity from low levels should first expect to see the number of incidents increase. This would be not because of lapses in privacy or security controls but because of the development of—and improvements in—mechanisms used to detect and report privacy and security incidents. If a tree falls in the forest, it will be heard if microphones are installed in key locations. Similarly, as a privacy and security program is improved and matures over time, the number of new risks will, at first, increase and then will later decrease.
Metrics concerning the various forms of data subject engagement help the organization understand the extent to which personal information is being collected, as well as communications of various sorts from data subjects. Some of the metrics that may be reported include
• Number of data collections
• Number of opt-ins and opt-outs
• Number of data subject requests, broken out by inquiries, requests for correction, and requests for deletion
• Amount of time spent processing data subject requests
Metrics in the realm of data governance and data management enable company management to understand whether the management of data, including personal information, is proceeding as expected. Examples of data governance metrics include
• Data retention activities, including purges and exceptions found
• Data usage activities, including approvals for new uses and policy violations
• Changes and activities regarding data sent to and received from other organizations
• Coverage of automated tools, and where the blind spots may be
• Changes in data inventory, particularly that which contains personal information
• Changes in the numbers and types of data collection and data input and output points
Privacy leaders schooled in the concepts and practices of process maturity will be familiar with the need to bring the processes in a privacy program (and the many related processes in IT, cybersecurity, and business unit) to target maturity levels. When a business process achieves a given maturity level, management will be more confident that the process has an expected degree of consistency, an established business record, and perhaps even measurements to ensure continuous improvement of the process over time.
Capability maturity models are used to measure and plan the maturity level of a business process. Maturity models are discussed in detail in Chapter 1.
Metrics on the performance of privacy information security provide measures of timeliness and effectiveness. Generally speaking, performance measurement metrics provide a view of tactical privacy and security processes and activities. As discussed earlier in this section, performance measurements are often the operational metrics that need to be transformed into executive-level metrics for those audiences.
Performance measurement metrics can include any of the following:
• Time to detect privacy and security incidents
• Time to remediate privacy and security incidents
• Time to provision user accounts
• Time to deprovision user accounts
• Time to respond to subject access requests
• Time to discover vulnerabilities
• Time to remediate vulnerabilities
Nearly every operational activity that is privacy- or security-related and measurable is a candidate for performance metrics.
The objective of a business resilience program is the planned continuation of key business activities when challenged by interruption events such as natural disasters, including earthquakes, floods, and weather events, as well as man-made or man-caused disasters, such as utility outages, riots, and fires.
The keystone of business resilience is the business impact analysis (BIA) that identifies the organization’s most critical business processes and the resources required to operate and support them.
Privacy leaders should be involved in creating BIAs and in an organization’s resilience program to ensure that contingency and emergency response plans do not run afoul of applicable privacy laws. These laws, and the regulators enforcing them, rarely give a free pass to an organization that fails to comply with applicable privacy laws just because it had a disaster event that required drastic action to continue business operations.
Business resilience metrics include these:
• Business impact analyses performed and their results
• Business contingency plans developed
• Business contingency training sessions held
• Privacy and security reviews of business contingency plans
• Audits of the business resilience program
• Reviews of the presence and effectiveness of business resilience programs in key supplier organizations
Larger organizations with multiple business units, geographic locations, privacy functions, or security functions (often as a result of mergers and acquisitions) may be experiencing issues related to overlaps or gaps in coverage or activities. For instance, an organization that recently acquired another company may have some duplication of effort in the asset management and risk management functions. In another example, local privacy personnel in a large, distributed organization may be performing privacy functions that are also being performed on their behalf from other personnel at headquarters.
Metrics in the category of convergence will be highly individualized, based on specific circumstances in an organization. Categories of metrics may include these:
• Gaps and overlaps in asset coverage
• Gaps and overlaps in data management tools coverage
• Consolidation of licenses for privacy and security tools
• Gaps or overlaps in skills, responsibilities, or coverage
Resource management metrics are similar to value delivery metrics: both convey an efficient use of resources in an organization’s privacy program. But because the emphasis here is on program efficiency, resource management metrics may be developed in these ways:
• Standardization of privacy-related processes—because consistency drives costs down
• Privacy and security involvement in every procurement and acquisition project
• Percentage of personal information records protected by privacy and security controls
Developing Metrics in Layers for Audience Relevance
When embarking on the quest for privacy and security metrics development, a common pitfall is the development of a one-dimensional metrics framework that publishes a single set of metrics to all audiences. For instance, a metrics program may publish figures on vulnerabilities discovered, vulnerabilities remediated, privacy incidents, security incidents, and internal audits and their exceptions. Publishing this or a similar set of metrics to various stakeholders will be of little value to some audiences and of no value to others.
A better approach is the development of operational metrics, which are usually easily discovered and measured. The next step is to transform those operational metrics into different metrics, stated in business terms, for particular business audiences. In a given organization, a privacy program may employ two, three, or more layers of metrics, usually related to one another and stated in relevant technical or business terms for each respective audience.
Though it may be a good starting point to ask business leaders what kinds of metrics they want to see, in many cases, privacy and security leaders will be asked what metrics they want to see. This can be a challenge at times, but by understanding the business, culture, compliance climate, and individuals involved at the stakeholder level, the privacy and security leader can start with a set of metrics that shows success in investments made or use metrics as a call to action for the leadership team.
The Web and mobile applications are the engines of commerce in many industries. Today, measuring business is all about measuring what happens in information systems. In their zeal for insight, some of these measurements intrude into people’s privacy. This section describes a variety of techniques used to track individual users’ activities, as well as ways in which tracking can be limited.
Privacy leaders need to revisit the organization’s visitor, customer, and employee tracking practices to ensure they are performing only the tracking necessary for business operations, and that all tracking mechanisms are compliant with applicable laws.
An old joke in the advertising business goes like this: “Did you hear about the marketer who could not sleep at night? He was worried that he was wasting half of his advertising budget, but he didn’t know which half.” Cue rim shot.
In the traditional advertising world, when companies purchased ad space on billboards and buses, at airports, and in radio and television commercials, they had no direct way of knowing whether their ads were influential, never mind which individuals were responding to those ads. The Internet and the Web have changed all of that. With ads served to individuals on their laptops, tablets, and smartphones, it is now fairly simple to distinguish individuals from one another, deliver data-driven targeted advertising, and know the outcomes with more certainty based on shared tracking data.
This leap in technology has resulted in citizens saying “Enough!” and in laws to curb this tracking and its uses and potential abuses. Indeed, the existence of this book, and your interest in it, is a result of tracking, plus the accumulation and abuse of personal information that has gone too far in many cases.
Numerous techniques and technologies are used to track the activities and locations of Internet-connected devices and their owners. Information systems log various types of events that give system owners better insight into how, how much, and by whom their systems are used. Some of this logging is highly detailed and often includes, directly or indirectly, the identities of the persons using these devices; this may be considered unnecessary and can represent an invasion of privacy.
Every endpoint—smartphone, tablet, laptop, or desktop computer, or connected device such as home surveillance camera, voice assistant, printer, and more—has an IP address. An IP address is a unique numeric value assigned to each of the devices connected to a wired or wireless network.
Most web sites, as well as many mobile applications, log basic activities such as authentication and meaningful transactions. Because IP addresses on the public Internet are unique and provide approximate geographical location information (sometimes no better than an entire country), IP addresses are often a part of these log entries.
In techniques such as network address translation, a public IP address will represent an organization’s network or a residential network, but not the individual devices within that network. This means that in a single network, separate individuals using their own devices and visiting the same web site will have the same public IP address associated with them. To the uninformed, these could appear as a single user, unless log entries include some other uniquely identifying information.
Individual devices such as laptop computers, tablet computers, and smartphones have internal device identifiers that uniquely identify them. Device serial numbers are stamped on these devices and are also available electronically. Also, mobile phones have an IMEI (International Mobile Equipment Identity) number that mobile network service providers use to identify devices throughout the world. Some of the activity tracking performed by mobile network operators and Internet service providers include identifiers like these. Often, these identifiers can be associated with their owners, giving network operators unique insight into the detailed usage of their devices.
Retail stores and other organizations use a variety of technologies to record the number of visitors who visit a store or office. Video surveillance is a mainstay, but with improved camera resolution, facial recognition is now possible.
Some organizations also implement Wi-Fi–based tracking technology, which counts visitors but also knows their exact location and whether they are unique or repeat visitors. Device Wi-Fi MAC addresses, which are unique for every device in the world, are the bases for visitor tracking.
Web tracking refers to general practices associated with measuring and observing users who visit web sites. Web site operators track individual user sessions on their sites for three primary reasons:
• Session integrity The nature and design of Internet protocols and multiuser applications require that each user’s session be uniquely identified. This is necessary to distinguish each user from every other. For instance, e-commerce sites need to identify individual user sessions properly and uniquely, so that each user is able to browse through and purchase products and services. This session integrity also gives each user a relative feeling of privacy, knowing that no other user is able to know what products they are viewing and purchasing.
• Usage statistics Web sites and applications want to accumulate analytics regarding their use: how many users are visiting (and at what times of the day, what days of the week, and so on), what pages are they viewing in what sequence, and how long are they remaining on individual pages. This information helps organizations design web sites and applications that are easier and simpler to use.
• Advertising tracking Advertising and its revenue fuel a significant portion of the Internet. Thus, advertisers and web site operators track not only the numbers of visitors, but they track and uniquely identify visitors to distinguish one from another. The technologies in play here give rise to the “creepy factor”—for example, after a person does an Internet search for a specific thing, for days afterward, on every page she visits, she sees advertisements from various companies for those very things.
Cookies Cookies are small pieces of data that web sites create and store in a user’s browser. Generally, a cookie is used by the web site to distinguish users from one another and also to remember unique users’ preferences such as language and display or usage settings. Several types of cookies are used for various purposes:
• Session cookies These are used to identify a unique user’s session. A session cookie is assigned when a user logs in to a web site and is removed when the user logs off or closes the browser.
• Persistent cookies These cookies remain on a user’s browser and are sometimes used to remember users’ preferences such as language, country, and landing page. Persistent cookies are also known as advertising cookies because they are used to distinguish users from one another.
• First-party cookies These cookies are placed by the domain the user is visiting and identified with that domain. For instance, if a user is visiting www.company.com, a first-party cookie will be associated with the domain company.com.
• Third-party cookies These cookies are placed by the domain the user is visiting, where the cookie is associated with a different domain. For example, if a user is visiting www.company.com, that web server could attempt to place a cookie from www.cooltrackinging.com for advertising purposes.
• Super cookies These are cookies with an origin of a top-level domain such as .com or .co.uk. They are used for tracking users across many domains.
• Flash cookies The once-popular Adobe Flash program has a feature, Flash Local stored object, that functions much like a cookie.
• Zombie cookies These cookies are created by various means and designed to be difficult or impossible to detect or delete; they regenerate themselves when possible.
• HTML5 web storage The now popular HTML5 standard includes specifications for local storage of information. One such use mimics the function of cookies.
Web Beacons A web beacon is a technique used by web servers to track the viewing of web pages and e-mail messages. Web beacons generally take the form of a 1×1 pixel image that is essentially invisible to users. Because web servers log details of the downloading of every object, including images, web beacons can function much like cookies and can enable the collection of information, such as whether the recipient has opened an e-mail and whether it was opened by others (presumably after being forwarded). This is particularly true if a web site utilizes uniquely named web beacon image files that are each sent only to a specific user.
Location On devices with GPS or similar device location capabilities, web servers may request specific device location via the user’s web browser. In modern operating systems, web sites are not permitted to obtain a device’s location unless the user specifically consents to it. These consents are generally persistent, and it may be difficult for a user to view the web sites for which she has approved for sending location information.
Device Information and Name When a user visits a web site, the site’s web servers can obtain a limited amount of information about the user’s device, including
• Device name For a personally owned device, usually the name that the user assigned to the device when she purchased it and set it up
• Browser The name and version of the browser used to visit the site
• Operating system The name and version of the operating system
• Viewport width The width of the device’s display in pixels
With location tracking enabled, some apps and web sites accumulate a detailed location history for users. This becomes a point of contention with citizens who believe that this constitutes overreach—perhaps this information could be used against them in some way. Many also believe that the manufacturers of mobile devices likewise accumulate detailed location history that includes an excessive amount of personal information that could lead to misuse or abuse.
A growing concern among privacy and security professionals is the increase in consumer devices and mobile apps that eavesdrop on their users in various ways. Many mobile apps access sensitive data on users’ mobile devices, often for no good reason, and many “smart” consumer devices collect more information from us than may seem reasonable. For instance, there is no reason for a camera or photo-editing app to access a user’s contact list; when installing such an app, users often “click through” the permission dialog without thinking about what they’re being asked to permit.
Similarly, some apps are able to sneak around a mobile device’s controls to obtain such information anyway. The operators of mobile app stores (primarily Apple and Google) do a pretty good job of preventing illicit eavesdropping, but some app developers are clever and find ways around the controls.
Some consumer devices are designed to eavesdrop on their customers, with only vague clues in the standard terms and conditions that indicate what is really going on. For instance, one brand of smart TVs advises that customers should not have conversations on sensitive topics in the presence of the television!
Applications can access and abuse users’ privacy and security in the following ways:
• Camera Many apps for both mobile devices and laptops request access to the device’s camera. For “honest” applications, it’s evident when the camera is being used, but apps could access the camera at other times as well. On many laptop computers, a small indicator light illuminates when the camera is in use, but not all mobile devices have this feature. (Personally, I am suspicious about whether new smart televisions have built-in cameras in their bezels.)
• Photos Many apps request access to stored photos on mobile devices. Although this is a legitimate need for photo-editing apps and social media apps for purposes of posting photos or updating profile pictures, users should pay attention to permission requests to access stored photos.
• Microphone Apps that need to record a user’s voice or other sounds will need to request access to a device’s microphone. This includes videoconferencing and audio calling applications. Others, such as health apps (for observing sleep, for example), may also request mic access. Remember that any voice-activated product has a built-in microphone that is essentially listening all the time (likely even when the product is switched off).
• Location Some applications need to know the location of the device in order to be useful to users. While mapping, navigation, and travel-related applications obviously require location services, others may not.
• Contacts Mobile device users will sometimes be asked if certain applications are allowed to access stored or cloud-based contact lists. Users should be especially careful with this permission, as unscrupulous vendors’ applications may harvest others’ contact information for marketing purposes.
• Voice assistant Many mobile devices and laptop computers are equipped with “Siri,” “Alexa,” and “Hey Google” voice assistants. Users need to be aware of whether these voice assistants, when activated, are listening and potentially uploading all speaking that takes place within range of the device.
• Local and cloud storage Mobile device users should be wary of applications that request permission to access local and cloud-based storage. Miscreant apps may attempt to exfiltrate those contents for who knows what purposes.
• Social media accounts Many mobile apps provide “value add” services as an adjunct to popular social media services such as Facebook, Twitter, LinkedIn, and Instagram. Those apps request permission to log in to users’ social media accounts to provide their services. Sometimes this access is misused or abused, with more personal information being sent to these other services than most users would consider reasonable.
• Paste buffer (clipboard) The paste buffer (known as the clipboard) on Apple mobile devices can, at times, contain highly sensitive information such as passwords, URLs, e-mail addresses, and phone numbers. Apps in some mobile devices have free access to the paste buffer on mobile devices, resulting in leakage of sensitive information.
Is Big Tech the New Big Brother?
Companies like Apple, Google, Microsoft, FaceBook, Twitter, Samsung, Vizio, and scores of others have designed numerous high-tech products that are revolutionary in the ways in which they make our work lives and personal lives easier and richer. But in recent years, we’re learning of some of the practices that may represent overreach in their capabilities. Here are some examples:
• It was revealed that Google Nest thermostats are equipped with secret microphones.
• Samsung suggests that the owners of their smart televisions should take sensitive conversations away from the television lest voice-recognition software hears what’s said.
• Several brands of children’s smart toys, including toys from Mattel and Genesis Toys, eavesdrop on children and their families’ conversations—by design.
• Google keeps a detailed dossier about the specific movements of users who use Android devices or who use Google Maps and other navigation apps.
• Employees of Amazon’s Ring doorbell and ADT’s security systems have been caught eavesdropping on customers’ homes.
• Apple and Android smartphones have controls to restrict installed apps’ access to location, voice, contacts, and data usage, but the manufacturers themselves are often exempt from these restrictions.
• And, finally, voice assistants hear everything.
Consumers are growing wary of big tech and the potential for overreach. Privacy professionals have long been concerned. The pendulum of acceptable tracking versus intrusions into privacy continues to swing, even as new technologies reveal even more about the lives of the people using them.
Combined with advancements in mobile device and CCTV optical capabilities, facial recognition software is going mainstream. Many products from Apple, Microsoft, and others utilize facial recognition for logging in to mobile devices. Commercial facial recognition products are also enabling corporations and law enforcement to recognize people, such as wanted criminals.
Privacy rights advocates are rigorously opposing facial recognition capabilities in public places such as airports, shopping malls, and city streets out of concerns that it could be abused by authorities aspiring to create a surveillance state. Some cities, states, provinces, and countries are passing laws forbidding the use of facial recognition capabilities in public places, and some larger technology organizations are refusing to sell these capabilities to governments. Public-setting facial recognition cannot be “uninvented,” however, and it is likely to continue to be used secretively by corporations and governments despite regulations forbidding its use.
Aside from facial recognition, other forms of biometrics have been in use for years and even decades. Numerous companies manufacture fingerprint and palm scan readers for use on mobile devices, as well as for building and secure zone entrance control. Iris scanning is also fairly common, as high-resolution cameras can obtain a quality image from a few feet away.
Static signature recognition, which is the task of verifying whether a signed document is genuine, has been used for centuries and is still used in banking to confirm signatures on checks. More intrusive biometric techniques such as voice recognition, retinal scan, and dynamic handwriting scanning are no longer in common use.
Contact tracing has been used for disease control for decades. Historically, contact tracing has been a manual process consisting of interviews with confirmed case patients to learn about their recent contacts with other individuals. Being highly manual, it has not been the most efficient tool to assist in reducing the spread of disease. However, the proliferation of smartphones that can provide proximity information may prove useful for contact tracing. As a result of the COVID-19 pandemic, Apple and Google introduced support for “COVID-19 apps” that could use their smartphone’s Bluetooth radio signals to notify a user who comes into close proximity with someone who is also using a “COVID-19 app” and has recorded in the app as testing positive for an infectious disease.
While health authorities view contact tracing as a valuable tool for infectious disease control, privacy advocates consider it an overly intrusive process that is subject to abuse by police states. Indeed, contact tracing can be used to discover associations between people who meet only face-to-face. Critics of contact tracing point to numerous “false positives” that would result. For instance, two hotel guests sleeping in adjacent hotel rooms might be identified as being in close proximity for several hours.
In mid-2020, Apple and Google included contact tracing as a standard feature of the iOS and Android operating systems; this feature is not activated by default and must be explicitly turned on (see Figure 2-2), or so it appears.
Figure 2-2 Contact tracing is built into Apple mobile devices. (Source: author)
Organizations that conduct part or all of their business through the use of computers need to enact a variety of controls to reduce the likelihood and impact of attacks. Indeed, this is the whole point of cybersecurity, as well as a substantial portion of information privacy. Some of the controls associated with cybersecurity involve the management and logging of activities on computing devices (laptops, desktops, tablets, and smartphones) used by its workers. After all, anomalous behavior on any of these devices may be signs of an attack. To detect and prevent such attacks, organizations must track and centrally log many types of user activities on computers, including the following:
• Web sites visited
• Files created, viewed, updated, transferred to other media, and deleted
• E-mail messages sent and received
• Contents of network communications
• Location of said devices (for device theft detection and remote data destruction)
Organizations with more mature cybersecurity programs will track and log most or all of these activities and use analytics of these records to detect anomalies that may be indications of security or privacy breaches.
Organizations undertaking such tracking and recording often include notices on these devices and in company policy, stating that such measures are taken in the name of data protection, and that any personal use of these devices (or networks) is subject to these practices, resulting in “no expectation of privacy.”
Web content filtering is used to prevent users from visiting web sites whose subject matter is not business related (such as weapons, gambling, and pornography sites), or to prevent users from visiting certain web sites that are known to have malicious content. When users visit these sites, malware or spyware may be installed on visitors’ computers. Many web content filtering systems log the web sites and web pages viewed by organization personnel, often associating web activity with specific workers by name. This log data can prove invaluable in a security or privacy breach investigation. CASBs are implemented to prevent the use of unauthorized cloud services. The logging capabilities in these systems are frowned upon or even illegal in some countries.
To detect and prevent leakage of sensitive information, some organizations undertake a practice known as SSL (Secure Sockets Layer) decryption, in which encrypted network traffic is decrypted so that the contents of the traffic can be examined for evidence of a security or privacy breach. Because so much Internet traffic is encrypted, organizations lacking SSL decryption are blind to many types of threats.
Legal problems with SSL decryption arise when workers occasionally use organization-issued computers to conduct personal business, such as accessing personal e-mail, making personal purchases, accessing healthcare services, conducting personal banking, and so on. Though a small number of organizations prohibit and actively block all such personal uses, most permit a minimal amount of personal use and will make an effort not to decrypt traffic from sites believed to be low business risk that could transmit personal information; however, they warn workers that all activities, whether business or personal, are monitored for security and privacy purposes.
Some organizations “whitelist” the use of personal banking and other, similar, activities so that an employee’s personal use of organization-issued computers is not examined in some cases. Indeed, personal banking is an unlikely path for exfiltrating sensitive data and represents a low risk for security and privacy breaches.
Internal e-mail represents an ongoing conversation in most organizations. For this reason, many organizations continuously archive all e-mail communication on separate e-mail archive servers. If the organization receives a legal request for specific e-mail messages, search capabilities on e-mail archive servers streamline the data collection effort. Any personal uses of organization e-mail accounts are naturally going to be included in such archiving. Again, employees are generally cautioned through visible notices that monitoring is taking place.
Users of mobile devices and smart products have limited abilities to prevent tracking and eavesdropping. Various tracking prevention remedies are discussed in this section.
Most browsers always permit users to reject third-party cookies. This will sometimes break the normal function of some web sites, depending upon their architecture. Some browsers enable users to permit and/or block cookies by specific domain, which gives users more granular control over cookie-based web tracking.
Web browsers on mobile devices and laptop computers enable users to remove all cookies from their browser. This will result in all logged in sessions being effectively logged out, and any web site preferences such as preferred language or postal code will be removed.
The Do Not Track web browser setting can be used to disable web server tracking for a user. Note that Do Not Track is a request that lacks specific controls for enforcement; web site operators must voluntarily implement features that result in the user’s visits not being tracked. Do Not Track has not been widely adopted by the industry, in part because of the lack of legal mandates for its use. Do Not Track is the web version of the US Do-Not-Call legislation that was enacted in 2003 as a result of the scourge of annoying telemarketing calls.
Many browsers have a privacy mode, sometimes called incognito mode, in which the tracking of web site visits is not included in a user’s browsing history. This may be useful on shared computers if a user does not want other users to know about their browsing history. Many people are unaware of the fact that privacy mode browsing does not diminish or affect the full logging that web content filters, CASBs, and web sites themselves perform. Indeed, web sites make no distinction regarding privacy mode browsing in their activity logs.
Users who don’t want their locations to be tracked online can use a Tor browser, which employs network routing through the Tor network. The Tor network is designed to conceal the IP address and, thus, the physical location of the device using the Tor browser. Tor browsers also do not retain cookies or browsing history. Use of the Tor network is limited to the Tor browser. Users who want to anonymize their IP addresses for other programs turn to the use of private virtual private network (VPN) services.
Persons who are concerned with the protection of their network traffic or who want to anonymize their IP address can use a private VPN service. These services, available on mobile devices as well as laptop and desktop computers, enable users to “hide” behind a relatively anonymous IP address, which will help to conceal their location.
VPN services do not anonymize a user’s web browser. For instance, if an e-mail user logs in to his webmail service and then activates a VPN, his webmail session will probably continue uninterrupted, since the user’s identity is asserted through session cookies. That said, web sites with better security regimens may alert users or even block access if the users are seen to be logging in from faraway countries. In a similar vein, the online banking app on my smartphone blocks VPN function because the GPS location and the VPN IP address location contradict each other.
Users of mobile phones and other small devices can purchase Faraday bags, which are small pouches that include a metallic material that blocks radiofrequency (RF) signals. Placing a mobile device into a Faraday bag essentially causes it to “disappear” from cellular, Wi-Fi, and Bluetooth networks.
Mobile device Faraday bags can also be used for building access cards to prevent them from being cloned by attackers. Smaller versions of Faraday bags are made for key fobs used to lock, unlock, and remotely start automobiles. Because of the relatively poor security associated with key fobs, people concerned with automobile theft utilize these bags. Figure 2-3 shows Faraday bags for a mobile device and for a key fob. A disadvantage of a Faraday bag is that the mobile device is unavailable for any use while inside the bag.
Figure 2-3 Faraday bags protect mobile devices and key fobs from eavesdropping and tracking. (Source: author)
Privacy policies are statements that describe the collection and use of personal information, as well as actions that persons can take to inquire and make requests about their personal information. Organizations generally should have both external and internal privacy policies.
Privacy standards are prescriptive statements that inform the workforce of how privacy policies are to be carried out.
Privacy laws passed by governments at national, state, and provincial levels impose data collection, handling, and retention requirements on organizations that store or process personal information about natural persons.
Multinational organizations, as well as organizations doing business with citizens in many countries, need to be aware of the presence of and requirements imposed by international data-sharing agreements. These agreements between governments serve as implementation requirements that organizations must follow.
Within the context of data privacy and privacy regulations, organizations must identify specifically the legal basis under which they are collecting and/or processing personal information.
Controls are statements that define required outcomes. Controls are often implemented through policies, procedures, mechanisms, systems, and other measures designed to reduce risk.
For a privacy program to be effective, organizations must have a complete and accurate inventory of all personal information. Although an inventory of structured information (data residing in application database management systems) will remain fairly static, the transient nature of unstructured data creates additional challenges.
A data classification policy is a formal and intentional way for an organization to define levels of importance or sensitivity to information. A typical data classification policy will define two or more (but rarely more than five) data classification levels.
Static and dynamic data loss prevention (DLP) systems can greatly aid in the effort to provide visibility and even control over the use of personal and other sensitive information.
Data use limitation is the privacy-oriented concept that states that organizations should use personal information only for purposes required for organizations to perform their services.
Data minimization refers to the practice of collecting and retaining only those specific data elements necessary to perform agreed-upon functions. Those elements should be discarded as soon as they are no longer needed. Only authorized personnel should have access to this information.
De-identification is the means through which PII is removed from business records. Two primary techniques of de-identification are anonymization and pseudonymization.
Privacy laws place emphasis on the concepts of data quality and data accuracy; organizations need to enact processes to ensure these outcomes.
Periodic data discovery scanning enables an organization to monitor the presence of personal information in unstructured data stores.
Information systems can be supplemented with DLP tooling that will monitor the movement of sensitive and personal information in real-time. Monitoring agents placed in key information systems can detect the creation, movement, and deletion of specific information and generate alerts that are sent to security or privacy personnel for investigation and follow-up.
A key part of an organization’s privacy program is the implementation of an intake function to receive inquiries and requests from data subjects. The incoming requests are known as data subject requests (DSRs). Requests will include inquiries about data usage, corrections, complaints, opt-ins, opt-outs, and requests for removal of personal information.
Organizations must identify relevant regulatory authorities and understand how they should interact with them.
Organizations must have a complete understanding of all of the methods of personal information input and intake, and ensure that consent is directly or indirectly collected from data subjects in every case.
Metrics are the means through which management can measure key processes and determine whether their strategies are working. Metrics are used in many operational processes, but in this discussion, metrics in the privacy governance context are the emphasis.
An organization’s privacy program may have one or more key risk indicators (KRIs), key goal indicators (KGIs), and key performance indicators (KPIs) that inform management of the effectiveness and progress of the program.
Privacy leaders need to revisit the organization’s visitor, customer, and employee tracking practices to ensure they are performing only the tracking necessary for business operations, and that all tracking mechanisms are compliant with applicable laws.
• Because privacy and data protection laws are being rapidly enacted and changed, organizations must devise a way to remain fully aware of new and changing laws and legal precedents.
• The CCPA/CPRA is expected to change significantly, and enforcement actions have yet to begin as of the writing of this book. Organizations subject to CCPA/CPRA should watch this law, updates, enforcement, and resulting case law carefully.
• International data-sharing agreements have proven to be volatile; hence, organizations should monitor developments concerning these agreements and be able to respond accordingly.
• An IT organization supporting many applications and services will generally have some controls that are specific to each application. However, IT will also have controls that apply across all applications and services. These are usually called its IT general controls (ITGC).
• Although it is not always necessary for an organization to select an industry-standard control framework, it is advantageous to do so. Industry-standard control frameworks have been used in thousands of companies, and they are regularly updated to reflect changing business practices, emerging threats, and new technologies.
• Organizations with multiple control frameworks often crave a simpler organization for their controls. Often, organizations will “map” their control frameworks together, resulting in a single control framework that includes controls from each framework.
• Organizations with lower process maturity are more likely to use unstructured means for performing procedures and completing tasks. Often this will result in a greater use of e-mail for process workflow and a greater use of unstructured data stores for storing data. E-mail, file servers, and cloud storage services represent the majority of unstructured data in many organizations.
• Organizations should strive to keep their data classification schemes simple, so that the workforce will be more likely to understand and comply with them.
• Organizations are cautioned to proceed slowly with the implementation of intervention DLP tools so as not to disrupt sanctioned business processes and activities.
• Organizations implementing data retention will encounter obstacles such as database referential integrity, comingling of data on backup media, sensitive data in e-mail, and unstructured data on data stores and stored on end user devices.
• The challenge with de-identification is to ensure that data records can no longer be associated with natural persons while retaining the information’s value for other purposes.
• Organizations need to ensure, through privacy controls, that data aggregation is performed only when approved by management.
• Qualified legal counsel is needed to confirm the applicability of privacy laws and to interpret their meaning.
• For a privacy program to be effective, an organization needs to implement an effective cybersecurity program, which provides the data protection aspects of a privacy program.
• Privacy regulations are not always explicitly clear on which data fields are considered personal information. Legal counsel may be needed to clarify this, so that privacy operations can be sure to monitor effectively.
• Various privacy laws are based on potentially unique philosophies and treat consent somewhat differently. For instance, GDPR requires explicit consent prior to the collection and processing of personal information, meaning that data subjects must opt in for any collection and use of their personal data. On the other hand, CCPA permits collection and processing of personal information but requires that data subjects be able to opt out of all such processing.
• Privacy leaders need to understand the distinction between tactical privacy metrics and those that reveal the state and health of the overall privacy program.
• In some jurisdictions, a MAC address and/or IP address is considered an element of PII.
• Organizations in some jurisdictions must carefully plan any tracking of employee computer usage to ensure that they do not run afoul of privacy laws or works councils.
1. Privacy governance is most concerned with:
B. Security policy
C. Privacy strategy
D. Security executive compensation
2. A privacy leader is reviewing a document that explains the purpose of the privacy program, along with roles and responsibilities and descriptions of business processes. What document is the privacy leader viewing?
A. Program charter
C. Control framework
D. Audit results
3. An organization’s board of directors wants to see quarterly metrics on risk reduction. What would be the best metric for this purpose?
A. Number of data subject requests received
B. Viruses blocked by antivirus programs
C. Packets dropped by the firewall
D. Time to patch vulnerabilities on critical servers
4. Which of the following metrics is the best example of a leading indicator?
A. Average time to mitigate security incidents
B. Increase in the number of attacks blocked by the intrusion prevention system (IPS)
C. Increase in the number of attacks blocked by the firewall
D. Percentage of critical servers being patched within service level agreements (SLAs)
5. The term legitimate interest refers to what privacy activity?
A. The basis for a user access request
B. Whether data collection is allowed by law
C. The legal basis for processing personal information
D. An alternative to lawful processing of personal information
A. Define system specifications
B. Define workforce behavior expectations
C. Describe business processes
7. The primary factor related to the selection of a control framework is:
A. Industry vertical
B. Current process maturity level
C. Size of the organization
D. Compliance level
8. Which of the following is the best definition of control objectives?
A. Detailed statements of desired outcomes
B. High-level statements of desired outcomes
C. Elements of a control audit procedure
D. Mapping of controls to applicable laws
9. The applicability of the NIST Privacy Framework is:
A. Any organization that is building a privacy program
B. US government agencies
C. US government agencies and their key suppliers
D. Key suppliers to US government agencies
10. One primary difference between GDPR and CCPA is:
A. GDPR requires an opt-out while CCPA requires an opt-in.
B. Only GDPR asserts extraterritorial jurisdiction.
C. Only CCPA asserts extraterritorial jurisdiction.
D. GDPR requires an opt-in while CCPA requires an opt-out.
11. A privacy strategist has examined a business process and has determined that personnel who perform the process do so consistently, but there is no written process document. The maturity level of this process is:
12. After examining several business processes, a privacy strategist found that their individual maturity levels range from Repeatable to Optimizing. What is the best future state for these business processes?
A. All processes should be changed to Repeatable.
B. All processes should be changed to Optimizing.
C. There is insufficient information to determine the desired end states of these processes.
D. Processes that are Repeatable should be changed to Defined.
13. In an organization using HIPAA as its control framework, the conclusion of a recent risk assessment stipulates that additional controls not present in HIPAA but present in ISO/IEC 27001 should be enacted. What is the best course of action in this situation?
A. Adopt ISO/IEC 27001 as the new control framework.
B. Retain HIPAA as the control framework and update process documentation.
C. Add the required controls to the existing control framework.
D. Adopt NIST SP 800-53 as the new control framework.
14. The best sequence for implementing DLP is:
A. Static, detective dynamic, preventive dynamic
B. Detective dynamic, preventive dynamic, static
C. Administrative, prevention, detection
D. File shares, networks, end user devices
15. An organization has written a program that substitutes data subject names and various PII fields with other values. What specific practice does this represent?
1. C. Privacy governance is the mechanism through which a privacy strategy is established, controlled, and monitored. Long-term and other strategic decisions are made in the context of privacy governance.
2. A. A program charter generally includes an overall description of a program (in this case, a privacy program), strategic objectives, roles and responsibilities, and primary business processes.
3. D. The metric on time to patch critical servers will be the most meaningful metric for the board of directors. The other metrics, while potentially interesting at the operational level, do not convey business meaning to board members.
4. D. The metric of percentage of critical servers being patched within SLAs is the best leading indicator because it is a rough predictor of the probability of a future security incident. The other metrics are trailing indicators because they report on past incidents.
5. C. Legitimate interest refers to a legal basis for the collection and processing of personal information where the interests of the organization collecting and processing personal information are balanced with the data subject’s interests.
7. A. The most important factor influencing the selection of a control framework is the industry vertical. For example, a healthcare organization would likely select HIPAA as its primary control framework, whereas a retail organization may select PCI DSS.
8. B. Control objectives are high-level statements of desired outcomes. In a typical control framework, multiple detailed controls will be included in each control objective.
9. A. Any organization may use the NIST Privacy Framework as a guide for designing, building, and implementing a privacy program.
10. D. GDPR requires that organizations provide data subjects an opportunity to opt in to be included in the collection and use of personal information. CCPA requires organizations to provide an opportunity to opt out for data subjects who no longer want their personal data used by organizations.
11. B. A process that is performed consistently but is undocumented is generally considered to be Repeatable.
12. C. There are no rules that specify that the maturity levels of different processes need to be the same or at different values relative to one another. In this example, each process may already be at an appropriate level based on risk appetite, risk levels, and other considerations.
13. C. An organization that needs to implement new controls should do so within its existing control framework. It is not necessary to adopt an entirely new control framework when a few controls need to be added.
14. A. The best approach for implementing a DLP environment is to start with static DLP scanning, which detects PII on data stores. This is followed by dynamic DLP that works in detective mode to learn more about data movement. Finally, dynamic DLP in preventive mode will block disallowed data movement.
15. D. Pseudonymization is the practice of substituting fictitious values for actual values in data records, as a way of de-identifying those records.