Data Classification Policies

Data classification in its simplest form is a way to identify the value of data. You achieve this by placing a label on the data. Labeling data enables people to find it quickly and handle it properly. You classify data independently of the form it takes. In other words, data stored in a computer should be classified in the same way as data printed on a report.

There is a cost to classifying data. Classifying data takes time and can be a tedious process. This is because there are many data types and uses. It’s important not to overclassify. A data classification approach must clearly and simply represent how you want the data to be handled.

When Is Data Classified or Labeled?

Classifying all data in an organization may be impossible. There has been an explosion in the amount of unstructured data, logs, and other data retained in recent years. Trying to individually inspect and label terabytes of data is expensive, time consuming, and not productive.

Different approaches can be employed to reduce this challenge. Here are several approaches used to reduce the time and effort needed to classify data:

  • Classify only the most important data, the data that represents the highest risk to the organization. Use a default classification for the remaining data.
  • Classify data by storage location or point of origin. For example, all data stored in the financial application database could be considered to be for internal use only and thus classified as “confidential.”
  • Classify data at time of creation or use; this technique relies on software that “hooks” into existing processes. For example, before an email is sent, a pop-up box requires the email content to be classified.

Regardless of the approach, the reason for classifying the data must determine the way it is secured and handled. The point is not just to classify data. The point is to classify data in order to manage risk.

It is recommended that at minimum, you classify any data that is not public. This leads to which classification system to use. There are a variety of classification systems. The U.S. Department of Defense (DoD) utilizes confidential, secret, and top secret as the baseline classifications for data. The Department of Energy uses Unclassified Controlled Nuclear Information (UCNI), Formerly Restricted Data (FRD), and Restricted Data. Then there are additional categories such as National Security Information (NSI) and Critical Nuclear Weapon Design Information (CNWDI). DoD classification is discussed in more detail later in this chapter.

It is highly unlikely that a civilian company that is not a DoD contractor would require that level of classification. And it would be very difficult to manage. However, at least a “­confidential” or “not confidential” would be workable for any organization.

The Need for Data Classification

You can classify data for different purposes depending on the need of the organization. For example, the military would have a very different classification need than the local grocery market. Both handle data, but handling requirements differ greatly. The more sensitive the data, the more important it is to handle the information properly.

An organization has several needs to classify data. The three most common needs are to:

  • Protect information
  • Retain information
  • Recover information
Protecting Information

The need to protect information is often referred to as the security classification. An organization has to protect data when its disclosure could cause damage. Data classification drives what type of security you should use to protect the information. Data classification also helps define the authentication and authorization methods you should use to ensure the data does not fall into unauthorized hands.

It is always important to protect your confidential data. In fact, one major aspect of trade secret litigation is whether or not the data owner took reasonable steps to secure that information. I frequently work as an expert witness in such cases, and if you don’t take steps to secure confidential data, you may not be able to enforce trade secret protection.

Authentication is the process used to prove the identity of the person. Authorization is the process used to grant permission to the person. Both authentication and authorization control access to systems, applications, networks, and data. Authentication makes sure you know who is accessing the data. Authorization makes sure you know the level of access that is permitted—for example, “read” versus “update.” When an individual is said to be an authorized user, it means that he or she received formal approval to access the systems, applications, networks, and data.

All organizations have some form of data that requires protection. Any organization that has employees has sensitive personal information, which, by law, must be protected.

Retaining Information

An organization must determine how long to retain information. It’s not practical to retain data forever. When you purge and delete sensitive data, there’s less of a target for a future breach. Therefore, organizations should retain only data that is needed to conduct business. Data retention policies define the methods of retaining data as well as the duration.

You need to retain data for two major reasons: legal obligation and needs of the business. All organizations have some legal requirements to keep records, such as financial and tax records. Generally such records are retained for seven years in the United States. There are also business reasons to keep records, such as customer information, contracts, and sales records. TABLE 11-1 depicts a sample retention classification scheme.

TABLE 11-1 Data Classification for Retention of Information
A sample information retention classification scheme table.

There are records for which there is no legal or business reason to keep them. Many organizations require that such records be deleted at some point. Deleting this information helps the company cut down on storage costs and protects the information from accidental disclosure. The additional benefit of removing unneeded data is the reduction of legal liabilities. There is a general theory that unneeded data creates a liability for a company. The key concept is “what you don’t know can hurt you.” As stated previously, there is an explosion of data across many businesses. Much of this data is unstructured, such as emails, call center recordings, other transactions, and even social media postings. Understanding what’s contained in every data file is impossible. This means that an organization may not fully understand the legal obligation or liability associated with handling every data item. In short, the less data retained, the less unknown liability exists.

Storage can be expensive for an organization. A corporate setting can have thousands of employees generating huge volumes of data. Retaining this data takes up valuable resources to back up, recover, monitor, protect, and classify. If you delete unneeded data, these costs are avoided.

Given the volume of data produced, it is inevitable that sensitive data will show up where it’s not supposed to. A good example is email. A service agent might try to help a customer by email to resolve a payment problem. Despite the agent’s good intentions, the agent might include the customer’s personal financial information in the email. Once that data is in the email system, it’s difficult to remove. The person receiving the email may have designated others to view the mail. Backups of the desktop and mail system will also have copies of the personal information. Wherever that data resides or travels, the information must now be protected and handled appropriately.

NOTE

According to a legal memorandum by Ater Wynne, LLP, each person in a corporate setting produces about 736 megabytes annually of electronic data. That equates to a stack of books 30 feet tall. Additionally, it’s estimated that email accounts for 80 percent of corporate communications in the United States.

As discussed earlier, you can reduce the likelihood of accidental disclosure by routinely deleting data that is no longer needed for legal or business reasons. Classifying what’s important ensures that the right data is deleted. Without retention policies, vital records could be lost. The retention policy can use data classification to help define handling methods.

It’s important to work with management in determining the retention policy. It’s also important to work with legal staff. The legal obligations can change depending on the business context. Assume a service agent with a securities brokerage wrote an email about a customer’s stock trade. This type of email correspondence must be retained by law. The Securities and Exchange Commission (SEC) Rule 17a-4 requires all customer correspondence to be retained for three years. This is to ensure a record is kept in case of an accusation of fraud or misrepresentation. The SEC rule also says the correspondence must be kept in a way that cannot be altered or overwritten. This means the retention policy must specify how the data is to be backed up. An example is a requirement that data should be kept on write-once optical drives. Regulations make data classification even more important in defining proper handling methods.

A retention policy can help protect a company during a lawsuit. The courts have held that no sanction will be applied to organizations operating in good faith. This is true even if they lost the records as a result of routine operations. “Good faith” is demonstrated through a retention policy that demonstrates how data is routinely classified, retained, and deleted.

Recovering Information

The need to recover information also drives the need for data classification. In a disaster, information that is mission-critical needs to be recovered quickly. Properly classifying data allows the more critical data to be identified. This data can then be handled with specific recovery requirements in mind. For example, an organization may choose to mirror critical data. This allows for recovery within seconds. In comparison, it can take hours to recover data from a tape backup. TABLE 11-2 depicts a sample recovery classification scheme.

TABLE 11-2 Data Classification for Recovery of Information
A sample information recovery classification scheme table.

There are various approaches, sometimes called classification schemes, to classifying data. A good rule of thumb is to keep it simple! A dozen classes within each scheme for security, retention, and recovery would be confusing. Employees cannot remember elaborate classification schemes. It’s difficult to train employees on the subtle differences among so many classes. A good rule is to use five or fewer classes. Many organizations use three classifications. Some add a fourth classification to align better to their business model and mission. Although a fifth classification is rarer, this may be an indication that an organization has a high enough level of automation and a mature enough risk program to use the additional classification to better manage its data.

In a three-class scheme, the classes represent a lower and upper extreme combined with a practical middle ground. It’s also good to keep the class names short, concise, and memorable. Some classification requirements are influenced by specific legal requirements. In other cases, classification requirements will be driven by what the business is willing to pay for. For example, Table 11-2 indicates a recovery time of less than 30 minutes for critical data. This sample recovery scheme may not be appropriate for all organizations; for example, the scheme might be too expensive for an elementary school to implement. In contrast, a Wall Street brokerage firm might find 30 minutes inadequate.

Legal Classification Schemes

A legal classification scheme to label data is driven primarily by legal requirements. Such schemes are often adopted by organizations that have a significant regulatory oversight or have had a significant legal or privacy viewpoint driving the data classification program. Regardless of the reason the organization adopts this approach, it’s important that as legal requirements change, the data classifications change with them. For example, the definition of privacy has changed over the years. If the objective of the classification is to maintain individual privacy as legally defined, as the law changes, so must the classification. Consider an individual’s home address. In some states the address alone is considered private information. In other states a home address is considered private only when combined with an individual’s name. This changing legal landscape can affect how data is classified and handled.

Stanford University offers a good example of a legal classification scheme to label data. The University Privacy Officer is listed as a contact point for questions on the data classification. The Chief Information Security Officer is listed as the key contact point for how the classes of data should be protected. It is a common practice to have privacy and security departments team up to create and manage data classification schemes. Stanford University has adopted the following data classification scheme:

  • Prohibited information—Information is classified as Prohibited if law or regulation requires protection of the information.
  • Restricted information—Information is classified as Restricted if it would otherwise qualify as Prohibited, but it has been determined by the university that prohibiting information storage would significantly reduce faculty/staff/student effectiveness.
  • Confidential information—Information is classified as Confidential if it is not considered to be Prohibited or Restricted but is not generally available to the public.
  • Unrestricted information—Information is classified as Unrestricted if it is not considered to be Prohibited, Restricted, or Confidential.

You can quickly see how the legal value placed on data determines use of the Prohibited and Restricted classifications. Examples of Prohibited information are Social Security numbers, driver’s license numbers, and credit card numbers. Examples of Restricted information are health records and passport numbers.

Military Classification Schemes

A security data classification reflects the criticality and sensitivity of the information. Criticality refers to how important the information is to achieving the organization’s mission. Sensitivity refers to the impact associated with unauthorized disclosure. A specific piece of data can be high on one scale but low on the other. The higher of the two scales typically drives the data classification. As data becomes more important, generally it requires stronger controls. The U.S. military classification scheme is used by several federal agencies.

The U.S. military classification scheme is defined in National Security Information document EO 12356. There are three classification levels:

  • Top Secret data, the unauthorized disclosure of which would reasonably be expected to cause grave damage to national security
  • Secret data, the unauthorized disclosure of which would reasonably be expected to cause serious damage to national security
  • Confidential data, the unauthorized disclosure of which would reasonably be expected to cause damage to national security

Any military data that is considered “classified” must use one of these three classification levels. There is also unclassified data that is handled by government agencies. This type of data has two classification levels:

Sensitive but unclassified is sometimes called SBU. It’s also sometimes called For official use only (FOUO) in the United States. The term FOUO is used primarily within the U.S. Department of Defense (DoD). Some examples of SBU data are Internal Revenue Service tax returns, Social Security numbers, and law enforcement records.

The Information Security Oversight Office (ISOO) oversees the U.S. government’s classification program. The ISOO produces an annual report to the president summarizing the classification program from the prior year. The report outlines what data has been classified and declassified each year. The 2018 report stated that of all the classified data, 38.35 percent was Top Secret, 59.67 percent was Secret, and 1.98 percent was Confidential.

TIP

Looking at the distribution of classification is a good exercise. By understanding the organization, mission, and nature of data handled, one can draw a general opinion of whether there’s an over- or underclassification of highly sensitive data. Overclassification of the most important data could mean using very expensive means to protect data that is not so important. Underclassification means the most important data may not be adequately protected.

Declassifying data is important. It’s not practical to keep data classified forever. First, it’s better to focus limited resources on protecting a smaller amount of the most important data. Second, in democracies, we expect the government to be transparent. Unless there’s a compelling reason to keep a secret, the expectation is the information will be released to the public.

The government routinely declassifies data. Declassification is a term that means to change the classification to “unclassified.” The declassification of data is handled in one of three programs run by the ISOO:

These three programs declassified 19.8 million pages of information as far back as 2012. As you can see, the government has more and more data to protect. The amount could become overwhelming unless there are policies to reduce the amount of data the government protects.

Business Classification Schemes

The private sector, like the military, uses data classification to reflect the importance of the information. Unlike the government, there is no one data classification scheme. There is no one right approach to classification of data. Also like the military, data classification in business drives security and how the data will be handled.

Although there is no mandatory data classification scheme, there are norms for private industry. Earlier in this chapter, we introduced a flat system with only two levels. If you want more detail, the following four classifications are often used:

  • Highly sensitive
  • Sensitive
  • Internal
  • Public

Highly sensitive classification refers to data that is mission-critical. You use criticality and sensitivity to determine what data is mission-critical. This classification is also used to protect highly regulated data. This could include Social Security numbers and financial records. If this information is breached, it could represent considerable liability to the organization. Mission-critical data is information vital for the organization to achieve its core business. As such, an unauthorized breach creates substantial risk to the enterprise.

Access to highly sensitive data is limited. Organizations often apply enhanced security and monitoring. Monitoring can include detailed logging of when records are accessed. Additional security controls may be applied, such as encryption.

Sensitive classification refers to data that is important to the business but not vital to its mission. If information is breached, it could represent significant financial loss. However, the breach of the information would not cause critical damage to the organization. This data might include client lists, vendor information, and network diagrams.

Access to sensitive data is restricted and monitored. The monitoring may not be as rigorous as with highly sensitive data.

The key difference between highly sensitive and sensitive is the magnitude of the impact. Unauthorized exposure of highly sensitive data may put the business at risk. Unauthorized exposure of sensitive data may result in substantial financial loss, but the business will survive.

Internal classification refers to data not related to the core business. The data could be routine communications within the organization. The impact of unauthorized access to internal data is a disruption of operations and financial loss.

Access to internal data is restricted to employees. The information is widely available for them, but the data is not released to the public or individuals outside the company.

Public classification refers to data that has no negative impact on the business when released to the public. Access to public data is often achieved by placing the data on a public website or through press releases. The number of individuals who are permitted to make data public is limited.

Many laws and regulations require you to know where your data is. These laws require you to protect the data commensurate to the risk to your business. Data classification is an effective way of determining risk. The organization is at greater risk when mission-critical data is breached. By classifying the data, you are able to find it quickly and define proper controls.

Developing a Customized Classification Scheme

You can often create a customized classification scheme by altering an existing one. Federal agencies cannot customize a scheme. This is because their classification schemes are mandated by law. Many security frameworks provide guidance and requirements to develop a classification. Some sources of guidance and requirements include the International Organization for Standardization (ISO), Control Objectives for Information and related Technology (COBIT), and Payment Card Industry Data Security Standard (PCI DSS).

Sometimes customizing a classification scheme is minor. This might include modifying the label but not changing the underlying definition; for example, the “highly sensitive” classification could equate to private, restricted, or mission-critical. It’s not the name that matters but the definition. Classification names can vary depending on the organization and the perspective of the creator. “Private” classification in one organization could mean “highly sensitive” in another. In still another it might mean “sensitive.”

When developing a customized data classification scheme, keep to the basics. You should consider the following general guidelines:

  1. Determine the number of classification levels.
  2. Define each classification level.
  3. Name each classification level.
  4. Align the classification to specific handling requirements.
  5. Define the audit and reporting requirements.

You determine the number of classification levels by looking at how much you want to separate the data. One approach is to separate the data by aligning it to critical business processes. This helps you understand the business to better protect the assets. For example, a power plant may want to isolate its supervisory control and data acquisition (SCADA) systems. SCADA data helps run the facility. Special security and controls may be placed on these systems and data. This can be achieved by classifying the SCADA data differently than other types of data.

The definition of each classification level depends on how you want to express the impact of a breach. Federal agencies determine impact based on confidentiality, integrity, and availability. They assign a rating of low, moderate, or high impact to each of these. By applying a formula, they can determine an impact. Organizations outside the government have adopted similar approaches. TABLE 11-3 depicts the basic impact matrix described in National Institute of Standards and Technology (NIST) Federal Information Processing Standard (FIPS) 199.

TABLE 11-3 Classification for Security of Information
A basic impact matrix for the security of information.

The FIPS-199 publication uses phrases such as “limited adverse effect” to denote low impact. For moderate and high impact, it uses “serious adverse effect” and “severe or catastrophic adverse effect.”

In the business world, impact definitions closely align with measured business results. For example, low risks can be defined as causing “operations disruptions and minimal financial loss.” An organization understands these terms because a financial scale can be used. For example, a low impact may result when $1 to $20,000 is at risk; a moderate impact might be defined as $20,000 to $500,000; and a high-impact risk might be defined as $500,000 and above. The exact amounts vary depending on the size of the organization and its risk tolerance. Applying specific dollar amounts to impacts makes the definitions clearer.

The name of each classification level is usually taken from the definition itself. The important point is to select a name that resonates within the organization. You may also consider using a name that peer organizations adopt. This can help facilitate the exchange of approaches within the industry. The name can reflect leadership’s view of risk, such as classifying data as “proprietary” versus “sensitive.”

To align the classification to specific handling requirements is a critical step. Once you determine the classification level, you must apply the appropriate security control requirements. Consider combining the levels where there’s little to no difference in security requirements.

Audit and reporting requirements depend on industry and regulatory requirements. Many organizations are subject to privacy law disclosures. You should consider these reporting requirements when classifying data. For example, sensitive data for audit and reporting requirements can be assigned a special classification. This classification can include additional logging and monitoring capability.

Classifying Your Data

You need to consider two primary issues when classifying data. One issue is data ownership, and the other is security controls. These two issues help you derive maximum value from the data classification effort.

The business is accountable to ensure data is protected. The business also defines handling requirements. IT is the custodian of the data. It’s up to the business to ensure adequate controls are funded and they meet regulatory requirements. The COBIT framework recommends that a data owner be assigned. The data owner is the person who would be accountable for defining all data handling requirements with the business. The data owner determines the level of protection and how the data is stored and accessed. Ultimately, the data owner must strike a balance between protection and usability. The data owner must consider both the business requirements and regulatory requirements.

The position of the data owner should be senior enough to be accountable. The data owner has a vested interest in making sure the data is accurate and properly secure. The data owner needs to understand the importance and value of the information to the business. He or she also needs to understand the ramifications that inaccurate data or unauthorized access has on the organization.

The data owner guides the IT department in defining controls and handling processes. The IT department designs, builds, and implements these controls. For example, if cardholder data is being collected, the data owner should be aware of PCI DSS standards. The IT department would advise the data owner on the technology requirements. Responsibility stays with the data owner to fund the technology. The duties and responsibilities of the data owner should be outlined in the security controls or in security policies.

Determining the security controls for each classification level is a core objective of data classification. It would make no sense to identify data as “highly sensitive” or “Top Secret,” and then allow broad access. The data owner and IT department determine what controls are appropriate. The following is a sampling of the security controls to be considered:

  • Authentication method
  • Encryption
  • Monitoring
  • Logging

It’s up to the business to ensure adequate controls are funded and they meet regulatory requirements. But beyond regulatory requirements, there are some basic guidelines. It should be readily apparent that the more sensitive the data, the stronger encryption, more detailed logging, and more monitoring that are needed.

Today, encryption is relatively easy to implement. Even Microsoft Windows has BitLocker for drive encryption and Encrypted File System (EFS) for individual file encryption. There are also a wide range of open source encryption solutions for files or entire drives. Communications on a network can be encrypted with Transport Layer Security (TLS) with minimal effort. Otherwise, the security policies are viewed as unrealistic and may even be ignored. TABLE 11-4 depicts a simple approach to linking data classifications to security controls.

TABLE 11-4 Data Classification and Security Controls
A table depicts the link between the data classifications and the security controls.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.118.14