Methods of data classification

In this section, we will work through different methods of data classification, explaining each process and providing the essential arguments for the methods, including what information will be gained for the classified information. To understand these methods, it is essential that you are able to define the correct and most efficient way to your classification system. There are two basic methods of classifying data: manually through interaction with the user, and automatically through classification software based on a set of rules.

For data classification, the following four methods are usually used:

  • Based on the user's choice (manual):
    • Focus on the manual selection of the user
    • Should be combined with an automatic process based on content and context that recommends the appropriate classification
    • Reclassification should be controlled and monitored
    • Based on the knowledge of the user when creating, editing, or reviewing sensitive information

Information gained: What is the content and context of a document?

  • Based on the content (manual/automatically):
    • Focus on examining and interpreting sensitive content using regular expressions, metadata, or other options

Information gained: What is the content of a document?

  • Based on the context (manual/automatically):
    • Focus on the application, location, or creator, as well as other variables and indicators of sensitive information

Information gained: Who accesses it? When will the data be accessed? Where is the data moving? How is the data used?

  • Supported by machine learning (manual/automatically):
    • Focus on document classes around automatic classification by comparison with comparison sets and training

Information gained: System learns to integrate further options into the classification or to increase the recognition of certain information.

The combination of different classification methods combines the different positive properties of the individual methods, as follows:

  • Metadata-based classification allows exceptionally high speeds
  • Pattern-matching is fast and safe for classification criteria that can be identified with regular expressions
  • Machine learning with linguistic-statistical methods is generally applicable and gives very good results with fuzzy criteria
  • Special classifiers, such as image recognition, can be another method

The combined and continuous use of these methods leads to success.

Now that we know the different classification methods, we can address the challenges facing organizations where the use of data classification can support a fundamental improvement in the management of information and the efficient use of security technology.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.59.38.41