Chapter 9

The Design of Directory Services

This chapter provides a glimpse of design theory. An understanding of how to proceed in the design of directory services is useful if you have to implement directory services from scratch or if you have to redesign or upgrade existing directory services. There are entire books that address the design and implementation of directory services. If you need an in-depth understanding of the design process, you should consult one of these. On the other hand, it is clear that you cannot put your project on hold while you read 800 pages or more to become a design specialist for directory services.

The goal of this chapter is to give you a brief overview of the principles guiding the design of directory services so that you can proceed further on your own. To achieve this goal, the chapter must fulfill two objectives: First, it will give you a basic understanding of how to proceed with the planning, design, and implementation of directory services. Second, it will show you where to get more information.

The first part of this chapter presents an introduction to the design process. You will see that the design process has no clear ending point. Rather, it is a cycle. After completing the “last” step, you are likely to go back to the first step and repeat all or some of the previous steps in an effort to refine the design.

After seeing an overview of the design of directory services, we will have a closer look at the individual steps involved in the design process.

Introduction

Setting up directory services from scratch would be a clean design problem, but it is not a likely scenario. Usually, the directory services are already in place, and your job is to optimize access to the existing infrastructure. The following is a typical scenario:

You have several applications using a database for user authentication. Let us assume that you have a standard RDBMS in place, but that some applications use a proprietary database for historical reasons. Furthermore, you need authentication for your intranet; you want to enable users to access static Web pages containing sensitive data; and you need authentication for intranet applications such as CGI (common gateway interface) scripts, PHP scripts, or application servers. Most of these applications need authentication as well as additional information about the user that has been authenticated, for example, the user name, the department the user is in, and perhaps some profiling information. Obviously, you need the same information regardless of which point you connect from within the enterprise. Consequently, the first step will be to provide an authentication mechanism implemented as a directory.

Once you have the directory services in place and your authentication mechanism fine-tuned, you will begin to extend it. Any authentication process holds information about persons, for example, UserlD, name, and password. LDAP can hold much more information, so why not use the LDAP server to provide additional information about these persons? Examples of information needed include:

■  Phone and fax numbers of your employees

■  Physical location of employees, such as town, building, floor, and room number

■  Computer equipment that the employees are using

■  Computer systems that the employees have access to

■  Printers that the employees can use

As you begin thinking about useful information for different departments, this list will grow longer.

At this point, it is likely that you will recognize the need for a redesign. You will begin to think about questions like data replication, data distribution, and data security. And you will begin to think about what you should do with the “release zero” of your global directory services. Throw it away and design a new version from scratch? Extend your existing implementation? This chapter was written to help you decide how to resolve these and other questions about your particular situation.

Note that the design phase of directory services is the most important of phase of all because a well-designed system will prevent problems from occurring later on. It is very difficult to recover from prominent design errors, and even very grave design errors may not become apparent until late in the implementation phase.

In contrast, with a good design in hand, implementation is straightforward. It is somewhat like writing a book: Once you have written a good outline, you have completed as much as half the work. So the more time you spend on the design phase, the less effort you will need to expend during the implementation phase. Of course, your time is not unlimited, and you cannot design forever. At some point, you will have to compromise between the time spent on design and the final benefit you gain with eventual further refinement of your design. Your goal should be to make the design “good enough.” This is a question of experience, but the point is: Do not stint on the time spent during the design phase.

Directory Life Cycle

When considering the life cycle of directory services, you will encounter many familiar concepts from software-development and project-management courses. We will not explain these concepts here. Instead, we will concentrate only on the specific concepts that apply to directory projects. Now let us have a closer look at the individual design phases. The steps described here are the steps that, after some experience, I have personally found useful. Unfortunately, there is no standard recipe on this subject. You will find many articles, books, and courses commenting on the steps involved in planning, designing, and implementing directory services. Exhibit 1 shows the life cycle of a directory. Note that it is not unusual to take the results of one stage and then take a step or two backward to refine a previous step.

There are many other versions of the life cycle of directory services, with some steps having additional refinements or subdivisions. The best way to proceed in your particular situation depends on your specific working environment, for example, the number of persons working on the project. Consider the following as a work list that must be completed before setting up and running directory services in your environment. Remember, furthermore, that there are no clear boundaries that clearly divide the single phases. For example, the step of “Analysis of data to be stored in your directory and data sources available in your environment” could be part of a planning stage as well as a design stage.

■  Planning of directory services: Define the goal of the project, listing the objectives and producing a time plan.

■  Design of directory services: Design the data the directory will hold, the schema, the tree, topology, and security measures.

Image

Exhibit 1. Directory Life Cycle

■  Implement design of directory services: At the beginning of the project, this may only be a prototype to verify that the design meets your requirements. If it does, you proceed to the next step; if not, you jump to the point where the design needs to be improved.

■  Verify production and maintenance of directory services: In this step, you finally see if your deliverable actually meets the requirements under normal workload. This is also the moment where you begin to think about improvements. And so we are back at the first step.

Now you can break down your directory project into subprojects, knowing that at the end of each subproject you will define the requirements for the next one.

Planning of Directory Services

In the planning phase, you lay the foundation for your project. As in every project, you define the overall goal and the timeline for its realization. This chapter cannot get into the details of project management, but we will discuss some aspects that are of particular relevance to the implementation of a directory. Exhibit 2 shows an overview of the steps necessary in the planning phase. As you can see, there is no strict sequence in the necessary steps. Every step will result in the output of one or more documents.

Image

Exhibit 2. Activities in the Planning Phase

One can divide the planning phase into following activities:

■  Define the overall goal of the project

■  List the benefits of the project

■  Define the objectives that must be met to achieve the project’s goal

■  Determine the target of the project

■  Analyze the actual situation

■  List the steps to perform

■  Identify the project plan

Goal of the Project

A directory is often a strategic project for the whole enterprise, so a proposal defining the project’s goal assumes a particular importance and is typically scrutinized closely by top management. Costs aside, they want to see a clear definition of two things:

1.  Goal of the directory services project

2.  Metrics to determine whether the goal of the project has been achieved

An unambiguous definition of the goal helps avoid misunderstandings between the project manager and the project sponsor. Perhaps more importantly, it helps the sponsor understand the purpose of the proposed project. The proposal should also clearly define some metric for evaluating whether the goal of the project has been achieved. The sponsor surely will want to know whether the money spent for this project was a good investment. With the help of the proposal document defining these two points, both the project sponsor and the project manager should arrive at a mutually agreeable definition of the project’s goal.

Benefits of the Project

Unless you are working in a pure research project, and very often even then, achieving the goal of the project will result in a benefit for your organization. This part of the planning activity explains the advantages of accomplishing the goal to management. In turn, management understands what it is buying when it agrees to invest the money required to reach the particular goal.

Objectives of the Project

To achieve the goal of the project, you need to define objectives. It is important to understand the difference between the goal and the objectives of a project. The goal is the end result achieved when the project is finished and is defined in a more general way than the objectives. Normally, a project has only one goal. To arrive at the goal, you break the project down into objectives. An objective is more specific than a goal, and a project has more than one objective. The main property of an objective is that it be measurable. The overall goal of the project is measurable in time, effort, and cost.

Because the individual objectives have to be measurable, you and the other actors involved have to agree about the appropriate metrics. This will help in avoiding misunderstandings about who has to do what and when.

The planning group will have to confer with a number of people with different roles. Again, the precise identity of those with whom you have to engage during this process depends on your environment. There are substantially three types of roles for the people involved in your project: project sponsor, management, and users. This does not mean that the sponsor and the user are different persons. On the contrary, it is very probable that the sponsor will also use the directory, and it is likely that management also assumes the role of sponsorship.

■  It is a near certainty that your sponsor will want you to provide a compelling reason why she should spend money on directory services, so try to understand her perspective of what she expects your work should achieve. Note that this is not the place to address the technical aspects of the project. Indeed, the sponsor may not even understand what directory services are. Political and strategic goals should be your principal focus.

■  You will also be in contact with management. Again, you will have to understand what management wants to achieve with directory services. These goals are strategic and should define how directory services are located in your reality and set the priorities of the objectives to be achieved.

■  Finally, and most importantly, you must speak with the future users of your services. You have to understand what they expect from directory services. Here you will learn functional and technical goals. There are two types of users of directory services: technicians and end users. The technicians include the persons administering the directory server and those who are responsible for writing applications using the directory server. End users include the persons supplying data, such as human-resource people filling in new records, and the end users, who finally connect with an application to get the information they need.

Target of the Project

The target of the project describes who in the end will benefit from the project. The target can also be called the end user. You describe the target to understand the importance of the whole project. Note that it is not simply a matter of counting the number of persons who will benefit from the result of the project. You will have to measure the improvement in the performance of the target users and then calculate the benefit the enterprise will gain from this improved performance.

Note that the target is not limited to physical persons. The project also will benefit applications. Thus, part of the planning phase is to understand what applications will use your directory data. Again, you will have to see where these applications are finally located. Gaining a clear picture of where your data could be needed is not an easy task, and it is essential to maintain a very tight contact with the users. Sometimes it is up to you to discover what applications could benefit from your directory. The more potential consumers you find, the less likely it is that you will have surprises later on. And the greater the number of applications that will use your directory, the more your work will be appreciated. It is a good idea to make a list of potential application types and then check to see whether these applications are actually being used in your enterprise. Examples of candidates are mail systems, calendar systems, room booking, help-desk applications, phone books, and every type of service using authentication services.

Analysis of the Actual Situation

A further activity to be done is the analysis of the actual situation. The overall goal defines where you want to be tomorrow; this step shows where you are today. The objectives show the status of individual stations.

Because we are speaking of a project of data management, the first thing to understand is the data to be managed. That is, you need to know which repositories you actually have in place. It may be there is just one or more directories in place. Indeed, this is often the case, and then you have to understand the quality of the data in the directories. This means that you need to understand whether this data can be used in the project, including a means of controlling consistency between the individual data sources.

Analysis of the Data to Be Held in the Directory

Because directory services are handling data, an important point you need to consider is what kind of data and what kind of information your directory should hold. Do not confuse this step with the data design you will make in the next step of the project. Here, you simply need to understand whether the directory should hold the data in question or not. What you have to do is decide whether the directory is suitable for holding the data in question.

Data appropriate to be held in the directory includes the following:

■  Data that is accessed from more than one application

■  Data that is accessed from more than one physical location

■  Data read more frequently than it is written

■  Data that can be expressed in attribute form, for example, sn = Voglmaier

Data not suitable for the directory includes:

■  Large, unstructured data objects like images, video, or binaries

■  Frequently changing data

Other data stores can be relational databases, file systems, file servers, ftp servers, http servers, etc.

Steps to Perform

Using the objectives you have defined, your analysis of the actual situation, and your knowledge of the actors involved in the project, you will break each objective down into a series of individual steps to perform. As you can surmise from Exhibit 3, the steps to perform depend on the actual situations, the objectives, and the resources available. Feedback from the individual steps can affect nearly all activities discussed until now and can even change some of them. These steps are indeed the first confrontation with reality.

Image

Exhibit 3. Overview of the Directory-Design Process

Project Plan

The last action in the planning phase is the production of a project plan. You produce the plan using all of the information you have gathered. The most important part of the project plan is the Gantt chart. The Gantt chart assigns the steps to be performed by the different teams in the project and puts the steps in a timeline specifying a date when the actions must be completed.

What your project plan will look like depends on a number of factors, including the number of people participating in the individual work groups and the budget at your disposal. It will also depend on your reality (factors such as dimensions of network, design of network, number of clients, number of servers, etc.) and the standards on your site. Furthermore, it will depend on the complexity of the directory you will develop, such as the amount of data, the type of data, how the data is connected, and similar factors. Finally, it will depend on which departments in your enterprise are involved in this project, the strategic importance of the directory, and also the dimension of your enterprise. A plan for a project involving one little department will be vastly different than a plan for a whole enterprise operating in many countries spanning different continents.

Once you have all the information at hand, you will have a very extensive document. At this point, it is worth considering whether, instead of handling a monolithic project, you should break the whole project down into a number of subprojects. Subprojects are much easier to handle and help you in a natural way to develop strategic milestones, thus keeping your timelines under control. Furthermore, it permits you to develop working prototypes quickly that you can show your sponsors to show how work is proceeding.

Design of Directory Services

Do you remember Chapter 3, where we spoke about the models standing behind LDAP? Directory services implement these models. It is the design process that breathes reality into the concepts contained in the models. Consequently, you will find the same conceptual structure that we encountered in the models presented in Chapter 3.

Exhibit 3 puts together the activities necessary for the design of a directory. The design steps are grouped together in their corresponding model context. Exhibit 3 proposes a way to proceed in the design process, as indicated by the large gray lines. Every design step is based on the result of the previous one. Note, however, that the single design activities are not independent of each other. For example, the security design — despite being based on the outcome of the tree design — itself has an impact on the tree design. The dotted lines indicate dependencies between the individual phases in the design process.

The following subsections briefly treat each of the various design steps in the directory design process illustrated in Exhibit 3.

Data Design

This is the most important step in the design process because it deals with what the directory is all about: the data. In this step, you decide which data you will put into the directory. When designing the directory data, you will analyze a set of information about the data. The following points need to be clarified:

■  Who uses the data

■  Who produces the data

■  Who “owns” the data

■  Who should have access to the data

■  The type of the data

Schema Design

The schema defines the rules to obey when putting data into the directory. It defines which data can be put into the directory and the format of that data. It also defines how comparisons should be executed in the case of queries. Take, for example, the case of phone numbers. Note that 1-800-2939-2636 and 1 800 2939 2636 are the same phone numbers in two different notations. The schema defines what a phone number looks like and when two phone numbers have to be considered as equal. In this example, the schema would define that a phone number can be divided into smaller pieces and that these pieces can be limited by hyphens or white spaces. It would furthermore specify that hyphens and space shall not be considered when making comparisons between values.

Tree Design

In this step, you lay down the logical structure of your namespace. A directory tree helps in managing large amounts of data by dividing it into smaller units that can be connected to each other in a hierarchical structure. This type of structure helps in locating data in the directory and improves performance during search functions. The decisions you make in tree design will have an enormous impact on the maintenance of the directory. The next two steps, partitioning design and replication design, can benefit from a good tree design.

Partitioning Design

Directories often hold a lot of data and, consequently, may be subjected to a heavy load of user requests. You might consider dividing the directory into smaller, more manageable pieces, with each located on a server of its own. This type of design also helps bring the data closer to the consumer. For example, assume that you have two departments, one in the United States and one in Japan. Each department has its own records. Partitioning the data would allow you to put the Japanese data onto the LAN where the Japanese users are and the U.S. data onto the U.S. LAN. Both databases would be closer to the data consumer. Furthermore, this would give you the flexibility to impose different policies for data maintenance. The requirement could also allow the use of different administrative structures or different security standards for the different partitions.

Replication Design

The same considerations that might lead you to consider partitioning as a design strategy apply also to replication. Indeed replication can be an alternative to partitioning. Replication addresses the availability of the directory in much the same way as partitioning. The objective could be to get the directory closer to the data consumer, but it could also be to increase throughput in heavily loaded machines. Replication also has the added value of keeping the directory available in the event of software or hardware failures. Replication provides a redundant data design and consequently requires mechanisms to keep the duplicated data in sync.

Security Design

In this step, you will treat a lot of issues. It involves first assuring that every user can do exactly what he is supposed to do. That means that you need to know exactly which actions which user may perform on what data. The next step is to protect the conversation between client and server in the correct way. If you set up replication or chaining, i.e., architectures that require conversations between servers, you have to protect these conversations as well. You must also think about physical security of the LDAP data configuration files. Therefore, you need to know exactly who has access to the server on which you are running your installation.

Let us now look at the various design activities in a bit more detail. As you have already noticed, these design activities are by no means independent of each other. Indeed, they are strongly interconnected. We will see these dependencies in the following sections.

As explained at the beginning of this chapter, there are a number of documents that fully describe additional procedures. At the end of this chapter, you will find a list of documents where you can find more information.

Data Design

In this step, you will reap the benefit of the work you did in the planning phase. This is where you put to use the information you have painstakingly gathered.

What you have now is a list of data that you will put into your directory. The data has one or more data sources and one or more applications using the data. Because the most important job of the directory is to service applications, it is important to know which applications need what data and where the data comes from. A useful approach is to produce a matrix for each application. The matrix should hold the data name, the data type, the information source, and the data source that has to be considered authoritative. The data type is meant as a human-language description — such as password, e-mail, phone number — and not a C-style type declaration such as int, char, etc. Exhibit 4 shows an example of such a matrix.

Image

Exhibit 4. Matrix for Data-Design Process

A mapping between the application and application owner as shown in Exhibit 5 is also very useful. You will then know who owns the data and who owns the application using this data. Both are partners who should be involved in the data design process.

Image

Exhibit 5. Mapping between Applications and Application Owner

Once you have documented all applications in this way, you can put this information together in another matrix describing what data the directory should hold. In this matrix, you will list the data name and data type along with the following information:

■  Applications that put the data in

■  Owner of the data, i.e., the entity that is responsible for maintaining the data

■  Whether the data is publicly readable

■  Whether a human user can modify personal data (e.g., password, “yes”; salary, “no”)

■  Who can update the data

Schema Design

Recall from Chapter 3 our discussion of the concept of a “directory schema” in the information model. The directory schema is a set of rules describing what can be stored in a directory and how information should be treated. This is achieved by the definition of attribute types and object classes. The attribute-type definitions explain how the data values are represented and how comparisons are made. The object classes are a set of attributes that frequently correspond to real-life objects. The object class further describes what attribute types are required and what attribute types are optional. It also gives a tool for retrieving a subset of an object class.

Schema design has two goals: The first is the mapping of the data defined in the previous step (data design) into attributes. The second is to put the attributes together into object classes.

One possibility is to use an existing schema. If there is no existing schema to fit your requirements, you can try to extend an existing schema. The last possibility is to create a new schema for the application at hand. If you write a new schema or extend an existing schema, you should document what you have done and eventually think about publishing the extensions you made.

You can get existing schemas from a number of sources. First of all, you will examine existing standards. You can get a standard schema from the site of IETF in the form of RFC 2256, “A Summary of the X.500(96) User Schema for Use with LDAPv3.” Vendor-supplied schemas are another possible resource. Nearly every LDAP implementation is shipped with a number of standard schemas. For example, OpenLDAP gives you a number of schemas that you will find in a configuration directory called “schema.” You should check to see if you can use one of these existing schemas. Your directory application also may be delivered with a schema of its own. Whatever the source, try to use a standard schema.

If your needs do not fit into an existing schema, you should consider modifying and extending one that comes close to meeting your needs. It is not recommended to modify existing object classes, because this could cause inconsistencies with your directory application and directory clients. You should instead subclass an existing class. For details about the process of subclassing, see Chapter 3, where the information model is explained.

The last possibility is to create your own schema. While the best approach is to use a standard schema or to extend an existing schema, there are situations where you really do need your own schema. However, be aware that you risk losing compatibility between your LDAP application and every other LDAP server. Your LDAP clients must also know about your proprietary schema.

Here are some considerations that you should take into account if you want to define new attribute types or new object classes:

■  Use a consistent naming schema for all extensions. This helps future application programmers who may want to use your schema.

■  Prefix your new object classes or attribute-type names with your project name or organization name. Because the LDAP namespace is flat, this avoids name collisions.

■  Try to use meaningful names without making them too long. This avoids a lot of unnecessary typing and, even more importantly, typos.

■  Get official object identifiers for your new objects. You can learn more about this at http://www.alvestrand.no.

Tree Design

In the previous step (schema design), you decided how to build object classes from attributes to project your data into the directory. In this step, you must structure your object classes to build up the directory information tree (DIT). The DIT defines the namespace where all the objects of your directory reside. The definition of the namespace helps the directory server to decide whether a client can get information from his directory or whether he should contact another directory server.

In designing the tree, you have to make the following decisions:

■  Choose a suffix as a root for the DIT

■  Choose a branching policy

■  Choose a naming policy

We will look at each of these steps in detail, but first let us consider the importance of the namespace. The design of the DIT strongly influences the following points:

■  Data organization: A good tree layout facilitates organization of the data, helps your users to browse the tree, and returns queries faster.

■  Replication, partitioning, and access control: We will have to decide about replication and partitioning a bit later. However, we lay a foundation for these future decisions when we design the directory tree. In both data replication and data partitioning, the boundaries between the single parts of the directory can only be at certain points within the namespace. The same is true for access control. It is easier to control access to a completely separated subtree than to have a tree where confidential data is mixed with public data. Furthermore, you may wish that a user or an application does not even see that there is additional private data available.

■  Maintenance: Changes in the tree are not infrequent. A well-organized tree will facilitate later changes.

■  Application support: The DIT must also support the directory applications using the directory.

Choosing a Root for the Directory Information Tree

The first step in designing the tree is to choose a root for the DIT. This root is also called a “directory suffix” or a “root distinguished name” (root DN). In most directory server implementations, it is possible to define more than one root DN in one directory server. During the installation, some implementations of directory servers configure more than one root DN. For example, the i-Planet directory server uses one root DN for the directory and one root DN for the administration of the directory server itself.

There are three different naming conventions you can use:

1.  The domainComponent attribute, where you split your domain name into its individual components:

dc = bmw, dc = de

2.  The domain name assigned to the enterprise:

o = bmw.de

3.  The traditional X.500 naming conventions:

o = bmw, c = de

The choice of a naming convention depends on your actual situation. You may be constrained in your choice if there are already directory-enabled applications using one of these possibilities. Choosing the last one is normally a good idea because it guarantees compatibility with the X.500 standard. LDAP works with any of the three naming conventions. However, regardless of the naming convention you choose, there are things that you should bear in mind when choosing a root DN:

■  The root DN has to be unique within the range of its visibility. If you are working inside the intranet of an enterprise, the name must be unique within the companywide intranet. If the directory services are meant for use over the Internet, the name must be unique throughout the worldwide Internet.

■  Once chosen, the root DN should not be changed. If the root DN changes, you have to change your whole directory tree and all applications using the directory services. In a large organization, this could take months.

■  The root DN should be not too long and it should be easy to remember. Your users and your application engineers will be happy, and the applications will be less error prone.

When the directory server receives a query from a client, the server decides whether it is competent to answer the query. A well-designed DIT facilitates this decision. Now all the server has to do is take the DN from the request and see whether it exists within one of the namespaces defined by the root DNs. If it is beneath one of the root DNs, then the directory server knows that it holds information useful for the query. If it does not, the directory server’s response depends on its configuration. It could direct the client to another server, or it could simply answer with an error code.

Branching the Directory Tree

The question is, should you use a flat namespace or a hierarchical schema? It is a good idea to keep your tree as flat as possible, such as the completely flat schema depicted in Exhibit 17 in Chapter 3. This namespace design has several advantages. First of all, it keeps things simple. A complicated hierarchy is more difficult to understand and is consequently more difficult to use and to maintain than a flat configuration. The flatter the directory tree, the easier it is for application designers to use it. Another reason to keep the hierarchy flat is that a flat namespace reduces the probability that the tree will undergo changes later on. A flat namespace also tends to have shorter relative distinguished names (RDN). The shorter the RDNs, the easier they are to remember.

So why would anyone ever use a hierarchical schema? There are several reasons, and we will encounter some of these arguments again in the later sections of this chapter.

■  Administrative purposes: If your enterprise is spread among several countries, then data maintenance, data backup, and statistics can be done locally. The local competence can impose branching decisions, as shown in Exhibit 18 in Chapter 3.

■  Partitioning decisions: When the amount of data is so high that you wish to distribute your data over more machines to improve performance, you need an opportune branch point to do so.

■  Replication decisions: You maintain your data centrally, but you want the option of replicating some of the data to local applications where the data is needed. The reason may be slow communication lines or dial-up connections.

■  Access control and security questions: Security issues are different for different types of data. Some data is appropriate for public consumption, and some data is proprietary and should be visible only to a defined group of people. You can differentiate security among data classes by branching up your tree.

The most important consideration is to design the tree so that there will be as few changes as possible. So avoid designing a tree based upon your actual organigram, if possible. Organigrams are subject to frequent changes, and after each reorganization you will have to redesign you namespace and change all applications accessing your directory server.

Partitioning

The previous section explained how to put the entries together to build the directory tree. In the planning phase, you may only have a rough idea of how many of these entries your directory is going to hold, and this number can be considerable. If it is too high, performance will suffer. The remedy is to distribute the data on more than one directory server, and that is what this section is about.

As mentioned previously, the partitioning design is closely related to replication design. Furthermore, both partitioning and replication can modify the decisions you made in designing the directory tree. Arguments that let you decide to partition or replicate the directory are nearly the same. Another point that is related to partitioning and replication is physical design, but this is beyond the scope of this chapter.

As we have seen in Chapter 5, partitioning is the division of your directory into more parts, while replication copies a part or all of the directory onto one or more other servers. These parts can be on the same machine or on different computers on your network. Partitioning of the directory has several goals:

■  Scalability: Directories can contain a few thousand entries or several million. The ability to distribute the entries over several servers provides a greater degree of scalability.

■  Load balancing: Distributing the entries over several servers prevents overloads on the server as well as the network.

■  Local management: Moving the entries to the local networks where the entries are maintained reduces network traffic and enhances performance of the local directory.

Partitioning of the directory is called for when:

■  Number of entries is too high

■  Network traffic to the directory is too high

■  Not all of the data is equally used

■  Some line segments become overloaded

Number of Entries Is Too High

If the database gets too big to be held within a single partition, you should divide it. You may wonder, “What is ‘too big’?” First you have the computer, where the disk space is one limiting factor. Another limiting factor is the directory server. See the documentation delivered with your software about the upper limit of entries your server can hold. Another limitation is backup time. When it takes ten hours to back up your system and only 1 kB of data has changed, it may be worthwhile to consider partitioning the database.

Network Traffic to the Directory Is Too High

Even if both the hardware and the server can handle the amount of data, the traffic caused by the directory can penalize the part of the network where your directory server is installed. Depending on the access frequency of the individual parts, dividing the data over two subnetworks could help to increase network performance.

Not All of the Data Is Equally Used

Your directory may contain data that is used frequently by one application and infrequently by another. Partitioning of the directory may enable you to put the frequently used data on a directory server that is closer to the application that uses the data. Backup time is another consideration. Separating static data from data that is updated frequently could save you backup time, because you will reduce the frequency of backups for the nearly static data.

Some Line Segments Become Overloaded

If your company has a high-speed network distributed throughout the enterprise, overloaded line segments will not be an issue for you.

However, not everyone is so lucky. Consider the case of an international enterprise having offices in the United States, the United Kingdom, and Germany, with each office maintaining its personnel records in a central directory. Assume that the connection between two of these nodes is not a high-speed connection or, even worse, is a dialup connection available only a few hours a day. In such cases, it may convenient to partition the directory and create a separate directory for each office.

Partitioning and Namespace

As mentioned earlier, partitioning design and tree design are not independent design phases. You can partition the directory only at branch points, which means that you can only move entire subtrees to new partitions from the main directory. See Exhibit 6 for an example of valid partitions. The accounting subtree and the human resources subtree are split up from the original partition, and two new partitions are created. Exhibit 7 instead is an example of invalid partitioning. You cannot put the accounting branch point in a separate partition because “accounting” no longer contains the subtree people and, therefore, there is a hole between people and ldap_abc.

Replication

Like partitioning, replication design can affect the design of the directory tree. So after deciding on a replication strategy, you may need to rethink some of the decisions that you made when you first designed the directory tree. You may also need to rethink some decisions you made in partitioning. However, a word of warning: Currently there is no standard regarding replication. In all of the design phases, it is recommended that you consult the documentation shipped with the product you are using. In this step, it is absolutely required. This includes verifying that the software actually supports the features you need, e.g., multimaster replication, scheduling of replications, etc.

Image

Exhibit 6. Organization of ldap_abc with Two Partitions, One for Human Resources and One for Accounting

Image

Exhibit 7. Example of an Invalid Partition

The goals of replication are:

■  High availability: Replication of the directory increases the availability of your directory services.

■  High performance: Replication places directory services as close as possible to your clients, thus enhancing performance.

■  Load balancing: Distributing the entries over multiple servers prevents overloading of the server as well as the network.

■  Local management: Moving the entries to the local networks where the entries are maintained reduces network traffic and enhances performance of the local directory.

Replication of the directory is called for when:

■  A request is made for its partitioning

■  Network traffic to the directory is too high

■  Some line segments become overloaded

■  Network would benefit from a backup system to guarantee availability of the directory in the event of hardware/software failures

Network Traffic to the Directory Is Too High

Even if both the hardware and the server can handle the amount of data, the traffic caused by the directory can penalize the part of the network where your directory server is installed. Depending on the access frequency of the individual parts, dividing the data over two subnetworks could help to increase network performance.

Some Line Segments Become Overloaded

If your company has a high-speed network distributed throughout the enterprise, overloaded line segments will not be an issue for you. However, not everyone is so lucky. Consider the case of an international enterprise having offices in the United States, the United Kingdom, and Germany, with each office maintaining its personnel records in a central directory. Assume that the connection between two of these nodes is not a high-speed connection or, even worse, is a dialup connection available only a few hours a day. In such cases, it may convenient to replicate the directory and create a separate directory for each office.

Replication and Namespace

Similar to partitioning, replication design and tree design are not independent design phases. Another similarity is that you can replicate the directory only at branch points. This means that you can only move entire subtrees to new replications from the main directory. See the previous section on partitioning for more details. Note that there is a lack of standards in replication, so you have to look at your server’s documentation for the unit of replication. For example, the SUN ONE directory server does not allow you to replicate a subtree because the smallest unit of replication is the database.

Security Design

Security design is an important issue in an Internet environment, but it is also important in an intranet environment. At the first glance, the reason may not be obvious. However, when you store data about persons, you assume a legal responsibility to protect that data.

Attacks against your data stores are not limited to those who are outside your enterprise. They can also be initiated from the inside. Such attacks are not necessarily the work of malign hackers. Data can also be compromised by careless users who do not have much concern about security issues. Imagine an employee who, instead of making a query against the directory to get the data needed for a commodity, just prints out everything, reads the interesting data, and throws away the output. If this printed output is recycled without being shredded, you can imagine the potential security issues.

Even if the security aspects are sufficient for the intranet, at some point you may decide to make the directory available on the extranet also. If you already have a good security design in place, you will not have to change much if you decide to make the directory visible from the outside.

When designing the security aspects of the directory, you need to decide on a strategy to protect the data stored in the directory. To do this, you need to know who can access the data and how. Again, in this step you will base your work on preceding steps. You have to know who owns the data, who maintains the data, who should be authorized to update the data, and who should be authorized to consult the directory.

Security design covers three arguments:

■  Authentication: Verifies that the user is the person she claims to be. This could range from simple authentication to complicated procedures involving the use of certificates.

■  Authorization: Ensures that the authenticated user can access only the data she is authorized to access

■  Protection of the data: Guarantees that data traveling on the network cannot be read or modified. The level of security depends on your particular requirements. You can let all of the traffic on your network travel in clear, or you can encrypt it.

Authentication

As mentioned previously, authentication is the activity required to control who is trying to contact you. To understand which authentication scheme to put in place, you have to understand which type of data travels on the line. Therefore, you should take the information you have produced in data design and add on the sensitivity of the data. You may produce a further matrix of data mapped against the sensitivity of the data. You could use security levels as described later in the authentication scheme for your directory server implementation. A frequent classification of levels is as follows:

■  Nonsecurity — otherwise classified as data in the public domain

■  Data not in the public domain that requires user credentials for access (intended as DN and password)

■  Sensitive data that requires user credentials plus encrypted communication

■  Highly sensitive data that requires user credentials, encrypted communication, plus use of certificates

In this design phase, the implementation of the directory server you are using again comes into play. And again it may be that in this phase, you will discover that you cannot work with only one directory server. You may need to put some of the more sensitive data in a special partition, because your directory server cannot handle different security mechanisms on the same partition. For example, the implementation you are using may not allow you to use certificates for a subtree of your directory server only. Therefore, you have to put this subtree on a partition of its own. In this phase you may also be obliged to review the tree design. Otherwise, you could end up with an illegal partition.

Authorization

This is the step where you design who can do what with which data. Again you will pull out the matrix you planned at the very first step of data design. You could create a new matrix at this point that contains the data plus the corresponding access rights. At the beginning of this design phase be as restrictive as possible. If, however, you should see that some functionality that the directory server has to offer is not possible, you can relax your security restrictions a little. Once you have documented your results, you should map your configuration onto the directory server’s configuration mechanism.

Protection of the Data

Until now we have protected the conversation between client and server, we have configured who can do what, and yet the most important piece may be still unprotected. We still have to protect the physical files on repository and also the configuration files. All these files have to be protected against unauthorized access. Frequently people put in complicated security mechanisms to protect the conversation between client and server, but they forget to protect physical access both to the server and to its configuration files. In order to understand your requirements, you have to understand who besides the administrators of the directory (if anyone) has access to the server the directory is installed on. Consequently, you have to protect the data and configuration files. It is, anyway, always good practice to configure a special user as owner of the directory server including the data and configuration files. Do not give to another user or group access rights to these files if not absolutely required. And again produce documentation of this step; it helps you later to understand and remember why you made this decision.

Conclusion

In this chapter you should have learned how to proceed in setting up a new directory server implementation. We did not discuss installation and configuration questions, but we did discuss the actions that precede these steps. This chapter, obviously, is far from complete because there is only so much space in this book. However, as promised at the beginning of this chapter, I will now show you where to research and find more details about directory server design.

First, I would recommend that you look at the Internet Web sites of the major players in the directory server arena. Two of them are Sun Microsystems and Novell. Sun delivers excellent documentation with its SUN ONE directory server. Following are a few publications of the SUN ONE server:

■  “Getting Started Guide”

■  “Deployment Guide”

■  “Installation and Tuning Guide”

■  “Administration Guide”

You can find these publications by going to http://www.sun.com, clicking on documentation, software products, SUN ONE. You can download these publications at no cost. Even if you do not use the SUN ONE server, you will learn much from them.

At the Novell site (http://www.novell.com), the product page (eDirectory) offers a number of good white papers.

Another link I would recommend that you look at is: http://www. kingsmountain.com/ldapRoadmap.shtml. Here you will get a number of useful and updated links to interesting resources.

The book Understanding and Deploying Directory Services is dedicated nearly entirely to the design of directory services and contains a number of case studies. (Tim Howes, Mark Smith, and Gordon Good, Understanding and Deploying Directory Services, Macmillan Network Architecture and Development Series).

RFCs are also a very good source of information. They are, however, not easy to read for newcomers. The RFCs worth mentioning are:

■  RFC1562: Naming Guidelines for the AARNet X.500 Directory Service

■  RFC1617: Naming and Structuring Guidelines for X.500 Directory Pilots

■  RFC1943: Building an X.500 Directory Service in the U.S.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.218.94.173