DISTRIBUTED DATA PROCESSING (STUDY OBJECTIVE 8)

Many small companies house all of their operations in a single building. For these companies there is usually no need to consider the physical location of their database. A small company with only a single building would obviously store its data on a computer within that building. However, most mid-sized or large organizations have multiple locations, sometimes located throughout the world. Large and midsized organizations must decide where their data should be physically stored and in which locations they should be processed. For a fast-food franchise like McDonald's Corp., for example, management could decide to maintain one database of prices for the food products that it sells. Should that price data be in one location and all restaurant computer systems access that one database, or should prices be stored in regions or localities so that each location can charge different prices? This is only an example of the problems of physical data storage facing large organizations. The location of the data storage and the location of the processing of the data can have tremendous impact upon the efficiency and effectiveness of the company.

THE REAL WORLD

McDonald's has restaurants, warehouses, and offices located throughout the world; yet its corporate headquarters is in Oakbrook, Illinois. If McDonald's management decided that all data, including prices, must be stored in a database at corporate headquarters, what would have to happen when you order a cheeseburger at a McDonald's in Los Angeles? The cash register system would have to read pricing data from the database in Oakbrook, Illinois. This would be inefficient for several reasons. First, each McDonald's restaurant would be trying to read the same database simultaneously in order to fill customer orders all around the world. Each of the McDonald's restaurants would need to be networked to that data in Illinois and would need to be able to read price data quickly in order to process the sale. This would generate so much network traffic that it would very likely overwhelm the network and computer system. In addition, if prices are stored only at corporate headquarters, it would become more difficult for each location to set its own prices. Certainly, it would be much more efficient for McDonald's to maintain pricing data at the local restaurants or in regional centers.

Like McDonald's, all large organizations must make decisions about the data they maintain—decisions involving where data are physically stored and which locations process various data.

This question of locations for data storage and processing is usually considered in the context of choosing from two general approaches: centralized or distributed. Data can be stored in a central location, or it can be distributed across various locations. Similarly, the processing of data and transactions can occur only in a central location, or distributed across the various locations. In the early days of computing, data processing and databases were stored and maintained in a central location. These are called centralized processing and centralized databases. However, in today's IT environment, most processing and databases are distributed. In distributed data processing (DDP) and distributed databases (DDB), the processing and the databases are dispersed to different locations of the organization. A distributed database is actually a collection of smaller databases dispersed across several computers on a computer network. The data are stored on different computers within the network, and the application programs access data from these different sites.

DDP AND DDB

Current IT systems use networks such as LANs and WANs extensively, enabling the easy distribution of processing and databases. Distributing the processing and data offers the following advantages:

  1. Reduced hardware cost. Distributed systems use networks of smaller computers rather than a single mainframe computer. This configuration is much less costly to purchase and maintain.
  2. Improved responsiveness. Access is faster, since data can be located at the site of the greatest demand for that data. Processing speed is improved, since the processing workload is spread over several computers.
  3. Easier incremental growth. As the organization grows or requires additional computing resources, new sites can be added quickly and easily. Adding smaller, networked computers is easier and less costly than adding a new mainframe computer.
  4. Increased user control and user involvement. If data and processing are distributed locally, the local users have more control over the data. This control also allows users to be more involved in the maintenance of the data, and users are therefore more satisfied.
  5. Automatic integrated backup. When data and processing are distributed across several computers, the failure of any single site is not as harmful. Other computers within the network can take on extra processing or data storage to make up for the loss of any single site.

However, it is important to recognize that there are also disadvantages to the use of DDP and DDB, namely, increased difficulty of managing, controlling, and maintaining integrity of the data. A large database that is stored, maintained, and accessed at a central location is much easier to manage and control. To consider why this is true, think of a large building with only a single door. Controlling access to items stored in the building can be controlled by having security at that single door. However, every door that is added to the building affords another opportunity for someone to gain unauthorized access. Therefore, every door represents a point at which security must be enhanced. The same is true of distributed systems, wherein several sites within the organization can access the databases. The increased number of sites accessing the data causes a greater need for security and control of the database.

In addition, when data are located at several sites, concurrency control is a problem. Think about the McDonald's pricing situation presented earlier in this section. If McDonald's decides to increase the price of cheeseburgers by 10 percent, that pricing change has to be made at every locality which maintains pricing data. Notice that this price change would be much easier to implement if there were only one centralized price database. The price could be changed in this single centralized database, and the price change would immediately be seen by all those who use that database. These disadvantages do not cause organizations to avoid the use of DDP and DDB, but they do cause greater attention to be paid to security and control issues. Organizations that use DDP and DDB must have better controls in place to ensure the security and concurrency of the data.

There are also management issues that are more difficult to control in DDP and DDB. If local users have more control over the systems, there is a greater chance that local sites will have incompatible hardware, systems, or data. For example, a local site may buy hardware that is incompatible with the larger network system of the organization. Management can lessen these problems by enforcing policies regarding the purchase and use of hardware and software, and through tighter management of the databases.

To database users, the question of how or where data are stored continues to become less important; whereas, it's more critical to determine whether the data are easily accessible and whether they can easily be analyzed. As described in Chapter 2, many companies are moving some or all databases to cloud storage. Another name for this is Database as a Service, or DaaS. The next section explains cloud databases.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.38.24