Chapter 5. Understanding your workload

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Understanding your workload

This chapter presents and describes the various workload types that an application can generate. This characterization can be useful for understanding performance documents and reports and categorizing the various workloads in your installation.

Information in this chapter is not dedicated to IBM System Storage DS8900F. You can apply this information generally to other flash or disk storage systems.

This chapter includes the following topics:

•General workload types

•Database Workload

•Application workload

•Profiling workloads in the design phase

•Understanding your workload type

5.1 General workload types

The correct understanding of the existing or planned workload is the key element of the entire planning and sizing process. Understanding the workload means having the description of the workload pattern:

•Expected or existing number of IOPS

•Size of the I/O requests

•Read and write ratio

•Random and sequential access ratio

•General purpose of the application and the workload

You might also collect the following information:

•Expected cache hit ratio

The number of requests serviced from cache. This number is important for read requests because write requests always get into the cache first. If out of 1000 requests 100 of them are serviced from cache, you have a 10% of cache hit ratio. The higher this parameter, the lower the overall response time.

•Expected seek ratio

•This parameter is not applicable to the flash drives that have no disk arms by design.

The percentage of the I/O requests for which the disk arm must move from its location. Moving the disk arm requires more time than rotating the disk, which rotates anyway and is fast enough. So, by not moving the disk arm, the whole track or cylinder can be read, which generally means large amount of data. This parameter is mostly indicative of how disk systems worked a long time ago, and it is now turned into a sort of the quality value of the random nature of the workload. A random workload shows this value close to 100%.

•Expected write efficiency

The write efficiency is a number that represents the number of times a block is written to before being de-staged to the Flash module or disk. Actual applications, especially databases, update the information: write-read again-change-write with changes. So, the data for the single flash module or disk block can be served several times from cache before it is written to the flash module or disk. A value of 0% means that a de-stage is assumed for every write operation and the characteristic of the “pure random small block write workload pattern”, which is unlikely. A value of 50% means that a de-stage occurs after the track is written to twice. A value of 100% is unlikely also because it means that writes come to the one track only and are never destaged to disk.

In general, you describe the workload in these terms. The following sections cover the details and describe the different workload types.

5.1.1 Typical Online Transaction Processing workload

This workload is characterized by mostly the random access of small-sized I/O records (less than or equal to 16 KB) with a mix of 70% reads and 30% writes. This workload is also characterized by low read-hit ratios in the flash or disk system cache (less than 20%). This workload might be representative of various online applications, for example, the SAP NetWeaver application or many database applications. This type of workload is typical because it is the basis for most of the benchmark and performance tests. However the following online transaction processing (OLTP) patterns are spread:

•90% read, 8 - 16 - 32 - 64 KB blocks, 30% sequential, 30 - 40% cache hit

•80% read, 16 - 32 - 64 KB blocks, 40 - 50% sequential, 50 - 60% cache hit

A 100% random access workload is rare, which you must remember when you size the disk system.

5.1.2 Microsoft Exchange Server Workload

This type of workload can be similar to an OLTP workload, but it has a different read/write balance. It is characterized by many write operations, up to 60%, with high random numbers. Also, the size of the I/O can be high, with blocks up to 128 KB, which is explained by the nature of the application. Additionally, it acts as a database and the data warehouse.

To help you with the implementation of storage systems with Microsoft Exchange Server 2019 is the Microsoft Jetstess tool. This tool provides the ability to simulate and verify the performance and stability of storage subsystems before putting them into production environments. Microsoft Exchange documentation can be found at:

https://www.microsoft.com/en-us/download/details.aspx?id=36849

See also Microsoft Exchange Solution Reviewed Program (ESRP) Storage which is primarily designed for the testing of third-party storage solutions. Refer to the following Microsoft Exchange article:

https://docs.microsoft.com/en-us/exchange/esrp-storage?view=exchserver-2019

5.1.3 Sequential Workload

Sequential access is one of the original workload types for data storage. Tape is the best example of sequential access. Tape uses several large blocks at one read operation, and it uses buffers first. Sequential access does not change for the flash modules or disks. Sequential workload is good for prefetch and putting data into cache because blocks are accessed one by one, and the storage system can read many blocks to unload the flash modules or disks. Sequential write requests work well because the storage system can optimize the access to the flash modules disks and write several tracks or cylinders at a time. Block sizes are typically 256 KB or more and response time is high, but that time does not matter. Sequential workload is about bandwidth, not response time. The following environments and applications are likely sequential workloads:

•Backup/restore applications

•Database log files

•Batch processing

•File servers

•Web servers

•Media streaming applications

•Graphical software

5.1.4 Batch Jobs Workload

Batch workloads have several common characteristics:

•Mixture of random database access, skip-sequential, pure sequential, and sorting.

•Large block sizes up to 128 - 256 KB

•High volume of write activity, and read activity

•The same volume extents might scramble for write and read at the same time.

•Batch workloads include large data transfers and high path utilizations

•Batch workloads are often constrained to operate within a particular window of time when online operation is restricted or shut down. Poor or improved performance is often not recognized unless it affects this window.

Plan when to run batch jobs: Plan all batch workload activity for the end of the day or at a slower time of day. Normal activity can be negatively affected with the batch activity.

5.1.5 Sort Jobs Workload

Most sorting applications are characterized by large transfers for input, output, and work data sets. DFSORT is IBM's high-performance sort, merge, copy, analysis and reporting product and is an optional feature of z/OS.

For more information about z/OS DFSORT, see the following:

•http://www.ibm.com/support/docview.wss?rs=114&uid=isg3T7000077

•https://www.ibm.com/docs/en/zos

5.1.6 Read-intensive Cache Friendly and Unfriendly Workloads

Use the cache hit ratio to estimate the cache-friendly read workload. If the ratio is more than 50%, the workload is cache friendly. If you have two serviced I/O’s to the same data and one I/O is serviced from cache, it is a 50% cache hit ratio.

It is not as easy to divide known workload types into cache friendly and cache unfriendly. An application can change its behavior during the day several times. When users work with data, it is cache friendly. When the batch processing or reporting starts, it is not cache friendly. High random-access numbers mean a not cache-friendly workload type. However, if the amount of data that is accessed randomly is not large, 10% for example, it can be placed totally into the disk system cache and becomes cache friendly.

Sequential workloads are always cache-friendly because of prefetch algorithms that exist in the DS8900 storage system. Sequential workload is easy to prefetch. You know that the next 10 or 100 blocks are accessed, and you can read them in advance. For the random workloads, it is different. There are no purely random workloads in the actual applications, and it is possible to predict some moments. The DS8900 storage systems use the following powerful read-caching algorithms to deal with cache unfriendly workloads:

•Sequential Prefetching in Adaptive Replacement Cache (SARC)

•Adaptive Multi-stream Prefetching (AMP)

•Intelligent Write Caching (IWC)

The write workload is always cache friendly because every write request comes to the cache first and the application gets the reply when the request is placed into cache. Write requests are served at least two times longer by the back end than read requests. You always need to wait for the write acknowledgment, which is why cache is used for every write request.

To learn more about the DS8900 caching algorithms, see Chapter 2.2.1, “DS8000 caching algorithms” on page 17.

Table 5-1 on page 111 provides a summary of the characteristics of the various types of workloads.

Table 5-1 Workload types

Workload type	Characteristics	Representative of this type of process
Sequential read	Sequential 128 - 1024 KB blocks Read hit ratio: 100%	Database backups Batch processing Media streaming
Sequential write	Sequential 128 - 1024 KB blocks Write ratio: 100%	Database restores and loads Batch processing File access
z/OS cache uniform	Random 4 KB record R/W ratio 3.4 Read hit ratio 84%	Average database CICS/VSAM IBM IMS
z/OS cache standard	Random 4 KB record R/W ratio 3.0 Read hit ratio 78%	Representative of typical database conditions
z/OS cache friendly	Random 4 KB record R/W ratio 5.0 Read hit ratio 92%	Interactive Existing software
z/OS cache hostile	Random 4 KB record R/W ratio 2.0 Read hit ratio 40%	Db2 logging
Open read-intensive	Random 4 - 8 - 16 - 32 KB record Read% = 70 - 90% Hit ratio 28%, 1%	Databases (Oracle and Db2) Large DB inquiry Decision support Warehousing
Open standard	Random 4 - 8 - 16 - 32 KB record Read% = 70% Hit ratio 50%	OLTP General Parallel File System (IBM GPFS)

5.2 Database Workload

Database workload does not come with the database initially. It depends on the application that is written for the database and the type of work that this application performs. The workload can be an OLTP workload or a data warehousing workload in the same database. A reference to a database in this section mostly means Db2 and Oracle databases, but this section can apply to other databases. For more information, see Chapter 11, “Database for IBM z/OS performance” on page 265.

The database environment is often difficult to typify because I/O characteristics differ greatly. A database query has a high read content and is of a sequential nature. It also can be random, depending on the query type and data structure. Transaction environments are more random in behavior and are sometimes cache unfriendly. At other times, they have good hit ratios. You can implement several enhancements in databases, such as sequential prefetch and the exploitation of I/O priority queuing, that affect the I/O characteristics. Users must understand the unique characteristics of their database capabilities before generalizing the performance.

5.2.1 Database query workload

Database query is a common type of database workload. This term includes transaction processing that is typically random, and sequential data reading, writing, and updating. The query can have following properties:

•High read content

•Mix of write and read content

•Random access, and sequential access

•Small or large transfer size

A well-tuned database keeps characteristics of the queries closer to a sequential read workload. A database can use all available caching algorithms, both its own and the storage system algorithms. This function, which caches data that has the most probability to be accessed, provides performance improvements for most database queries.

5.2.2 Database logging workload

The logging system is an important part of the database. It is the main component to preserve the data integrity and provide a transaction mechanism. There are several types of log files in the database:

•Online transaction logs: This type of log is used to restore the last condition of the database when the latest transaction failed. A transaction can be complex and require several steps to complete. Each step means changes in the data. Because data in the database must be in a consistent state, an incomplete transaction must be rolled back to the initial state of the data. Online transaction logs have a rotation mechanism that creates and uses several small files for about an hour each.

•Archive transaction logs: This type is used to restore a database state up to the specified date. Typically, it is used with incremental or differential backups. For example, if you identify a data error that occurred a couple of days ago, you can restore the data back to its prior condition with only the archive logs. This type of log uses a rotation mechanism also.

The workload pattern for the logging is sequential writes mostly. Block size is about 64 KB. Reads are rare and might not be considered. The write capability and location of the online transaction logs are most important. The entire performance of the database depends on the writes to the online transaction logs, if the database is very write intensive, consider RAID-10. If the configuration has several extent pool pairs in its logical layout also consider physically separating log files from the flash modules or disks on which the data and index files are.

5.2.3 Database transaction environment workload

Database transaction workloads have these characteristics:

•Low to moderate read hits, depending on the size of the database buffers

•Cache unfriendly for certain applications

•Deferred writes that cause low write-hit ratios, which means that cached write data is rarely required for reading

•Deferred write chains with multiple locate-record commands in chain

•Low read/write ratio because of reads that are satisfied in a large database buffer pool

•High random-read access values, which are cache unfriendly

The enhanced prefetch cache algorithms, together with the high storage back-end bandwidth, provide high system throughput and high transaction rates for database transaction-based workloads.

A database can benefit from using a large amount of server memory for the large buffer pool. For example, the database large buffer pool, when managed correctly, can avoid a large percentage of the accesses to flash or disk. Depending on the application and the size of the buffer pool, this large buffer pool can convert poor cache hit ratios into synchronous reads in Db2. You can spread data across several RAID arrays to increase the throughput even if all accesses are read misses. Db2 administrators often require that table spaces and their indexes are placed on separate volumes. This configuration improves both availability and performance.

5.2.4 Database utilities workload

Database utilities, such as loads, reorganizations, copies, and recovers, generate high read and write sequential and sometimes random operations. This type of workload takes advantage of the sequential bandwidth performance of the back-end storage connection, such as the PCI-Express bus for the device adapter (DA) pairs, and the use of higher RPM (15 K) disk drives or flash drives and Easy Tier automatic mode enabled on both.

5.3 Application workload

This section categorizes various types of common applications according to their I/O behavior. There are four typical categories:

•Need for high throughput. These applications need more bandwidth (the more, the better). Transfers are large, read-only I/O’s that are sequential access. These applications use database management systems (DBMSs); however, random DBMS access might also exist.

•Need for high throughput and a mix of read/write (R/W), similar to the first category (large transfer sizes). In addition to 100% read operations, this category mixes reads and writes in 70/30 and 50/50 ratios. The DBMS is typically sequential, but random and 100% write operations also exist.

•Need for high I/O rate and throughput. This category requires both performance characteristics of IOPS and megabytes per second (MBps). Depending on the application, the profile is typically sequential access, medium to large transfer sizes (16 KB, 32 KB, and 64 KB), and 100/0, 0/100, and 50/50 R/W ratios.

•Need for high I/O rate. With many users and applications that run simultaneously, this category can consist of a combination of small to medium-sized transfers (4 KB, 8 KB, 16 KB, and 32 KB), 50/50 and 70/30 R/W ratios, and a random DBMS.

Synchronous activities: Certain applications have synchronous activities, such as locking database tables during an online backup, or logging activities. These types of applications are highly sensitive to any increase in flash module or disk drive response time and must be handled with care.

Table 5-2 summarizes these workload categories and common applications.

Table 5-2 Application workload types

Category	Application	Read/write ratio	I/O size	Access type
4	General file serving	Expect 50/50	64 - 256 KB	Sequential mostly because of good file system caching.
4	Online transaction processing	50/50, 70/30	4 KB, 8 KB, or 16 KB	Random mostly for writes and reads. Bad cache hits
4	Batch update	Expect 50/50	16 KB, 32 KB, 64 KB, or 128 KB	Almost 50/50 mix of sequential and random. Moderate cache hits.
1	Data mining	90/10	32 KB, 64 KB, or larger	Mainly sequential, some random.
1	Video on demand	100/0	256 KB and larger	Sequential, good caching.
2	Data warehousing	90/10, 70/30, or 50/50	64 KB or larger	Mainly sequential, rarely random, and good caching.
2	Engineering and scientific	100/0, 0/100, 70/30, or 50/50	64 KB or larger	Sequential mostly, good caching.
3	Digital video editing	100/0, 0/100, or 50/50	128 KB, 256 - 1024 KB	Sequential, good caching.
3	Image processing	100/0, 0/100, or 50/50	64 KB, 128 KB	Sequential, good caching.
1	Backup, restore	100/0, 0/100	256 - 1024 KB	Sequential, good caching.

5.3.1 General file serving

This application type consists of many users who run many different applications, all with varying file access sizes and mixtures of read/write ratios, all occurring simultaneously. Applications can include file server, LAN storage and even internet/intranet servers. There is no standard profile. General file serving fits this application type because this profile covers almost all transfer sizes and R/W ratios.

5.3.2 Online Transaction Processing

This application category typically has many users, all accessing the same storage system and a common set of files. The file access typically is under the control of a DBMS, and each user might work on the same or unrelated activities. The I/O requests are typically spread across many files; therefore, the file sizes are typically small and randomly accessed. A typical application consists of a network file server or a storage system that is accessed by a sales department that enters order information.

5.3.3 Data mining

Databases are the repository of most data, and every time that information is needed, a database is accessed. Data mining is the process of extracting valid, previously unknown, and ultimately comprehensive information from large databases to make crucial business decisions. This application category consists of several operations, each of which is supported by various techniques, such as rule induction, neural networks, conceptual clustering, and association discovery. In these applications, the DBMS extracts only large sequential or possibly random files, depending on the DBMS access algorithms.

5.3.4 Video on Demand

Video on demand consists of video playback that can be used to broadcast quality video for either satellite transmission or a commercial application, such as in-room movies. Fortunately for the storage industry, the data rates that are needed for this type of transfer are reduced dramatically because of data compression developments. A broadcast quality video stream, for example, full HD video, may need about 4 - 5 Mbps bandwidth to serve a single user. These advancements reduce the need for higher speed interfaces and can be serviced with the current interface. However, these applications demand numerous concurrent users that interactively access multiple files within the same storage system. This requirement changed the environment of video applications because the storage system is specified by several video streams that they can service simultaneously. In this application, the DBMS extracts only large sequential files.

5.3.5 Data Warehousing

A data warehouse supports information processing by providing a solid platform of integrated, historical data from which to perform analysis. A data warehouse organizes and stores the data that is needed for informational and analytical processing over a long historical period. A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of data that is used to support the management decision-making process. A data warehouse is always a physically separate store of data that spans a spectrum of time, and many relationships exist in the data warehouse.

An example of a data warehouse is a design around a financial institution and its functions, such as loans, savings, bank cards, and trusts for a financial institution. In this application, there are three kinds of operations: initial loading of the data, access to the data, and updating of the data. However, because of the fundamental characteristics of a warehouse, these operations can occur simultaneously. At times, this application can perform 100% reads when accessing the warehouse, 70% reads and 30% writes when accessing data while record updating occurs simultaneously, or even 50% reads and 50% writes when the user load is heavy. The data within the warehouse is a series of snapshots and after the snapshot of data is made, the data in the warehouse does not change. Therefore, there is typically a higher read ratio when using the data warehouse.

5.3.6 Engineering and Scientific Applications

The engineering and scientific arena includes hundreds of applications. Typical applications are computer-assisted design (CAD), Finite Element Analysis, simulations and modeling, and large-scale physics applications. Transfers can consist of 1 GB of data for 16 users. Other transfers might require 20 GB of data and hundreds of users. The engineering and scientific areas of business are more concerned with the manipulation of spatial data and series data. This application typically goes beyond standard relational DBMSs, which manipulate only flat (two-dimensional) data. Spatial or multi-dimensional issues and the ability to handle complex data types are commonplace in engineering and scientific applications.

Object-Relational DBMSs (ORDBMSs) are being developed, and they offer traditional relational DBMS features and support complex data types. Objects can be stored and manipulated, and complex queries at the database level can be run. Object data is data about real objects, including information about their location, geometry, and topology. Location describes their position, geometry relates to their shape, and topology includes their relationship to other objects. These applications essentially have an identical profile to that of the data warehouse application.

5.3.7 Digital Video Editing

Digital video editing is popular in the movie industry. The idea that a film editor can load entire feature films onto flash storage systems and interactively edit and immediately replay the edited clips has become a reality. This application combines the ability to store huge volumes of digital audio and video data onto relatively affordable storage devices to process a feature film.

Depending on the host and operating system that are used to perform this application, transfers are typically medium to large and access is always sequential. Image processing consists of moving huge image files for editing. In these applications, the user regularly moves huge high-resolution images between the storage device and the host system. These applications service many desktop publishing and workstation applications. Editing sessions can include loading large files of up to 16 MB into host memory, where users edit, render, modify, and store data onto the storage system. High interface transfer rates are needed for these applications, or the users waste huge amounts of time by waiting to see results. If the interface can move data to and from the storage device at over 32 MBps, an entire 16 MB image can be stored and retrieved in less than 1 second. The need for throughput is all important to these applications and along with the additional load of many users, I/O operations per second (IOPS) are also a major requirement.

5.4 Profiling workloads in the design phase

Assessing the I/O profile before you build and deploy the application requires methods of evaluating the workload profile without measurement data. In these cases, as a preferred practice, use a combination of general rules based on application type and the development of an application I/O profile by the application architect or the performance architect. The following examples are basic examples that are designed to provide an idea of how to approach workload profiling in the design phase.

For general rules for application types, see Table 5-1 on page 111.

The following requirements apply to developing an application I/O profile:

•User population

Determining the user population requires understanding the total number of potential users, which for an online banking application might represent the total number of customers. From this total population, you must derive the active population that represents the average number of persons that use the application at any specific time, which is derived from experiences with other similar applications.

In Table 5-3, you use 1% of the total population. From the average population, you can estimate the peak. The peak workload is some multiplier of the average and is typically derived based on experience with similar applications. In this example, we use a multiple of 3.

Table 5-3 User population

Total potential users	Average active users	Peak active users
50000	500	1500

•Transaction distribution

Table 5-4 breaks down the number of times that key application transactions are run by the average user and how much I/O is generated per transaction. Detailed application and database knowledge is required to identify the number of I/O’s and the type of I/O’s per transaction. The following information is a sample.

Table 5-4 Transaction distribution

Transaction	Iterations per user	I/O’s	I/O type
Look up savings account	1	4	Random read
Look up checking account	1	4	Random read
Transfer money to checking	0.5	4 reads/4 writes	Random read/write
Configure new bill payee	0.5	4 reads/4 writes	Random read/write
Submit payment	1	4 writes	Random write
Look up payment history	1	24 reads	Random read

•Logical I/O profile

An I/O profile is created by combining the user population and the transaction distribution. Table 5-5 provides an example of a logical I/O profile.

Table 5-5 Logical I/O profile from user population and transaction profiles

Transaction	Iterations per user	I/O’s	I/O type	Average user I/O’s	Peak users
Look up savings account	1	4	Random read (RR) I/O’s	2000	6000
Look up checking account	1	4	RR	2000	6000
Transfer money to checking	0.5	4 reads/4 writes	RR, random write I/O’s (RW)	1000, 1000	3000 R/W
Configure new bill payee	0.5	4 reads/4 writes	RR, RW	1000, 1000	3000 R/W
Submit payment	1	4 writes	RW	2000	6000 R/W
Look up payment history	1	24 reads	RR	12000	36000

•Physical I/O profile

The physical I/O profile is based on the logical I/O with the assumption that the database provides cache hits to 90% of the read I/O’s. All write I/O’s are assumed to require a physical I/O. This physical I/O profile results in a read miss ratio of (1 - 0.9) = 0.1 or 10%. Table 5-6 is an example, and every application has different characteristics.

Table 5-6 Physical I/O profile

Transaction	Average user logical I/O’s	Average active users physical I/O’s	Peak active users physical I/O’s
Look up savings account	2000	200 RR	600 RR
Look up checking account	2000	200 RR	600 RR
Transfer money to checking	1000, 1000	100 RR, 1000 RW	300 RR, 3000 RW
Configure new bill payee	1000, 1000	100 RR, 1000 RW	300 RR, 3000 RW
Submit payment	2000	200 RR	600 RR
Look up payment history	12000	1200 SR	3600 RR
Totals	20000 R, and 2000 W	2000 RR 2000 RW	6000 RR 6000 RW

As you can see in Table 5-6, to meet the peak workloads, you must design an I/O subsystem to support 6000 random reads/sec and 6000 random writes/sec:

Physical I/O’s The number of physical I/O’s per second from the host perspective

RR Random Read I/O’s

RW Random Write I/O’s

Determine the appropriate configuration to support your unique workload.

5.5 Understanding your workload type

To understand the workload, you need the performance data from the operating system and from the disk system. Combined analysis of these two sets of performance data can give you the entire picture and help you understand your workload. Separate analysis might be not accurate. This section describes various performance monitoring tools.

5.5.1 Monitoring the DS8000 workload

The following performance monitoring tools are available for the IBM System Storage DS8900:

•IBM Spectrum Control

IBM Spectrum Control is the tool to monitor the workload on your DS8000 storage for a long period and collect historical data. This tool can also create reports and provide alerts. For more information, see 7.2.1, “IBM Spectrum Control and IBM Storage Insights Pro overview” on page 141.

•IBM Storage Insights

Is a cloud based service which enables quick deployment to allow optimization of storage and SAN resources using with proprietary analytics from IBM Research. For information on IBM Storage Insights, refer to:

https://www.ibm.com/products/analytics-driven-data-management

5.5.2 Monitoring the host workload

The following sections list the host-based performance measurement and reporting tools under the UNIX, Linux, Windows, IBM i, and z/OS environments.

Open Systems servers

This section lists the most common tools that are available on Open Systems servers to monitor the workload.

UNIX and Linux Open Systems servers

To get host information about I/O subsystems, processor activities, virtual memory, and the use of the physical memory, use the following common UNIX and Linux commands:

•iostat

•vmstat

•sar

•nmon

•topas

•filemon

These commands are standard tools that are available with most UNIX and UNIX like (Linux) systems. Use iostat to obtain the data that you need to evaluate your host I/O levels. Specific monitoring tools are also available for AIX, Linux, Hewlett-Packard UNIX (HP-UX), and Oracle Solaris.

Microsoft Windows servers

Common Microsoft Windows Server monitoring tools include the Windows Performance Monitor (perfmon). Performance Monitor has the flexibility to customize the monitoring to capture various categories of Windows server system resources, including processor and memory. You can also monitor disk I/O by using perfmon.

To activate:

1. Open Windows Start

2. Search for Performance Monitor and click on perfmon app.

IBM i environment

IBM i provides a vast selection of performance tools that can be used in performance-related cases with external storage. Several of the tools, such as Collection services, are integrated in the IBM i system. Other tools are a part of an IBM i licensed product. The management of many IBM i performance tools is integrated into IBM Systems Director Navigator for IBM i, or into IBM iDoctor.

The IBM i tools, such as Performance Explorer and iDoctor, are used to analyze the hot data in IBM i and to size flash modules for this environment. Other tools, such as Job Watcher, are used mostly in solving performance problems, together with the tools for monitoring the DS8900 storage system such as Storage Insights.

For more information about the IBM i tools and their usage, see 9.4.1, “IBM i performance tools” on page 217.

z Systems environment

The z/OS systems have proven performance monitoring and management tools that are available to use for performance analysis. Resource Measurement Facility (RMF), a z/OS performance tool, collects performance data and reports it for the wanted interval. It also provides cache reports. The cache reports are similar to the disk-to-cache and cache-to-disk reports that are available in IBM Spectrum Control, except that the RMF cache reports are in text format. RMF collects the performance statistics of the DS8900 storage system that are related to the link or port and also to the rank and extent pool. The REPORTS(ESS) parameter in the RMF report generator produces the reports that are related to those resources.

The RMF Spreadsheet Reporter is an easy way to create Microsoft Excel Charts based on RMF postprocessor reports. It is used to convert your RMF data to spreadsheet format and generate representative charts for all performance charts for all performance-relevant areas, and is described here:

https://www.ibm.com/docs/en/zos/2.5.0?topic=workstation-rmf-spreadsheet-reporter

For more information, see Chapter 10, “Performance considerations for IBM z Systems servers” on page 225.

5.5.3 Modeling the workload and sizing the system

Workload modeling is used to predict the behavior of the system under the workload to identify the limits and potential bottlenecks, and to model the growth of the system and plan for the future.

IBM and IBM Business Partner specialists use the IBM Storage Modeler (StorM) tool for performance modeling of the workload and capacity planning on the systems.

IBM StorM can be used to help to plan the DS8900 hardware configuration. With IBM StorM, you model the DS8900 performance when migrating from another disk system or when making changes to an existing DS8000 configuration and the I/O workload. IBM StorM is for use with both z Systems and Open Systems server workloads. In addition, IBM StorM also models storage capacity requirements.

You can model the following major DS8000 components by using IBM StorM:

•Supported DS8000 model: DS8884F, DS8884, DS8886F, DS8886, DS8888F, DS8882F, DS8980F, DS8910F and DS8950F models

•Capacity sizing in IBM i

•Importing data for performance assessments

•Configuration capacity using flash modules and RAID type

•Type and number of DS8000 host adapters (HAs)

•Remote Copy options

When working with IBM StorM, always ensure that you input accurate and representative workload information because IBM StorM results depend on the input data that you provide. Also, carefully estimate the future demand growth that you input to IBM StorM for modeling projections. The hardware configuration decisions are based on these estimates.

For more information about using StorM, see Chapter 6.1, “IBM Storage Modeller” on page 126.

Sizing the system for the workload

With the performance data collected or estimated, and the model of the data that is created, you can size the planned system. Systems are sized with application-specific tools that are provided by the application vendors. There are several tools:

•General storage sizing: StorM.

•MS Exchange sizing: MS Exchange sizing tool. You can find more information about this tool at the following websites:

– http://technet.microsoft.com/en-us/library/bb124558.aspx

– http://technet.microsoft.com/en-us/library/ff367907.aspx

•Oracle, Db2, SAP NetWeaver: IBM Techline and specific tools.

Workload testing

There are various reasons for conducting I/O load tests. They all start with a hypothesis and have defined performance requirements. The objective of the test is to determine whether the hypothesis is true or false. For example, a hypothesis might be that you think that a DS8900 storage system with 18 flash arrays and 256 GB of cache can support 100,000 IOPS with a 70/30/50 workload and the following response time requirements:

•Read response times: 95th percentile < 10 ms

•Write response times: 95th percentile < 5 ms

To test, complete the following generic steps:

1. Define the hypothesis.

2. Simulate the workload by using an artificial or actual workload.

3. Measure the workload.

4. Compare workload measurements with objectives.

5. If the results support your hypothesis, publish the results and make recommendations. If the results do not support your hypothesis, determine why and make adjustments.

Microsoft Windows environment

The following example tests might be appropriate for a Windows environment:

•Pre-deployment hardware validation. Ensure that the operating system, multipathing, and host bus adapter (HBA) drivers are at the current levels and supported. Before you deploy any solution and especially a complex solution, such as Microsoft cluster servers, ensure that the configuration is supported. Refer to the following interoperability website for more information:

http://www.ibm.com/systems/support/storage/ssic/interoperability.wss

•Application-specific requirements. Often, you receive inquiries about the DS8000 storage system and Microsoft Exchange. To simulate MS Exchange 2019 storage I/O load on a server, use the MS Exchange Server Jetstress 2016 tool. For more information, refer to:

https://www.microsoft.com/en-us/download/details.aspx?id=36849

The universal workload generator and benchmark tool is the Iometer (http://www.iometer.org). Iometer is both a workload generator (it performs I/O operations to stress the system) and a measurement tool (it examines and records the performance of its I/O operations and their effect on the system). It can be configured to emulate the disk or network I/O load of any program or benchmark, or it can be used to generate entirely synthetic I/O loads. It can generate and measure loads on single or multiple (networked) systems.

Iometer can be used for the following measurements and characterizations:

•Performance of disk and network controllers

•Bandwidth and latency capabilities of buses

•Network throughput to attached drives

•Shared bus performance

•System-level hard disk drive performance

•System-level network performance

You can use Iometer to configure these settings:

•Read/write ratios

•Sequential/random

•Arrival rate and queue depth

•Block Size

•Number of concurrent streams

With these configuration settings, you can simulate and test most types of workloads. Specify the workload characteristics to reflect the workload in your environment.

Unix and Linux environment

The Unix and Linux dd command is a very useful tool to drive sequential read workloads or sequential write workloads against the DS8900 storage system.

From 2019 Subsystem Device Driver Specific Module (SDDDSM) and Subsystem Device Driver Path Control Module (SDDPCM) are no longer supported for DS8000. Users will need to use native operating system device drivers for multipath support. For information, refer to:

https://www.ibm.com/support/pages/sdddsm-and-sddpcm-end-support-ds8000

This section describes how to perform these tasks using Multipath I/O devices:

Useful Multipath I/O commands for AIX

•lsattr -El hdiskxx - lists device attributes

•lsmpio - l hdiskxx - lists the number of available paths to a logical volume

– Determine the sequential read speed that an individual logical unit number (LUN)) can provide in your environment.

– Measure sequential read and write speeds for file systems.

To test the sequential read speed of a rank, run the following command:

time dd if=/dev/rhdiskxx of=/dev/null bs=128k count=781

The rhdiskxx is the character or raw device file for the logical unit numbers (LUN) that is presented to the operating system by Multipath I/O (MPIO). This command reads 100 MB off rhdiskxx and reports how long it takes in seconds. Take 100 MB and divide by the number of seconds that is reported to determine the MBps read speed.

Run the following command and start the nmon monitor or iostat -k 1 command in Linux:

dd if=/dev/rhdiskxx of=/dev/null bs=128k

Your nmon monitor (the e option) reports that this previous command imposed a sustained 100 MBps bandwidth with a blocksize=128 K on rhdiskxx. Notice the xfers/sec column; xfers/sec is IOPS. Now, if your dd command did not error out because it reached the end of the disk, press Ctrl+c to stop the process. Now, nmon reports an idle status. Next, run the following dd command with a 4 KB blocksize and put it in the background:

dd if=/dev/rhdiskxx of=/dev/null bs=4k &

For this command, nmon reports a lower MBps but a higher IOPS, which is the nature of I/O as a function of blocksize. Run your dd sequential read command with a bs=1024 and you see a high MBps but a reduced IOPS.

The following commands perform sequential writes to your LUNs:

•dd if=/dev/zero of=/dev/rhdiskxx bs=128k

•dd if=/dev/zero of=/dev/rhdiskxx bs=1024k

•time dd if=/dev/zero of=/dev/rhdiskxx bs=128k count=781

Try different block sizes, different raw hdisk devices and combinations of reads and writes. Run the commands against the block device (/dev/hdiskxx) and notice that block size does not affect performance.

Because the dd command generates a sequential workload, you still must generate the random workload. You can use a no-charge open source tool, such as Vdbench.

Vdbench is a disk and tape I/O workload generator for verifying data integrity and measuring the performance of direct-attached and network-connected storage on Windows, AIX, Linux, Solaris, OS X, and HP-UX. It uses workload profiles as the inputs for the workload modeling and has its own reporting system. All output is presented in HTML files as reports and can be analyzed later. For more information, refer to:

http://www.oracle.com/technetwork/server-storage/vdbench-downloads-1901681.html

AIX Tuning Parameters

There are several disk and disk adapter kernel tunable parameters in the AIX operating system that are useful when used in conjunction with DS8000 storage servers.

•Disk Adapter Outstanding-Requests Limit

•Fibre Channel Adapter Outstanding-Requests Limit

•Disk Drive Queue Depth

For full details see the following website:

https://www.ibm.com/docs/en/aix/7.2?topic=parameters-disk-disk-adapter-tunable

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 5. Understanding your workload

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 5. Understanding your workload