When several workloads compete for CPU time on a large system, you can divide the CPUs into sets and bind each workload to a different set to constrain it. This section describes how this works and where it can be used effectively.
In the past, it was common to use several systems, one to run each workload. Modern computer systems are so powerful and scalable, that it becomes more efficient to consolidate workloads onto fewer, larger, systems. A new feature in the Solaris 2.6 operating environment allows a multiprocessor machine to be partitioned into processor sets, where each workload is constrained to use only the processors in one set. The Solaris 7 version adds some interrupt management capabilities.
Let's start by taking a look at the manual page for the Solaris 7 version of the psrset command.
Maintenance Commands psrset(1M) NAME psrset - creation and management of processor sets SYNOPSIS psrset -c [ processor_id ... ] psrset -d processor_set_id psrset -a processor_set_id processor_id ... psrset -r processor_id ... psrset -p [ processor_id ... ] psrset -b processor_set_id pid ... psrset -u pid ... psrset -e processor_set_id command [argument(s)] psrset -f processor_set_id psrset -n processor_set_id psrset -q [ pid ... ] psrset [ -i ] [ processor_set_id ... ] DESCRIPTION The psrset utility controls the management of processor sets. Processor sets allow the binding of processes to groups of processors, rather than just a single processor. There are two types of processor sets, those created by the user using the psrset command or the pset_create(2) system call, and those automatically created by the system. Pro- cessors assigned to user-created processor sets will run only LWPs that have been bound to that processor set, but system processor sets may run other LWPs as well. System-created processor sets will not always exist on a given machine. When they exist, they will generally represent particular characteristics of the underlying machine, such as groups of processors that can communicate more quickly with each other than with other processors in the system. These processor sets cannot be modified or removed, but processes may be bound to them. OPTIONS The following options are supported: -a Assigns the specified processors to the specified pro- cessor set. -b Binds all the LWPs of the specified processes to the specified processor set. -c Creates a new processor set. -d Removes the specified processor set, releasing all pro- cessors and processes associated with it. -e Executes the given command in the specified processor set. -f Disables interrupts for all processors within the specified processor set. -i Displays the type and processor assignments of the specified processor sets, or of all processor sets. -n Enables interrupts for all processors within the speci- fied processor set. -p Displays the processor set assignments of the specified processors, or of all processors. -q Displays the processor set bindings of the specified processes, or of all processes. -r Removes the specified processors from the processor sets to which they are assigned. -u Removes the processor set bindings of all LWPs of the specified processes. USAGE The -a option assigns a list of processors to a processor set. Processor sets automatically created by the system cannot have processors assigned to them. However, processors belonging to system processor sets may be assigned to user- created processor sets. This option is restricted to use by the super-user. The -b option binds all of the LWPs of the specified processes to the specified processor set. LWPs bound to a processor set will be restricted to run only on the proces- sors in that set unless they require resources available only on another processor. Processes may only be bound to non-empty processor sets, that is, processor sets that have had processors assigned to them. Bindings are inherited, so new LWPs and processes created by a bound LWP will have the same binding. Binding an interac- tive shell to a processor, for example, binds all commands executed by the shell. The -c option creates a processor set and displays the new processor set ID. If a list of processors is given, it also attempts to assign those processors to the processor set. If this succeeds, the processors will be idle until LWPs are bound to the processor set. This option is restricted to use by the super-user. The -d option removes a previously created processor set. Processor sets automatically created by the system cannot be removed. This option is restricted to use by the super-user. The -e option executes a command (with optional arguments) in the specified processor set. The command process and any child processes are executed only by processors in the pro- cessor set. The super-user may execute a command in any active processor set. Other users may only execute commands in system pro- cessor sets. The -f option disables interrupts for all possible proces- sors in the specified processor set. See psradm(1M). If some processors in the set cannot have their interrupts dis- abled, the other processors will still have their interrupts disabled, and the command will report an error and return non-zero exit status. This option is restricted to use by the super-user. The -i option displays a list of processors assigned to each named processor set. If no argument is given, a list of all processor sets and the processors assigned to them is displayed. This is also the default operation if the psrset command is not given an option. The -n option enables interrupts for all processors in the specified processor set. See psradm(1M). This option is restricted to use by the super-user. The -p option displays the processor set assignments for the specified list of processors. If no argument is given, the processor set assignments for all processors in the system is given. The -q option displays the processor set bindings of the specified processes. If a process is composed of multiple LWPs, which have different bindings, the bindings of only one of the bound LWPs will be shown. If no argument is given, the processor set bindings of all processes in the system is displayed. The -r option removes a list of processors from their current processor sets. Processors that are removed will return to either the system processor set to which they pre- viously belonged, or to the general pool of processors if they did not belong to a system processor set. This option is restricted to use by the super-user. Processors with LWPs bound to them using pbind(1M) cannot be assigned to or removed from processor sets. The -u option removes the processor set bindings from all the LWPs of the specified processes, allowing them to be executed on any on-line processor if they are not bound to individual processors through pbind. The super-user may bind or unbind any process to any active processor set. Other users may only bind or unbind processes to system processor sets. Furthermore, they may only bind or unbind processes for which they have permission to signal, that is, any process that has the same effective user ID as the user.
The initial state is that all CPUs belong to a default system processor set. You can create additional sets by taking CPUs away from the system set. The kernel only uses the system set for normal operations, although interrupts are handled by processors regardless of which set they belong to. At least one CPU will always remain in the system processor set. For example, NFS services will run only on the system processor set.
If you have a mix that includes some NFS service that needs to be constrained, this is one way to accomplish that. In general, the system set should be as large as possible, perhaps shared with one of your regular workloads so that you don't starve the kernel of CPU time.
Sun has published a fully audited benchmark in which an online transaction processing TPC-C workload was run on the same machine at the same time as a data warehouse TPC-D workload. This was managed using processor sets. A 16-CPU Sun Enterprise 6000 was divided into an 8-CPU system processor set and an additional 8-CPU user-created set. A single copy of the IBM DB2 Universal Server database code was used to create two database instances on separate parts of the disk subsystem. When the benchmark was run, the continuous small TPC-C transactions ran at a constant rate, providing good response times to the online users The large and varied TPC-D transactions were constrained and did not affect the online user response times. The overall throughput was less than it might have been if the idle time in each set had been used by the other workload, but consistency of steady state response times and throughput is a requirement for an audited TPC-C result, and it could not be achieved without using processor sets in this way.
The TPC-C summary is at:
http://www.tpc.org/results/individual_results/Sun/sun.ue6000.ibm.es.pdf
The TPC-D summary is at:
http://www.tpc.org/results/individual_results/Sun/sun.ue6000.ibm.d.es.pdf
The Solaris operating environment maintains a queue of jobs that are ready to run on a per-CPU basis. There is no single global run queue. Older versions of the Solaris operating environment implement processor binding using the pbind command and underlying system calls. A process is bound to a CPU with pbind, but it isn't exclusive. Other work can also run on that CPU. With psrset, the binding is to a group of CPUs, but it is also an exclusive binding, and nothing else will be scheduled to run on that set. You can use pbind within any set, to give a further level of control over resource usage.
The way psrset works is to create a kind of virtual machine for scheduling purposes. Once a process is bound to that set, all child processes are also bound to that set, so it is sufficient to bind a shell or startup script for an application. You must have root permissions to make bindings.
The system normally keeps a linked list of the online processors. Each processor has its own run queue. When a kernel thread is to be placed on a run queue, it goes through some various machinations and decides where the thread should be placed. Normally, this is the same processor on which it last ran, but sometimes it changes processors (migrates) for various reasons (load balancing, for example).
With processor sets, you can split up the list of processors into disjoint subsets. When you create a processor set, you create a new list with the processors that are in the set. The processors are taken out of the normal list of processors that run everything not in the set. Processes assigned to the set run on the processors in the set's list and can migrate between them. Other processes and normal (non-interrupt) kernel threads cannot run on those processors; they no longer have access to them. It's as if the processors have been taken off-line. The exception is kernel threads that can be bound to a specific processor for one reason or another, but this is unusual.
Interrupts are taken on whichever CPU normally takes that interrupt, but any subsequent activity will take place in the system processor set. Use the mpstat command to view the distribution of interrupts and load over all the CPUs.
% mpstat 5
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0 58 8 1459 822 610 1306 171 242 96 30 609 6 67 27 0
1 36 8 1750 1094 657 1100 151 238 104 28 717 6 76 18 0
4 53 7 1518 951 759 1111 155 226 95 29 642 6 69 24 0
5 25 7 1715 1067 765 1104 178 232 111 23 552 7 65 28 0
3.21.233.41