FSS(4) Device and Network Interfaces FSS(4)
NAME
FSS - Fair share scheduler
DESCRIPTION
The fair share scheduler (FSS) guarantees application performance by
explicitly allocating shares of CPU resources to projects. A share
indicates a project's entitlement to available CPU resources. Because
shares are meaningful only in comparison with other project's shares,
the absolute quantity of shares is not important. Any number that is
in proportion with the desired CPU entitlement can be used.
The goals of the FSS scheduler differ from the traditional time-sharing
scheduling class (TS). In addition to scheduling individual LWPs, the
FSS scheduler schedules projects against each other, making it
impossible for any project to acquire more CPU cycles simply by running
more processes concurrently.
A project's entitlement is individually calculated by FSS independently
for each processor set if the project contains processes bound to them.
If a project is running on more than one processor set, it can have
different entitlements on every set. A project's entitlement is
defined as a ratio between the number of shares given to a project and
the sum of shares of all active projects running on the same processor
set. An active project is one that has at least one running or
runnable process. Entitlements are recomputed whenever any project
becomes active or inactive, or whenever the number of shares is
changed.
Processor sets represent virtual machines in the FSS scheduling class
and processes are scheduled independently in each processor set. That
is, processes compete with each other only if they are running on the
same processor set. When a processor set is destroyed, all processes
that were bound to it are moved to the default processor set, which
always exists. Empty processor sets (that is, sets without processors
in them) have no impact on the FSS scheduler behavior.
If a processor set contains a mix of TS/IA and FSS processes, the
fairness of the FSS scheduling class can be compromised because these
classes use the same range of priorities. Fairness is most
significantly affected if processes running in the TS scheduling class
are CPU-intensive and are bound to processors within the processor set.
As a result, you should avoid having processes from TS/IA and FSS
classes share the same processor set. RT and FSS processes use
disjoint priority ranges and therefore can share processor sets.
As projects execute, their CPU usage is accumulated over time. The FSS
scheduler periodically decays CPU usages of every project by
multiplying it with a decay factor, ensuring that more recent CPU usage
has greater weight when taken into account for scheduling. The FSS
scheduler continually adjusts priorities of all processes to make each
project's relative CPU usage converge with its entitlement.
While FSS is designed to fairly allocate cycles over a long-term time
period, it is possible that projects will not receive their allocated
shares worth of CPU cycles due to uneven demand. This makes one-shot,
instantaneous analysis of FSS performance data unreliable.
Note that share is not the same as utilization. A project may be
allocated 50% of the system, although on the average, it uses just 20%.
Shares serve to cap a project's CPU usage only when there is
competition from other projects running on the same processor set.
When there is no competition, utilization may be larger than
entitlement based on shares. Allocating a small share to a busy
project slows it down but does not prevent it from completing its work
if the system is not saturated.
The configuration of CPU shares is managed by the name server as a
property of the
project(5) database. In the following example, an
entry in the
/etc/project file sets the number of shares for project
x-files to 10:
x-files:100::::project.cpu-shares=(privileged,10,none)
Projects with undefined number of shares are given one share each.
This means that such projects are treated with equal importance.
Projects with 0 shares only run when there are no projects with non-
zero shares competing for the same processor set. The maximum number
of shares that can be assigned to one project is 65535.
You can use the
prctl(1) command to determine the current share
assignment for a given project:
$ prctl -n project.cpu-shares -i project x-files
or to change the amount of shares if you have root privileges:
# prctl -r -n project.cpu-shares -v 5 -i project x-files
See the
prctl(1) man page for additional information on how to modify
and examine resource controls associated with active processes, tasks,
or projects on the system. See
resource_controls(7) for a description
of the resource controls supported in the current release of the
Solaris operating system.
By default, project
system (project ID 0) includes all system daemons
started by initialization scripts and has an "unlimited" amount of
shares. That is, it is always scheduled first no matter how many
shares are given to other projects.
The following command sets FSS as the default scheduler for the system:
# dispadmin -d FSS
This change will take effect on the next reboot. Alternatively, you
can move processes from the time-share scheduling class (as well as the
special case of init) into the FSS class without changing your default
scheduling class and rebooting by becoming
root, and then using the
priocntl(1) command, as shown in the following example:
# priocntl -s -c FSS -i class TS
# priocntl -s -c FSS -i pid 1
CONFIGURING SCHEDULER WITH DISPADMIN
You can use the
dispadmin(8) command to examine and tune the FSS
scheduler's time quantum value. Time quantum is the amount of time
that a thread is allowed to run before it must relinquish the
processor. The following example dumps the current time quantum for
the fair share scheduler:
$ dispadmin -g -c FSS
#
# Fair Share Scheduler Configuration
#
RES=1000
#
# Time Quantum
#
QUANTUM=110
The value of the QUANTUM represents some fraction of a second with the
fractional value determined by the reciprocal value of RES. With the
default value of RES = 1000, the reciprocal of 1000 is .001, or
milliseconds. Thus, by default, the QUANTUM value represents the time
quantum in milliseconds.
If you change the RES value using
dispadmin(8) with the
-r option, you
also change the QUANTUM value. For example, instead of quantum of 110
with RES of 1000, a quantum of 11 with a RES of 100 results. The
fractional unit is different while the amount of time is the same.
You can use the
-s option to change the time quantum value. Note that
such changes are not preserved across reboot. Please refer to the
dispadmin(8) man page for additional information.
SEE ALSO
prctl(1),
priocntl(1),
priocntl(2),
project(5),
resource_controls(7),
dispadmin(8),
psrset(8) System Administration Guide: Virtualization Using the Solaris Operating Systemillumos December 17, 2019 illumos