System information
Tuning the Task Scheduler 173
A list of the most important task scheduler sysctl tuning variables (located at /
proc/sys/kernel/) with a short description follows:
sched_child_runs_first
A freshly forked child runs before the parent continues execution. Setting this pa-
rameter to 1 is beneficial for an application in which the child performs an ex-
ecution after fork. For example make -j<NO_CPUS> performs better when
sched_child_runs_first is turned off. The default value is 0.
sched_compat_yield
Enables the aggressive yield behavior of the old 0(1) scheduler. Java applications
that use synchronization extensively perform better with this value set to 1. Only
use it when you see a drop in performance. The default value is 0.
Expect applications that depend on the sched_yield() syscall behavior to perform
better with the value set to 1.
sched_migration_cost
Amount of time after the last execution that a task is considered to be “cache hot”
in migration decisions. A “hot” task is less likely to be migrated, so increasing this
variable reduces task migrations. The default value is 500000 (ns).
If the CPU idle time is higher than expected when there are runnable processes,
try reducing this value. If tasks bounce between CPUs or nodes too often, try in-
creasing it.
sched_latency_ns
Targeted preemption latency for CPU bound tasks. Increasing this variable in-
creases a CPU bound task's timeslice. A task's timeslice is its weighted fair share
of the scheduling period:
timeslice = scheduling period * (task's weight/total weight of tasks in the run
queue)
The task's weight depends on the task's nice level and the scheduling policy. Min-
imum task weight for a SCHED_OTHER task is 15, corresponding to nice 19.
The maximum task weight is 88761, corresponding to nice -20.
Timeslices become smaller as the load increases. When the number of runnable
tasks exceeds sched_latency_ns/sched_min_granularity_ns, the
slice becomes number_of_running_tasks * sched_min_granularity_ns.
Prior to that, the slice is equal to sched_latency_ns.