CPU Scheduling and I/O Scheduling

All this talk of resource management could almost make us forget the complete story, so let's quickly rewind again. It should be obvious at this point that container-based OS virtualization is not achieved by a single, easily implementable solution. OSes today consist of a rather large number of subsystems, and several different methods are combined to make containers a reality. The way resource management for each container is handled was explained by an introduction to beancounters. Memory is not the only thing a user needs on a system however. Their processes require time on the CPU to run properly, and they perform I/O by using their allowed hard drive space and interacting with the system.

Multi-core CPUs have allowed us to drastically increase the number of tasks a CPU can handle, but even as we increase the number of cores the basic problem remains the same. A core can only take on one job at a time, so it must spend time switching between different tasks. Therefore, scheduling is required to achieve a "fair" result for each of the containers. To this extent, OpenVZ makes use of two scheduling systems: Fair-Share CPU Scheduling and Completely Fair Queuing for I/O.

Fair-share CPU scheduling

This is a system for scheduling time slices on the CPU on a per-user or per-group basis, rather than the usual per-process approach. In OpenVZ, it is used with set weights given to every container, allowing for full control of how much time is put into each one.

Time scheduled per container happens as follows: In a situation where there are four containers and each one is given a weight of, e.g. 100, they will all receive the same amount of CPU time. When one container is given a weight of 200, two others remain at 100, and one is scaled back to 50, the CPU time will be handed out proportionally, being roughly 45%/22%/22%/11%.

When CPU time has been fairly distributed over the different containers, standard Linux process priorities apply when selecting exactly what process within the container gets which time slice.

I/O: Completely Fair Queuing

Just like the CPU, I/O devices like the hard drive are forced to take one job at a time. This job list can pile up over time, as the CPU does not need to wait for the I/O device to finish an existing job before sending it a new one. If all jobs were simply carried out in the order they were called for by the software, there would be no possibility to control on the amount of I/O time allotted to each process. For this reason, another scheduler is introduced, one that is not only used by OpenVZ but is integrated into most modern-day Linux distributions.

In CFQ, each process is given a separate queue for synchronous requests and an I/O priority. Asynchronous requests for each process are all jumbled together in separate queues, depending on their priority. As the queues fill up with requests to perform I/O, the amount of time available to each queue is proportional to its given priority. When the queue's turn comes up, the I/O device works through it as quickly as possible, until the time slice has passed and the queue has to wait for its turn again.

File System

To keep disk usage of each container in check, disk quotas are available for use on both the container-level as well as the actual user and group levels. This gives the global administrator full control over the disk usage of each separate container, while each container's root user is still able to distribute quotas according to their wishes.

Beancounters Isolation and Virtualization
Comments Locked

3 Comments

View All Comments

Log in

Don't have an account? Sign up now