Multiprocessing in Linux
Mrs. Vanita Dnyandev Jadhav, (vdjadhav@coe.sveri.ac.in) Assistant Professor, CSE Dept, SVERI’s COE, Pandharpur
Symmetric Multiprocessing
The preceding discussion of multiprocessor systems has been implicitly made in the context of symmetric multiprocessing systems. In symmetric multiprocessing systems, all CPUs in the system can be used in the same way and are under the control and management of the same operating system. This is in contrast to asymmetric multiprocessing systems, which will be discussed shortly.
A symmetric multiprocessor system requires a multiprocessor-aware operating system. All mainstream operating systems today support multiprocessing, but this was not always the case.
Conceptually, the extension of a single-CPU operating system to a multiprocessor one is simple: the primary obligations are to schedule processes to run on each CPU and to provide communication and synchronization mechanisms between processes. Figure illustrates the conceptual organization of a multiprocessor-aware operating system.

Fig.1: Organization of a Multiprocessor-Aware Operating System.
To be sure, scheduling processes on more than one CPU increases the complexity of the scheduling task: rather than choosing what to schedule, the scheduler must now also decide where to schedule a given process. Communication and synchronization mechanisms, however, were already largely present in all preemptive multitasking operating systems.
The greatest challenges in developing an SMP variant of an operating system have to do with details in the system software, with issues ranging from the boot processor to the management of I/O devices. None of these issues are conceptually formidable, but they do involve substantial engineering efforts.
Linux SMP Support
Multiprocessing support has been available in the Linux kernel since version 2.0, which was released in the mid-1990s. As support has improved over the past 15 or so years, substantial improvements in performance and reliability have been made. The main advances seen in major kernel upgrades have had to do with transitioning coarse-grained kernel locks to finer-grain ones, to reduce the amount of needless stalling. Today, Linux SMP is extremely robust and used in personal and enterprise systems around the globe.
In Linux, you can easily determine how many CPU cores are presented in the systems with a command like this one: cat /proc/cpuinfo
Linux supports concurrent processes both within the kernel and among applications. There is an unfortunate mixture of terms in Linux. Linux really only deals with processes; that is, applications that take advantage of user-level threads such as with the pthreads POSIX threads package do so within the context of a single process. Processes are the entities that are scheduled on CPUs. Except within the kernel! Within the kernel, kernel threads can be created and operated in parallel. Confusing term usage aside, it is important to note that in Linux, user-level threads cannot be used to take advantage of multiple CPU cores in an SMP system. To do that, you must use multiple processes. The Python example presented earlier in this chapter is an example of how to use multiple processes within an application.
Inter process Communication
In Linux and other operating systems, the mechanisms include the following:
•Pipes: Most commonly used at the command line to connect the output of one process to the input of another. The same mechanism is available programmatically.
•FIFOs: Pipes with names.
•Signals: Identifiers sent between processes at the application layer and the kernel to indicate a state change or system condition.
•Message queues: Operate like pipes and FIFOs except they can have multiple readers and writers.
•Shared memory: Linux and other operating systems allow processes to share memory regions, either in memory or through the file system.
•Semaphores: A generic mechanism used to keep access to shared resources orderly and consistent.
Multicore Architectures
Conceptually at least, the obvious approach to increasing the amount of work performed per clock cycle is simply to clone a single CPU core multiple times on the chip. In the simplest case, each of these cores executes largely independently, sharing data through the memory system, usually through a cache coherency protocol. This design is a scaled-down version of traditional multi socket server symmetric multiprocessing systems that have been used to increase performance for decades, in some cases to extreme degrees.
However, multicore systems come in different guises, and it can be very difficult to define a core. For example, a mainstream CPU, at the high end, generally includes a wide range of functional blocks such that it is independent of other cores on the chip, barring interfacing logic, memory controllers, and so on, that would be unlikely to count as cores. However the line can be blurred. For example, AMD’s Steamroller (high-power core) design, shown alongside the simpler Puma (low-power core) design in Figure, shares functional units between pairs of cores in a replicable unit termed a module. A single thread will run on each core in a traditional fashion while the hardware interleaves floating-point instructions onto the shared floating-point pipelines. The aim of such a design is to raise efficiency by improving occupancy of functional units.

Fig.2: The AMD Puma (left) and Steamroller (right) high-level designs (not shown to any shared scale).
Puma is a low-power design that follows a traditional approach to mapping functional units to cores. Steamroller combines two cores within a module, sharing its floating-point (FP) units.
References:
1.John Kubiatowicz. Introduction to Parallel Architectures and Pthreads.2013 Short Course on Parallel Programming.
2. David Culler; Jaswinder Pal Singh; Anoop Gupta (1999). Parallel Computer Architecture: A Hardware/Software Approach. Morgan Kaufmann. p. 47. ISBN 978-1558603431.
3. Lina J. Karam, Ismail AlKamal, Alan Gatherer, Gene A. Frantz, David V. Anderson, Brian L. Evans (2009). “Trends in Multi-core DSP Platforms” (PDF). IEEE Signal Processing Magazine. 26 (6): 38–49. Bibcode:2009ISPM…26…38K. doi:10.1109/MSP.2009.934113.
4. Gregory V. Wilson (October 1994). “The History of the Development of Parallel Computing”.
Comments
Post a Comment