36

I'm studying CPU's and I know how it reads a program from the memory and execute its instructions. I also understand that an OS separates programs in processes, and then alternate between each one so fast that you think that they're running at the same time, but in fact each program runs alone in the CPU. But, if the OS is also a bunch of code running in the CPU, how can it manage the processes?

I've been thinking and the only explanation I could think is: when the OS loads a program from the external memory to RAM, it adds its own instructions in the middle of the original program instructions, so then the program is executed, the program can call the OS and do some things. I believe there's an instruction that the OS will add to the program, that will allow the CPU to return to the OS code some time. And also, I believe that when the OS loads a program, it checks if there's some prohibted instructions (that would jump to forbidden adresses in the memory) and eliminates then.

Am I thinking rigth? I'm not a CS student, but in fact, a math student. If possible, I would want a good book about this, because I did not find anyone that explains how the OS can manage a process if the OS is also a bunch of code running in the CPU, and it can't run at the same time of the program. The books only tell that the OS can manage things, but now how.

Revering Sumoda
  • 437
  • 6
  • 8

7 Answers7

37

No. The operating system does not mess around with the program's code injecting new code into it. That would have a number of disadvantages.

  1. It would be time-consuming, as the OS would have to scan through the entire executable making its changes. Normally, part of the executable are only loaded as needed. Also, inserting is expensive as you have to move a load of stuff out of the way.

  2. Because of the undecidability of the halting problem, it's impossible to know where to insert your "Jump back to the OS" instructions. For example, if the code includes something like while (true) {i++;}, you definitely need to insert a hook inside that loop but the condition on the loop (true, here) could be arbitrarily complicated so you can't decide how long it loops for. On the other hand, it would be very inefficient to insert hooks into every loop: for example, jumping back out to the OS during for (i=0; i<3; i++) {j=j+i;} would slow down the process a lot. And, for the same reason, you can't detect short loops to leave them alone.

  3. Because of the undecidability of the halting problem, it's impossible to know if the code injections changed the meaning of the program. For example, suppose you use function pointers in your C program. Injecting new code would move the locations of the functions so, when you called one through the pointer, you'd jump to the wrong place. If the programmer was sick enough to use computed jumps, those would fail, too.

  4. It would play merry hell with any anti-virus system, since it would change virus code, too and muck up all your checksums.

You could get around the halting-problem problem by simulating the code and inserting hooks in any loop that executes more than a certain fixed number of times. However, that would require extremely expensive simulation of the whole program before it was allowed to execute.

Actually, if you wanted to inject code, the compiler would be the natural place to do it. That way, you'd only have to do it once but it still wouldn't work for the second and third reasons given above. (And somebody could write a compiler that didn't play along.)

There are three main ways that the OS regains control from processes.

  1. In co-operative (or non-preemptive) systems, there's a yield function that a process can call to give control back to the OS. Of course, if that's your only mechanism, you're reliant on the processes behaving nicely and a process that doesn't yield will hog the CPU until it terminates.

  2. To avoid that problem, a timer interrupt is used. CPUs allow the OS to register callbacks for all the different types of interrupts that the CPU implements. The OS uses this mechanism to register a callback for a timer interrupt that is fired periodically, which allows it to execute its own code.

  3. Every time a process tries to read from a file or interact with the hardware in any other way, it's asking the OS to do work for it. When the OS is asked to do something by a process, it can decide to put that process on hold and start running a different one. This might sound a bit Machiavellian but it's the right thing to do: disk I/O is slow so you may as well let process B run while process A is waiting for the spinning lumps of metal to move to the right place. Network I/O is even slower. Keyboard I/O is glacial because people are not gigahertz beings.

David Richerby
  • 82,470
  • 26
  • 145
  • 239
13

While David Richerby's answer is a good one, it does sort of glaze over how modern operating systems halt existing programs. My answer should be accurate for the x86 or x86_64 architecture, which is the only one commonly in use for desktops and laptops. Other architectures should have similar methods of achieving this.

When the operating system is starting up, it sets up an interrupt table. Each entry of the table points to a bit of code inside the operating system. When interrupts happen, which is controlled by the CPU, it looks at this table and calls the code. There are various interrupts, such as dividing by zero, invalid code, and some operating system defined ones.

This is how the user process talks to the kernel, such as if it wants to read/write to the disk or something else that the operating system kernel controls. An operating system will also set up a timer that calls an interrupt when it finishes, so the running code is forcibly changed from the user program to the operating system kernel, and the kernel can do other things such as queue up other programs to run.

From memory, when this happens the operating system kernel has to save where the code was, and when the kernel has finished doing what it needs to do it restores the previous state of the program. Thus the program doesn't even know that it was interrupted.

The process can't change the interrupt table for two reasons, the first is that it is running in a protected environment so if it tries to call certain protected assembly code then the cpu will trigger another interrupt. The second reason is virtual memory. The location of the interrupt table is at 0x0 to 0x3FF in real memory, but with user processes that location is usually not mapped, and trying to read unmapped memory will trigger another interrupt, so without the protected function and the ability to write to real RAM, the user process can't change it.

Programmdude
  • 231
  • 1
  • 3
5

The OS kernel gets control back from the running process due to CPU clock interrupt handler, not by injecting code into the process.

You should read about interrupts to get more clarification about how they work and how OS kernels handle them and implement different features.

Ankur
  • 628
  • 3
  • 12
3

There is a method similar to what you describe: co-operative multitasking. The OS does not insert instructions, but each program must be written to call OS functions which may choose to run another of the cooperative processes. This has the disadvantages you describe: one program crashing takes out the whole system. Windows up to and including 3.0 worked like this; 3.0 in "protected mode" and above did not.

Pre-emptive multitasking (the normal kind these days) relies on an external source of interrupts. Interrupts override the normal flow of control and usually save the registers out somewhere, so the CPU can do something else and then transparently resume the program. Of course, the operating system can change the "when you leave interrupts resume here" register, so it resumes inside a different process.

(Some systems do rewrite instructions in a limited way on program load, called "thunking", and the Transmeta processor dynamically recompiled to its own instruction set)

pjc50
  • 421
  • 2
  • 4
3

Multi-tasking does not require anything like code injection. In an operating system like Windows, there is a component of operating system code called the scheduler which relies on a hardware interrupt triggered by a hardware timer. This is used by the operating system to switch between different programs and itself, making it all seem to our human perception to happen concurrently.

Basically, the operating system programs the hardware timer to go off every so often... perhaps 100 times a second. When the timer goes off, it generates a hardware interrupt - a signal which tells the CPU to stop what it's doing, save its state on the stack, change its mode to something more privileged, and execute the code it will find in a specially designated place in memory. That code happens to be part of the scheduler, which decides what should be done next. It might be to resume some other process, in which case it will have to perform what is known as a "context switch" - replacing the entirety of its current state (including virtual memory tables) with that of the other process. In returning to a process, it has to restore all of the context of that process, and then return from the interrupt - restoring its privilege mode and execution state from the stack, and continuing that process.

The "specially designated" place in memory does not have to be known by anything but the operating system. Implementations vary, but the gist of it is that the CPU will respond to various interrupts by performing a table lookup; the table's location is at a specific place in memory (determined by the hardware design of the CPU), the contents of the table is set by the operating system (generally at boot time), and the "type" of interrupt will determine which entry in the table is to be used as the "interrupt service routine".

None of this involves "code injection"... it is based on code contained in the operating system in co-operation with hardware features of the CPU and its supporting circuitry.

Zenilogix
  • 131
  • 2
2

I think the closest real-world example to what you describe is one of the techniques used by VMware, Full virtualization using binary translation.

VMware acts as a layer underneath one or more simultaneously executing operating systems on the same hardware.

Most of the instructions being executed (e.g. in ordinary applications) can be virtualized using the hardware, but an OS kernel itself makes use of instructions that cannot be virtualized, because if the machine code of the guess OS were executed unmodified it would "break out" of the control of the VMware host. For example, a guest OS would need to run in the most privileged protection ring, and set up the interrupt table. If it were allowed to do that, VMware would have lost control of the hardware.

VMware rewrites those instructions in the OS code before executing it, replacing them with jumps into VMware code that simulates the desired effect.

So this technique is somewhat analogous to what you describe.

2

There are a variety of cases in which an operating system might "inject code" into a program. The 68000-based versions of the Apple Macintosh system builds a table of all segment entry points (located immediately preceding the static global variables, IIRC). When a program starts, each entry in the table consists of a trap instruction followed by the segment number and offset into the segment. If the trap is executed, the system will look at the words after the trap instruction to see what segment and offset is required, load the segment (if it isn't already), add the start address of the segment to the offset, and then replace the trap with a jump to that newly-computed address.

On older PC software, although this wasn't technically done by the "OS", it was common for code to be built with trap instructions instead of coprocessor math instructions. If no math coprocessor was installed, the trap handler would emulate it. If a coprocessor was installed, the first time a trap is taken the handler will replace the trap instruction with a coprocessor instruction; future executions of the same code will use the coprocessor instruction directly.

supercat
  • 1,281
  • 8
  • 11