Functions, calls, returning values.
In part 2, we completed the analysis of the first line. As a result, we saw that in the filesystem a file descriptor was created and its number was returned to us. This number is used basically by all the functions, that have to do actions with an already open file. Let me remind you that we’re reviewing the following piece of code:
int fd = open(“/home/me/file1”, O_RDONLY);
read(fd, buf, 10);
A careful observer would have noticed that the first line is, in fact, not complete yet. At this moment, only the open() function call is complete. We still need to understand what happens during the function call and how the value it returned is being returned. Functions or rather procedures are concepts provided by the central processor. In the assembler languages of all contemporary decent central processors, there are commands to call and declare procedures. For example, in the Intel x86 architecture, there are proc marker and the call command. It’s also worth to note the ret command, that has to be written at the end of a procedure.
The Von Neumann cycle suggests that the processor goes through the commands one by one and implements them. Meanwhile, we have to understand that there are commands that make the processor go to another command, and break the consecutive order. Those are call, jmp, jne, jnz and a series of other commands.
The call command deserves a close look because after the procedure is over the processor has to come back to the command of that call and implement the command following it. With the other commands reviewed in the example, there are no similar problems. As we know, the ongoing command’s address when it comes to the Intel x86 processor is kept in the IP registry. So we’re faced with a problem — during the procedure execution, the value of the IP registry has to be remembered to further be able to return to the command following the call command. Where can it be stored? The functions/procedures call mechanism is a combination of those resources that ensure the solution of this problem and the uninterrupted work of the system.
Likely many of you know about call stack. The mechanisms we’re reviewing is implemented through it. To understand its meaning we need to immerse into the essence of function calls. So if the f() function is calling the g() function, then unless the g() is over f() can’t be completed. So the function call mechanism is in a way Last-In-First-Out construction. The stack is also a data construction with similar qualities, and this is why the function call mechanism is implemented through the stack and not queue or a binary tree.
But anyway, what is stored in that stack? With the judgments reviewed before we can easily deduce that the return address — the IP registry value, has to be stored in the stack. But the story is not over here. The thing is that stack with the help of compiler and the operating system is used with a few other purposes. In particular, with the help of stack, the compiler transfers the function arguments and the value returned.
Except those the stack stores the local variables of functions. This is not all yet. Processors usually provide push and pop commands that work with stack that are available to the programmers and can be called for different reasons.
To keep all of this under control, the stack is divided into stack frames, each of these belongs to a called function. This means that when the function is called the active section of the stack is temporarily “closed” and a new sector is provided for the part of the stack in the called function after completion of which the called function has to be restored.
The processor works with the stack using two registries: base pointer (bp) and stack pointer (sp). The first register stores the bottom address of the (active part of the) stack and the second — the address of the top. So to preserve and recover the address to the active part of the stack, it’s enough to keep the values to the bp registry. And the same stack is used to preserve.
No matter how entangled the story is, it has distinct logic. The logic is the following: there is a problem with the functions and their call order, and the easiest and the most accessible solution has to be presented. No matter how entangled it may seem, this is the most comfortable solution the humanity has come up with. The reader is free to develop and present their own solutions.
After all, when the function is complete when the active part of the stack is moved and the one of the caller function is restored, we need to make the returned value of the called function accessible. And again, the most comfortable place is in the stack.
Coming back to the part of the code we reviewed it’s worth to note that a new variable is declared that’s been assigned the descriptor number of the open() function that has already appeared in the stack.
There are only a few minutes left to the end of the story. The events become gradually more rapid, the job of the main character of the story becomes more and more complicated. But the resolution will inevitably come when the main character will eventually read the first ten bytes on an already cherished file.
To be continued…