In this tutorial, there will be several tasks running in parallel in Skelix, and each task use its own LDT, we are going to implement preemptively task switching.
At first let's clarify one opinion, running multiple tasks concurrently on one single processor(one core) is not going to happen, in reality, the processor switches tasks very fast, say 100 times per second, so it looks like several processes are running at the same time.
The 386 processor has hardware support for multitasking, but certainly it also can be accomplished manually, here is a program I wrote before, it switches the execution of several code snippets. We all know there is only one set of registers in every single CPU, and they are used by all processes, so when a running processes is stopped and before it is switched to another process, all the informations about this process which are registers and some other information relate to them should be stored, because another process is going to use them. This information called "context" in general, and the 386 processor use the task state segments (TSS) to store this information, which is at least 104 bytes long, and a TSS descriptor refers to it. TSS descriptors only appear in the GDT, that means a user task can not switch to any specific process by itself via LDT. When we switch processes in the hardware way, the processor stores all informations in the TSS automatically.
The TSS defines the state of the execution environment for a task. Now let's take a look at the TSS structure,
It is a big table, but it is not that scary if you look into it. For those "not used" fields, they are reserved by Intel. The 386 processor supports
nested task, that means when task 1 switch to task 2, then the "Back link" field of task 2 is set to the selector of task 1 and the NT(Nested Task) bit in its
EFLAGS
is set as well, so when task 2 returns, the processor knows which task is going to take control. We know that the 386
processor supports 4 different privileges, and TSS allows each privilege uses different stacks when task switching, so there are 4 different stacks
for one process can be used. We are going to talk about CR3
register in following tutorials. We are going to talk about
LDT in a short while. About I/O bitmap an T bit, we don't care about them at this moment.
Like other descriptors in GDT, the TSS also need a descriptor in GDT refer to it, here is its format,
General GDT descriptor,
We can compare it to a general segment descriptor, we can see the D(data or code segment) and X(not used) bits are set to zero, the AVL
bit is available to users. The type is 010B, the B(bit 41) is the busy bit of this TSS descriptor, because one process can only own one TSS,
so once it is executing, the B bit of its descriptor is set to indicate its TSS can not be used any more just in case of the
reentrant. A(access) is set to 1, but the TSS can neither be read nor written by processes even A bit is set to writable. Other field of the
descriptor have the same meaning compares to normal data and code segments, just one thing we have to know, the limit field of the descriptor must can
present a memory space that at least 104 bytes long which is the minimum length of TSS. There is another thing I have to mention, in
reality most OS use software context switching because hardware context switching just save the registers we mentioned above, but not include
FPU
, MMX
, SSE
etc. Since we are not going to use them in Skelix,
so it is safe to use TSS to do environment saving.
In this tutorial, DPL fields of TSS descriptors are going to be set to zero, so that only the kernel has the right to perform task switching.
Just like GDTR
and IDTR
, TSS descriptors also has its own specific register
for task switching, called TR
, it can be loaded to this register by instruction LTR
. And the selector
refers to TSS descriptor in GDT can not be load into any other segment registers, or an exception will occur.
There are several ways to perform task switching, and we are going to use a far jump to a TSS descriptor in GDT to achieve it. All those boring stuffs I mentioned above serve for this purpose. If you want to find out alternate ways to do it, you'd better read some code from wherever possible, because there are really only few books on this topic.
During task switching, the processor first stores the context of current task in the current TSS, then loads the task register of the new task, then loads the context of the new task from the new TSS with the new task's descriptor in GDT, finally executes the new task.
Let's take a look at the code, this is the TSS structure we are going to use,
06/include/task.h
struct
TSS_STRUCT {
unsigned
int
back_link;
unsigned
int
esp0, ss0;
unsigned
int
esp1, ss1;
unsigned
int
esp2, ss2;
unsigned
int
cr3;
unsigned
int
eip;
unsigned
int
eflags;
unsigned
int
eax,ecx,edx,ebx;
unsigned
int
esp, ebp;
unsigned
int
esi, edi;
unsigned
int
es, cs, ss, ds, fs, gs;
unsigned
int
ldt;
unsigned
int
trace_bitmap;
};
It is exactly the same as the TSS structure I mentioned above, but because this structure is going to be used by the processor, so it has to be 104 bytes long and no paddings in it, if you use IA-64 or other arches or other compilers, check documents youself.
For a working OS, those information are not enough, so anothing structure TASK_STRUCT
is used to wrap TSS structure.
#define TS_RUNNING 0
#define TS_RUNABLE 1
#define TS_STOPPED 2
struct
TASK_STRUCT {
struct
TSS_STRUCT tss;
unsigned
long
tss_entry;long
unsigned
long
ldt[2];long
unsigned
long
ldt_entry;long
int
state;
int
priority;
struct
TASK_STRUCT *next;
};
#define DEFAULT_LDT_CODE 0x00cffa000000ffffULL
#define DEFAULT_LDT_DATA 0x00cff2000000ffffULL
#define INITIAL_PRIO 200
This is all we need at this moment, those TS_*
stuff defines all states that a process can be at.
The tss_entry
in TASK_STRUCT
defines the descriptor of the tss field in this structure, I am going to explain it in a short while.
Two *ldt*
fields will be explained later on. state
field store the current state of this process,
which is one of those TS_*
. priority
indicates the sequence of the execution of processes in system,
and the new task will be given a initial priority INITIAL_PRIO
. All tasks in Skelix are managed as a link, the next
field defines the next task in the link.
Now let's look at an example, this is the TASK_STRUCT
structure for task 0, that is the first task in system, when kernel finishes all initialization, it becomes task 0.
static
unsigned
long
TASK0_STACK[256] = {0xf};
This is the stack used as the CPL0 stack for task 0. Whitout that 0xF, I encountered a problem during my compiling, if it is just initialized like static unsigned long TASK0_STACK[256];
then this memory area just vanishes (Shen Feng explains, without 0xF, TASK0_STACK[256]
would be an zero-initialized static data, and it would be in
.bss
section, and the object file only record its length and start address etc.. That's why a nonzero value has to be used to keep it in .data
section.)
struct
TASK_STRUCT TASK0 = {
/* tss */
{ /* back_link */
0,
/* esp0 ss0 */
(unsigned
)&TASK0_STACK+sizeof
TASK0_STACK, DATA_SEL,
Make esp0
points to the "bottom" of the stack. DATA_SEL
and CODE_SEL
appears in next few lines are defined at 06/include/kernel.h
, they are selectors of data and code segments in GDT.
/* esp1 ss1 esp2 ss2 */
0, 0, 0, 0,
/* cr3 */
0,
/* eip eflags */
0, 0,
/* eax ecx edx ebx */
0, 0, 0, 0,
/* esp ebp */
0, 0,
/* esi edi */
0, 0,
/* es cs ds */
USER_DATA_SEL, USER_CODE_SEL, USER_DATA_SEL,
/* ss fs gs */
USER_DATA_SEL, USER_DATA_SEL, USER_DATA_SEL,
/* ldt */
0x20,
/* trace_bitmap */
0x00000000},
/* tss_entry */
0,
/* idt[2] */
{DEFAULT_LDT_CODE, DEFAULT_LDT_DATA},
/* idt_entry */
0,
/* state */
TS_RUNNING,
/* priority */
INITIAL_PRIO,
/* next */
0,
};
Now we have the TSS, we are going to create a TSS descriptor, remember there are two reserved place in GDT.
06/bootsect.s
gdt:
.quad
0x0000000000000000 # null descriptor
.quad
0x00cf9a000000ffff # cs
.quad
0x00cf92000000ffff # ds
.quad
0x0000000000000000 # reserved for tss
.quad
0x0000000000000000 # reserved for ldt
The forth entry(0x3) are reserved for TSS of current running task, so a macro CURR_TASK_TSS = 3
has been defined as an index refers to this position in GDT.
We are going to let the current task uses this place, once it gives up the control, it saves its descriptor in its tss_entry
of TASK_STRUCT
.
When a new task takes over the control, it loads its own TSS descriptor from its TASK_STRUCT
to this place. In this way, we can allow unlimited tasks works in system.
Because there is a length limit about GDT, it can only have 8096 descriptors, actually, Linux has the limit of tasks for quite long time, I don't know why it has this limit,
because eliminating this limit seems to be quite simple.
unsigned
long
long
set_tss(unsigned
long
tss) {long
unsigned
long
tss_entry = 0x0080890000000067ULL;long
tss_entry |= ((tss)<<16) & 0xffffff0000ULL;
tss_entry |= ((tss)<<32) & 0xff00000000000000ULL;
return
gdt[CURR_TASK_TSS] = tss_entry;
}
set_tss
generates the TSS descriptor and put it into GDT, we can see it set the DPL of descriptor to 0, so only kernel can use this descriptor.
unsigned
long
long
get_tss(void
) {
return
gdt[CURR_TASK_TSS];
}
It has been a while since we use GDT, LDT should be fairly easy at this point. LDT is just like GDT, but it is local, every task can have its own LDT, we are going to let every task in Skelix has two descriptors in LDT, the first one is for the code segment, and the second one is for data and stack segments. Now let's go over the selector format,
we are going to let all tasks work at privilege level 3, that means RPL=11b, and TI=1 indicates this selector refers to a LDT. Unlike GDT, the first descriptor of LDT can be used by users, so the selector refers to code segment will be 0x7 , and the selector refers to data and stack segments will be 0xF.
The LDT field in TSS is the selector which refers to an descriptor in GDT, this descriptor describes the LDT. I know it sounds confusing, so let's make it clear, first we let every task owns a LDT, and this LDT resides in somewhere in memory(we do not concern about virtual memory at this stage), and we need a descriptor to describe this memory area which contains this LDT, and this descriptor will be put in GDT, and the selector refers to this descriptor will be put into the LDT field in TSS of this task.
The third descriptor of GDT is for current running task's TSS, and the forth descriptor is quite clear, it is for current task's LDT. So theirs selector will be 0x18 and 0x20 respectively. Like functions relate to TSS, we also have,
06/task.c
unsigned
long
long
set_ldt(unsigned
long
ldt) {long
unsigned
long
ldt_entry = 0x008082000000000fULL;long
ldt_entry |= ((ldt)<<16) & 0xffffff0000ULL;
ldt_entry |= ((ldt)<<32) & 0xff00000000000000ULL;
return
gdt[CURR_TASK_LDT] = ldt_entry;
}
It is similar to GDT entries except for the DPL is 3 instead 0.
unsigned
long
long
get_tss(void
) {
return
gdt[CURR_TASK_TSS];
}
At this moment we will let all tasks have the same LDT content, it means they share the same memory space, that is not a good idea, but I will give them separate memory space by using virtual memory, it will be introduced in later tutorial.
06/include/task.h
#define DEFAULT_LDT_CODE 0x00cffa000000ffffULL
#define DEFAULT_LDT_DATA 0x00cff2000000ffffULL
These are the LDT descriptors in tasks' LDT, they are almost the same as in GDT, except for the DPL field is set to 3.
I mentioned above, all tasks are linked as a single direction link structure, there are two important pointers, one is the next
field in TASK0
,
it is the head of the whole link, and a current
pointer is used for pointing to the current running task.
Before any tasks start, we define,
06/task.c
struct
TASK_STRUCT *current = &TASK0;
Then we take a look at how a new task is generated,
06/init.c
static
void
new_task(struct
TASK_STRUCT *task, unsigned
int
eip
,
unsigned
stack0, int
unsigned
stack3) {int
new_task
takes four arguments, the first one is the TASK_STRUCT
of new task, the second one is the start entry of the program,
the eip
field in TSS will be assigned to this value, so the processor can know where to start this program, the esp0
and esp3
arguments are for the ESP
pointer in privilege level 0 and 3, because their selectors have fixed value, 0x10 an 0xf, ss0 and ss are not needed.
memcpy(task, &TASK0, sizeof
(struct
TASK_STRUCT));
task->tss.esp0 = stack0;
task->tss.
= eip
;eip
task->tss.eflags = 0x3202;
task->tss.esp
= stack3;
task->priority = INITIAL_PRIO;
task->state = TS_STOPPED;
TASK0
are used as a template, we just modify some fields to suits our needs. We change new task state to TS_STOPPED
before the
task link is modified correctly. Otherwise, the link can be broken if another new_task
is invoked at the same time.
task->next = current->next;
current->next = task;
task->state = TS_RUNABLE;
}
Puts new task in task link.
extern
task1_run(void
);void
extern
task2_run(void
);void
They are the entries of new tasks, will be introduced in a short while.
static
long
task1_stack0[1024] = {0xf, };
static
long
task1_stack3[1024] = {0xf, };
static
long
task2_stack0[1024] = {0xf, };
static
long
task2_stack3[1024] = {0xf, };
Because we didn't have the memory management done in this tutorial, so we just set up several arrays as task stacks.
void
init(void
) {
char
wheel[] = {'\\', '|', '/', '-'};
int
i = 0;
struct
TASK_STRUCT task1;
struct
TASK_STRUCT task2;
idt_install();
pic_install();
kb_install();
timer_install(1000);
set_tss((unsigned
long
)&TASK0.tss);long
set_ldt((unsigned
long
)&TASK0.ldt);long
Loads the TASK0
's LDT and TSS descriptors to GDT.
__asm__
("ltrw
%%ax
\n\t"::"a"(TSS_SEL));
__asm__
("lldt
%%ax
\n\t"::"a"(LDT_SEL));
Like you might guess, there are two new registers for TSS and LDT like GDT and IDT, called TR
and LDTR
,
they are loaded by ltr
and lldt
. Before we trying to do any task manipulation, we have to load valid values to TR
and LDTR
, or you will get an general protection exception.
sti();
new_task(&task1,
(unsigned
int
)task1_run,
(unsigned
int
)task1_stack0+sizeof
task1_stack0,
(unsigned
int
)task1_stack3+sizeof
task1_stack3);
new_task(&task2,
(unsigned
int
)task2_run,
(unsigned
int
)task2_stack0+sizeof
task2_stack0,
(unsigned
int
)task2_stack3+sizeof
task2_stack3);
Creates two new tasks, please note that, before creating any new tasks, the interrupt had been enabled.
__asm__
("movl
%%esp
,%%eax
\n\t" \
"pushl
%%ecx
\n\t" \
"pushl
%%eax
\n\t" \
"pushfl
\n\t" \
"pushl
%%ebx
\n\t" \
"pushl
$1f\n\t" \
"iret
\n" \
"1:\tmovw %%cx
,%%ds
\n\t" \
"movw
%%cx
,%%es
\n\t" \
"movw
%%cx
,%%fs
\n\t" \
"movw
%%cx
,%%gs
" \
::"b"(USER_CODE_SEL),"c"(USER_DATA_SEL));
Now the kernel has became task TASK0
, by a long return it loaded correct EIP
and CS
and SS
and ESP
from TASK0
structure, the stack looks like,
+--------------------+
| LDT stack selector |
+--------------------+
| ESP |
+--------------------+
| EFLAGS |
+--------------------+
| LDT code selector |
+--------------------+
| EIP |
+--------------------+
then we load data segment selectors from LDT.
for
(;;) {
__asm__
("movb
%%al
, 0xb8000+160*24"::"a"(wheel[i]));
if
(i == sizeof
wheel)
i = 0;
else
++i;
}
}
Now all tasks are running at privilege level 3, so all other parts that have privilege level higher that that can not be accessed, it gives you some idea of protection. But in Skelix, I am not going to separate the kernel space from the whole memory, the memory protection will be achieved by the memory mapping of virtual memory. In later tutorial, the user tasks can invoke system functions via system calls.
Now we have three tasks, then which part will do the switching? Well, as you may guess, we have the timer interrupt in a certain interval, so it is the good and clearly choice. We have to change the code in timer interrupt like this,
06/timer.c
do_timer(void
) {void
struct
TASK_STRUCT *v = &TASK0;
int
x, y;
++timer_ticks;
get_cursor(&x, &y);
set_cursor(71, 24);
kprintf(KPL_DUMP, "%x", timer_ticks);
set_cursor(x, y);
outb(0x20, 0x20);
cli();
for
(; v; v=v->next) {
Traversing the task link, to change priorities of all tasks.
if
(v->state == TS_RUNNING) {
if
((v->priority+=30) <= 0)
v->priority = 0xffffffff;
} else
v->priority -= 10;
}
Well, the priority of all tasks will be changed during timer interrupt, the running task get higher numerical priority, that is the lower priority. And the waiting tasks get higher priority.
if
(! (timer_ticks%1))
scheduler();
We re-schedule all tasks in two timer ticking.
sti();
}
Let's check out the scheduler
,
scheduler(void
) {void
struct
TASK_STRUCT *v = &TASK0, *tmp = 0;
int
cp = current->priority;
for
(; v; v = v->next) {
if
((v->state==TS_RUNABLE) && (cp>v->priority)) {
tmp = v;
cp = v->priority;
}
}
Traversing the task link, to find the task which has the highest priority to run, that is the smallest number.
if
(tmp && (tmp!=current)) {
current->tss_entry = get_tss();
current->ldt_entry = get_ldt();
We save the TSS and LDT descriptors from GDT to TASK_STRUCT
, because some state bits in those descriptors might have been modified by the processor.
tmp->tss_entry = set_tss((unsigned
long
)((long
unsigned
int
)&tmp->tss));
tmp->ldt_entry = set_ldt((unsigned
long
)((long
unsigned
int
)&tmp->ldt));
current->state = TS_RUNABLE;
tmp->state = TS_RUNNING;
current = tmp;
Sets TSS and LDT descriptors of the task which is going to run to the GDT, and changes theirs state
fields.
__asm__
__volatile__
("ljmp
$" TSS_SEL_STR ", $0\n\t");
}
}
Skelix uses a far jump instruction to perform the task switching, the segment part is the TSS selector, and the offset part is simply ignored by the processor.
The TSS_SEL_STR
is defined as "$0x18", the quotation marks are included, so C can take all those string fragments into one string.
At last, let's check out the code of task1_run
and task2_run
, their entries are written in assembly instead C,
because I do not want to handle the stack change in C function calls. These two tasks actually call other C functions to do the real job.
task1_run:
call
do_task1
jmp
task1_run
task2_run:
call
do_task2
jmp
task2_run
At first, let's check if tasks work in privilege 3 properly, so we call kprintf
to print the current CS
that task is using, then let it go to busy idle or it will keep printing
it on the screen, and screen scrolls up too fast to see any result.
06/init.c
void
do_task1(void
) {
unsigned
int
cs;
__asm__
("movl
%%cs
, %%eax
":"=a"(cs));
kprintf(KPL_DUMP, "%x", cs);
for
(;;)
;
}
void
do_task2(void
) {
unsigned
int
cs;
__asm__
("movl
%%cs
, %%eax
":"=a"(cs));
kprintf(KPL_PANIC, "%x", cs);
for
(;;)
;
}
Don't forget to add new modules to KERNEL_OBJS
in Makefile.
06/Makefile
KERNEL_OBJS= load.o init.o isr.o timer.o libcc.o scr.o kb.o task.o kprintf.o exceptions.o
As we can see, first we are using LDT, and the CS
has been set to 0x7 correctly, another thing is as we make the kprintf
buffer global,
so actually all tasks are sharing same buffer, so output of two tasks might be mixed together when another task starts printting before former task finishes.
Now let them do something fancier, we let two tasks keep displaying different letters,
06/init.c
void
do_task1(void
) {
print_c('A', BLUE, WHITE);
}
void
do_task2(void
) {
print_c('B', GRAY, BROWN);
}
Feel free to use my code. Please contact me if you have any questions.