Not achieved. Failed on giving each task a virtual 4GB memory space. Just enabled paging mechanism and triggered the page fault exception. Huge disappointing.
386 processors provide us facilities of isolating memory space and memory protection for each task, and the best for the last, each task can access 4GB memory.
Those facilities can be divided into two parts: segmentation and paging. Segmentation has been used by us from the first tutorials, it allows tasks have separate code, data and stack modules. Paging allows mapping memory onto disks as demanded, we are going to use it in this tutorial.
Because we do not have 4GB physical memory for each task, so we have to use somthing else to virtualize the memory space, this mechanism is handled by the processor's paging mechanism. It divide each segment into pages (4KB is the size we are going to use), each page can be stored either on disk or memory. The OS traces states of those pages via page directory and page table. The page directory stores the information of page tables, and page tables stores the information about pages.
When paging is enabled, the process translate a given address by tasks to physical address by the following steps,
Locates the descriptor we are using in GDT or LDT by current selector, then do the privilege and limit checking to ensures accessing is allowed.
Adds the base address of stored in descriptor to get a linear address.
Divides the linear address by page size to get the page number it uses.
Checks this page is presented or not, if not then a page fault exception occures.
The exception function will get a free page for storage or load it from disk by demanding.
Because it is an exception, so the processor re-execute the instruction caused this exception, and this this time the page is presented in memory.
The data structures used by the processor is page directory and page table, they both are an array of 32-bit entries.
The first graph is the format of an entry of page directory, the second one is an entry of page table. As you can see, they have quite similar formats, Let's take a look at the common part among them.
Bit 0 | P | If this page or page table is currently in physical memory. When p=0, then the page is not in memory and a page fault exception occurs during access |
Bit 1 | R/W | If one page or pages(in the case of a page directory entry) are read-only(=0) or can be written to (=1) |
Bit 2 | U/S | Privileges of one page or pages(in the case of a page directory entry), when they are in supervisor level(=0), then only PL0-2 can access them, or in user level(=3), then every task can access them |
Bits 3, 4, (6), 7, 8 | X | Intel reversed, just set them to zero |
Bit 5 | A | If the page or pages have been accessed |
Bits 9-11 | User Defined | We are going to use bit 11 to indicate if page is on disk, when it is not present |
Page table's physical address in page directory entry stores the most significant 20-bit address of the page table it points to. Because it only stores 20-bit, so the page table must be 4KB aligned. The 31-12 bits in the page table entry stores the most significant 20-bit address of the page it points to, because it has 20-bit, so it can represent 2^20 = 1M pages, so that spans 1M*4K = 4GB memory spaces. The D bit in page table entry indicates whether this page's content has been changed, it is useful when swap out this page to disk, if it has not been modified and if is loaded from disk originally, then we can just abandon this page simply instead dumping it to disk.
For translating the logical address to physical address, the logical address is divided into three parts,
Bits 31-22 | It is the entry index to the page directory, we can get the physical address of the page table it points to |
Bits 21-12 | It is the entry index of the page table, we can get the physical address of the page it points to |
Bits 11-0 | Indicates the offset in the page |
For example, we have a logical address 0x3E837B0A, we check its first 10-bit, that is 0x0FA, so it refers to the 0x0FA entry in page directory, say it starts from 0x0005C000, then we check the first 20-bit of this entry for the address of the page table, say it is 0x0003F000, then we check out the second 10-bit of the logical address, that is 0x037, so we check the first 20-bit of 0x037 entry in page table which starts at 0x0003F000, then we can get the physical address of the page, say it is 0x0001B000, then we get the last 12-bit of the logical address, that is 0xB0A, finally we add it to the physical address of the page, so we can get the physical address of 0x3E837B0A is 0x0001B000+0xB0A = 0x0001BB0A.
But there is one problem, how can we find the start of this thread, the answer there is a new register for page directory, that is CR3
,
it stores the physical address of the page directory which is currently using, so it is also called PDBR.
The CR3
register must be loaded before enable paging, its value can be changed by MOV
instruction or
loaded by the CR3 field in TSS structure during task switching.
Whenever the processor access an entry which has non-present bit or a privilege failure happens, the page fault exception function executes.
CR2
stores the logical address which causes this exception, an error code also be pushed into stack, it has following format,
The exception handler usually do the following steps,
Find a free page in memory or load it from disk.
Set the corresponding page directory and page table entries to correct value.
Invalidate TLBs.
The processor stores the most recently used page directory and page table entries in a caches called Translation Lookaside Buffers(TLBs),
so accessing page directory and page tables only happens when those entries do not exist in TLBs. Whenever we modified the content of page directory or page table we have to invalidate TLBs,
then TLBs will abandon the old content, but TLBs is transparent to us, so we invalidate TLB by MOV
a new value into CR3
or by the task switching.
Let's take a look at the code snippet, some constants are defined
08/include/kernel.h
#define PAGE_DIR ((HD0_ADDR+HD0_SIZE+(4*1024)-1) & 0xfffff000)
Let's put page directory right after IDT, page directory must be 4KB aligned.
#define PAGE_SIZE (4*1024)
#define PAGE_TABLE (PAGE_DIR+PAGE_SIZE)
Let's page table be after page directory.
#define MEMORY_RANGE (4*1024*1024)
We use 4MB memory in Skelix.
08/mm.c
static
char
mmap[MEMORY_RANGE/PAGE_SIZE] = {PG_REVERSED, };
This is the bitmap of physical memory.
void
mm_install(void
) {
unsigned
*page_dir = ((int
unsigned
*)PAGE_DIR);int
unsigned
*page_table = ((int
unsigned
*)PAGE_TABLE);int
unsigned
int
address = 0;
int
i;
for
(i=0; i<MEMORY_RANGE/PAGE_SIZE; ++i) {
/* attribute set to: kernel, r/w, present */
page_table[i] = address|7;
address += PAGE_SIZE;
};
Initializes all page table entries from 0-4MB.
page_dir[0] = (PAGE_TABLE|7);
Because one page directory entry can present 4MB memory, so we just set up the first entry in page directory.
for
(i=1; i<1024; ++i)
page_dir[i] = 6;
The next 1023 page directory entries, 1024 entries can refer to a 4GB memory space.
/* set lower 1MB memory to used */
for
(i=(1*1024*1024)/PAGE_SIZE-1; i>=0; --i)
mmap[i] = PG_REVERSED;
Because the kernel use the lower 1MB memory, so we make those pages reversed, so it can not be swapped out, that make them always present in memory.
__asm__
(
"movl
%%eax
, %%cr3
\n\t"
"movl %%
cr0
, %%eax
\n\t"
"orl
$0x80000000,%%eax
\n\t"
"movl
%%eax
, %%cr0
"::"a"(PAGE_DIR));
}
By setting the 31 bit in CR0
, we enabled paging, easy, right?
We can easily find a free page in memory by searching array mmap
.
unsigned
int
alloc_page(int
type) {
int
i;
for
(i=(sizeof
mmap)-1; i>=0 && mmap[i]; --i)
;
if
(i < 0) {
kprintf(KPL_PANIC, "NO MEMORY LEFT");
halt();
}
mmap[i] = type;
return
i;
}
void
*
page2mem(unsigned
int
nr) {
return
(void
*)(nr * PAGE_SIZE);
}
void
do_page_fault(enum
KP_LEVEL kl,
unsigned
ret_ip, int
unsigned
ss, int
unsigned
gs,int
unsigned
fs, int
unsigned
es, int
unsigned
ds, int
unsigned
edi, int
unsigned
esi, int
unsigned
ebp,int
unsigned
esp, int
unsigned
ebx, int
unsigned
edx, int
unsigned
ecx, int
unsigned
eax, int
unsigned
isr_nr, int
unsigned
err, int
unsigned
eip, int
unsigned
cs, int
unsigned
eflags,int
unsigned
old_esp, int
unsigned
old_ss) {int
unsigned
int
cr2, cr3;
(
)ret_ip; (void
)ss; (void
)gs; (void
)fs; (void
)es; void
(
)ds; (void
)edi; (void
)esi; (void
)ebp; (void
)esp; void
(
) ebx; (void
)edx; (void
)ecx; (void
)eax; void
(
)isr_nr; (void
)eip; (void
)cs; (void
)eflags; void
(
)old_esp; (void
)old_ss; (void
)kl;void
__asm__
("movl %%cr2, %%eax":"=a"(cr2));
__asm__
("movl %%cr3, %%eax":"=a"(cr3));
kprintf(KPL_PANIC, "\n The fault at %x cr3:%x was caused by a %s. "
"The accessing cause of the fault was a %s, when the "
"processor was executing in %s mode, page %x is free\n",
cr2, cr3,
(err&0x1)?"page-level protection voilation":"not-present page",
(err&0x2)?"write":"read",
(err&0x4)?"user":"supervisor",
alloc_page(PG_NORMAL));
}
This exception handler does nothing else but printing the information about this exception.
Then we can allocate memory dynamically, new_task
changed in this way,
static
void
new_task(unsigned
int
eip) {
struct
TASK_STRUCT *task = page2mem(alloc_page(PG_TASK));
memcpy(&(task->tss), &(TASK0.tss), sizeof
(struct
TSS_STRUCT));
task->tss.esp0 = (unsigned
int
)task + PAGE_SIZE;
task->tss.eip = eip;
task->tss.eflags = 0x3202;
task->tss.esp = (unsigned
int
)page2mem(alloc_page(PG_TASK))+PAGE_SIZE;
task->priority = INITIAL_PRIO;
task->ldt[0] = DEFAULT_LDT_CODE;
task->ldt[1] = DEFAULT_LDT_DATA;
task->next = current->next;
current->next = task;
task->state = TS_RUNABLE;
}
Now, let's add mm_install
to 08/init.c
, and don't forget to modify the corresponding line in
08/exceptions.c
, then trying to access memory address beyond 4MB.
08/init.c
idt_install();
pic_install();
mm_install(); /* &&&&& Her it is */
kb_install();
08/exceptions.c
void
page_fault(void
) {
__asm__
("pushl
%%eax
;call do_page_fault"::"a"(KPL_PANIC));
halt();
}
Finally, add mm.o
to KERNEL_OBJS
in Makefile.
08/Makefile
KERNEL_OBJS= load.o init.o isr.o timer.o libcc.o scr.o kb.o task.o kprintf.o hd.o \
exceptions.o fs.o mm.o
Feel free to use my code. Please contact me if you have any questions.