My Goal

Not achieved. Failed on giving each task a virtual 4GB memory space. Just enabled paging mechanism and triggered the page fault exception. Huge disappointing.

Download source code


Paging

386 processors provide us facilities of isolating memory space and memory protection for each task, and the best for the last, each task can access 4GB memory.

Those facilities can be divided into two parts: segmentation and paging. Segmentation has been used by us from the first tutorials, it allows tasks have separate code, data and stack modules. Paging allows mapping memory onto disks as demanded, we are going to use it in this tutorial.

Because we do not have 4GB physical memory for each task, so we have to use somthing else to virtualize the memory space, this mechanism is handled by the processor's paging mechanism. It divide each segment into pages (4KB is the size we are going to use), each page can be stored either on disk or memory. The OS traces states of those pages via page directory and page table. The page directory stores the information of page tables, and page tables stores the information about pages.

When paging is enabled, the process translate a given address by tasks to physical address by the following steps,

The data structures used by the processor is page directory and page table, they both are an array of 32-bit entries.

Page table entry format

The first graph is the format of an entry of page directory, the second one is an entry of page table. As you can see, they have quite similar formats, Let's take a look at the common part among them.

Bit 0PIf this page or page table is currently in physical memory. When p=0, then the page is not in memory and a page fault exception occurs during access
Bit 1R/WIf one page or pages(in the case of a page directory entry) are read-only(=0) or can be written to (=1)
Bit 2U/SPrivileges of one page or pages(in the case of a page directory entry), when they are in supervisor level(=0), then only PL0-2 can access them, or in user level(=3), then every task can access them
Bits 3, 4, (6), 7, 8XIntel reversed, just set them to zero
Bit 5AIf the page or pages have been accessed
Bits 9-11User DefinedWe are going to use bit 11 to indicate if page is on disk, when it is not present

Page table's physical address in page directory entry stores the most significant 20-bit address of the page table it points to. Because it only stores 20-bit, so the page table must be 4KB aligned. The 31-12 bits in the page table entry stores the most significant 20-bit address of the page it points to, because it has 20-bit, so it can represent 2^20 = 1M pages, so that spans 1M*4K = 4GB memory spaces. The D bit in page table entry indicates whether this page's content has been changed, it is useful when swap out this page to disk, if it has not been modified and if is loaded from disk originally, then we can just abandon this page simply instead dumping it to disk.

For translating the logical address to physical address, the logical address is divided into three parts,

Bits 31-22It is the entry index to the page directory, we can get the physical address of the page table it points to
Bits 21-12It is the entry index of the page table, we can get the physical address of the page it points to
Bits 11-0Indicates the offset in the page

For example, we have a logical address 0x3E837B0A, we check its first 10-bit, that is 0x0FA, so it refers to the 0x0FA entry in page directory, say it starts from 0x0005C000, then we check the first 20-bit of this entry for the address of the page table, say it is 0x0003F000, then we check out the second 10-bit of the logical address, that is 0x037, so we check the first 20-bit of 0x037 entry in page table which starts at 0x0003F000, then we can get the physical address of the page, say it is 0x0001B000, then we get the last 12-bit of the logical address, that is 0xB0A, finally we add it to the physical address of the page, so we can get the physical address of 0x3E837B0A is 0x0001B000+0xB0A = 0x0001BB0A.

But there is one problem, how can we find the start of this thread, the answer there is a new register for page directory, that is CR3, it stores the physical address of the page directory which is currently using, so it is also called PDBR.

page translation

The CR3 register must be loaded before enable paging, its value can be changed by MOV instruction or loaded by the CR3 field in TSS structure during task switching.

Whenever the processor access an entry which has non-present bit or a privilege failure happens, the page fault exception function executes. CR2 stores the logical address which causes this exception, an error code also be pushed into stack, it has following format,

error code for page fault

The exception handler usually do the following steps,

The processor stores the most recently used page directory and page table entries in a caches called Translation Lookaside Buffers(TLBs), so accessing page directory and page tables only happens when those entries do not exist in TLBs. Whenever we modified the content of page directory or page table we have to invalidate TLBs, then TLBs will abandon the old content, but TLBs is transparent to us, so we invalidate TLB by MOV a new value into CR3 or by the task switching.

Let's take a look at the code snippet, some constants are defined

08/include/kernel.h

#define PAGE_DIR        ((HD0_ADDR+HD0_SIZE+(4*1024)-1) & 0xfffff000)

Let's put page directory right after IDT, page directory must be 4KB aligned.

#define PAGE_SIZE       (4*1024)

#define PAGE_TABLE      (PAGE_DIR+PAGE_SIZE)

Let's page table be after page directory.

#define MEMORY_RANGE    (4*1024*1024)

We use 4MB memory in Skelix.

08/mm.c

static char mmap[MEMORY_RANGE/PAGE_SIZE] = {PG_REVERSED, };

This is the bitmap of physical memory.

void

mm_install(void) {

    unsigned int *page_dir = ((unsigned int *)PAGE_DIR);

    unsigned int *page_table = ((unsigned int *)PAGE_TABLE);

    unsigned int address = 0;

    int i;

    for(i=0; i<MEMORY_RANGE/PAGE_SIZE; ++i) {

        /* attribute set to: kernel, r/w, present */

        page_table[i] = address|7;

        address += PAGE_SIZE;

    };

Initializes all page table entries from 0-4MB.

    page_dir[0] = (PAGE_TABLE|7);

Because one page directory entry can present 4MB memory, so we just set up the first entry in page directory.

    for (i=1; i<1024; ++i)

        page_dir[i] = 6;

The next 1023 page directory entries, 1024 entries can refer to a 4GB memory space.

    /* set lower 1MB memory to used */

    for (i=(1*1024*1024)/PAGE_SIZE-1; i>=0; --i)

        mmap[i] = PG_REVERSED;

Because the kernel use the lower 1MB memory, so we make those pages reversed, so it can not be swapped out, that make them always present in memory.

    __asm__ (

        "movl    %%eax,    %%cr3\n\t"

        "movl    %%cr0,    %%eax\n\t"

        "orl     $0x80000000,%%eax\n\t"

        "movl    %%eax,    %%cr0"::"a"(PAGE_DIR));

}

By setting the 31 bit in CR0, we enabled paging, easy, right?

We can easily find a free page in memory by searching array mmap.

unsigned int

alloc_page(int type) {

    int i;

 

    for (i=(sizeof mmap)-1; i>=0 && mmap[i]; --i)

        ;

    if (i < 0) {

        kprintf(KPL_PANIC, "NO MEMORY LEFT");

        halt();

    }

    mmap[i] = type;

    return i;

}

 

void *

page2mem(unsigned int nr) {

    return (void *)(nr * PAGE_SIZE);

}

 

void

do_page_fault(enum KP_LEVEL kl,

              unsigned int ret_ip, unsigned int ss, unsigned int gs,

              unsigned int fs, unsigned int es, unsigned int ds, 

              unsigned int edi, unsigned int esi, unsigned int ebp,

              unsigned int esp, unsigned int ebx, unsigned int edx, 

              unsigned int ecx, unsigned int eax, unsigned int isr_nr, 

              unsigned int err, unsigned int eip, unsigned int cs, 

              unsigned int eflags,unsigned int old_esp, unsigned int old_ss) {

    unsigned int cr2, cr3;

    (void)ret_ip; (void)ss; (void)gs; (void)fs; (void)es; 

    (void)ds; (void)edi; (void)esi; (void)ebp; (void)esp; 

    (void) ebx; (void)edx; (void)ecx; (void)eax; 

    (void)isr_nr; (void)eip; (void)cs; (void)eflags; 

    (void)old_esp; (void)old_ss; (void)kl;

    __asm__ ("movl %%cr2, %%eax":"=a"(cr2));

    __asm__ ("movl %%cr3, %%eax":"=a"(cr3));

    kprintf(KPL_PANIC, "\n  The fault at %x cr3:%x was caused by a %s. "

            "The accessing cause of the fault was a %s, when the "

            "processor was executing in %s mode, page %x is free\n", 

            cr2, cr3,

            (err&0x1)?"page-level protection voilation":"not-present page", 

            (err&0x2)?"write":"read", 

            (err&0x4)?"user":"supervisor",

            alloc_page(PG_NORMAL));

}

This exception handler does nothing else but printing the information about this exception.

Then we can allocate memory dynamically, new_task changed in this way,

static void

new_task(unsigned int eip) {

    struct TASK_STRUCT *task = page2mem(alloc_page(PG_TASK));

    memcpy(&(task->tss), &(TASK0.tss), sizeof(struct TSS_STRUCT));

 

    task->tss.esp0 = (unsigned int)task + PAGE_SIZE;

    task->tss.eip = eip;

    task->tss.eflags = 0x3202;

    task->tss.esp = (unsigned int)page2mem(alloc_page(PG_TASK))+PAGE_SIZE;

 

    task->priority = INITIAL_PRIO;

    task->ldt[0] = DEFAULT_LDT_CODE;

    task->ldt[1] = DEFAULT_LDT_DATA;

 

    task->next = current->next;

    current->next = task;

    task->state = TS_RUNABLE;

}

Now, let's add mm_install to 08/init.c, and don't forget to modify the corresponding line in 08/exceptions.c, then trying to access memory address beyond 4MB.

08/init.c

    idt_install();

    pic_install();

    mm_install();      /* &&&&& Her it is */

    kb_install();

08/exceptions.c

void

page_fault(void) {

    __asm__ ("pushl    %%eax;call    do_page_fault"::"a"(KPL_PANIC));

    halt();

}

Finally, add mm.o to KERNEL_OBJS in Makefile.

08/Makefile

KERNEL_OBJS= load.o init.o isr.o timer.o libcc.o scr.o kb.o task.o kprintf.o hd.o \

        exceptions.o fs.o mm.o

making process of tutorial08 page fault exception


Feel free to use my code. Please contact me if you have any questions.