LDD3 第15章内存映射和DMA

本章内容分为三个部分：

第一部分讲述了mmap系统调用的实现过程。将设备内存直接映射到用户进程的地址空间，尽管不是所有设备都需要，但是能显著的提高设备性能。
如何跨越边界直接访问用户空间的内存页，一些相关的驱动程序需要这种能力。在很多情况下，内核执行了该种映射，而无需驱动程序的参与。
直接内存访问（DMA）I/O操作，它使得外设具有直接访问系统内存的能力。

关注Linux内存管理实现的主要特性，而非讲述操作系统中内存管理的理论。

1.1 地址类型

Linux是一个虚拟内存系统，意味着用户程序所使用的地址与硬件使用的物理地址是不同的。

虚拟内存是一个简介层，系统中运行的程序可以分配比物理内存更多的内存。甚至单独进程都拥有比系统物理内存更多的虚拟地址空间。

在任何情况下使用何种类型的地址，内核代码并未明确加以区分，因此程序对此要仔细处理。

用户虚拟地址：这是在用户空间程序所能看到的常规地址。用户地址或者32位的，或者是64位的
物理地址：该地址在处理器和系统内存之家使用。
总线地址：该地址在外围总线和内存之间使用。通常他们与处理器使用的物理地址相同，但这么做并不是必须的。一些计算机提供I/O内存管理（MMU），实现总线和主内存之间的重新映射。
但使用DMA时，MMU变成了一个额外的操作。
内核逻辑地址：内核逻辑地址组成了内核的常规地址空间。kmalloc返回的就是内核逻辑地址
内核虚拟地址：内核虚拟地址和内核逻辑地址，都将内核空间的地址映射到物理地址上。内核虚拟地址与物理地址的映射不是一一对应的。

如果有一个逻辑地址，宏__pa()（在<asm/page.h>中定义）返回其对应的物理地址，

使用宏__va()也能将物理地址逆向映射到逻辑地址，但这只对低端内存页有效。

1.2 物理地址和页

物理地址被分散成离散的单元，称之为页。系统对内存的操作都是基于单个页的。

每个页的大小随体系架构的不同而不同，大多数系统使用4095个字节。常量PAGE_SIZE（在<asm/page.h>中定义）给出了在任何指定体系架构下的大小。

1.3 高端与低端内存

大量的32位系统中，系统的寻址空间不能大于4GB。内核在（x86中）将4GB的虚拟地址空间分割成用户空间和内核空间。

典型的分配是1GB内核空间，3GB的用户空间。内核对任何内存的访问，都需要映射至虚拟地址空间内核部分的大小，再减去内核代码自身所占用的空间。

低端内存：在于内核空间上的逻辑地址内存。

高端内存：那些不存在逻辑地址的内存，它们处于内核虚拟地址之上。

1.4 内存映射和页结构

内核使用逻辑地址来引用物理内存中的页。支持高端内存后，在高端内存中无法使用逻辑地址。内核处理内存的函数趋向于使用指向page结构的指针（在<linux/mm.h>中）。

/*
 * Each physical page in the system has a struct page associated with
 * it to keep track of whatever it is we are using the page for at the
 * moment. Note that we have no way to track which tasks are using
 * a page, though if it is a pagecache page, rmap structures can tell us
 * who is mapping it.
 */
struct page {
    unsigned long flags;        /* Atomic flags, some possibly
                     * updated asynchronously */
    atomic_t _count;        /* Usage count, see below. */
    union {
        /*
         * Count of ptes mapped in
         * mms, to show when page is
         * mapped & limit reverse map
         * searches.
         *
         * Used also for tail pages
         * refcounting instead of
         * _count. Tail pages cannot
         * be mapped and keeping the
         * tail page _count zero at
         * all times guarantees
         * get_page_unless_zero() will
         * never succeed on tail
         * pages.
         */
        atomic_t _mapcount;

        struct {        /* SLUB */
            u16 inuse;
            u16 objects;
        };
    };
    union {
        struct {
        unsigned long private;      /* Mapping-private opaque data:
                         * usually used for buffer_heads
                         * if PagePrivate set; used for
                         * swp_entry_t if PageSwapCache;
                         * indicates order in the buddy
                         * system if PG_buddy is set.
                         */
        struct address_space *mapping;  /* If low bit clear, points to
                         * inode address_space, or NULL.
                         * If page mapped as anonymous
                         * memory, low bit is set, and
                         * it points to anon_vma object:
                         * see PAGE_MAPPING_ANON below.
                         */
        };
#if USE_SPLIT_PTLOCKS
        spinlock_t ptl;
#endif
        struct kmem_cache *slab;    /* SLUB: Pointer to slab */
        struct page *first_page;    /* Compound tail pages */
    };
    union {
        pgoff_t index;      /* Our offset within mapping. */
        void *freelist;     /* SLUB: freelist req. slab lock */
    };
    struct list_head lru;       /* Pageout list, eg. active_list
                     * protected by zone->lru_lock !
                     */
    /*
     * On machines where all RAM is mapped into kernel address space,
     * we can simply calculate the virtual address. On machines with
     * highmem some memory is mapped into kernel virtual memory
     * dynamically, so we need a place to store that address.
     * Note that this field could be 16 bits on x86 ... ;)
     *
     * Architectures with slow multiplication can define
     * WANT_PAGE_VIRTUAL in asm/page.h
     */
#if defined(WANT_PAGE_VIRTUAL)
    void *virtual;          /* Kernel virtual address (NULL if
                       not kmapped, ie. highmem) */
#endif /* WANT_PAGE_VIRTUAL */
#ifdef CONFIG_WANT_PAGE_DEBUG_FLAGS
    unsigned long debug_flags;  /* Use atomic bitops on this */
#endif

#ifdef CONFIG_KMEMCHECK
    /*
     * kmemcheck wants to track the status of each byte in a page; this
     * is a pointer to such a status block. NULL if not tracked.
     */
    void *shadow;
#endif
};

struct page