writel __raw_writel mb()/rmb()/wmb()

在邮件列表里讨论了一下writel是如何实现的，这个函数实现在操作系统层，有内存保护的情况下，往一个寄存器或者内存地址写一个数据。

在arch/alpha/kernel/io.c中有

188 void writel(u32 b, volatile void __iomem *addr)
189 {
190 __raw_writel(b, addr);
191 mb();
192 }

这样一个writel函数的作用应该是向一个地址上写一个值，我想知道这个函数底下具体实现的细节，于是往下继续跟踪代码：__raw_writel(b, addr);

129 void __raw_writel(u32 b, volatile void __iomem *addr)
130 {
131 IO_CONCAT(__IO_PREFIX,writel)(b, addr);
132 }

再往下跟踪 IO_CONCAT，在对应的io.h中的定义如下：

134 #define IO_CONCAT(a,b) _IO_CONCAT(a,b)
135 #define _IO_CONCAT(a,b) a ## _ ## b

这段代码前几天问过了，是标示将两边的字符串连接起来的意思。

跟踪__IO_PREFIX 定义如下

501 #undef __IO_PREFIX
502 #define __IO_PREFIX apecs

到这里就结束了，再往下我就晕了，有问题如下：

1、到底是怎么将数据写入地址的？我把这些单独提取出来，进行预编译，宏展开后，发现是这样的：

void __raw_writel( )
{
apecs_writel(b, addr);
}

但是在内核里根本就没找到apecs_writel函数，请帮忙解释下。

For the first question,
you should refer to the file "arch\alpha\kernle\Machvec_impl.h"
"~\Machve.h" "~\io.c" "~\io.h" "~\core_**.h".

as you have analysized before, in the file Machvec_impl.h and Machve.h,
DO_CIA_IO,IO,IO_LITE, these three macros implement the symbole
connection between ** arch and writel function, and the function
pointer initializations.
so, the details implementation to writel is to init the
alpha_machine_vector structure and the definition to the relevant
function pointer invoked to complete the low-level write operation.

.mv_writel =CAT(low,_writel),<---IO(CIA,cia)<-->cia_writel(b, addr); <---

|
writel(b, addr)-->__raw_writel(b, addr);--->cia_writel(b,addr)---------------

For the second quesiton,
mb()--->__asm__ __volatile__("mb": : :"memory");
so, it is a memory barrier for alpha architecture to ensure some
operations before some actions could be occured.
and, it is similiar with the barrier() in x86 platform/arm platform.

继续阅读代码，看看定义__IO_PREFIX之后紧接着包含了哪个头文件。在哪个头文
件里面寻找答案。对于你的apsec，看看以下代码段（linux-2.6.28-rc4）

arch/alpha/include/asm/core_apecs.h
------------------------------------------
#undef __IO_PREFIX
#define __IO_PREFIX apecs
#define apecs_trivial_io_bw 0
#define apecs_trivial_io_lq 0
#define apecs_trivial_rw_bw 2
#define apecs_trivial_rw_lq 1
#define apecs_trivial_iounmap 1
#include <asm/io_trivial.h>
------------------------------------------

arch/alpha/include/asm/io_trivial.h
------------------------------------------
__EXTERN_INLINE void
IO_CONCAT(__IO_PREFIX,writel)(u32 b, volatile void __iomem *a)
{
*(volatile u32 __force *)a = b;
}

就是最终通过*(volatile u32 __force *)a = b;
来写入数据的。

如果在没有os，没有mmu的情况下，当开发板裸跑的时候，我们只需要一句话就一切ok：

*(unsigned long *)addr = value;

内存屏障主要解决的问题是编译器的优化和CPU的乱序执行。
编译器在优化的时候，生成的汇编指令可能和c语言程序的执行顺序不一样，在需要程序严格按照c语言顺序执行时，需要显式的告诉编译不需要优化，这在linux下是通过barrier()宏完成的，它依靠volidate关键字和 memory关键字，前者告诉编译barrier()周围的指令不要被优化，后者作用是告诉编译器汇编代码会使内存里面的值更改，编译器应使用内存里的新值而非寄存器里保存的老值。
同样，CPU执行会通过乱序以提高性能。汇编里的指令不一定是按照我们看到的顺序执行的。linux中通过mb()系列宏来保证执行的顺序。具体做法是通过mfence/lfence指令（它们是奔4后引进的，早期x86没有）以及x86指令中带有串行特性的指令（这样的指令很多，例如linux中实现时用到的lock指令，I/O指令，操作控制寄存器、系统寄存器、调试寄存器的指令、iret指令等等）。简单的说，如果在程序某处插入了mb()/rmb()/wmb()宏，则宏之前的程序保证比宏之后的程序先执行，从而实现串行化。wmb的实现和barrier()类似，是因为在x86平台上，写内存的操作不会被乱序执行。
实际上在RSIC平台上，这些串行工作都有专门的指令由程序员显式的完成，比如在需要的地方调用串行指令，而不像x86上有这么多隐性的带有串行特性指令（例如lock指令）。所以在risc平台下工作的朋友通常对串行化操作理解的容易些。

原文地址 http://blog.chinaunix.net/u/6071/showart_2049460.html