/proc/cpuinfo里的CPU型号怎么来的？

发表于 2020年12月17日 21:20 分类于 Linux 阅读次数：

今天有一件小事，勾起了我的好奇心。有个同事反馈说，我们虚拟的CPU主频较低，对性能有影响，于是就问了一下，怎么看主频的，很简单，看看lscpu里的Model name:字段就行了：

[root@]# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
...
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 85
Model name:            Intel(R) Xeon(R) Gold 6240R CPU @ 2.40GHz
Stepping:              7
CPU MHz:               2394.374
BogoMIPS:              4788.74
Hypervisor vendor:     KVM
Virtualization type:   full
...

可以看到这台机器的Model name:是Intel(R) Xeon(R) Gold 6240R CPU @ 2.40GHz，@符号后面就是2.40GHz，也就是这颗CPU的基础频率，其实之前写过一个文章再谈CPU的电源管理（如何做到稳定全核睿频？），我们线上实际也是跑在睿频频率上的。实际这个@后面的频率并不能反映证实频率。

那么问题来了。这个Model name到底从哪读的？

其实如果有些基础的可能一开始就能猜到会不会是用cpuid指令去读取的，但是不着急，一点点跟着找找看看。

首先lscpu命令大部分数据都是从/proc/cpuinfo文件读取的，所以我们先看看这个文件：

[root@~]# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 85
model name      : Intel(R) Xeon(R) Gold 6240R CPU @ 2.40GHz
stepping        : 7
microcode       : 0x500002c
cpu MHz         : 3199.951
cache size      : 36608 KB
physical id     : 0
siblings        : 48
core id         : 0
cpu cores       : 24
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 22
wp              : yes
...

确实也有model name，那么这个文件咋来的呢？一般情况下，/proc下面的文件都是内核虚拟出来的，那么直接找找这个文件的内核代码就知道了：

extern const struct seq_operations cpuinfo_op;
static int cpuinfo_open(struct inode *inode, struct file *file)
{
	arch_freq_prepare_all();
	return seq_open(file, &cpuinfo_op);
}

再顺着找下cpuinfo_op的实现，需要说明的是，这个实现不同的平台长的不太一样，我们就看x86的实现就行了，毕竟我们现在用的就是x86：

static int show_cpuinfo(struct seq_file *m, void *v)
{
	struct cpuinfo_x86 *c = v;
	unsigned int cpu;
	int i;

	cpu = c->cpu_index;
	seq_printf(m, "processor\t: %u\n"
		   "vendor_id\t: %s\n"
		   "cpu family\t: %d\n"
		   "model\t\t: %u\n"
		   "model name\t: %s\n",
		   cpu,
		   c->x86_vendor_id[0] ? c->x86_vendor_id : "unknown",
		   c->x86,
		   c->x86_model,
		   c->x86_model_id[0] ? c->x86_model_id : "unknown");
...
}

下面的代码不用管，最关键的model name是从x86_model_id里读到的，那么就简单了啊，看看谁去填充x86_model_id不就可以了么？

有工具的帮助，还是很容易找到的：代码在arch/x86/kernel/cpu/common.c#L646：

static void get_model_name(struct cpuinfo_x86 *c)
{
	unsigned int *v;
	char *p, *q, *s;

	if (c->extended_cpuid_level < 0x80000004)
		return;

	v = (unsigned int *)c->x86_model_id;
	cpuid(0x80000002, &v[0], &v[1], &v[2], &v[3]);
	cpuid(0x80000003, &v[4], &v[5], &v[6], &v[7]);
	cpuid(0x80000004, &v[8], &v[9], &v[10], &v[11]);
	c->x86_model_id[48] = 0;

	/* Trim whitespace */
	p = q = s = &c->x86_model_id[0];

	while (*p == ' ')
		p++;

	while (*p) {
		/* Note the last non-whitespace index */
		if (!isspace(*p))
			s = q;

		*q++ = *p++;
	}

	*(s + 1) = '\0';
}

哦，确实和上面的猜想差不多，用了cpuid这个指令，从0x80000002读到0x80000004。但是仔细看这里有个小细节：x86_model_id的定义原本是char x86_model_id[64];是一个char类型数组，而cpuid的定义呢如下：

static inline void cpuid(unsigned int op,
			 unsigned int *eax, unsigned int *ebx,
			 unsigned int *ecx, unsigned int *edx)
{
	*eax = op;
	*ecx = 0;
	__cpuid(eax, ebx, ecx, edx);
}

几个返回值是unsigned int *类型，从名字就可以看出来，EAX，EBX，ECX，EDX这几个寄存器都是32位的。所以呢，在代码里先强制把x86_model_id转换成(unsigned int *)，这样相当于一次操作4个字节内容。

从The CPUID Explorer了解到：CPUID(0x80000002)..CPUID(0x80000004):就是Processor brand string。

既然看到这里，我们也自己试试呗，看看能不能自己读取并解析出来，我们也用cpuid命令获取寄存器里的原始内容：

[root@]# cpuid --one-cpu --raw --leaf=0x80000002
CPU:
   0x80000002 0x00: eax=0x65746e49 ebx=0x2952286c ecx=0x6f655820 edx=0x2952286e
[root@]# cpuid --one-cpu --raw --leaf=0x80000003
CPU:
   0x80000003 0x00: eax=0x6c6f4720 ebx=0x32362064 ecx=0x20523034 edx=0x20555043
[root@]# cpuid --one-cpu --raw --leaf=0x80000004
CPU:
   0x80000004 0x00: eax=0x2e322040 ebx=0x48473034 ecx=0x0000007a edx=0x00000000

需要说明的是，因为x86是Little Endian架构，也就是从寄存器读出来的int值是反过来的，那么0x65746e49实际应该表示成0x496e7465，我们一个个字节拼一下，再打印出来，看看对不对：

python3
>>> x = '\x49\x6e\x74\x65\x6c\x28\x52\x29\x20\x58\x65\x6f\x6e\x28\x52\x29\x20\x47\x6f\x6c\x64\x20\x36\x32\x34\x30\x52\x20\x43\x50\x55\x20\x40\x20\x32\x2e\x34\x30\x47\x48\x7a\x00'
>>> print(x)
Intel(R) Xeon(R) Gold 6240R CPU @ 2.40GHz

嘿嘿，没问题！