# mm3

## 进程VMA

每一个进程都有0GB\~3GB虚拟地址，内核虚拟地址只有一个3GB\~4GB，如下：

```bash
　　　　　　3GB~4GB　　　　　　　　　　　　　　内核空间
　　　  /     |    \
　　　/       |      \
---------------------------------------
0GB~3GB　　0GB~3GB　　0GB~3GB　　　　　　　用户空间

A进程　　　　B进程 　  　C进程
```

因此不同进程可以有相同虚拟地址，但是不同进程的页表不一样，所以相同虚拟地址映射到不同的物理地址中。

每一个进程的0GB\~3GB虚拟地址，只是用某几段区域，不会全部使用完；一般有text 段、data 段、BSS 段、Heap 段、Stack 段等，**每一段虚拟地址区域就是一个vma**。

将每一个进程的所有段通过链表方式进行连接，如下：

```c
struct task_struct {
    struct mm_struct *mm
};

struct mm_struct {
    struct vm_area_struct * mmap;        /* list of VMAs */
};

struct vm_area_struct {
    unsigned long vm_start;        /* Our start address within vm_mm. */
    unsigned long vm_end;        /* The first byte after our end address

    /* linked list of VM areas per task, sorted by address */
    struct vm_area_struct *vm_next;
};
```

查看一个进程vma的分布情况：

```bash
$ cat test.c 
#include <stdio.h>

void main(void)
{
    while(1);
}

$ pidof a.out 
1487

$ pmap 1487
## vma起始地址　　　vma大小　vma权限
0000000000400000      4K r-x--  /xxx/a.out  ## text 段，通过权限可知
0000000000600000      4K r----  /xxx/a.out  ## data 段
0000000000601000      4K rw---  /xxx/a.out  ## Heap 段
00007f27579f4000   1504K r-x--  /lib/libc-2.11.1.so
00007f2757b6c000   2048K -----  /lib/libc-2.11.1.so
00007f2757d6c000     16K r----  /lib/libc-2.11.1.so
00007f2757d70000      4K rw---  /lib/libc-2.11.1.so
00007f2757d71000     20K rw---    [ anon ]
00007f2757d76000    128K r-x--  /lib/ld-2.11.1.so
00007f2757f83000     12K rw---    [ anon ]
00007f2757f93000      8K rw---    [ anon ]
00007f2757f95000      4K r----  /lib/ld-2.11.1.so
00007f2757f96000      4K rw---  /lib/ld-2.11.1.so
00007f2757f97000      4K rw---    [ anon ]
00007fffffa74000     84K rw---    [ stack ]
00007fffffbff000      4K r-x--    [ anon ]
ffffffffff600000      4K r-x--    [ anon ]
 total             3856K

$ cat /proc/1487/maps 
## vma起始地址-结束地址　vma权限
00400000-00401000    　r-xp 00000000 00:16 51646247  /xxx/a.out  ## text 段，通过权限可知
...

$ cat /proc/1487/smaps
## vma起始地址-结束地址　vma权限
00400000-00401000 　　　r-xp 00000000 00:16 51646247     /xxx/a.out  ## text 段，通过权限可知
Size:                  4 kB  ## VSS
Rss:                   4 kB  ## RSS
Pss:                   4 kB  ## PSS
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         4 kB
Private_Dirty:         0 kB
Referenced:            4 kB
Swap:                  0 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
...
```

## page fault的可能性

1. 访问没有VMA的非法区内存区域，应用程序出现segv
2. heap区域vma（R+W权限），指向页表是R权限；因此第一次写操作时，发生page fault，申请一页内存，页表权限是R+W，此过程称为**minor**
3. 代码段vma（R+X权限），执行W操作，权限不对，应用程序出现segv
4. 代码段 vma（R+X权限），执行X操作，如果页表不存在，需要申请页，读出代码段，有会硬盘I/O操作，此过程称为**major**

## 进程内存消耗的4个概念：VSS、RSS、PSS和USS

VSS：虚拟内存占用空间，VMA大小

RSS：实际内存占用空间，私有数据+共享内存

PSS：实际内存占用空间，私有数据+共享内存/共享数量

USS：实际内存占用空间，私有数据

查看VSS/RSS/PSS/USS占用大小，如下：

```bash
$ smem
  PID User     Command                         Swap      USS      PSS      RSS 
 3454 vernon   ./a.out                            0       84       89      356
or
$ smem --pie=command
or
$ smem --bar=command
```

## 内存泄露的界定方法？

原理：连续多点采样法，随着运行时间越久，进程使用内存越多

1. 先用free命令判断系统是否内存泄露？
2. 分别检查应用空间与内核空间是否内存泄露？

### 应用空间内存泄露检查步骤：

* 连续多点采样后，`USS`从`560 -> 608 -> 644`，判断应用空间内存泄露

```bash
$ cat leak-example.c 
void main(void)
{
    unsigned int *p1, *p2;
    while(1)
    {
        p1=malloc(4096*3);
        p1[0] = 0;
        p1[1024] = 1;
        p1[1024*2] = 2;

        p2=malloc(1024);
        p2[0] = 1;
        free(p2);
        sleep(1);
    }
}

$ gcc -g leak-example.c
$ ./a.out
$ smem -P a.out ## 主要观察USS大小
  PID User     Command                         Swap      USS      PSS      RSS 
 3896 vernon   ./a.out                            0      560      563      876 
 3900 vernon   /usr/bin/python /usr/bin/sm        0     3704     4071     6136 
$ smem -P a.out 
  PID User     Command                         Swap      USS      PSS      RSS 
 3896 vernon   ./a.out                            0      608      611      924 
 3902 vernon   /usr/bin/python /usr/bin/sm        0     3700     4067     6132 
$ smem -P a.out 
  PID User     Command                         Swap      USS      PSS      RSS 
 3896 vernon   ./a.out                            0      644      647      960 
 3904 vernon   /usr/bin/python /usr/bin/sm        0     3704     4071     6136
```

* 通过valgrind进行检测，能够精确定位哪一行源码发生内存泄露，如下：

```bash
# valgrind
$ valgrind --tool=memcheck --leak-check=yes ./a.out
==4760== Memcheck, a memory error detector
==4760== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==4760== Using Valgrind-3.6.0.SVN-Debian and LibVEX; rerun with -h for copyright info
==4760== Command: ./a.out
==4760== 
^C==4760== 
==4760== HEAP SUMMARY:
==4760==     in use at exit: 86,016 bytes in 7 blocks
==4760==   total heap usage: 14 allocs, 7 frees, 93,184 bytes allocated
==4760== 
==4760== 73,728 bytes in 6 blocks are definitely lost in loss record 2 of 2
==4760==    at 0x4C274A8: malloc (vg_replace_malloc.c:236)
==4760==    by 0x4005C5: main (leak-example.c:6)  ## 此行源码发生内存泄露
==4760== 
==4760== LEAK SUMMARY:
==4760==    definitely lost: 73,728 bytes in 6 blocks
==4760==    indirectly lost: 0 bytes in 0 blocks
==4760==      possibly lost: 0 bytes in 0 blocks
==4760==    still reachable: 12,288 bytes in 1 blocks
==4760==         suppressed: 0 bytes in 0 blocks
==4760== Reachable blocks (those to which a pointer was found) are not shown.
==4760== To see them, rerun with: --leak-check=full --show-reachable=yes
==4760== 
==4760== For counts of detected and suppressed errors, rerun with: -v
==4760== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 4 from 4)
```

或者通过asan进行检测，但是需要gcc版本大于等于8，能够精确定位哪一行源码发生内存泄露，如下：

```bash
# asan
$ cat lsan.c 
#include <unistd.h>
#include <sanitizer/lsan_interface.h>

void main(void)
{
    unsigned int *p1, *p2, i=0;
    while(1)
    {
        p1=malloc(4096*3);
        p1[0] = 0;
        p1[1024] = 1;
        p1[1024*2] = 2;

        p2=malloc(1024);
        p2[0] = 1;
        free(p2);
        usleep(10000);

        /* check memory leak by asan */
        if(i++==10)
            __lsan_do_leak_check(); ## 需要加上此行
    }
}

$ gcc -g -fsanitize=address ./lsan.c  ## 需要加上-fsanitize=address
$ ./a.out 
=================================================================
==15338==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 122880 byte(s) in 10 object(s) allocated from:
    #0 0x7f6cf112cbc8 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dbc8)
    #1 0x55aece4f1225 in main lsan.c:9  ## 此行源码发生内存泄露
    #2 0x7f6cf0e540b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)

SUMMARY: AddressSanitizer: 122880 byte(s) leaked in 10 allocation(s).
```

### 内核空间内存泄露检查步骤：

* 查看内核空间的内存实时情况，判断内核空间内存泄露

```bash
$ cat /proc/meminfo
$ cat /proc/buddyinfo
$ cat /proc/zoneinfo
$ cat /proc/slabinfo
```

* linux kernel启动中DEBUG\_KMEMLEAK功能，进行定位哪里出现内存泄露

```bash
# linux 2.6.34
$ make menuconfig
Kernel hacking  --->
    [ ] Kernel memory leak detector
or
# linux3.16
$ make menuconfig
Kernel hacking  --->
    Memory Debugging  --->
        [*] Kernel memory leak detector
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://vernon2gb.gitbook.io/notes/memory_management/mm03.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
