Notes
main
main
  • Introduction
  • linuxKernel
    • tips
    • make_help
    • old linux
      • compile_linux0.11
      • TestEnvironment
      • load_setup
      • get_hard_data
    • list
    • plist
    • fifo
    • idr
    • xarray
    • rbtree
    • maple_tree
    • syscall
    • bitmap
    • page
    • page_flags
    • page_size
    • page mapcount
    • page refcount
    • folio
    • slub
      • proc_slabinfo
      • slub_theory
      • kmalloc_kfree
      • kmem_cache
      • slab_alloc
      • slab_free
      • proc_meminfo_SReclaimable_SReclaimable
    • vmalloc
    • brk
    • mmap
    • mremap
    • mprotect
    • madvise
    • read
    • write
    • shmem
    • huge_page
    • page_fault
    • rmap
    • lru
    • multi-gen-LRU
    • page_reclaim
    • page_cache
    • page_table
    • rcu
    • kvm
    • aarch64_boot
    • tracing_system
    • cache_coherence_and_memory_consistency
    • cpu_speculates
    • mmap_lock
    • per-vma_lock
    • cgroup
    • symbol
    • campact
    • page_ext
    • mempool
    • kernelstack
    • filesystem
    • io_stack
    • workingset
    • ioremap
    • sched_period
  • linuxDebug
    • openocd_openjtag
    • i2c_tools
    • objdump
    • addr2line
    • gdb_useage
    • debug_linux_kernel_via_gdb
    • debug_linux_module_via_gdb
    • early_boot
    • sequentially_execute
    • dynamic_debug
    • research_linuxKernel_by_patch
    • tracefs
    • ebpf
    • bpftrace
    • perf
    • flame_graph
    • crash
    • ASAN_HWASAN_MTE_check_mem_bug
    • page_owner
    • vmtouch
    • fio
    • benchmark
  • linuxSystem
    • common
      • system_version
      • procfs
      • proc_sys_vm
      • cmd_ps
      • makefile
      • file_descriptor
      • psi
      • ulimit
      • top
      • delay_accounting
    • ubuntu
      • custom_kernel
      • get_cmd_src
      • record_ssh_info
      • log
      • run_custom_script
      • repo
      • cockpit
      • nfs
      • tftp
      • misc
    • fedora
      • system_upgrade
      • custom_kernel
      • lvextend
      • yt-dlp
      • jellyfin
  • linuxDriver
    • i2c_peripherals_driver
    • spi_peripherals_driver
    • gpio_subsystem
    • IRQ_driver
    • blockIO_unblockIO_async
    • linux_own_driver
    • misc_device
    • input_device
    • timer
    • atomic_spinlock_semaphore_mutex
    • lcd
    • touch_screen
    • debugfs
    • v4l2
    • mmap
  • hardware
    • paging_mmu_pt
    • iommu
  • process_thread_scheduler
    • scheduler01
    • scheduler02
    • scheduler03
    • scheduler04
    • scheduler05
    • scheduler06
  • memory_management
    • mm1
    • mm2
    • mm3
    • mm4
    • mm5
  • input_output_filesystem
    • io_fs_01
    • io_fs_02
    • io_fs_03
    • io_fs_04
  • lock_and_lockup_detector
    • general_lock
    • hung_task
    • softLockup_hardLockup
    • crash_experiment
  • MIT_6.S081
    • 6.S081_Operating_System_Engineering
    • Schedule.md
    • Class
      • Overview
      • Administrivia
    • Labs
      • Tools
      • Guidance
      • startup
      • syscall
      • page_table
      • Calling_Convention
      • traps
    • xv6
      • xv6
    • References.md
  • qemu
    • qemu_buildroot
    • qemu_busybox.md
    • Serial.md
    • demo_mini2440
      • 0_compilation_error_summary
      • 1_compilation_steps
      • 2_operation_mode
      • 3_transplant_tools_libraries
      • 4_tools_use
      • reference_website
  • tools
    • getKernelSourceCodeList
    • nat
    • shell
    • translating
    • YouCompleteMe
    • cscope
    • global
    • vscode
    • vim
    • binary
    • markdown
    • draw
    • git
    • tig
    • tmux
    • mail_client
    • download_patchset_from_LKML
    • minicom
    • clash
  • other
    • interview
    • interview_c_base
    • know_dontknow
    • Stop-Ask-Questions-The-Stupid-Ways
    • How-To-Ask-Questions-The-Smart-Way
    • docker
    • buildroot
    • rv32_to_rv64
Powered by GitBook
On this page
  • 测试程序
  • autogroup disabled
  • autogroup enabled
  • Android和cgroup
  • Docker和cgroup
  • systemd和cgroup
  • cpuset和cgroup

Was this helpful?

  1. process_thread_scheduler

scheduler05

测试程序

$ cat multithread.c 
#include <stdio.h>
#include <pthread.h>
#include <stdlib.h>
#include <sys/types.h>

void* thread_fun(void* param)
{
    printf("thread pid:%d, tid:%lu\n",getpid(), pthread_self());
    while(1);
}

int main(int argc, char *argv[])
{
    int n = atoi(argv[1]);
    pthread_t tid[n];
    int i;

    printf("main pid:%d, tid:%lu\n",getpid(), pthread_self());

    for (i = 0; i < n; i++)
        pthread_create(&tid[i],NULL,thread_fun,NULL);

    for (i = 0; i < n; i++)
        pthread_join(tid[i],NULL);

    return 0;
}

autogroup disabled

通过如下命令,关闭autogroup功能

$ echo 0 > /proc/sys/kernel/sched_autogroup_enabled

然后执行如下命令,一个进程有4个线程,另一个进程有8个线程,4线程的CPU% : 8线程的CPU% = 1 : 2

$ ./a.out 4
main pid:22619
$ ./a.out 8
main pid:22624
$ top
进�� USER      PR  NI    VIRT    RES    SHR �  %CPU %MEM     TIME+ COMMAND
22624 leopard   20   0   72256    848    768 S 254.0  0.0   1:45.08 a.out
22619 leopard   20   0   39472    792    712 S 123.2  0.0   0:57.32 a.out

将两个进程分别放入A与B群,因为默认A群与B群的权重都是1024,CFS调度算法平分CPU%给两个群,如下:

$ cd /sys/fs/cgroup/cpu
$ mkdir A B
$ echo 22619 > A/cgroup.procs
$ echo 22624 > B/cgroup.procs
$ cat A/cpu.shares B/cpu.shares 
1024
1024
$ top
进�� USER      PR  NI    VIRT    RES    SHR �  %CPU %MEM     TIME+ COMMAND
22619 leopard   20   0   39472    792    712 S 197.3  0.0   8:56.36 a.out
22624 leopard   20   0   72256    848    768 S 193.7  0.0  16:33.36 a.out

将A群权重设置为512,即 A群的权重 : B群的权重 = A群的CPU% : B群的CPU% = 1 : 2

$ echo 512 > A/cpu.shares 
$ cat A/cpu.shares B/cpu.shares 
512
1024
$ top
进�� USER      PR  NI    VIRT    RES    SHR �  %CPU %MEM     TIME+ COMMAND
22624 leopard   20   0   72256    848    768 S 254.5  0.0  33:26.35 a.out
22619 leopard   20   0   39472    792    712 S 127.2  0.0  25:13.92 a.out

将A群的最大CPU%设置为110% ,B群的最大CPU%设置为10%

$ echo 110000 > A/cpu.cfs_quota_us
$ echo 10000 > B/cpu.cfs_quota_us
$ cat {A,B}/cpu.cfs_quota_us
110000
10000
$ top
进�� USER      PR  NI    VIRT    RES    SHR �  %CPU %MEM     TIME+ COMMAND
22619 leopard   20   0   39472    792    712 S 109.6  0.0  27:41.88 a.out
22624 leopard   20   0   72256    848    768 S  10.0  0.0  39:51.49 a.out

autogroup enabled

通过如下命令,开启autogroup功能,每一个session就会自动创建一个cgroups,像ubntun的session就是bash。

$ echo 1 > /proc/sys/kernel/sched_autogroup_enabled

开启autogroup功能后

# session 1
$ echo $$ # bash pid
22316
$ ./a.out 4
main pid:23093

# session 2
$ echo $$ # bash pid
22380
$ ./a.out 8
main pid:23100

# session 3
# 在相同session执行程序,自动将进程存放在当前session创建的cgroups,bash与a.out的cgroups是相同。
$ cat {23093,22316,23100,22380}/autogroup
/autogroup-763 nice 0
/autogroup-763 nice 0
/autogroup-764 nice 0
/autogroup-764 nice 0
# session ID = bash ID
$ ps -C a.out o pid,ppid,pgid,sid,comm
  PID  PPID  PGID   SID COMMAND 
23093 22316 23093 22316 a.out
23100 22380 23100 22380 a.out
# 默认cgroups权重都是1024
$ top
进�� USER      PR  NI    VIRT    RES    SHR �  %CPU %MEM     TIME+ COMMAND
23093 leopard   20   0   39472    884    804 S 193.3  0.0   0:15.16 a.out
23100 leopard   20   0   72256    880    800 S 192.0  0.0   0:09.02 a.out

Android和cgroup

# 前台应用程序 apps
cpu.shares = 1024
cpu.rt_period_us: 1000000 cpu.rt_runtime_us: 800000

# 后台应用程序 bg_non_interactive
cpu.shares = 52
cpu.rt_period_us: 1000000 cpu.rt_runtime_us: 700000

Docker和cgroup

Docker使用cgroup调配容器的CPU资源

$ docker run --cpu-quota 25000 --cpu-period 10000 --cpu-shares 30 linuxep/lepv0.1
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
3f39ca25d14d
$ cd /sys/fs/cgroup/cpu/docker/3f39c...
$ cat cpu.cfs_quota_us
25000
$ cat cpu.cfs_period_us
10000
$ cat cpu.shares
30

systemd和cgroup

默认时,不创建新cgroups,位于/cgroups下,如下:

$ sudo systemd-run --unit=test --slice=test ./a.out 4
$ systemd-cgtop
Control Group             Tasks   %CPU   Memory  Input/s Output/s
/                          622    399.7   4.8G      -        -
/test.slice                 5     387.1    -        -        -
/test.slice/test.service    5      -       -        -        -
$ top
进�� USER      PR  NI    VIRT    RES    SHR �  %CPU %MEM     TIME+ COMMAND
23927 root      20   0   39472    896    820 S 388.0  0.0   8:19.76 a.out

如下,此时会创建新cgroups,设置新cgroups的最大CPU%为25%

$ sudo systemd-run  -p CPUQuota=25% --unit=test --slice=test ./a.out 4
Running as unit: test.service
$ top
进�� USER      PR  NI    VIRT    RES    SHR �  %CPU %MEM     TIME+ COMMAND
24177 root      20   0   39472    828    752 S  24.8  0.0   0:10.62 a.out
$ systemd-cgls
Control group /:
-.slice
├─test.slice
  └─test.service
     └─24177 /home/leopard/workplace/share/scheduler05-code/./a.out 4
$ systemd-cgtop
Control Group          Tasks   %CPU   Memory  Input/s Output/s
/                        624   75.0   4.8G      -        -
/test.slice                5   25.0     -       -        -
/test.slice/test.service   5   25.0     -       -        -

$ cd /sys/fs/cgroup/cpu/test.slice/test.service
$ cat cpu.cfs_period_us
100000
$ cat cpu.cfs_quota_us
25000

修改新cgroups的权重为600,最大内存为500M,如下:

$ cd /sys/fs/cgroup/cpu/test.slice/test.service
$ cat cpu.shares 
1024
$ sudo systemctl set-property --runtime test.service CPUShares=600 MemoryLimit=500M
$ cat cpu.shares 
600
$ cd /sys/fs/cgroup/memory/test.slice/test.service
$ cat memory.limit_in_bytes 
524288000

以上设置都是通过命令行操作,实际操作中都是在/lib/systemd/system/*.service文件配置,系统开机后自动加载服务。

cpuset和cgroup

cpusets提供了一种Linux内核机制来约束一个进程或一组进程使用哪些cpu和内存节点。

The root cpuset contains all the systems CPUs and Menmory Nodes

cpuset.cpus : list of CPUs in that cpuset

cpuset.mems : list of Memory Nodes in that cpuset

此方法一般用在NUMA服务器中,设置某进程只能在某CPU和某内存中工作,如下:

$ numactl --hardware
available: 1 nodes (0)
node 0 cpus: 0 1 2 3
node 0 size: 7772 MB
node 0 free: 2104 MB
node distances:
node   0 
  0:  10 

$ ./a.out 4
main pid:25239
$ cd /sys/fs/cgroup/cpuset
$ mkdir A
$ echo 0-1 > cpuset.cpus    # 只使用CPU0、CPU1
$ echo 0 > cpuset.mems      # 只使用内存0
$ echo 25239 > cgroup.procs # 将此进程放在此cgroups中执行
$ top
进�� USER      PR  NI    VIRT    RES    SHR �  %CPU %MEM     TIME+ COMMAND     
25239 leopard   20   0   39472    872    792 S 200.0  0.0   3:11.67 a.out
Previousscheduler04Nextscheduler06

Last updated 4 years ago

Was this helpful?