softLockup_hardLockup
0. 简介
Softlockups are bugs that cause the kernel to loop in kernel mode for more than 20 seconds, without giving other tasks a chance to run. The current stack trace is displayed upon detection and the system will stay locked up.
example: 持有spinlock之后,在临界区花了太长时间
Hardlockups are bugs that cause the CPU to loop in kernel mode for more than 10 seconds, without letting other interrupts have a chance to run. The current stack trace is displayed upon detection and the system will stay locked up.
example: 关闭本地中断太长时间
1. 编译linux kernel
打开soft lockup detector 和 hard lockup detector
## based on linux 5.4 version
$ make x86_64_defconfig
$ make menuconfig
Kernel hacking --->
Debug Lockups and Hangs --->
[*] Detect Soft Lockups ## CONFIG_SOFTLOCKUP_DETECTOR
[*] Panic (Reboot) On Soft Lockups ## CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC
[*] Detect Hard Lockups ## CONFIG_HARDLOCKUP_DETECTOR
[*] Panic (Reboot) On Hard Lockups ## CONFIG_BOOTPARAM_HARDLOCKUP_PANIC
$ make
2. 启动linux kernel
查看soft/hard lockup相关属性
$ cd /proc/sys/kernel/
$ grep . soft*
soft_watchdog:1
softlockup_all_cpu_backtrace:0
softlockup_panic:1 ## 当出现soft lockup后, 打印stack信息,同时是否触发panic
$ grep . hard*
hardlockup_all_cpu_backtrace:0
hardlockup_panic:1 ## 当出现hard lockup后, 打印stack信息,同时是否触发panic
3. 例子
## 打上测试soft lockup的补丁
$ git am 0001-test-soft-lockup.patch
$ make
## 启动qemu,然后加载softLockup.ko
$ insmod softLockup.ko &
$
[ 53.602662] softlockup_init
[ 80.977020] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [insmod:174]
[ 80.977286] Modules linked in: softLockup(+)
[ 80.977823] CPU: 0 PID: 174 Comm: insmod Not tainted 5.4.0 #6
[ 80.977990] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[ 80.978769] RIP: 0010:softlockup_init+0x13/0x1000 [softLockup]
[ 80.979193] Code: Bad RIP value.
[ 80.979293] RSP: 0018:ffffbddb4024bcb8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13
[ 80.979493] RAX: 000000000000000f RBX: 0000000000000000 RCX: 0000000000000000
[ 80.979649] RDX: 0000000000000000 RSI: ffff9ff2078163d8 RDI: ffff9ff2078163d8
[ 80.979815] RBP: ffffffffc0315000 R08: 000000000000017e R09: 000000000000001e
[ 80.979959] R10: ffff9ff204b71640 R11: ffffbddb4024bb68 R12: ffff9ff204b59290
[ 80.980112] R13: 0000000000000001 R14: 0000000000000001 R15: ffffbddb4024be88
[ 80.980304] FS: 00007f2510f89740(0000) GS:ffff9ff207800000(0000) knlGS:0000000000000000
[ 80.980494] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 80.980632] CR2: ffffffffc0314fe9 CR3: 00000000044a6000 CR4: 00000000000006f0
[ 80.980846] Call Trace:
[ 80.981536] do_one_initcall+0x41/0x1df
[ 80.981809] ? _cond_resched+0x10/0x40
[ 80.981907] ? kmem_cache_alloc_trace+0x36/0x1b0
[ 80.982055] do_init_module+0x56/0x1ee
[ 80.982158] load_module+0x1f84/0x2660
[ 80.982289] ? vfs_read+0x10e/0x130
[ 80.982384] ? __do_sys_finit_module+0xba/0xe0
[ 80.982494] __do_sys_finit_module+0xba/0xe0
[ 80.982604] do_syscall_64+0x43/0x120
[ 80.982692] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 80.982904] RIP: 0033:0x7f2510876839
[ 80.983099] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
[ 80.983511] RSP: 002b:00007ffdd54b51d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 80.983690] RAX: ffffffffffffffda RBX: 000000000000005f RCX: 00007f2510876839
[ 80.983857] RDX: 0000000000000000 RSI: 0000557920abf260 RDI: 0000000000000003
[ 80.983997] RBP: 0000557920abf260 R08: 0000000000000000 R09: 00007f251076c9d0
[ 80.984157] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003
[ 80.984320] R13: 00007ffdd54b5f4e R14: 0000000000000000 R15: 0000000000000000
[ 80.984627] Kernel panic - not syncing: softlockup: hung tasks
[ 80.984861] CPU: 0 PID: 174 Comm: insmod Tainted: G L 5.4.0 #6
[ 80.985014] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[ 80.985236] Call Trace:
[ 80.985433] <IRQ>
[ 80.985508] dump_stack+0x50/0x6b
[ 80.985582] panic+0xf3/0x2c8
[ 80.985661] watchdog_timer_fn+0x237/0x240
[ 80.985768] ? softlockup_fn+0x40/0x40
[ 80.985852] __hrtimer_run_queues+0x103/0x280
[ 80.985956] hrtimer_interrupt+0xe0/0x240
[ 80.986060] smp_apic_timer_interrupt+0x5d/0x120
[ 80.986182] apic_timer_interrupt+0xf/0x20
[ 80.986317] </IRQ>
[ 80.986391] RIP: 0010:softlockup_init+0x13/0x1000 [softLockup]
[ 80.986533] Code: Bad RIP value.
[ 80.986612] RSP: 0018:ffffbddb4024bcb8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13
[ 80.986778] RAX: 000000000000000f RBX: 0000000000000000 RCX: 0000000000000000
[ 80.986914] RDX: 0000000000000000 RSI: ffff9ff2078163d8 RDI: ffff9ff2078163d8
[ 80.987062] RBP: ffffffffc0315000 R08: 000000000000017e R09: 000000000000001e
[ 80.987217] R10: ffff9ff204b71640 R11: ffffbddb4024bb68 R12: ffff9ff204b59290
[ 80.987360] R13: 0000000000000001 R14: 0000000000000001 R15: ffffbddb4024be88
[ 80.987550] ? 0xffffffffc0315000
[ 80.987645] ? softlockup_init+0x13/0x1000 [softLockup]
[ 80.987760] do_one_initcall+0x41/0x1df
[ 80.987854] ? _cond_resched+0x10/0x40
[ 80.987949] ? kmem_cache_alloc_trace+0x36/0x1b0
[ 80.988057] do_init_module+0x56/0x1ee
[ 80.988146] load_module+0x1f84/0x2660
[ 80.988226] ? vfs_read+0x10e/0x130
[ 80.988318] ? __do_sys_finit_module+0xba/0xe0
[ 80.988417] __do_sys_finit_module+0xba/0xe0
[ 80.988524] do_syscall_64+0x43/0x120
[ 80.988613] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 80.988732] RIP: 0033:0x7f2510876839
[ 80.988818] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
[ 80.989219] RSP: 002b:00007ffdd54b51d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 80.989387] RAX: ffffffffffffffda RBX: 000000000000005f RCX: 00007f2510876839
[ 80.989545] RDX: 0000000000000000 RSI: 0000557920abf260 RDI: 0000000000000003
[ 80.989707] RBP: 0000557920abf260 R08: 0000000000000000 R09: 00007f251076c9d0
[ 80.989854] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003
[ 80.990007] R13: 00007ffdd54b5f4e R14: 0000000000000000 R15: 0000000000000000
[ 80.990534] Kernel Offset: 0xbe00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 80.990933] ---[ end Kernel panic - not syncing: softlockup: hung tasks ]---
## 打上测试hard lockup的补丁
$ git am 0001-test-hard-lockup.patch
$ make
## 启动qemu,然后加载hardLockup.ko
$ insmod hardLockup.ko &
## 无法生效hard lockup panic....
## 因为触发hard lockup panic的前提是有NMI中断,qeme是没有摸拟NMI中断(只有真实硬件中才有NMI中断),所以qemu不能做此实验
Last updated
Was this helpful?