From 34aaff40b42148b23dcde40152480e25c7d2d759 Mon Sep 17 00:00:00 2001 From: Petr Mladek Date: Wed, 14 Dec 2016 15:05:58 -0800 Subject: kdb: call vkdb_printf() from vprintk_default() only when wanted kdb_trap_printk allows to pass normal printk() messages to kdb via vkdb_printk(). For example, it is used to get backtrace using the classic show_stack(), see kdb_show_stack(). vkdb_printf() tries to avoid a potential infinite loop by disabling the trap. But this approach is racy, for example: CPU1 CPU2 vkdb_printf() // assume that kdb_trap_printk == 0 saved_trap_printk = kdb_trap_printk; kdb_trap_printk = 0; kdb_show_stack() kdb_trap_printk++; Problem1: Now, a nested printk() on CPU0 calls vkdb_printf() even when it should have been disabled. It will not cause a deadlock but... // using the outdated saved value: 0 kdb_trap_printk = saved_trap_printk; kdb_trap_printk--; Problem2: Now, kdb_trap_printk == -1 and will stay like this. It means that all messages will get passed to kdb from now on. This patch removes the racy saved_trap_printk handling. Instead, the recursion is prevented by a check for the locked CPU. The solution is still kind of racy. A non-related printk(), from another process, might get trapped by vkdb_printf(). And the wanted printk() might not get trapped because kdb_printf_cpu is assigned. But this problem existed even with the original code. A proper solution would be to get_cpu() before setting kdb_trap_printk and trap messages only from this CPU. I am not sure if it is worth the effort, though. In fact, the race is very theoretical. When kdb is running any of the commands that use kdb_trap_printk there is a single active CPU and the other CPUs should be in a holding pen inside kgdb_cpu_enter(). The only time this is violated is when there is a timeout waiting for the other CPUs to report to the holding pen. Finally, note that the situation is a bit schizophrenic. vkdb_printf() explicitly allows recursion but only from KDB code that calls kdb_printf() directly. On the other hand, the generic printk() recursion is not allowed because it might cause an infinite loop. This is why we could not hide the decision inside vkdb_printf() easily. Link: http://lkml.kernel.org/r/1480412276-16690-4-git-send-email-pmladek@suse.com Signed-off-by: Petr Mladek Cc: Daniel Thompson Cc: Jason Wessel Cc: Peter Zijlstra Cc: Sergey Senozhatsky Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/kdb.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/linux') diff --git a/include/linux/kdb.h b/include/linux/kdb.h index eb706188dc23..68bd88223417 100644 --- a/include/linux/kdb.h +++ b/include/linux/kdb.h @@ -161,6 +161,7 @@ enum kdb_msgsrc { }; extern int kdb_trap_printk; +extern int kdb_printf_cpu; extern __printf(2, 0) int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list args); extern __printf(1, 2) int kdb_printf(const char *, ...); -- cgit v1.2.3