<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/Documentation/admin-guide/sysctl/kernel.rst, branch v6.1.168</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.1.168</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.1.168'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2023-01-24T06:24:41+00:00</updated>
<entry>
<title>panic: Introduce warn_limit</title>
<updated>2023-01-24T06:24:41+00:00</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2022-11-17T23:43:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f53b6dda4d9b6e4ba1af5efd6c6650784996b4e7'/>
<id>urn:sha1:f53b6dda4d9b6e4ba1af5efd6c6650784996b4e7</id>
<content type='text'>
commit 9fc9e278a5c0b708eeffaf47d6eb0c82aa74ed78 upstream.

Like oops_limit, add warn_limit for limiting the number of warnings when
panic_on_warn is not set.

Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Baolin Wang &lt;baolin.wang@linux.alibaba.com&gt;
Cc: "Jason A. Donenfeld" &lt;Jason@zx2c4.com&gt;
Cc: Eric Biggers &lt;ebiggers@google.com&gt;
Cc: Huang Ying &lt;ying.huang@intel.com&gt;
Cc: Petr Mladek &lt;pmladek@suse.com&gt;
Cc: tangmeng &lt;tangmeng@uniontech.com&gt;
Cc: "Guilherme G. Piccoli" &lt;gpiccoli@igalia.com&gt;
Cc: Tiezhu Yang &lt;yangtiezhu@loongson.cn&gt;
Cc: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Cc: linux-doc@vger.kernel.org
Reviewed-by: Luis Chamberlain &lt;mcgrof@kernel.org&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Link: https://lore.kernel.org/r/20221117234328.594699-5-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>exit: Allow oops_limit to be disabled</title>
<updated>2023-01-24T06:24:41+00:00</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2022-12-02T20:59:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e0738725bbf64b4c24ea8da4bbc70904e5d3f5f8'/>
<id>urn:sha1:e0738725bbf64b4c24ea8da4bbc70904e5d3f5f8</id>
<content type='text'>
commit de92f65719cd672f4b48397540b9f9eff67eca40 upstream.

In preparation for keeping oops_limit logic in sync with warn_limit,
have oops_limit == 0 disable checking the Oops counter.

Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Baolin Wang &lt;baolin.wang@linux.alibaba.com&gt;
Cc: "Jason A. Donenfeld" &lt;Jason@zx2c4.com&gt;
Cc: Eric Biggers &lt;ebiggers@google.com&gt;
Cc: Huang Ying &lt;ying.huang@intel.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: linux-doc@vger.kernel.org
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>exit: Put an upper limit on how often we can oops</title>
<updated>2023-01-24T06:24:41+00:00</updated>
<author>
<name>Jann Horn</name>
<email>jannh@google.com</email>
</author>
<published>2022-11-17T23:43:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=767997ef5dc0de5f395b69f5f26eb8b01c65ae1a'/>
<id>urn:sha1:767997ef5dc0de5f395b69f5f26eb8b01c65ae1a</id>
<content type='text'>
commit d4ccd54d28d3c8598e2354acc13e28c060961dbb upstream.

Many Linux systems are configured to not panic on oops; but allowing an
attacker to oops the system **really** often can make even bugs that look
completely unexploitable exploitable (like NULL dereferences and such) if
each crash elevates a refcount by one or a lock is taken in read mode, and
this causes a counter to eventually overflow.

The most interesting counters for this are 32 bits wide (like open-coded
refcounts that don't use refcount_t). (The ldsem reader count on 32-bit
platforms is just 16 bits, but probably nobody cares about 32-bit platforms
that much nowadays.)

So let's panic the system if the kernel is constantly oopsing.

The speed of oopsing 2^32 times probably depends on several factors, like
how long the stack trace is and which unwinder you're using; an empirically
important one is whether your console is showing a graphical environment or
a text console that oopses will be printed to.
In a quick single-threaded benchmark, it looks like oopsing in a vfork()
child with a very short stack trace only takes ~510 microseconds per run
when a graphical console is active; but switching to a text console that
oopses are printed to slows it down around 87x, to ~45 milliseconds per
run.
(Adding more threads makes this faster, but the actual oops printing
happens under &amp;die_lock on x86, so you can maybe speed this up by a factor
of around 2 and then any further improvement gets eaten up by lock
contention.)

It looks like it would take around 8-12 days to overflow a 32-bit counter
with repeated oopsing on a multi-core X86 system running a graphical
environment; both me (in an X86 VM) and Seth (with a distro kernel on
normal hardware in a standard configuration) got numbers in that ballpark.

12 days aren't *that* short on a desktop system, and you'd likely need much
longer on a typical server system (assuming that people don't run graphical
desktop environments on their servers), and this is a *very* noisy and
violent approach to exploiting the kernel; and it also seems to take orders
of magnitude longer on some machines, probably because stuff like EFI
pstore will slow it down a ton if that's active.

Signed-off-by: Jann Horn &lt;jannh@google.com&gt;
Link: https://lore.kernel.org/r/20221107201317.324457-1-jannh@google.com
Reviewed-by: Luis Chamberlain &lt;mcgrof@kernel.org&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Link: https://lore.kernel.org/r/20221117234328.594699-2-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>x86/split_lock: Add sysctl to control the misery mode</title>
<updated>2022-12-31T12:31:55+00:00</updated>
<author>
<name>Guilherme G. Piccoli</name>
<email>gpiccoli@igalia.com</email>
</author>
<published>2022-10-24T20:02:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bb1878d741ffe848c05313b99e0938621e1df0ce'/>
<id>urn:sha1:bb1878d741ffe848c05313b99e0938621e1df0ce</id>
<content type='text'>
[ Upstream commit 727209376f4998bc84db1d5d8af15afea846a92b ]

Commit b041b525dab9 ("x86/split_lock: Make life miserable for split lockers")
changed the way the split lock detector works when in "warn" mode;
basically, it not only shows the warn message, but also intentionally
introduces a slowdown through sleeping plus serialization mechanism
on such task. Based on discussions in [0], seems the warning alone
wasn't enough motivation for userspace developers to fix their
applications.

This slowdown is enough to totally break some proprietary (aka.
unfixable) userspace[1].

Happens that originally the proposal in [0] was to add a new mode
which would warns + slowdown the "split locking" task, keeping the
old warn mode untouched. In the end, that idea was discarded and
the regular/default "warn" mode now slows down the applications. This
is quite aggressive with regards proprietary/legacy programs that
basically are unable to properly run in kernel with this change.
While it is understandable that a malicious application could DoS
by split locking, it seems unacceptable to regress old/proprietary
userspace programs through a default configuration that previously
worked. An example of such breakage was reported in [1].

Add a sysctl to allow controlling the "misery mode" behavior, as per
Thomas suggestion on [2]. This way, users running legacy and/or
proprietary software are allowed to still execute them with a decent
performance while still observing the warning messages on kernel log.

[0] https://lore.kernel.org/lkml/20220217012721.9694-1-tony.luck@intel.com/
[1] https://github.com/doitsujin/dxvk/issues/2938
[2] https://lore.kernel.org/lkml/87pmf4bter.ffs@tglx/

[ dhansen: minor changelog tweaks, including clarifying the actual
  	   problem ]

Fixes: b041b525dab9 ("x86/split_lock: Make life miserable for split lockers")
Suggested-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Guilherme G. Piccoli &lt;gpiccoli@igalia.com&gt;
Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Reviewed-by: Tony Luck &lt;tony.luck@intel.com&gt;
Tested-by: Andre Almeida &lt;andrealmeid@igalia.com&gt;
Link: https://lore.kernel.org/all/20221024200254.635256-1-gpiccoli%40igalia.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'mm-nonmm-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm</title>
<updated>2022-10-12T18:00:22+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-10-12T18:00:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=676cb4957396411fdb7aba906d5f950fc3de7cc9'/>
<id>urn:sha1:676cb4957396411fdb7aba906d5f950fc3de7cc9</id>
<content type='text'>
Pull non-MM updates from Andrew Morton:

 - hfs and hfsplus kmap API modernization (Fabio Francesco)

 - make crash-kexec work properly when invoked from an NMI-time panic
   (Valentin Schneider)

 - ntfs bugfixes (Hawkins Jiawei)

 - improve IPC msg scalability by replacing atomic_t's with percpu
   counters (Jiebin Sun)

 - nilfs2 cleanups (Minghao Chi)

 - lots of other single patches all over the tree!

* tag 'mm-nonmm-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (71 commits)
  include/linux/entry-common.h: remove has_signal comment of arch_do_signal_or_restart() prototype
  proc: test how it holds up with mapping'less process
  mailmap: update Frank Rowand email address
  ia64: mca: use strscpy() is more robust and safer
  init/Kconfig: fix unmet direct dependencies
  ia64: update config files
  nilfs2: replace WARN_ONs by nilfs_error for checkpoint acquisition failure
  fork: remove duplicate included header files
  init/main.c: remove unnecessary (void*) conversions
  proc: mark more files as permanent
  nilfs2: remove the unneeded result variable
  nilfs2: delete unnecessary checks before brelse()
  checkpatch: warn for non-standard fixes tag style
  usr/gen_init_cpio.c: remove unnecessary -1 values from int file
  ipc/msg: mitigate the lock contention with percpu counter
  percpu: add percpu_counter_add_local and percpu_counter_sub_local
  fs/ocfs2: fix repeated words in comments
  relay: use kvcalloc to alloc page array in relay_alloc_page_array
  proc: make config PROC_CHILDREN depend on PROC_FS
  fs: uninline inode_maybe_inc_iversion()
  ...
</content>
</entry>
<entry>
<title>kernel/utsname_sysctl.c: print kernel arch</title>
<updated>2022-09-12T04:55:12+00:00</updated>
<author>
<name>Petr Vorel</name>
<email>pvorel@suse.cz</email>
</author>
<published>2022-09-01T19:44:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bfca3dd3d0680fc2fc7f659a152234afbac26e4d'/>
<id>urn:sha1:bfca3dd3d0680fc2fc7f659a152234afbac26e4d</id>
<content type='text'>
Print the machine hardware name (UTS_MACHINE) in /proc/sys/kernel/arch.

This helps people who debug kernel with initramfs with minimal environment
(i.e.  without coreutils or even busybox) or allow to open sysfs file
instead of run 'uname -m' in high level languages.

Link: https://lkml.kernel.org/r/20220901194403.3819-1-pvorel@suse.cz
Signed-off-by: Petr Vorel &lt;pvorel@suse.cz&gt;
Acked-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: David Sterba &lt;dsterba@suse.com&gt;
Cc: "Eric W . Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Rafael J. Wysocki &lt;rafael@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>memory tiering: rate limit NUMA migration throughput</title>
<updated>2022-09-12T03:25:54+00:00</updated>
<author>
<name>Huang Ying</name>
<email>ying.huang@intel.com</email>
</author>
<published>2022-07-13T08:39:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c6833e10008f976a173dd5abdf992e492cbc3bcf'/>
<id>urn:sha1:c6833e10008f976a173dd5abdf992e492cbc3bcf</id>
<content type='text'>
In NUMA balancing memory tiering mode, if there are hot pages in slow
memory node and cold pages in fast memory node, we need to promote/demote
hot/cold pages between the fast and cold memory nodes.

A choice is to promote/demote as fast as possible.  But the CPU cycles and
memory bandwidth consumed by the high promoting/demoting throughput will
hurt the latency of some workload because of accessing inflating and slow
memory bandwidth contention.

A way to resolve this issue is to restrict the max promoting/demoting
throughput.  It will take longer to finish the promoting/demoting.  But
the workload latency will be better.  This is implemented in this patch as
the page promotion rate limit mechanism.

The number of the candidate pages to be promoted to the fast memory node
via NUMA balancing is counted, if the count exceeds the limit specified by
the users, the NUMA balancing promotion will be stopped until the next
second.

A new sysctl knob kernel.numa_balancing_promote_rate_limit_MBps is added
for the users to specify the limit.

Link: https://lkml.kernel.org/r/20220713083954.34196-3-ying.huang@intel.com
Signed-off-by: "Huang, Ying" &lt;ying.huang@intel.com&gt;
Reviewed-by: Baolin Wang &lt;baolin.wang@linux.alibaba.com&gt;
Tested-by: Baolin Wang &lt;baolin.wang@linux.alibaba.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: osalvador &lt;osalvador@suse.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Rik van Riel &lt;riel@surriel.com&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Wei Xu &lt;weixugc@google.com&gt;
Cc: Yang Shi &lt;shy828301@gmail.com&gt;
Cc: Zhong Jiang &lt;zhongjiang-ali@linux.alibaba.com&gt;
Cc: Zi Yan &lt;ziy@nvidia.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'powerpc-6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux</title>
<updated>2022-08-06T23:38:17+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-08-06T23:38:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cae4199f9319f42534ee2e2e4aadf183b9bb7f73'/>
<id>urn:sha1:cae4199f9319f42534ee2e2e4aadf183b9bb7f73</id>
<content type='text'>
Pull powerpc updates from Michael Ellerman:

 - Add support for syscall stack randomization

 - Add support for atomic operations to the 32 &amp; 64-bit BPF JIT

 - Full support for KASAN on 64-bit Book3E

 - Add a watchdog driver for the new PowerVM hypervisor watchdog

 - Add a number of new selftests for the Power10 PMU support

 - Add a driver for the PowerVM Platform KeyStore

 - Increase the NMI watchdog timeout during live partition migration, to
   avoid timeouts due to increased memory access latency

 - Add support for using the 'linux,pci-domain' device tree property for
   PCI domain assignment

 - Many other small features and fixes

Thanks to Alexey Kardashevskiy, Andy Shevchenko, Arnd Bergmann, Athira
Rajeev, Bagas Sanjaya, Christophe Leroy, Erhard Furtner, Fabiano Rosas,
Greg Kroah-Hartman, Greg Kurz, Haowen Bai, Hari Bathini, Jason A.
Donenfeld, Jason Wang, Jiang Jian, Joel Stanley, Juerg Haefliger, Kajol
Jain, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Masahiro Yamada,
Maxime Bizon, Miaoqian Lin, Murilo Opsfelder Araújo, Nathan Lynch,
Naveen N.  Rao, Nayna Jain, Nicholas Piggin, Ning Qiang, Pali Rohár,
Petr Mladek, Rashmica Gupta, Sachin Sant, Scott Cheloha, Segher
Boessenkool, Stephen Rothwell, Uwe Kleine-König, Wolfram Sang, Xiu
Jianfeng, and Zhouyi Zhou.

* tag 'powerpc-6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (191 commits)
  powerpc/64e: Fix kexec build error
  EDAC/ppc_4xx: Include required of_irq header directly
  powerpc/pci: Fix PHB numbering when using opal-phbid
  powerpc/64: Init jump labels before parse_early_param()
  selftests/powerpc: Avoid GCC 12 uninitialised variable warning
  powerpc/cell/axon_msi: Fix refcount leak in setup_msi_msg_address
  powerpc/xive: Fix refcount leak in xive_get_max_prio
  powerpc/spufs: Fix refcount leak in spufs_init_isolated_loader
  powerpc/perf: Include caps feature for power10 DD1 version
  powerpc: add support for syscall stack randomization
  powerpc: Move system_call_exception() to syscall.c
  powerpc/powernv: rename remaining rng powernv_ functions to pnv_
  powerpc/powernv/kvm: Use darn for H_RANDOM on Power9
  powerpc/powernv: Avoid crashing if rng is NULL
  selftests/powerpc: Fix matrix multiply assist test
  powerpc/signal: Update comment for clarity
  powerpc: make facility_unavailable_exception 64s
  powerpc/platforms/83xx/suspend: Remove write-only global variable
  powerpc/platforms/83xx/suspend: Prevent unloading the driver
  powerpc/platforms/83xx/suspend: Reorder to get rid of a forward declaration
  ...
</content>
</entry>
<entry>
<title>powerpc/pseries/mobility: set NMI watchdog factor during an LPM</title>
<updated>2022-07-27T11:36:03+00:00</updated>
<author>
<name>Laurent Dufour</name>
<email>ldufour@linux.ibm.com</email>
</author>
<published>2022-07-13T15:47:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=118b1366930c8c833b8b36abef657f40d4e26610'/>
<id>urn:sha1:118b1366930c8c833b8b36abef657f40d4e26610</id>
<content type='text'>
During an LPM, while the memory transfer is in progress on the arrival
side, some latencies are generated when accessing not yet transferred
pages on the arrival side. Thus, the NMI watchdog may be triggered too
frequently, which increases the risk to hit an NMI interrupt in a bad
place in the kernel, leading to a kernel panic.

Disabling the Hard Lockup Watchdog until the memory transfer could be a
too strong work around, some users would want this timeout to be
eventually triggered if the system is hanging even during an LPM.

Introduce a new sysctl variable nmi_watchdog_factor. It allows to apply
a factor to the NMI watchdog timeout during an LPM. Just before the CPUs
are stopped for the switchover sequence, the NMI watchdog timer is set
to watchdog_thresh + factor%

A value of 0 has no effect. The default value is 200, meaning that the
NMI watchdog is set to 30s during LPM (based on a 10s watchdog_thresh
value). Once the memory transfer is achieved, the factor is reset to 0.

Setting this value to a high number is like disabling the NMI watchdog
during an LPM.

Signed-off-by: Laurent Dufour &lt;ldufour@linux.ibm.com&gt;
Reviewed-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20220713154729.80789-5-ldufour@linux.ibm.com

</content>
</entry>
<entry>
<title>docs: admin-guide/sysctl: Fix rendering error</title>
<updated>2022-06-24T19:07:37+00:00</updated>
<author>
<name>Stephen Kitt</name>
<email>steve@sk2.org</email>
</author>
<published>2022-06-24T11:02:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=30fb8761419ebf1f26492a0bfaa5d795aa819179'/>
<id>urn:sha1:30fb8761419ebf1f26492a0bfaa5d795aa819179</id>
<content type='text'>
Text in ``literal`` markup must be separated by word separators, so text
like ``lowwater``% renders incorrectly.  Add the suggested "\ " after two
problematic occurrences.

Signed-off-by: Stephen Kitt &lt;steve@sk2.org&gt;
Link: https://lore.kernel.org/r/20220624110230.595740-1-steve@sk2.org
[jc: tweaked to use "\ "]
Signed-off-by: Jonathan Corbet &lt;corbet@lwn.net&gt;
</content>
</entry>
</feed>
