<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/arch, branch v5.10.5</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v5.10.5</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v5.10.5'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2021-01-06T13:56:55+00:00</updated>
<entry>
<title>s390: always clear kernel stack backchain before calling functions</title>
<updated>2021-01-06T13:56:55+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>hca@linux.ibm.com</email>
</author>
<published>2020-12-04T16:56:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=13f9eec229734b6952089b9bb315b2bd9c0f73b3'/>
<id>urn:sha1:13f9eec229734b6952089b9bb315b2bd9c0f73b3</id>
<content type='text'>
[ Upstream commit 9365965db0c7ca7fc81eee27c21d8522d7102c32 ]

Clear the kernel stack backchain before potentially calling the
lockdep trace_hardirqs_off/on functions. Without this walking the
kernel backchain, e.g. during a panic, might stop too early.

Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>um: ubd: Submit all data segments atomically</title>
<updated>2021-01-06T13:56:55+00:00</updated>
<author>
<name>Gabriel Krisman Bertazi</name>
<email>krisman@collabora.com</email>
</author>
<published>2020-11-22T04:13:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ef3b9ad967d0bdfb4d18dad4e11279fdbd3256fb'/>
<id>urn:sha1:ef3b9ad967d0bdfb4d18dad4e11279fdbd3256fb</id>
<content type='text'>
[ Upstream commit fc6b6a872dcd48c6f39c7975836d75113db67d37 ]

Internally, UBD treats each physical IO segment as a separate command to
be submitted in the execution pipe.  If the pipe returns a transient
error after a few segments have already been written, UBD will tell the
block layer to requeue the request, but there is no way to reclaim the
segments already submitted.  When a new attempt to dispatch the request
is done, those segments already submitted will get duplicated, causing
the WARN_ON below in the best case, and potentially data corruption.

In my system, running a UML instance with 2GB of RAM and a 50M UBD disk,
I can reproduce the WARN_ON by simply running mkfs.fvat against the
disk on a freshly booted system.

There are a few ways to around this, like reducing the pressure on
the pipe by reducing the queue depth, which almost eliminates the
occurrence of the problem, increasing the pipe buffer size on the host
system, or by limiting the request to one physical segment, which causes
the block layer to submit way more requests to resolve a single
operation.

Instead, this patch modifies the format of a UBD command, such that all
segments are sent through a single element in the communication pipe,
turning the command submission atomic from the point of view of the
block layer.  The new format has a variable size, depending on the
number of elements, and looks like this:

+------------+-----------+-----------+------------
| cmd_header | segment 0 | segment 1 | segment ...
+------------+-----------+-----------+------------

With this format, we push a pointer to cmd_header in the submission
pipe.

This has the advantage of reducing the memory footprint of executing a
single request, since it allow us to merge some fields in the header.
It is possible to reduce even further each segment memory footprint, by
merging bitmap_words and cow_offset, for instance, but this is not the
focus of this patch and is left as future work.  One issue with the
patch is that for a big number of segments, we now perform one big
memory allocation instead of multiple small ones, but I wasn't able to
trigger any real issues or -ENOMEM because of this change, that wouldn't
be reproduced otherwise.

This was tested using fio with the verify-crc32 option, and by running
an ext4 filesystem over this UBD device.

The original WARN_ON was:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at lib/refcount.c:28 refcount_warn_saturate+0x13f/0x141
refcount_t: underflow; use-after-free.
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.5.0-rc6-00002-g2a5bb2cf75c8 #346
Stack:
 6084eed0 6063dc77 00000009 6084ef60
 00000000 604b8d9f 6084eee0 6063dcbc
 6084ef40 6006ab8d e013d780 1c00000000
Call Trace:
 [&lt;600a0c1c&gt;] ? printk+0x0/0x94
 [&lt;6004a888&gt;] show_stack+0x13b/0x155
 [&lt;6063dc77&gt;] ? dump_stack_print_info+0xdf/0xe8
 [&lt;604b8d9f&gt;] ? refcount_warn_saturate+0x13f/0x141
 [&lt;6063dcbc&gt;] dump_stack+0x2a/0x2c
 [&lt;6006ab8d&gt;] __warn+0x107/0x134
 [&lt;6008da6c&gt;] ? wake_up_process+0x17/0x19
 [&lt;60487628&gt;] ? blk_queue_max_discard_sectors+0x0/0xd
 [&lt;6006b05f&gt;] warn_slowpath_fmt+0xd1/0xdf
 [&lt;6006af8e&gt;] ? warn_slowpath_fmt+0x0/0xdf
 [&lt;600acc14&gt;] ? raw_read_seqcount_begin.constprop.0+0x0/0x15
 [&lt;600619ae&gt;] ? os_nsecs+0x1d/0x2b
 [&lt;604b8d9f&gt;] refcount_warn_saturate+0x13f/0x141
 [&lt;6048bc8f&gt;] refcount_sub_and_test.constprop.0+0x2f/0x37
 [&lt;6048c8de&gt;] blk_mq_free_request+0xf1/0x10d
 [&lt;6048ca06&gt;] __blk_mq_end_request+0x10c/0x114
 [&lt;6005ac0f&gt;] ubd_intr+0xb5/0x169
 [&lt;600a1a37&gt;] __handle_irq_event_percpu+0x6b/0x17e
 [&lt;600a1b70&gt;] handle_irq_event_percpu+0x26/0x69
 [&lt;600a1bd9&gt;] handle_irq_event+0x26/0x34
 [&lt;600a1bb3&gt;] ? handle_irq_event+0x0/0x34
 [&lt;600a5186&gt;] ? unmask_irq+0x0/0x37
 [&lt;600a57e6&gt;] handle_edge_irq+0xbc/0xd6
 [&lt;600a131a&gt;] generic_handle_irq+0x21/0x29
 [&lt;60048f6e&gt;] do_IRQ+0x39/0x54
 [...]
---[ end trace c6e7444e55386c0f ]---

Cc: Christopher Obbard &lt;chris.obbard@collabora.com&gt;
Reported-by: Martyn Welch &lt;martyn@collabora.com&gt;
Signed-off-by: Gabriel Krisman Bertazi &lt;krisman@collabora.com&gt;
Tested-by: Christopher Obbard &lt;chris.obbard@collabora.com&gt;
Acked-by: Anton Ivanov &lt;anton.ivanov@cambridgegreys.com&gt;
Signed-off-by: Richard Weinberger &lt;richard@nod.at&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>um: random: Register random as hwrng-core device</title>
<updated>2021-01-06T13:56:55+00:00</updated>
<author>
<name>Christopher Obbard</name>
<email>chris.obbard@collabora.com</email>
</author>
<published>2020-10-27T15:30:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a8b49c4bdf8770008ab72fd4573bfd1d71ea71df'/>
<id>urn:sha1:a8b49c4bdf8770008ab72fd4573bfd1d71ea71df</id>
<content type='text'>
[ Upstream commit 72d3e093afae79611fa38f8f2cfab9a888fe66f2 ]

The UML random driver creates a dummy device under the guest,
/dev/hw_random. When this file is read from the guest, the driver
reads from the host machine's /dev/random, in-turn reading from
the host kernel's entropy pool. This entropy pool could have been
filled by a hardware random number generator or just the host
kernel's internal software entropy generator.

Currently the driver does not fill the guests kernel entropy pool,
this requires a userspace tool running inside the guest (like
rng-tools) to read from the dummy device provided by this driver,
which then would fill the guest's internal entropy pool.

This all seems quite pointless when we are already reading from an
entropy pool, so this patch aims to register the device as a hwrng
device using the hwrng-core framework. This not only improves and
cleans up the driver, but also fills the guest's entropy pool
without having to resort to using extra userspace tools in the guest.

This is typically a nuisance when booting a guest: the random pool
takes a long time (~200s) to build up enough entropy since the dummy
hwrng is not used to fill the guest's pool.

This port was originally attempted by Alexander Neville "dark" (in CC,
discussion in Link), but the conversation there stalled since the
handling of -EAGAIN errors were no removed and longer handled by the
driver. This patch attempts to use the existing method of error
handling but utilises the new hwrng core.

The issue can be noticed when booting a UML guest:

    [    2.560000] random: fast init done
    [  214.000000] random: crng init done

With the patch applied, filling the pool becomes a lot quicker:

    [    2.560000] random: fast init done
    [   12.000000] random: crng init done

Cc: Alexander Neville &lt;dark@volatile.bz&gt;
Link: https://lore.kernel.org/lkml/20190828204609.02a7ff70@TheDarkness/
Link: https://lore.kernel.org/lkml/20190829135001.6a5ff940@TheDarkness.local/
Cc: Sjoerd Simons &lt;sjoerd.simons@collabora.co.uk&gt;
Signed-off-by: Christopher Obbard &lt;chris.obbard@collabora.com&gt;
Acked-by: Anton Ivanov &lt;anton.ivanov@cambridgegreys.com&gt;
Signed-off-by: Richard Weinberger &lt;richard@nod.at&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>powerpc/64: irq replay remove decrementer overflow check</title>
<updated>2021-01-06T13:56:54+00:00</updated>
<author>
<name>Nicholas Piggin</name>
<email>npiggin@gmail.com</email>
</author>
<published>2020-11-07T01:43:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b1e155ccc882cd54ca613965df5653860438b67a'/>
<id>urn:sha1:b1e155ccc882cd54ca613965df5653860438b67a</id>
<content type='text'>
[ Upstream commit 59d512e4374b2d8a6ad341475dc94c4a4bdec7d3 ]

This is way to catch some cases of decrementer overflow, when the
decrementer has underflowed an odd number of times, while MSR[EE] was
disabled.

With a typical small decrementer, a timer that fires when MSR[EE] is
disabled will be "lost" if MSR[EE] remains disabled for between 4.3 and
8.6 seconds after the timer expires. In any case, the decrementer
interrupt would be taken at 8.6 seconds and the timer would be found at
that point.

So this check is for catching extreme latency events, and it prevents
those latencies from being a further few seconds long.  It's not obvious
this is a good tradeoff. This is already a watchdog magnitude event and
that situation is not improved a significantly with this check. For
large decrementers, it's useless.

Therefore remove this check, which avoids a mftb when enabling hard
disabled interrupts (e.g., when enabling after coming from hardware
interrupt handlers). Perhaps more importantly, it also removes the
clunky MSR[EE] vs PACA_IRQ_HARD_DIS incoherency in soft-interrupt replay
which simplifies the code.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20201107014336.2337337-1-npiggin@gmail.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>powerpc: sysdev: add missing iounmap() on error in mpic_msgr_probe()</title>
<updated>2021-01-06T13:56:53+00:00</updated>
<author>
<name>Qinglang Miao</name>
<email>miaoqinglang@huawei.com</email>
</author>
<published>2020-10-28T09:15:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=498d90690f24d13e11d961e8089e64f4e3aa0ff5'/>
<id>urn:sha1:498d90690f24d13e11d961e8089e64f4e3aa0ff5</id>
<content type='text'>
[ Upstream commit ffa1797040c5da391859a9556be7b735acbe1242 ]

I noticed that iounmap() of msgr_block_addr before return from
mpic_msgr_probe() in the error handling case is missing. So use
devm_ioremap() instead of just ioremap() when remapping the message
register block, so the mapping will be automatically released on
probe failure.

Signed-off-by: Qinglang Miao &lt;miaoqinglang@huawei.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20201028091551.136400-1-miaoqinglang@huawei.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>mm: memmap defer init doesn't work as expected</title>
<updated>2021-01-06T13:56:50+00:00</updated>
<author>
<name>Baoquan He</name>
<email>bhe@redhat.com</email>
</author>
<published>2020-12-29T23:14:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=98b57685c26d8f41040ecf71e190250fb2eb2a0c'/>
<id>urn:sha1:98b57685c26d8f41040ecf71e190250fb2eb2a0c</id>
<content type='text'>
commit dc2da7b45ffe954a0090f5d0310ed7b0b37d2bd2 upstream.

VMware observed a performance regression during memmap init on their
platform, and bisected to commit 73a6e474cb376 ("mm: memmap_init:
iterate over memblock regions rather that check each PFN") causing it.

Before the commit:

  [0.033176] Normal zone: 1445888 pages used for memmap
  [0.033176] Normal zone: 89391104 pages, LIFO batch:63
  [0.035851] ACPI: PM-Timer IO Port: 0x448

With commit

  [0.026874] Normal zone: 1445888 pages used for memmap
  [0.026875] Normal zone: 89391104 pages, LIFO batch:63
  [2.028450] ACPI: PM-Timer IO Port: 0x448

The root cause is the current memmap defer init doesn't work as expected.

Before, memmap_init_zone() was used to do memmap init of one whole zone,
to initialize all low zones of one numa node, but defer memmap init of
the last zone in that numa node.  However, since commit 73a6e474cb376,
function memmap_init() is adapted to iterater over memblock regions
inside one zone, then call memmap_init_zone() to do memmap init for each
region.

E.g, on VMware's system, the memory layout is as below, there are two
memory regions in node 2.  The current code will mistakenly initialize the
whole 1st region [mem 0xab00000000-0xfcffffffff], then do memmap defer to
iniatialize only one memmory section on the 2nd region [mem
0x10000000000-0x1033fffffff].  In fact, we only expect to see that there's
only one memory section's memmap initialized.  That's why more time is
costed at the time.

[    0.008842] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x0009ffff]
[    0.008842] ACPI: SRAT: Node 0 PXM 0 [mem 0x00100000-0xbfffffff]
[    0.008843] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x55ffffffff]
[    0.008844] ACPI: SRAT: Node 1 PXM 1 [mem 0x5600000000-0xaaffffffff]
[    0.008844] ACPI: SRAT: Node 2 PXM 2 [mem 0xab00000000-0xfcffffffff]
[    0.008845] ACPI: SRAT: Node 2 PXM 2 [mem 0x10000000000-0x1033fffffff]

Now, let's add a parameter 'zone_end_pfn' to memmap_init_zone() to pass
down the real zone end pfn so that defer_init() can use it to judge
whether defer need be taken in zone wide.

Link: https://lkml.kernel.org/r/20201223080811.16211-1-bhe@redhat.com
Link: https://lkml.kernel.org/r/20201223080811.16211-2-bhe@redhat.com
Fixes: commit 73a6e474cb376 ("mm: memmap_init: iterate over memblock regions rather that check each PFN")
Signed-off-by: Baoquan He &lt;bhe@redhat.com&gt;
Reported-by: Rahul Gopakumar &lt;gopakumarr@vmware.com&gt;
Reviewed-by: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>x86/CPU/AMD: Save AMD NodeId as cpu_die_id</title>
<updated>2020-12-30T10:54:29+00:00</updated>
<author>
<name>Yazen Ghannam</name>
<email>yazen.ghannam@amd.com</email>
</author>
<published>2020-11-09T21:06:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=700d098acec5271161606f3c0086b71695ea2ef8'/>
<id>urn:sha1:700d098acec5271161606f3c0086b71695ea2ef8</id>
<content type='text'>
[ Upstream commit 028c221ed1904af9ac3c5162ee98f48966de6b3d ]

AMD systems provide a "NodeId" value that represents a global ID
indicating to which "Node" a logical CPU belongs. The "Node" is a
physical structure equivalent to a Die, and it should not be confused
with logical structures like NUMA nodes. Logical nodes can be adjusted
based on firmware or other settings whereas the physical nodes/dies are
fixed based on hardware topology.

The NodeId value can be used when a physical ID is needed by software.

Save the AMD NodeId to struct cpuinfo_x86.cpu_die_id. Use the value
from CPUID or MSR as appropriate. Default to phys_proc_id otherwise.
Do so for both AMD and Hygon systems.

Drop the node_id parameter from cacheinfo_*_init_llc_id() as it is no
longer needed.

Update the x86 topology documentation.

Suggested-by: Borislav Petkov &lt;bp@alien8.de&gt;
Signed-off-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20201109210659.754018-2-Yazen.Ghannam@amd.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>Revert: "ring-buffer: Remove HAVE_64BIT_ALIGNED_ACCESS"</title>
<updated>2020-12-30T10:54:29+00:00</updated>
<author>
<name>Steven Rostedt (VMware)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2020-12-14T17:33:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2bbb32065694dc5e66a028f91d69a6bf592be430'/>
<id>urn:sha1:2bbb32065694dc5e66a028f91d69a6bf592be430</id>
<content type='text'>
commit adab66b71abfe206a020f11e561f4df41f0b2aba upstream.

It was believed that metag was the only architecture that required the ring
buffer to keep 8 byte words aligned on 8 byte architectures, and with its
removal, it was assumed that the ring buffer code did not need to handle
this case. It appears that sparc64 also requires this.

The following was reported on a sparc64 boot up:

   kernel: futex hash table entries: 65536 (order: 9, 4194304 bytes, linear)
   kernel: Running postponed tracer tests:
   kernel: Testing tracer function:
   kernel: Kernel unaligned access at TPC[552a20] trace_function+0x40/0x140
   kernel: Kernel unaligned access at TPC[552a24] trace_function+0x44/0x140
   kernel: Kernel unaligned access at TPC[552a20] trace_function+0x40/0x140
   kernel: Kernel unaligned access at TPC[552a24] trace_function+0x44/0x140
   kernel: Kernel unaligned access at TPC[552a20] trace_function+0x40/0x140
   kernel: PASSED

Need to put back the 64BIT aligned code for the ring buffer.

Link: https://lore.kernel.org/r/CADxRZqzXQRYgKc=y-KV=S_yHL+Y8Ay2mh5ezeZUnpRvg+syWKw@mail.gmail.com

Cc: stable@vger.kernel.org
Fixes: 86b3de60a0b6 ("ring-buffer: Remove HAVE_64BIT_ALIGNED_ACCESS")
Reported-by: Anatoly Pugachev &lt;matorola@gmail.com&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>um: Fix time-travel mode</title>
<updated>2020-12-30T10:54:17+00:00</updated>
<author>
<name>Johannes Berg</name>
<email>johannes.berg@intel.com</email>
</author>
<published>2020-11-20T20:08:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ef82413937d1656ff1b74310bb5496b18df1520e'/>
<id>urn:sha1:ef82413937d1656ff1b74310bb5496b18df1520e</id>
<content type='text'>
commit ff9632d2a66512436d616ef4c380a0e73f748db1 upstream.

Since the time-travel rework, basic time-travel mode hasn't worked
properly, but there's no longer a need for this WARN_ON() so just
remove it and thereby fix things.

Cc: stable@vger.kernel.org
Fixes: 4b786e24ca80 ("um: time-travel: Rewrite as an event scheduler")
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
Signed-off-by: Richard Weinberger &lt;richard@nod.at&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>um: Remove use of asprinf in umid.c</title>
<updated>2020-12-30T10:54:17+00:00</updated>
<author>
<name>Anton Ivanov</name>
<email>anton.ivanov@cambridgegreys.com</email>
</author>
<published>2020-11-13T10:26:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c4b425322134e2ee2950bc1ca54a66511a1526a9'/>
<id>urn:sha1:c4b425322134e2ee2950bc1ca54a66511a1526a9</id>
<content type='text'>
commit 97be7ceaf7fea68104824b6aa874cff235333ac1 upstream.

asprintf is not compatible with the existing uml memory allocation
mechanism. Its use on the "user" side of UML results in a corrupt slab
state.

Fixes: 0d4e5ac7e780 ("um: remove uses of variable length arrays")
Cc: stable@vger.kernel.org
Signed-off-by: Anton Ivanov &lt;anton.ivanov@cambridgegreys.com&gt;
Signed-off-by: Richard Weinberger &lt;richard@nod.at&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
