<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/linux, branch v5.10.36</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v5.10.36</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v5.10.36'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2021-05-11T12:47:31+00:00</updated>
<entry>
<title>perf: Rework perf_event_exit_event()</title>
<updated>2021-05-11T12:47:31+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2021-04-08T10:35:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=94902ee2996a7f71471138093495df452dab87b6'/>
<id>urn:sha1:94902ee2996a7f71471138093495df452dab87b6</id>
<content type='text'>
[ Upstream commit ef54c1a476aef7eef26fe13ea10dc090952c00f8 ]

Make perf_event_exit_event() more robust, such that we can use it from
other contexts. Specifically the up and coming remove_on_exec.

For this to work we need to address a few issues. Remove_on_exec will
not destroy the entire context, so we cannot rely on TASK_TOMBSTONE to
disable event_function_call() and we thus have to use
perf_remove_from_context().

When using perf_remove_from_context(), there's two races to consider.
The first is against close(), where we can have concurrent tear-down
of the event. The second is against child_list iteration, which should
not find a half baked event.

To address this, teach perf_remove_from_context() to special case
!ctx-&gt;is_active and about DETACH_CHILD.

[ elver@google.com: fix racing parent/child exit in sync_child_event(). ]
Signed-off-by: Marco Elver &lt;elver@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20210408103605.1676875-2-elver@google.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>mfd: da9063: Support SMBus and I2C mode</title>
<updated>2021-05-11T12:47:31+00:00</updated>
<author>
<name>Hubert Streidl</name>
<email>hubert.streidl@de.bosch.com</email>
</author>
<published>2021-03-16T16:22:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=14c7e3f5bed5c7daa3f67cd1153b9d3618366f5a'/>
<id>urn:sha1:14c7e3f5bed5c7daa3f67cd1153b9d3618366f5a</id>
<content type='text'>
[ Upstream commit 586478bfc9f7e16504d6f64cf18bcbdf6fd0cbc9 ]

By default the PMIC DA9063 2-wire interface is SMBus compliant. This
means the PMIC will automatically reset the interface when the clock
signal ceases for more than the SMBus timeout of 35 ms.

If the I2C driver / device is not capable of creating atomic I2C
transactions, a context change can cause a ceasing of the clock signal.
This can happen if for example a real-time thread is scheduled. Then
the DA9063 in SMBus mode will reset the 2-wire interface. Subsequently
a write message could end up in the wrong register. This could cause
unpredictable system behavior.

The DA9063 PMIC also supports an I2C compliant mode for the 2-wire
interface. This mode does not reset the interface when the clock
signal ceases. Thus the problem depicted above does not occur.

This patch tests for the bus functionality "I2C_FUNC_I2C". It can
reasonably be assumed that the bus cannot obey SMBus timings if
this functionality is set. SMBus commands most probably are emulated
in this case which is prone to the latency issue described above.

This patch enables the I2C bus mode if I2C_FUNC_I2C is set or
otherwise keeps the default SMBus mode.

Signed-off-by: Hubert Streidl &lt;hubert.streidl@de.bosch.com&gt;
Signed-off-by: Mark Jonas &lt;mark.jonas@de.bosch.com&gt;
Reviewed-by: Wolfram Sang &lt;wsa+renesas@sang-engineering.com&gt;
Signed-off-by: Lee Jones &lt;lee.jones@linaro.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>mfd: intel-m10-bmc: Fix the register access range</title>
<updated>2021-05-11T12:47:31+00:00</updated>
<author>
<name>Xu Yilun</name>
<email>yilun.xu@intel.com</email>
</author>
<published>2021-03-10T15:55:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d7ec1dab6be76a1727e20797796d2e5802601520'/>
<id>urn:sha1:d7ec1dab6be76a1727e20797796d2e5802601520</id>
<content type='text'>
[ Upstream commit d9b326b2c3673f939941806146aee38e5c635fd0 ]

This patch fixes the max register address of MAX 10 BMC. The range
0x20000000 ~ 0x200000fc are for control registers of the QSPI flash
controller, which are not accessible to host.

Signed-off-by: Xu Yilun &lt;yilun.xu@intel.com&gt;
Reviewed-by: Tom Rix &lt;trix@redhat.com&gt;
Signed-off-by: Lee Jones &lt;lee.jones@linaro.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>power: supply: bq27xxx: fix power_avg for newer ICs</title>
<updated>2021-05-11T12:47:24+00:00</updated>
<author>
<name>Matthias Schiffer</name>
<email>matthias.schiffer@ew.tq-group.com</email>
</author>
<published>2021-03-03T09:54:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8ff0d8a0904bf38dd0f0c403e52e166b66c11e49'/>
<id>urn:sha1:8ff0d8a0904bf38dd0f0c403e52e166b66c11e49</id>
<content type='text'>
[ Upstream commit c4d57c22ac65bd503716062a06fad55a01569cac ]

On all newer bq27xxx ICs, the AveragePower register contains a signed
value; in addition to handling the raw value as unsigned, the driver
code also didn't convert it to µW as expected.

At least for the BQ28Z610, the reference manual incorrectly states that
the value is in units of 1mW and not 10mW. I have no way of knowing
whether the manuals of other supported ICs contain the same error, or if
there are models that actually use 1mW. At least, the new code shouldn't
be *less* correct than the old version for any device.

power_avg is removed from the cache structure, se we don't have to
extend it to store both a signed value and an error code. Always getting
an up-to-date value may be desirable anyways, as it avoids inconsistent
current and power readings when switching between charging and
discharging.

Signed-off-by: Matthias Schiffer &lt;matthias.schiffer@ew.tq-group.com&gt;
Signed-off-by: Sebastian Reichel &lt;sebastian.reichel@collabora.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>mmc: core: Fix hanging on I/O during system suspend for removable cards</title>
<updated>2021-05-11T12:47:14+00:00</updated>
<author>
<name>Ulf Hansson</name>
<email>ulf.hansson@linaro.org</email>
</author>
<published>2021-03-10T15:29:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=44faf03f56b85bf523bda3d0db2121f2df7c722e'/>
<id>urn:sha1:44faf03f56b85bf523bda3d0db2121f2df7c722e</id>
<content type='text'>
commit 17a17bf50612e6048a9975450cf1bd30f93815b5 upstream.

The mmc core uses a PM notifier to temporarily during system suspend, turn
off the card detection mechanism for removal/insertion of (e)MMC/SD/SDIO
cards. Additionally, the notifier may be used to remove an SDIO card
entirely, if a corresponding SDIO functional driver don't have the system
suspend/resume callbacks assigned. This behaviour has been around for a
very long time.

However, a recent bug report tells us there are problems with this
approach. More precisely, when receiving the PM_SUSPEND_PREPARE
notification, we may end up hanging on I/O to be completed, thus also
preventing the system from getting suspended.

In the end what happens, is that the cancel_delayed_work_sync() in
mmc_pm_notify() ends up waiting for mmc_rescan() to complete - and since
mmc_rescan() wants to claim the host, it needs to wait for the I/O to be
completed first.

Typically, this problem is triggered in Android, if there is ongoing I/O
while the user decides to suspend, resume and then suspend the system
again. This due to that after the resume, an mmc_rescan() work gets punted
to the workqueue, which job is to verify that the card remains inserted
after the system has resumed.

To fix this problem, userspace needs to become frozen to suspend the I/O,
prior to turning off the card detection mechanism. Therefore, let's drop
the PM notifiers for mmc subsystem altogether and rely on the card
detection to be turned off/on as a part of the system_freezable_wq, that we
are already using.

Moreover, to allow and SDIO card to be removed during system suspend, let's
manage this from a -&gt;prepare() callback, assigned at the mmc_host_class
level. In this way, we can use the parent device (the mmc_host_class
device), to remove the card device that is the child, in the
device_prepare() phase.

Reported-by: Kiwoong Kim &lt;kwmad.kim@samsung.com&gt;
Cc: stable@vger.kernel.org # v4.5+
Signed-off-by: Ulf Hansson &lt;ulf.hansson@linaro.org&gt;
Reviewed-by: Linus Walleij &lt;linus.walleij@linaro.org&gt;
Link: https://lore.kernel.org/r/20210310152900.149380-1-ulf.hansson@linaro.org
Reviewed-by: Kiwoong Kim &lt;kwmad.kim@samsung.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>swiotlb: add a IO_TLB_SIZE define</title>
<updated>2021-05-07T09:04:32+00:00</updated>
<author>
<name>Jianxiong Gao</name>
<email>jxgao@google.com</email>
</author>
<published>2021-04-29T17:33:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=22163a8ec863f98bc4a096ac08b2dfe9edc5ddf1'/>
<id>urn:sha1:22163a8ec863f98bc4a096ac08b2dfe9edc5ddf1</id>
<content type='text'>
commit: b5d7ccb7aac3895c2138fe0980a109116ce15eff

Add a new IO_TLB_SIZE define instead open coding it using
IO_TLB_SHIFT all over.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Acked-by: Jianxiong Gao &lt;jxgao@google.com&gt;
Tested-by: Jianxiong Gao &lt;jxgao@google.com&gt;
Signed-off-by: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Signed-off-by: Jianxiong Gao &lt;jxgao@google.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>driver core: add a min_align_mask field to struct device_dma_parameters</title>
<updated>2021-05-07T09:04:32+00:00</updated>
<author>
<name>Jianxiong Gao</name>
<email>jxgao@google.com</email>
</author>
<published>2021-04-29T17:33:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2e8b3b0b8e2d3b56a0b653980c62cbcb374ffedf'/>
<id>urn:sha1:2e8b3b0b8e2d3b56a0b653980c62cbcb374ffedf</id>
<content type='text'>
commit: 36950f2da1ea4cb683be174f6f581e25b2d33e71

Some devices rely on the address offset in a page to function
correctly (NVMe driver as an example). These devices may use
a different page size than the Linux kernel. The address offset
has to be preserved upon mapping, and in order to do so, we
need to record the page_offset_mask first.

Signed-off-by: Jianxiong Gao &lt;jxgao@google.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Acked-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>capabilities: require CAP_SETFCAP to map uid 0</title>
<updated>2021-05-07T09:04:31+00:00</updated>
<author>
<name>Serge E. Hallyn</name>
<email>serge@hallyn.com</email>
</author>
<published>2021-04-20T13:43:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=fb4c1c2e9fd1adb19452bbaa75f0f2bb2826ac0d'/>
<id>urn:sha1:fb4c1c2e9fd1adb19452bbaa75f0f2bb2826ac0d</id>
<content type='text'>
[ Upstream commit db2e718a47984b9d71ed890eb2ea36ecf150de18 ]

cap_setfcap is required to create file capabilities.

Since commit 8db6c34f1dbc ("Introduce v3 namespaced file capabilities"),
a process running as uid 0 but without cap_setfcap is able to work
around this as follows: unshare a new user namespace which maps parent
uid 0 into the child namespace.

While this task will not have new capabilities against the parent
namespace, there is a loophole due to the way namespaced file
capabilities are represented as xattrs.  File capabilities valid in
userns 1 are distinguished from file capabilities valid in userns 2 by
the kuid which underlies uid 0.  Therefore the restricted root process
can unshare a new self-mapping namespace, add a namespaced file
capability onto a file, then use that file capability in the parent
namespace.

To prevent that, do not allow mapping parent uid 0 if the process which
opened the uid_map file does not have CAP_SETFCAP, which is the
capability for setting file capabilities.

As a further wrinkle: a task can unshare its user namespace, then open
its uid_map file itself, and map (only) its own uid.  In this case we do
not have the credential from before unshare, which was potentially more
restricted.  So, when creating a user namespace, we record whether the
creator had CAP_SETFCAP.  Then we can use that during map_write().

With this patch:

1. Unprivileged user can still unshare -Ur

   ubuntu@caps:~$ unshare -Ur
   root@caps:~# logout

2. Root user can still unshare -Ur

   ubuntu@caps:~$ sudo bash
   root@caps:/home/ubuntu# unshare -Ur
   root@caps:/home/ubuntu# logout

3. Root user without CAP_SETFCAP cannot unshare -Ur:

   root@caps:/home/ubuntu# /sbin/capsh --drop=cap_setfcap --
   root@caps:/home/ubuntu# /sbin/setcap cap_setfcap=p /sbin/setcap
   unable to set CAP_SETFCAP effective capability: Operation not permitted
   root@caps:/home/ubuntu# unshare -Ur
   unshare: write failed /proc/self/uid_map: Operation not permitted

Note: an alternative solution would be to allow uid 0 mappings by
processes without CAP_SETFCAP, but to prevent such a namespace from
writing any file capabilities.  This approach can be seen at [1].

Background history: commit 95ebabde382 ("capabilities: Don't allow
writing ambiguous v3 file capabilities") tried to fix the issue by
preventing v3 fscaps to be written to disk when the root uid would map
to the same uid in nested user namespaces.  This led to regressions for
various workloads.  For example, see [2].  Ultimately this is a valid
use-case we have to support meaning we had to revert this change in
3b0c2d3eaa83 ("Revert 95ebabde382c ("capabilities: Don't allow writing
ambiguous v3 file capabilities")").

Link: https://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux.git/log/?h=2021-04-15/setfcap-nsfscaps-v4 [1]
Link: https://github.com/containers/buildah/issues/3071 [2]
Signed-off-by: Serge Hallyn &lt;serge@hallyn.com&gt;
Reviewed-by: Andrew G. Morgan &lt;morgan@kernel.org&gt;
Tested-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Reviewed-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Tested-by: Giuseppe Scrivano &lt;gscrivan@redhat.com&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Fix leakage of uninitialized bpf stack under speculation</title>
<updated>2021-05-07T09:04:31+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2021-04-29T15:19:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2fa15d61e4cbaaa1d1250e67b251ff96952fa614'/>
<id>urn:sha1:2fa15d61e4cbaaa1d1250e67b251ff96952fa614</id>
<content type='text'>
commit 801c6058d14a82179a7ee17a4b532cac6fad067f upstream.

The current implemented mechanisms to mitigate data disclosure under
speculation mainly address stack and map value oob access from the
speculative domain. However, Piotr discovered that uninitialized BPF
stack is not protected yet, and thus old data from the kernel stack,
potentially including addresses of kernel structures, could still be
extracted from that 512 bytes large window. The BPF stack is special
compared to map values since it's not zero initialized for every
program invocation, whereas map values /are/ zero initialized upon
their initial allocation and thus cannot leak any prior data in either
domain. In the non-speculative domain, the verifier ensures that every
stack slot read must have a prior stack slot write by the BPF program
to avoid such data leaking issue.

However, this is not enough: for example, when the pointer arithmetic
operation moves the stack pointer from the last valid stack offset to
the first valid offset, the sanitation logic allows for any intermediate
offsets during speculative execution, which could then be used to
extract any restricted stack content via side-channel.

Given for unprivileged stack pointer arithmetic the use of unknown
but bounded scalars is generally forbidden, we can simply turn the
register-based arithmetic operation into an immediate-based arithmetic
operation without the need for masking. This also gives the benefit
of reducing the needed instructions for the operation. Given after
the work in 7fedb63a8307 ("bpf: Tighten speculative pointer arithmetic
mask"), the aux-&gt;alu_limit already holds the final immediate value for
the offset register with the known scalar. Thus, a simple mov of the
immediate to AX register with using AX as the source for the original
instruction is sufficient and possible now in this case.

Reported-by: Piotr Krysiuk &lt;piotras@gmail.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Tested-by: Piotr Krysiuk &lt;piotras@gmail.com&gt;
Reviewed-by: Piotr Krysiuk &lt;piotras@gmail.com&gt;
Reviewed-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>bpf: Allow variable-offset stack access</title>
<updated>2021-04-28T11:40:00+00:00</updated>
<author>
<name>Andrei Matei</name>
<email>andreimatei1@gmail.com</email>
</author>
<published>2021-02-07T01:10:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f3c4b01689d392373301e6e60d1b02c5b4020afc'/>
<id>urn:sha1:f3c4b01689d392373301e6e60d1b02c5b4020afc</id>
<content type='text'>
[ Upstream commit 01f810ace9ed37255f27608a0864abebccf0aab3 ]

Before this patch, variable offset access to the stack was dissalowed
for regular instructions, but was allowed for "indirect" accesses (i.e.
helpers). This patch removes the restriction, allowing reading and
writing to the stack through stack pointers with variable offsets. This
makes stack-allocated buffers more usable in programs, and brings stack
pointers closer to other types of pointers.

The motivation is being able to use stack-allocated buffers for data
manipulation. When the stack size limit is sufficient, allocating
buffers on the stack is simpler than per-cpu arrays, or other
alternatives.

In unpriviledged programs, variable-offset reads and writes are
disallowed (they were already disallowed for the indirect access case)
because the speculative execution checking code doesn't support them.
Additionally, when writing through a variable-offset stack pointer, if
any pointers are in the accessible range, there's possilibities of later
leaking pointers because the write cannot be tracked precisely.

Writes with variable offset mark the whole range as initialized, even
though we don't know which stack slots are actually written. This is in
order to not reject future reads to these slots. Note that this doesn't
affect writes done through helpers; like before, helpers need the whole
stack range to be initialized to begin with.
All the stack slots are in range are considered scalars after the write;
variable-offset register spills are not tracked.

For reads, all the stack slots in the variable range needs to be
initialized (but see above about what writes do), otherwise the read is
rejected. All register spilled in stack slots that might be read are
marked as having been read, however reads through such pointers don't do
register filling; the target register will always be either a scalar or
a constant zero.

Signed-off-by: Andrei Matei &lt;andreimatei1@gmail.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20210207011027.676572-2-andreimatei1@gmail.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
