<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/xen/privcmd.c, branch v7.0.11</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v7.0.11</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v7.0.11'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-04-30T09:13:05+00:00</updated>
<entry>
<title>xen/privcmd: fix double free via VMA splitting</title>
<updated>2026-04-30T09:13:05+00:00</updated>
<author>
<name>Juergen Gross</name>
<email>jgross@suse.com</email>
</author>
<published>2026-04-10T07:20:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=71bf829800758a6e3889096e4754ef47ba7fc850'/>
<id>urn:sha1:71bf829800758a6e3889096e4754ef47ba7fc850</id>
<content type='text'>
commit 24daca4fc07f3ff8cd0e3f629cd982187f48436a upstream.

privcmd_vm_ops defines .close (privcmd_close), but neither .may_split
nor .open. When userspace does a partial munmap() on a privcmd mapping,
the kernel splits the VMA via __split_vma(). Since may_split is NULL,
the split is allowed. vm_area_dup() copies vm_private_data (a pages
array allocated in alloc_empty_pages()) into the new VMA without any
fixup, because there is no .open callback.

Both VMAs now point to the same pages array. When the unmapped portion
is closed, privcmd_close() calls:
    - xen_unmap_domain_gfn_range()
    - xen_free_unpopulated_pages()
    - kvfree(pages)

The surviving VMA still holds the dangling pointer. When it is later
destroyed, the same sequence runs again, which leads to a double free.

Fix this issue by adding a .may_split callback denying the VMA split.

This is XSA-487 / CVE-2026-31787

Fixes: d71f513985c2 ("xen: privcmd: support autotranslated physmap guests.")
Reported-by: Atharva Vartak &lt;atharva.a.vartak@gmail.com&gt;
Suggested-by: Atharva Vartak &lt;atharva.a.vartak@gmail.com&gt;
Signed-off-by: Juergen Gross &lt;jgross@suse.com&gt;
Reviewed-by: Jan Beulich &lt;jbeulich@suse.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>xen/privcmd: unregister xenstore notifier on module exit</title>
<updated>2026-03-26T07:57:51+00:00</updated>
<author>
<name>GuoHan Zhao</name>
<email>zhaoguohan@kylinos.cn</email>
</author>
<published>2026-03-25T12:02:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cd7e1fef5a1ca1c4fcd232211962ac2395601636'/>
<id>urn:sha1:cd7e1fef5a1ca1c4fcd232211962ac2395601636</id>
<content type='text'>
Commit 453b8fb68f36 ("xen/privcmd: restrict usage in
unprivileged domU") added a xenstore notifier to defer setting the
restriction target until Xenstore is ready.

XEN_PRIVCMD can be built as a module, but privcmd_exit() leaves that
notifier behind. Balance the notifier lifecycle by unregistering it on
module exit.

This is harmless even if xenstore was already ready at registration
time and the notifier was never queued on the chain.

Fixes: 453b8fb68f3641fe ("xen/privcmd: restrict usage in unprivileged domU")
Signed-off-by: GuoHan Zhao &lt;zhaoguohan@kylinos.cn&gt;
Reviewed-by: Juergen Gross &lt;jgross@suse.com&gt;
Signed-off-by: Juergen Gross &lt;jgross@suse.com&gt;
Message-ID: &lt;20260325120246.252899-1-zhaoguohan@kylinos.cn&gt;
</content>
</entry>
<entry>
<title>xen/privcmd: add boot control for restricted usage in domU</title>
<updated>2026-03-20T11:06:01+00:00</updated>
<author>
<name>Juergen Gross</name>
<email>jgross@suse.com</email>
</author>
<published>2025-10-14T11:28:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1613462be621ad5103ec338a7b0ca0746ec4e5f1'/>
<id>urn:sha1:1613462be621ad5103ec338a7b0ca0746ec4e5f1</id>
<content type='text'>
When running in an unprivileged domU under Xen, the privcmd driver
is restricted to allow only hypercalls against a target domain, for
which the current domU is acting as a device model.

Add a boot parameter "unrestricted" to allow all hypercalls (the
hypervisor will still refuse destructive hypercalls affecting other
guests).

Make this new parameter effective only in case the domU wasn't started
using secure boot, as otherwise hypercalls targeting the domU itself
might result in violating the secure boot functionality.

This is achieved by adding another lockdown reason, which can be
tested to not being set when applying the "unrestricted" option.

This is part of XSA-482

Signed-off-by: Juergen Gross &lt;jgross@suse.com&gt;
---
V2:
- new patch
</content>
</entry>
<entry>
<title>xen/privcmd: restrict usage in unprivileged domU</title>
<updated>2026-03-20T11:05:41+00:00</updated>
<author>
<name>Juergen Gross</name>
<email>jgross@suse.com</email>
</author>
<published>2025-10-09T14:54:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=453b8fb68f3641fea970db88b7d9a153ed2a37e8'/>
<id>urn:sha1:453b8fb68f3641fea970db88b7d9a153ed2a37e8</id>
<content type='text'>
The Xen privcmd driver allows to issue arbitrary hypercalls from
user space processes. This is normally no problem, as access is
usually limited to root and the hypervisor will deny any hypercalls
affecting other domains.

In case the guest is booted using secure boot, however, the privcmd
driver would be enabling a root user process to modify e.g. kernel
memory contents, thus breaking the secure boot feature.

The only known case where an unprivileged domU is really needing to
use the privcmd driver is the case when it is acting as the device
model for another guest. In this case all hypercalls issued via the
privcmd driver will target that other guest.

Fortunately the privcmd driver can already be locked down to allow
only hypercalls targeting a specific domain, but this mode can be
activated from user land only today.

The target domain can be obtained from Xenstore, so when not running
in dom0 restrict the privcmd driver to that target domain from the
beginning, resolving the potential problem of breaking secure boot.

This is XSA-482

Reported-by: Teddy Astie &lt;teddy.astie@vates.tech&gt;
Fixes: 1c5de1939c20 ("xen: add privcmd driver")
Signed-off-by: Juergen Gross &lt;jgross@suse.com&gt;
---
V2:
- defer reading from Xenstore if Xenstore isn't ready yet (Jan Beulich)
- wait in open() if target domain isn't known yet
- issue message in case no target domain found (Jan Beulich)
</content>
</entry>
<entry>
<title>Convert 'alloc_obj' family to use the new default GFP_KERNEL argument</title>
<updated>2026-02-22T01:09:51+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-02-22T00:37:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43'/>
<id>urn:sha1:bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43</id>
<content type='text'>
This was done entirely with mindless brute force, using

    git grep -l '\&lt;k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>treewide: Replace kmalloc with kmalloc_obj for non-scalar types</title>
<updated>2026-02-21T09:02:28+00:00</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2026-02-21T07:49:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=69050f8d6d075dc01af7a5f2f550a8067510366f'/>
<id>urn:sha1:69050f8d6d075dc01af7a5f2f550a8067510366f</id>
<content type='text'>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
</content>
</entry>
<entry>
<title>xen: privcmd: WQ_PERCPU added to alloc_workqueue users</title>
<updated>2026-01-12T10:28:46+00:00</updated>
<author>
<name>Marco Crivellari</name>
<email>marco.crivellari@suse.com</email>
</author>
<published>2025-11-06T15:58:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=378f1dc3d6472f9bd437f564daf5a1a2f32505e7'/>
<id>urn:sha1:378f1dc3d6472f9bd437f564daf5a1a2f32505e7</id>
<content type='text'>
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.

alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.

This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.

This continues the effort to refactor workqueue APIs, which began with
the introduction of new workqueues and a new alloc_workqueue flag in:

commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")

This change adds a new WQ_PERCPU flag to explicitly request alloc_workqueue()
to be per-cpu when WQ_UNBOUND has not been specified.

With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.

Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.

Suggested-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Marco Crivellari &lt;marco.crivellari@suse.com&gt;
Reviewed-by: Juergen Gross &lt;jgross@suse.com&gt;
Signed-off-by: Juergen Gross &lt;jgross@suse.com&gt;
Message-ID: &lt;20251106155831.306248-3-marco.crivellari@suse.com&gt;
</content>
</entry>
<entry>
<title>xen: replace XENFEAT_auto_translated_physmap with xen_pv_domain()</title>
<updated>2025-09-08T15:01:36+00:00</updated>
<author>
<name>Juergen Gross</name>
<email>jgross@suse.com</email>
</author>
<published>2025-08-26T14:56:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0f4283123fe1e6016296048d0fdcfce615047a13'/>
<id>urn:sha1:0f4283123fe1e6016296048d0fdcfce615047a13</id>
<content type='text'>
Instead of testing the XENFEAT_auto_translated_physmap feature, just
use !xen_pv_domain() which is equivalent.

This has the advantage that a kernel not built with CONFIG_XEN_PV
will be smaller due to dead code elimination.

Reviewed-by: Jason Andryuk &lt;jason.andryuk@amd.com&gt;
Signed-off-by: Juergen Gross &lt;jgross@suse.com&gt;
Message-ID: &lt;20250826145608.10352-3-jgross@suse.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'pull-fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2024-11-18T20:24:06+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-11-18T20:24:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0f25f0e4efaeb68086f7e65c442f2d648b21736f'/>
<id>urn:sha1:0f25f0e4efaeb68086f7e65c442f2d648b21736f</id>
<content type='text'>
Pull 'struct fd' class updates from Al Viro:
 "The bulk of struct fd memory safety stuff

  Making sure that struct fd instances are destroyed in the same scope
  where they'd been created, getting rid of reassignments and passing
  them by reference, converting to CLASS(fd{,_pos,_raw}).

  We are getting very close to having the memory safety of that stuff
  trivial to verify"

* tag 'pull-fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (28 commits)
  deal with the last remaing boolean uses of fd_file()
  css_set_fork(): switch to CLASS(fd_raw, ...)
  memcg_write_event_control(): switch to CLASS(fd)
  assorted variants of irqfd setup: convert to CLASS(fd)
  do_pollfd(): convert to CLASS(fd)
  convert do_select()
  convert vfs_dedupe_file_range().
  convert cifs_ioctl_copychunk()
  convert media_request_get_by_fd()
  convert spu_run(2)
  switch spufs_calls_{get,put}() to CLASS() use
  convert cachestat(2)
  convert do_preadv()/do_pwritev()
  fdget(), more trivial conversions
  fdget(), trivial conversions
  privcmd_ioeventfd_assign(): don't open-code eventfd_ctx_fdget()
  o2hb_region_dev_store(): avoid goto around fdget()/fdput()
  introduce "fd_pos" class, convert fdget_pos() users to it.
  fdget_raw() users: switch to CLASS(fd_raw)
  convert vmsplice() to CLASS(fd)
  ...
</content>
</entry>
<entry>
<title>assorted variants of irqfd setup: convert to CLASS(fd)</title>
<updated>2024-11-03T06:28:07+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2024-07-20T05:48:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=66635b0776243ff567db08601546b7f26b67dd08'/>
<id>urn:sha1:66635b0776243ff567db08601546b7f26b67dd08</id>
<content type='text'>
in all of those failure exits prior to fdget() are plain returns and
the only thing done after fdput() is (on failure exits) a kfree(),
which can be done before fdput() just fine.

NOTE: in acrn_irqfd_assign() 'fail:' failure exit is wrong for
eventfd_ctx_fileget() failure (we only want fdput() there) and once
we stop doing that, it doesn't need to check if eventfd is NULL or
ERR_PTR(...) there.

NOTE: in privcmd we move fdget() up before the allocation - more
to the point, before the copy_from_user() attempt.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
</feed>
