Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
Pull kvm fix from Paolo Bonzini:
"Do not always honor guest PAT on CPUs that support self-snoop.
This triggers an issue in the bochsdrm driver, which used ioremap()
instead of ioremap_wc() to map the video RAM.
The revert lets video RAM use the WB memory type instead of the slower
UC memory type"
* tag 'for-linus-6.11' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
Revert "KVM: VMX: Always honor guest PAT on CPUs that support self-snoop"
|
|
This reverts commit 377b2f359d1f71c75f8cc352b5c81f2210312d83.
This caused a regression with the bochsdrm driver, which used ioremap()
instead of ioremap_wc() to map the video RAM. After the commit, the
WB memory type is used without the IGNORE_PAT, resulting in the slower
UC memory type. In fact, UC is slow enough to basically cause guests
to not boot... but only on new processors such as Sapphire Rapids and
Cascade Lake. Coffee Lake for example works properly, though that might
also be an effect of being on a larger, more NUMA system.
The driver has been fixed but that does not help older guests. Until we
figure out whether Cascade Lake and newer processors are working as
intended, revert the commit. Long term we might add a quirk, but the
details depend on whether the processors are working as intended: for
example if they are, the quirk might reference bochs-compatible devices,
e.g. in the name and documentation, so that userspace can disable the
quirk by default and only leave it enabled if such a device is being
exposed to the guest.
If instead this is actually a bug in CLX+, then the actions we need to
take are different and depend on the actual cause of the bug.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
- One Intel patch that I mistakenly merged into for-next despite it
belonging in fixes: add Arrow Lake-H/U ACPI ID so this Arrow Lake
chip probes.
- One fix making the CY895x0 reg cache work, which is good because it
makes the device work too.
* tag 'pinctrl-v6.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: pinctrl-cy8c95x0: Fix regcache
pinctrl: meteorlake: Add Arrow Lake-H/U ACPI ID
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A few last-minute ASoC fixes and MAINTAINERS update.
All look small, obvious and nice-to-have fixes for 6.11-final"
* tag 'sound-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ASoC: meson: axg-card: fix 'use-after-free'
ASoC: codecs: avoid possible garbage value in peb2466_reg_read()
MAINTAINERS: update Pierre Bossart's email and role
ASoC: tas2781: fix to save the dsp bin file name into the correct array in case name_prefix is not NULL
ASoC: Intel: soc-acpi-intel-mtl-match: add missing empty item
ASoC: Intel: soc-acpi-intel-lnl-match: add missing empty item
|
|
Pull smb client fix from Steve French:
"Fix for packet signing of write"
* tag '6.11-rc7-SMB3-client-fix' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Fix signature miscalculation
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v6.11
A few last minute fixes, plus an update for Pierre's contact details and
status. It'd be good to get these into v6.11 (especially the
MAINTAINERS update) but it wouldn't be the end of the world if they
waited for the merge window, none of them are super remarkable and it's
just a question of timing that they're last minute.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci fix from Bjorn Helgaas:
- Prevent a possible deadlock (reported by lockdep) when a driver
relinquishes a pci_dev, another driver claims it, and one uses
managed pcim_enable_device() and the other doesn't (Philipp Stanner)
* tag 'pci-v6.11-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
PCI: Fix potential deadlock in pcim_intx()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A few last minute fixes for v6.11, they're all individually
unremarkable and only last minute due to when they came in"
* tag 'spi-fix-v6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: nxp-fspi: fix the KASAN report out-of-bounds bug
spi: geni-qcom: Fix incorrect free_irq() sequence
spi: geni-qcom: Undo runtime PM changes at driver exit time
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire
Pull soundwire fix from Vinod Koul:
- Revert of earlier fix sent for non-continuous port map programming
which caused regression on Intel platforms
* tag 'soundwire-6.11-fixes_2' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
soundwire: stream: Revert "soundwire: stream: fix programming slave ports for non-continous port maps"
|
|
Pull drm fixes from Dave Airlie:
"Regular fixes pull, the amdgpu JPEG engine fixes are probably the
biggest, they look to block some register accessing, otherwise there
are just minor fixes and regression fixes all over.
nouveau had a regression report going back a few kernels that finally
got fixed, Not entirely happy with so many changes so late, but they
all seem quite benign apart from the jpeg one.
dma-buf/heaps:
- fix off by one in CMA heap fault handler
syncobj:
- fix syncobj leak in drm_syncobj_eventfd_ioctl
amdgpu:
- Avoid races between set_drr() functions and dc_state_destruct()
- Fix regerssion related to zpos
- Fix regression related to overlay cursor
- SMU 14.x updates
- JPEG fixes
- Silence an UBSAN warning
amdkfd:
- Fetch cacheline size from IP discovery
i915:
- Prevent a possible int overflow in wq offsets
xe:
- Remove a double include
- Fix null checks and UAF
- Fix access_ok check in user_fence_create
- Fix compat IS_DISPLAY_STEP() range
- OA fix
- Fixes in show_meminfo
nouveau:
- fix GP10x regression on boot
stm:
- add COMMON_CLK dep
rockchip:
- iommu api change
tegra:
- iommu api change"
* tag 'drm-fixes-2024-09-13' of https://gitlab.freedesktop.org/drm/kernel: (25 commits)
drm/xe/client: add missing bo locking in show_meminfo()
drm/xe/client: fix deadlock in show_meminfo()
drm/xe/oa: Enable Xe2+ PES disaggregation
drm/xe/display: fix compat IS_DISPLAY_STEP() range end
drm/xe: Fix access_ok check in user_fence_create
drm/xe: Fix possible UAF in guc_exec_queue_process_msg
drm/xe: Remove fence check from send_tlb_invalidation
drm/xe/gt: Remove double include
drm/amd/display: Add all planes on CRTC to state for overlay cursor
drm/amdgpu/atomfirmware: Silence UBSAN warning
drm/amd/amdgpu: apply command submission parser for JPEG v1
drm/amd/amdgpu: apply command submission parser for JPEG v2+
drm/amd/pm: fix the pp_dpm_pcie issue on smu v14.0.2/3
drm/amd/pm: update the features set on smu v14.0.2/3
drm/amd/display: Do not reset planes based on crtc zpos_changed
drm/amd/display: Avoid race between dcn35_set_drr() and dc_state_destruct()
drm/amd/display: Avoid race between dcn10_set_drr() and dc_state_destruct()
drm/amdkfd: Add cache line size info
drm/tegra: Use iommu_paging_domain_alloc()
drm/rockchip: Use iommu_paging_domain_alloc()
...
|
|
The size of the mux stride was off by one, which could result in
invalid pin configuration on the device side or invalid state
readings on the software side.
While on it also update the code and:
- Increase the mux stride size to 16
- Align the virtual muxed regmap range to 16
- Start the regmap window at the selector
- Mark reserved registers as not-readable
Fixes: 8670de9fae49 ("pinctrl: cy8c95x0: Use regmap ranges")
Signed-off-by: Patrick Rudolph <patrick.rudolph@9elements.com>
Reported-by: Andy Shevchenko <andy@kernel.org>
Reviewed-by: Andy Shevchenko <andy@kernel.org>
Link: https://lore.kernel.org/20240902072859.583490-1-patrick.rudolph@9elements.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pinctrl/intel into fixes
intel-pinctrl for v6.11-1
This includes a new ACPI ID that is added to the Intel Meteor Lake
driver to support recent Intel Arrow Lake hardware.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
- Remove a double include (Lucas)
- Fix null checks and UAF (Brost)
- Fix access_ok check in user_fence_create (Nirmoy)
- Fix compat IS_DISPLAY_STEP() range (Jani)
- OA fix (Ashutosh)
- Fixes in show_meminfo (Auld)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZuL-sORu54zfz1Lf@intel.com
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
An off-by-one fix for the CMA DMA-buf heap, An init fix for nouveau, a
config dependency fix for stm, a syncobj leak fix, and two iommu fixes
for tegra and rockchip.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240912-phenomenal-upbeat-grouse-a26781@houat
|
|
https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes
- Prevent a possible int overflow in wq offsets [guc] (Nikita Zhandarovich)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tursulin@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZuKTN2XngNhBB3z3@linux
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.11-2024-09-11:
amdgpu:
- Avoid races between set_drr() functions and dc_state_destruct()
- Fix regerssion related to zpos
- Fix regression related to overlay cursor
- SMU 14.x updates
- JPEG fixes
- Silence an UBSAN warning
amdkfd:
- Fetch cacheline size from IP discovery
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240911170528.838655-1-alexander.deucher@amd.com
|
|
Fix the calculation of packet signatures by adding the offset into a page
in the read or write data payload when hashing the pages from it.
Fixes: 39bc58203f04 ("cifs: Add a function to Hash the contents of an iterator")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Tom Talpey <tom@talpey.com>
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fix from Stephen Boyd:
"One build fix for 32-bit arches using the Qualcomm PLL driver. It's
cheaper to use a comparison here instead of a division so we just do
that to fix the build"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: qcom: clk-alpha-pll: Simplify the zonda_pll_adjust_l_val()
|
|
Pull block fix from Jens Axboe:
"Just a single fix for a deadlock issue that can happen if someone
attempts to change the root disk IO scheduler with a module that
requires loading from disk.
Changing the scheduler freezes the queue while that operation is
happening, hence causing a deadlock"
* tag 'block-6.11-20240912' of git://git.kernel.dk/linux:
block: Prevent deadlocks when switching elevators
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fix from Guenter Roeck:
- Fix clearing status register bits for chips supporting older
PMBus versions
* tag 'hwmon-for-v6.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (pmbus) Conditionally clear individual status bits for pmbus rev >= 1.2
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fix from Tejun Heo:
"A fix for a NULL worker->pool deref bug which can be triggered when a
worker is created and then destroyed immediately"
* tag 'wq-for-6.11-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: Clear worker->pool in the worker thread context
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- Two fixes for smp_processor_id() calls in preemptible sections: one
if the perf driver, and one in the fence.i prctl.
* tag 'riscv-for-linus-6.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Disable preemption while handling PR_RISCV_CTX_SW_FENCEI_OFF
drivers: perf: Fix smp_processor_id() use in preemptible code
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from netfilter.
There is a recently notified BT regression with no fix yet. I do not
think a fix will land in the next week.
Current release - regressions:
- core: tighten bad gso csum offset check in virtio_net_hdr
- netfilter: move nf flowtable bpf initialization in
nf_flow_table_module_init()
- eth: ice: stop calling pci_disable_device() as we use pcim
- eth: fou: fix null-ptr-deref in GRO.
Current release - new code bugs:
- hsr: prevent NULL pointer dereference in hsr_proxy_announce()
Previous releases - regressions:
- hsr: remove seqnr_lock
- netfilter: nft_socket: fix sk refcount leaks
- mptcp: pm: fix uaf in __timer_delete_sync
- phy: dp83822: fix NULL pointer dereference on DP83825 devices
- eth: revert "virtio_net: rx enable premapped mode by default"
- eth: octeontx2-af: Modify SMQ flush sequence to drop packets
Previous releases - always broken:
- eth: mlx5: fix bridge mode operations when there are no VFs
- eth: igb: Always call igb_xdp_ring_update_tail() under Tx lock"
* tag 'net-6.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (36 commits)
net: netfilter: move nf flowtable bpf initialization in nf_flow_table_module_init()
net: tighten bad gso csum offset check in virtio_net_hdr
netlink: specs: mptcp: fix port endianness
net: dpaa: Pad packets to ETH_ZLEN
mptcp: pm: Fix uaf in __timer_delete_sync
net: libwx: fix number of Rx and Tx descriptors
net: dsa: felix: ignore pending status of TAS module when it's disabled
net: hsr: prevent NULL pointer dereference in hsr_proxy_announce()
selftests: mptcp: include net_helper.sh file
selftests: mptcp: include lib.sh file
selftests: mptcp: join: restrict fullmesh endp on 1st sf
netfilter: nft_socket: make cgroupsv2 matching work with namespaces
netfilter: nft_socket: fix sk refcount leaks
MAINTAINERS: Add ethtool pse-pd to PSE NETWORK DRIVER
dt-bindings: net: tja11xx: fix the broken binding
selftests: net: csum: Fix checksums for packets with non-zero padding
net: phy: dp83822: Fix NULL pointer dereference on DP83825 devices
virtio_net: disable premapped mode by default
Revert "virtio_net: big mode skip the unmap check"
Revert "virtio_net: rx remove premapped failover code"
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Ilpo Järvinen:
- asus-wmi: Disable OOBE that interferes with backlight control
- panasonic-laptop: Two fixes to SINF array handling
* tag 'platform-drivers-x86-v6.11-7' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: asus-wmi: Disable OOBE experience on Zenbook S 16
platform/x86: panasonic-laptop: Allocate 1 entry extra in the sinf array
platform/x86: panasonic-laptop: Fix SINF array out of bounds accesses
|
|
As Jann points out, PFN mappings are special, because unlike normal
memory mappings, there is no lifetime information associated with the
mapping - it is just a raw mapping of PFNs with no reference counting of
a 'struct page'.
That's all very much intentional, but it does mean that it's easy to
mess up the cleanup in case of errors. Yes, a failed mmap() will always
eventually clean up any partial mappings, but without any explicit
lifetime in the page table mapping itself, it's very easy to do the
error handling in the wrong order.
In particular, it's easy to mistakenly free the physical backing store
before the page tables are actually cleaned up and (temporarily) have
stale dangling PTE entries.
To make this situation less error-prone, just make sure that any partial
pfn mapping is torn down early, before any other error handling.
Reported-and-tested-by: Jann Horn <jannh@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Simona Vetter <simona.vetter@ffwll.ch>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
bo_meminfo() wants to inspect bo state like tt and the ttm resource,
however this state can change at any point leading to stuff like NPD and
UAF, if the bo lock is not held. Grab the bo lock when calling
bo_meminfo(), ensuring we drop any spinlocks first. In the case of
object_idr we now also need to hold a ref.
v2 (MattB)
- Also add xe_bo_assert_held()
Fixes: 0845233388f8 ("drm/xe: Implement fdinfo memory stats printing")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240911155527.178910-6-matthew.auld@intel.com
(cherry picked from commit 4f63d712fa104c3ebefcb289d1e733e86d8698c7)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
There is a real deadlock as well as sleeping in atomic() bug in here, if
the bo put happens to be the last ref, since bo destruction wants to
grab the same spinlock and sleeping locks. Fix that by dropping the ref
using xe_bo_put_deferred(), and moving the final commit outside of the
lock. Dropping the lock around the put is tricky since the bo can go
out of scope and delete itself from the list, making it difficult to
navigate to the next list entry.
Fixes: 0845233388f8 ("drm/xe: Implement fdinfo memory stats printing")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2727
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240911155527.178910-5-matthew.auld@intel.com
(cherry picked from commit 0083b8e6f11d7662283a267d4ce7c966812ffd8a)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Enable Xe2+ PES disaggregation (for OAG) to retrieve disaggregated metrics
when disaggregated data is needed. Userspace can select whether to receive
aggregated or disaggregated metrics via the particular OA configuration it
uses (programmed via DRM_XE_OBSERVATION_OP_ADD_CONFIG).
Bspec: 61101
Fixes: e936f885f1e9 ("drm/xe/oa/uapi: Expose OA stream fd")
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240909165933.2638765-1-ashutosh.dixit@intel.com
Cc: stable@vger.kernel.org
(cherry picked from commit fb2551a0e93897aec7fb3d4f473ebc06b146d160)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
It's supposed to be an open range at the end like in i915. Fingers
crossed that nobody relies on this definition.
Fixes: 44e694958b95 ("drm/xe/display: Implement display support")
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/fe8743770694e429f6902491cdb306c97bdf701a.1724180287.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit 453afb1a439994deeacb8d9ecbb48c1f2348ea0a)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Check size of the data not size of the pointer.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202407300421.IBkAja96-lkp@intel.com/
Fixes: ddeb7989a98f ("drm/xe: Validate user fence during creation")
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Apoorva Singh <apoorva.singh@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240806110722.28661-1-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
(cherry picked from commit e102b5ed6e283a144793cab8fcd95f61d0ddbadb)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Store xe_device ahead of processing message as message can be free'd in
some cases.
v2:
- Including missing local changes
v3:
- Resend for CI
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202407231445.rpisd1vA-lkp@intel.com/
Fixes: 55ea73aacfb9 ("drm/xe: Build PM into GuC CT layer")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240724164341.1848954-1-matthew.brost@intel.com
(cherry picked from commit 1a394b4f504f33eac8c38b6f42ba025105c7e869)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
'fence' argument in send_tlb_invalidation cannot be NULL, remove
non-NULL check from send_tlb_invalidation.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202407231049.esig0Fkb-lkp@intel.com/
Fixes: 58bfe6674467 ("drm/xe: Drop xe_gt_tlb_invalidation_wait")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240723190714.1744653-1-matthew.brost@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
(cherry picked from commit 6482253e6e1ad1c3a76645a3899d3cfdb5b918cb)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
The header generated/xe_wa_oob.h is included twice. Remove one.
Fixes: 27cb2b7fec2a ("drm/xe/bmg: implement Wa_16023588340")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/r/202407052122.AzuWSPuo-lkp@intel.com/
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708173301.1543871-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 3d122660dc70029d9cccb4e8670125f0affa959e)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
nf_flow_table_module_init()
Move nf flowtable bpf initialization in nf_flow_table module load
routine since nf_flow_table_bpf is part of nf_flow_table module and not
nf_flow_table_inet one. This patch allows to avoid the following kernel
warning running the reproducer below:
$modprobe nf_flow_table_inet
$rmmod nf_flow_table_inet
$modprobe nf_flow_table_inet
modprobe: ERROR: could not insert 'nf_flow_table_inet': Invalid argument
[ 184.081501] ------------[ cut here ]------------
[ 184.081527] WARNING: CPU: 0 PID: 1362 at kernel/bpf/btf.c:8206 btf_populate_kfunc_set+0x23c/0x330
[ 184.081550] CPU: 0 UID: 0 PID: 1362 Comm: modprobe Kdump: loaded Not tainted 6.11.0-0.rc5.22.el10.x86_64 #1
[ 184.081553] Hardware name: Red Hat OpenStack Compute, BIOS 1.14.0-1.module+el8.4.0+8855+a9e237a9 04/01/2014
[ 184.081554] RIP: 0010:btf_populate_kfunc_set+0x23c/0x330
[ 184.081558] RSP: 0018:ff22cfb38071fc90 EFLAGS: 00010202
[ 184.081559] RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000000
[ 184.081560] RDX: 000000000000006e RSI: ffffffff95c00000 RDI: ff13805543436350
[ 184.081561] RBP: ffffffffc0e22180 R08: ff13805543410808 R09: 000000000001ec00
[ 184.081562] R10: ff13805541c8113c R11: 0000000000000010 R12: ff13805541b83c00
[ 184.081563] R13: ff13805543410800 R14: 0000000000000001 R15: ffffffffc0e2259a
[ 184.081564] FS: 00007fa436c46740(0000) GS:ff1380557ba00000(0000) knlGS:0000000000000000
[ 184.081569] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 184.081570] CR2: 000055e7b3187000 CR3: 0000000100c48003 CR4: 0000000000771ef0
[ 184.081571] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 184.081572] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 184.081572] PKRU: 55555554
[ 184.081574] Call Trace:
[ 184.081575] <TASK>
[ 184.081578] ? show_trace_log_lvl+0x1b0/0x2f0
[ 184.081580] ? show_trace_log_lvl+0x1b0/0x2f0
[ 184.081582] ? __register_btf_kfunc_id_set+0x199/0x200
[ 184.081585] ? btf_populate_kfunc_set+0x23c/0x330
[ 184.081586] ? __warn.cold+0x93/0xed
[ 184.081590] ? btf_populate_kfunc_set+0x23c/0x330
[ 184.081592] ? report_bug+0xff/0x140
[ 184.081594] ? handle_bug+0x3a/0x70
[ 184.081596] ? exc_invalid_op+0x17/0x70
[ 184.081597] ? asm_exc_invalid_op+0x1a/0x20
[ 184.081601] ? btf_populate_kfunc_set+0x23c/0x330
[ 184.081602] __register_btf_kfunc_id_set+0x199/0x200
[ 184.081605] ? __pfx_nf_flow_inet_module_init+0x10/0x10 [nf_flow_table_inet]
[ 184.081607] do_one_initcall+0x58/0x300
[ 184.081611] do_init_module+0x60/0x230
[ 184.081614] __do_sys_init_module+0x17a/0x1b0
[ 184.081617] do_syscall_64+0x7d/0x160
[ 184.081620] ? __count_memcg_events+0x58/0xf0
[ 184.081623] ? handle_mm_fault+0x234/0x350
[ 184.081626] ? do_user_addr_fault+0x347/0x640
[ 184.081630] ? clear_bhb_loop+0x25/0x80
[ 184.081633] ? clear_bhb_loop+0x25/0x80
[ 184.081634] ? clear_bhb_loop+0x25/0x80
[ 184.081637] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 184.081639] RIP: 0033:0x7fa43652e4ce
[ 184.081647] RSP: 002b:00007ffe8213be18 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
[ 184.081649] RAX: ffffffffffffffda RBX: 000055e7b3176c20 RCX: 00007fa43652e4ce
[ 184.081650] RDX: 000055e7737fde79 RSI: 0000000000003990 RDI: 000055e7b3185380
[ 184.081651] RBP: 000055e7737fde79 R08: 0000000000000007 R09: 000055e7b3179bd0
[ 184.081651] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000040000
[ 184.081652] R13: 000055e7b3176fa0 R14: 0000000000000000 R15: 000055e7b3179b80
Fixes: 391bb6594fd3 ("netfilter: Add bpf_xdp_flow_lookup kfunc")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Florian Westphal <fw@strlen.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Link: https://patch.msgid.link/20240911-nf-flowtable-bpf-modprob-fix-v1-1-f9fc075aafc3@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following batch contains two fixes from Florian Westphal:
Patch #1 fixes a sk refcount leak in nft_socket on mismatch.
Patch #2 fixes cgroupsv2 matching from containers due to incorrect
level in subtree.
netfilter pull request 24-09-12
* tag 'nf-24-09-12' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nft_socket: make cgroupsv2 matching work with namespaces
netfilter: nft_socket: fix sk refcount leaks
====================
Link: https://patch.msgid.link/20240911222520.3606-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
25216afc9db5 ("PCI: Add managed pcim_intx()") moved the allocation step for
pci_intx()'s device resource from pcim_enable_device() to pcim_intx(). As
before, pcim_enable_device() sets pci_dev.is_managed to true; and it is
never set to false again.
Due to the lifecycle of a struct pci_dev, it can happen that a second
driver obtains the same pci_dev after a first driver ran. If one driver
uses pcim_enable_device() and the other doesn't, this causes the other
driver to run into managed pcim_intx(), which will try to allocate when
called for the first time.
Allocations might sleep, so calling pci_intx() while holding spinlocks
becomes then invalid, which causes lockdep warnings and could cause
deadlocks:
========================================================
WARNING: possible irq lock inversion dependency detected
6.11.0-rc6+ #59 Tainted: G W
--------------------------------------------------------
CPU 0/KVM/1537 just changed the state of lock:
ffffa0f0cff965f0 (&vdev->irqlock){-...}-{2:2}, at:
vfio_intx_handler+0x21/0xd0 [vfio_pci_core] but this lock took another,
HARDIRQ-unsafe lock in the past: (fs_reclaim){+.+.}-{0:0}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(fs_reclaim);
local_irq_disable();
lock(&vdev->irqlock);
lock(fs_reclaim);
<Interrupt>
lock(&vdev->irqlock);
*** DEADLOCK ***
Have pcim_enable_device()'s release function, pcim_disable_device(), set
pci_dev.is_managed to false so that subsequent drivers using the same
struct pci_dev do not implicitly run into managed code.
Link: https://lore.kernel.org/r/20240905072556.11375-2-pstanner@redhat.com
Fixes: 25216afc9db5 ("PCI: Add managed pcim_intx()")
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Closes: https://lore.kernel.org/all/20240903094431.63551744.alex.williamson@redhat.com/
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
|
|
Marc Hartmayer reported:
[ 23.133876] Unable to handle kernel pointer dereference in virtual kernel address space
[ 23.133950] Failing address: 0000000000000000 TEID: 0000000000000483
[ 23.133954] Fault in home space mode while using kernel ASCE.
[ 23.133957] AS:000000001b8f0007 R3:0000000056cf4007 S:0000000056cf3800 P:000000000000003d
[ 23.134207] Oops: 0004 ilc:2 [#1] SMP
(snip)
[ 23.134516] Call Trace:
[ 23.134520] [<0000024e326caf28>] worker_thread+0x48/0x430
[ 23.134525] ([<0000024e326caf18>] worker_thread+0x38/0x430)
[ 23.134528] [<0000024e326d3a3e>] kthread+0x11e/0x130
[ 23.134533] [<0000024e3264b0dc>] __ret_from_fork+0x3c/0x60
[ 23.134536] [<0000024e333fb37a>] ret_from_fork+0xa/0x38
[ 23.134552] Last Breaking-Event-Address:
[ 23.134553] [<0000024e333f4c04>] mutex_unlock+0x24/0x30
[ 23.134562] Kernel panic - not syncing: Fatal exception: panic_on_oops
With debuging and analysis, worker_thread() accesses to the nullified
worker->pool when the newly created worker is destroyed before being
waken-up, in which case worker_thread() can see the result detach_worker()
reseting worker->pool to NULL at the begining.
Move the code "worker->pool = NULL;" out from detach_worker() to fix the
problem.
worker->pool had been designed to be constant for regular workers and
changeable for rescuer. To share attaching/detaching code for regular
and rescuer workers and to avoid worker->pool being accessed inadvertently
when the worker has been detached, worker->pool is reset to NULL when
detached no matter the worker is rescuer or not.
To maintain worker->pool being reset after detached, move the code
"worker->pool = NULL;" in the worker thread context after detached.
It is either be in the regular worker thread context after PF_WQ_WORKER
is cleared or in rescuer worker thread context with wq_pool_attach_mutex
held. So it is safe to do so.
Cc: Marc Hartmayer <mhartmay@linux.ibm.com>
Link: https://lore.kernel.org/lkml/87wmjj971b.fsf@linux.ibm.com/
Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Fixes: f4b7b53c94af ("workqueue: Detach workers directly in idle_cull_fn()")
Cc: stable@vger.kernel.org # v6.11+
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
The referenced commit drops bad input, but has false positives.
Tighten the check to avoid these.
The check detects illegal checksum offload requests, which produce
csum_start/csum_off beyond end of packet after segmentation.
But it is based on two incorrect assumptions:
1. virtio_net_hdr_to_skb with VIRTIO_NET_HDR_GSO_TCP[46] implies GSO.
True in callers that inject into the tx path, such as tap.
But false in callers that inject into rx, like virtio-net.
Here, the flags indicate GRO, and CHECKSUM_UNNECESSARY or
CHECKSUM_NONE without VIRTIO_NET_HDR_F_NEEDS_CSUM is normal.
2. TSO requires checksum offload, i.e., ip_summed == CHECKSUM_PARTIAL.
False, as tcp[46]_gso_segment will fix up csum_start and offset for
all other ip_summed by calling __tcp_v4_send_check.
Because of 2, we can limit the scope of the fix to virtio_net_hdr
that do try to set these fields, with a bogus value.
Link: https://lore.kernel.org/netdev/20240909094527.GA3048202@port70.net/
Fixes: 89add40066f9 ("net: drop bad gso csum_start and offset in virtio_net_hdr")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20240910213553.839926-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The MPTCP port attribute is in host endianness, but was documented
as big-endian in the ynl specification.
Below are two examples from net/mptcp/pm_netlink.c showing that the
attribute is converted to/from host endianness for use with netlink.
Import from netlink:
addr->port = htons(nla_get_u16(tb[MPTCP_PM_ADDR_ATTR_PORT]))
Export to netlink:
nla_put_u16(skb, MPTCP_PM_ADDR_ATTR_PORT, ntohs(addr->port))
Where addr->port is defined as __be16.
No functional change intended.
Fixes: bc8aeb2045e2 ("Documentation: netlink: add a YAML spec for mptcp")
Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240911091003.1112179-1-ast@fiberby.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When sending packets under 60 bytes, up to three bytes of the buffer
following the data may be leaked. Avoid this by extending all packets to
ETH_ZLEN, ensuring nothing is leaked in the padding. This bug can be
reproduced by running
$ ping -s 11 destination
Fixes: 9ad1a3749333 ("dpaa_eth: add support for DPAA Ethernet")
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20240910143144.1439910-1-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There are two paths to access mptcp_pm_del_add_timer, result in a race
condition:
CPU1 CPU2
==== ====
net_rx_action
napi_poll netlink_sendmsg
__napi_poll netlink_unicast
process_backlog netlink_unicast_kernel
__netif_receive_skb genl_rcv
__netif_receive_skb_one_core netlink_rcv_skb
NF_HOOK genl_rcv_msg
ip_local_deliver_finish genl_family_rcv_msg
ip_protocol_deliver_rcu genl_family_rcv_msg_doit
tcp_v4_rcv mptcp_pm_nl_flush_addrs_doit
tcp_v4_do_rcv mptcp_nl_remove_addrs_list
tcp_rcv_established mptcp_pm_remove_addrs_and_subflows
tcp_data_queue remove_anno_list_by_saddr
mptcp_incoming_options mptcp_pm_del_add_timer
mptcp_pm_del_add_timer kfree(entry)
In remove_anno_list_by_saddr(running on CPU2), after leaving the critical
zone protected by "pm.lock", the entry will be released, which leads to the
occurrence of uaf in the mptcp_pm_del_add_timer(running on CPU1).
Keeping a reference to add_timer inside the lock, and calling
sk_stop_timer_sync() with this reference, instead of "entry->add_timer".
Move list_del(&entry->list) to mptcp_pm_del_add_timer and inside the pm lock,
do not directly access any members of the entry outside the pm lock, which
can avoid similar "entry->x" uaf.
Fixes: 00cfd77b9063 ("mptcp: retransmit ADD_ADDR when timeout")
Cc: stable@vger.kernel.org
Reported-and-tested-by: syzbot+f3a31fb909db9b2a5c4d@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=f3a31fb909db9b2a5c4d
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/tencent_7142963A37944B4A74EF76CD66EA3C253609@qq.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The number of transmit and receive descriptors must be a multiple of 128
due to the hardware limitation. If it is set to a multiple of 8 instead of
a multiple 128, the queues will easily be hung.
Cc: stable@vger.kernel.org
Fixes: 883b5984a5d2 ("net: wangxun: add ethtool_ops for ring parameters")
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240910095629.570674-1-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The TAS module could not be configured when it's running in pending
status. We need disable the module and configure it again. However, the
pending status is not cleared after the module disabled. TC taprio set
will always return busy even it's disabled.
For example, a user uses tc-taprio to configure Qbv and a future
basetime. The TAS module will run in a pending status. There is no way
to reconfigure Qbv, it always returns busy.
Actually the TAS module can be reconfigured when it's disabled. So it
doesn't need to check the pending status if the TAS module is disabled.
After the patch, user can delete the tc taprio configuration to disable
Qbv and reconfigure it again.
Fixes: de143c0e274b ("net: dsa: felix: Configure Time-Aware Scheduler via taprio offload")
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Link: https://patch.msgid.link/20240906093550.29985-1-xiaoliang.yang_1@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In the function hsr_proxy_annouance() added in the previous commit
5f703ce5c981 ("net: hsr: Send supervisory frames to HSR network
with ProxyNodeTable data"), the return value of the hsr_port_get_hsr()
function is not checked to be a NULL pointer, which causes a NULL
pointer dereference.
To solve this, we need to add code to check whether the return value
of hsr_port_get_hsr() is NULL.
Reported-by: syzbot+02a42d9b1bd395cbcab4@syzkaller.appspotmail.com
Fixes: 5f703ce5c981 ("net: hsr: Send supervisory frames to HSR network with ProxyNodeTable data")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Lukasz Majewski <lukma@denx.de>
Link: https://patch.msgid.link/20240907190341.162289-1-aha310510@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Matthieu Baerts says:
====================
selftests: mptcp: misc. small fixes
Here are some various fixes for the MPTCP selftests.
Patch 1 fixes a recently modified test to continue to work as expected
on older kernels. This is a fix for a recent fix that can be backported
up to v5.15.
Patch 2 and 3 include dependences when exporting or installing the
tests. Two fixes for v6.11-rc1.
====================
Link: https://patch.msgid.link/20240910-net-selftests-mptcp-fix-install-v1-0-8f124aa9156d@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Similar to the previous commit, the net_helper.sh file from the parent
directory is used by the MPTCP selftests and it needs to be present when
running the tests.
This file then needs to be listed in the Makefile to be included when
exporting or installing the tests, e.g. with:
make -C tools/testing/selftests \
TARGETS=net/mptcp \
install INSTALL_PATH=$KSFT_INSTALL_PATH
cd $KSFT_INSTALL_PATH
./run_kselftest.sh -c net/mptcp
Fixes: 1af3bc912eac ("selftests: mptcp: lib: use wait_local_port_listen helper")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240910-net-selftests-mptcp-fix-install-v1-3-8f124aa9156d@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The lib.sh file from the parent directory is used by the MPTCP selftests
and it needs to be present when running the tests.
This file then needs to be listed in the Makefile to be included when
exporting or installing the tests, e.g. with:
make -C tools/testing/selftests \
TARGETS=net/mptcp \
install INSTALL_PATH=$KSFT_INSTALL_PATH
cd $KSFT_INSTALL_PATH
./run_kselftest.sh -c net/mptcp
Fixes: f265d3119a29 ("selftests: mptcp: lib: use setup/cleanup_ns helpers")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240910-net-selftests-mptcp-fix-install-v1-2-8f124aa9156d@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A new endpoint using the IP of the initial subflow has been recently
added to increase the code coverage. But it breaks the test when using
old kernels not having commit 86e39e04482b ("mptcp: keep track of local
endpoint still available for each msk"), e.g. on v5.15.
Similar to commit d4c81bbb8600 ("selftests: mptcp: join: support local
endpoint being tracked or not"), it is possible to add the new endpoint
conditionally, by checking if "mptcp_pm_subflow_check_next" is present
in kallsyms: this is not directly linked to the commit introducing this
symbol but for the parent one which is linked anyway. So we can know in
advance what will be the expected behaviour, and add the new endpoint
only when it makes sense to do so.
Fixes: 4878f9f8421f ("selftests: mptcp: join: validate fullmesh endp on 1st sf")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240910-net-selftests-mptcp-fix-install-v1-1-8f124aa9156d@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When running in container environmment, /sys/fs/cgroup/ might not be
the real root node of the sk-attached cgroup.
Example:
In container:
% stat /sys//fs/cgroup/
Device: 0,21 Inode: 2214 ..
% stat /sys/fs/cgroup/foo
Device: 0,21 Inode: 2264 ..
The expectation would be for:
nft add rule .. socket cgroupv2 level 1 "foo" counter
to match traffic from a process that got added to "foo" via
"echo $pid > /sys/fs/cgroup/foo/cgroup.procs".
However, 'level 3' is needed to make this work.
Seen from initial namespace, the complete hierarchy is:
% stat /sys/fs/cgroup/system.slice/docker-.../foo
Device: 0,21 Inode: 2264 ..
i.e. hierarchy is
0 1 2 3
/ -> system.slice -> docker-1... -> foo
... but the container doesn't know that its "/" is the "docker-1.."
cgroup. Current code will retrieve the 'system.slice' cgroup node
and store its kn->id in the destination register, so compare with
2264 ("foo" cgroup id) will not match.
Fetch "/" cgroup from ->init() and add its level to the level we try to
extract. cgroup root-level is 0 for the init-namespace or the level
of the ancestor that is exposed as the cgroup root inside the container.
In the above case, cgrp->level of "/" resolved in the container is 2
(docker-1...scope/) and request for 'level 1' will get adjusted
to fetch the actual level (3).
v2: use CONFIG_SOCK_CGROUP_DATA, eval function depends on it.
(kernel test robot)
Cc: cgroups@vger.kernel.org
Fixes: e0bb96db96f8 ("netfilter: nft_socket: add support for cgroupsv2")
Reported-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|