summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2022-12-31io_uring/net: introduce IORING_SEND_ZC_REPORT_USAGE flagStefan Metzmacher1-0/+18
commit e307e6698165ca6508ed42c69cb1be76c8eb6a3c upstream. It might be useful for applications to detect if a zero copy transfer with SEND[MSG]_ZC was actually possible or not. The application can fallback to plain SEND[MSG] in order to avoid the overhead of two cqes per request. Or it can generate a log message that could indicate to an administrator that no zero copy was possible and could explain degraded performance. Cc: stable@vger.kernel.org # 6.1 Link: https://lore.kernel.org/io-uring/fb6a7599-8a9b-15e5-9b64-6cd9d01c6ff4@gmail.com/T/#m2b0d9df94ce43b0e69e6c089bdff0ce6babbdfaa Signed-off-by: Stefan Metzmacher <metze@samba.org> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/8945b01756d902f5d5b0667f20b957ad3f742e5e.1666895626.git.metze@samba.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-12-31dt-bindings: clocks: imx8mp: Add ID for usb suspend clockLi Jun1-1/+2
commit 5c1f7f1090947d494c30042123e0ec846f696336 upstream. usb suspend clock has a gate shared with usb_root_clk. Fixes: 9c140d9926761 ("clk: imx: Add support for i.MX8MP clock driver") Cc: stable@vger.kernel.org # v5.19+ Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com> Signed-off-by: Li Jun <jun.li@nxp.com> Signed-off-by: Abel Vesa <abel.vesa@linaro.org> Link: https://lore.kernel.org/r/1664549663-20364-1-git-send-email-jun.li@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-12-31ALSA: hda/hdmi: fix stream-id config keep-alive for rt suspendKai Vehmanen1-0/+1
[ Upstream commit ee0b089d660021792e4ab4dda191b097ce1e964f ] When the new style KAE keep-alive implementation is used on compatible Intel hardware, the clocks are maintained when codec is in D3. The generic code in hda_cleanup_all_streams() can however interfere with generation of audio samples in this mode, by setting the stream and channel ids to zero. To get full benefit of the keepalive, set the new no_stream_clean_at_suspend quirk bit on affected Intel hardware. When this bit is set, stream cleanup is skipped in hda_call_codec_suspend(). Special handling is needed for the case when system goes to suspend. The stream id programming can be lost in this case. This will also cause codec->cvt_setups to be out of sync. Handle this by implementing custom suspend/resume handlers. If keep-alive is active for any converter, set the quirk flags no_stream_clean_at_suspend and forced_resume. Upon resume, keepalive programming is restored if needed. Fixes: 15175a4f2bbb ("ALSA: hda/hdmi: add keep-alive support for ADL-P and DG2") Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20221209101822.3893675-4-kai.vehmanen@linux.intel.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31Bluetooth: Add quirk to disable MWS Transport ConfigurationSven Peter2-0/+13
[ Upstream commit ffcb0a445ec2d5753751437706aa0a7ea8351099 ] Broadcom 4378/4387 controllers found in Apple Silicon Macs claim to support getting MWS Transport Layer Configuration, < HCI Command: Read Local Supported... (0x04|0x0002) plen 0 > HCI Event: Command Complete (0x0e) plen 68 Read Local Supported Commands (0x04|0x0002) ncmd 1 Status: Success (0x00) [...] Get MWS Transport Layer Configuration (Octet 30 - Bit 3)] [...] , but then don't actually allow the required command: > HCI Event: Command Complete (0x0e) plen 15 Get MWS Transport Layer Configuration (0x05|0x000c) ncmd 1 Status: Command Disallowed (0x0c) Number of transports: 0 Baud rate list: 0 entries 00 00 00 00 00 00 00 00 00 00 Signed-off-by: Sven Peter <sven@svenpeter.dev> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31Bluetooth: Add quirk to disable extended scanningSven Peter2-1/+13
[ Upstream commit 392fca352c7a95e2828d49e7500e26d0c87ca265 ] Broadcom 4377 controllers found in Apple x86 Macs with the T2 chip claim to support extended scanning when querying supported states, < HCI Command: LE Read Supported St.. (0x08|0x001c) plen 0 > HCI Event: Command Complete (0x0e) plen 12 LE Read Supported States (0x08|0x001c) ncmd 1 Status: Success (0x00) States: 0x000003ffffffffff [...] LE Set Extended Scan Parameters (Octet 37 - Bit 5) LE Set Extended Scan Enable (Octet 37 - Bit 6) [...] , but then fail to actually implement the extended scanning: < HCI Command: LE Set Extended Sca.. (0x08|0x0041) plen 8 Own address type: Random (0x01) Filter policy: Accept all advertisement (0x00) PHYs: 0x01 Entry 0: LE 1M Type: Active (0x01) Interval: 11.250 msec (0x0012) Window: 11.250 msec (0x0012) > HCI Event: Command Complete (0x0e) plen 4 LE Set Extended Scan Parameters (0x08|0x0041) ncmd 1 Status: Unknown HCI Command (0x01) Signed-off-by: Sven Peter <sven@svenpeter.dev> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31regulator: core: Use different devices for resource allocation and DT lookupChiYuan Huang1-1/+2
[ Upstream commit 8f3cbcd6b440032ebc7f7d48a1689dcc70a4eb98 ] Following by the below discussion, there's the potential UAF issue between regulator and mfd. https://lore.kernel.org/all/20221128143601.1698148-1-yangyingliang@huawei.com/ From the analysis of Yingliang CPU A |CPU B mt6370_probe() | devm_mfd_add_devices() | |mt6370_regulator_probe() | regulator_register() | //allocate init_data and add it to devres | regulator_of_get_init_data() i2c_unregister_device() | device_del() | devres_release_all() | // init_data is freed | release_nodes() | | // using init_data causes UAF | regulator_register() It's common to use mfd core to create child device for the regulator. In order to do the DT lookup for init data, the child that registered the regulator would pass its parent as the parameter. And this causes init data resource allocated to its parent, not itself. The issue happen when parent device is going to release and regulator core is still doing some operation of init data constraint for the regulator of child device. To fix it, this patch expand 'regulator_register' API to use the different devices for init data allocation and DT lookup. Reported-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: ChiYuan Huang <cy_huang@richtek.com> Link: https://lore.kernel.org/r/1670311341-32664-1-git-send-email-u0084500@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31media: dvbdev: adopts refcnt to avoid UAFLin Ma1-14/+17
[ Upstream commit 0fc044b2b5e2d05a1fa1fb0d7f270367a7855d79 ] dvb_unregister_device() is known that prone to use-after-free. That is, the cleanup from dvb_unregister_device() releases the dvb_device even if there are pointers stored in file->private_data still refer to it. This patch adds a reference counter into struct dvb_device and delays its deallocation until no pointer refers to the object. Link: https://lore.kernel.org/linux-media/20220807145952.10368-1-linma@zju.edu.cn Signed-off-by: Lin Ma <linma@zju.edu.cn> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31bpf: Fix a BTF_ID_LIST bug with CONFIG_DEBUG_INFO_BTF not setYonghong Song1-1/+1
[ Upstream commit beb3d47d1d3d7185bb401af628ad32ee204a9526 ] With CONFIG_DEBUG_INFO_BTF not set, we hit the following compilation error, /.../kernel/bpf/verifier.c:8196:23: error: array index 6 is past the end of the array (that has type 'u32[5]' (aka 'unsigned int[5]')) [-Werror,-Warray-bounds] if (meta->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx]) ^ ~~~~~~~~~~~~~~~~~~~~~~~ /.../kernel/bpf/verifier.c:8174:1: note: array 'special_kfunc_list' declared here BTF_ID_LIST(special_kfunc_list) ^ /.../include/linux/btf_ids.h:207:27: note: expanded from macro 'BTF_ID_LIST' #define BTF_ID_LIST(name) static u32 __maybe_unused name[5]; ^ /.../kernel/bpf/verifier.c:8443:19: error: array index 5 is past the end of the array (that has type 'u32[5]' (aka 'unsigned int[5]')) [-Werror,-Warray-bounds] btf_id == special_kfunc_list[KF_bpf_list_pop_back]; ^ ~~~~~~~~~~~~~~~~~~~~ /.../kernel/bpf/verifier.c:8174:1: note: array 'special_kfunc_list' declared here BTF_ID_LIST(special_kfunc_list) ^ /.../include/linux/btf_ids.h:207:27: note: expanded from macro 'BTF_ID_LIST' #define BTF_ID_LIST(name) static u32 __maybe_unused name[5]; ... Fix the problem by increase the size of BTF_ID_LIST to 16 to avoid compilation error and also prevent potentially unintended issue due to out-of-bound access. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221123155759.2669749-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31mrp: introduce active flags to prevent UAF when applicant uninitSchspa Shi1-0/+1
[ Upstream commit ab0377803dafc58f1e22296708c1c28e309414d6 ] The caller of del_timer_sync must prevent restarting of the timer, If we have no this synchronization, there is a small probability that the cancellation will not be successful. And syzbot report the fellowing crash: ================================================================== BUG: KASAN: use-after-free in hlist_add_head include/linux/list.h:929 [inline] BUG: KASAN: use-after-free in enqueue_timer+0x18/0xa4 kernel/time/timer.c:605 Write at addr f9ff000024df6058 by task syz-fuzzer/2256 Pointer tag: [f9], memory tag: [fe] CPU: 1 PID: 2256 Comm: syz-fuzzer Not tainted 6.1.0-rc5-syzkaller-00008- ge01d50cbd6ee #0 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace.part.0+0xe0/0xf0 arch/arm64/kernel/stacktrace.c:156 dump_backtrace arch/arm64/kernel/stacktrace.c:162 [inline] show_stack+0x18/0x40 arch/arm64/kernel/stacktrace.c:163 __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x68/0x84 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:284 [inline] print_report+0x1a8/0x4a0 mm/kasan/report.c:395 kasan_report+0x94/0xb4 mm/kasan/report.c:495 __do_kernel_fault+0x164/0x1e0 arch/arm64/mm/fault.c:320 do_bad_area arch/arm64/mm/fault.c:473 [inline] do_tag_check_fault+0x78/0x8c arch/arm64/mm/fault.c:749 do_mem_abort+0x44/0x94 arch/arm64/mm/fault.c:825 el1_abort+0x40/0x60 arch/arm64/kernel/entry-common.c:367 el1h_64_sync_handler+0xd8/0xe4 arch/arm64/kernel/entry-common.c:427 el1h_64_sync+0x64/0x68 arch/arm64/kernel/entry.S:576 hlist_add_head include/linux/list.h:929 [inline] enqueue_timer+0x18/0xa4 kernel/time/timer.c:605 mod_timer+0x14/0x20 kernel/time/timer.c:1161 mrp_periodic_timer_arm net/802/mrp.c:614 [inline] mrp_periodic_timer+0xa0/0xc0 net/802/mrp.c:627 call_timer_fn.constprop.0+0x24/0x80 kernel/time/timer.c:1474 expire_timers+0x98/0xc4 kernel/time/timer.c:1519 To fix it, we can introduce a new active flags to make sure the timer will not restart. Reported-by: syzbot+6fd64001c20aa99e34a4@syzkaller.appspotmail.com Signed-off-by: Schspa Shi <schspa@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31net: add atomic_long_t to net_device_stats fieldsEric Dumazet2-26/+37
[ Upstream commit 6c1c5097781f563b70a81683ea6fdac21637573b ] Long standing KCSAN issues are caused by data-race around some dev->stats changes. Most performance critical paths already use per-cpu variables, or per-queue ones. It is reasonable (and more correct) to use atomic operations for the slow paths. This patch adds an union for each field of net_device_stats, so that we can convert paths that are not yet protected by a spinlock or a mutex. netdev_stats_to_stats64() no longer has an #if BITS_PER_LONG==64 Note that the memcpy() we were using on 64bit arches had no provision to avoid load-tearing, while atomic_long_read() is providing the needed protection at no cost. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31drm/edid: add a quirk for two LG monitors to get them to work on 10bpcHamza Mahfooz1-0/+6
[ Upstream commit aa193f7eff8ff753577351140b8af13b76cdc7c2 ] The LG 27GP950 and LG 27GN950 have visible display corruption when trying to use 10bpc modes. So, to fix this, cap their maximum DSC target bitrate to 15bpp. Suggested-by: Roman Li <roman.li@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31video: hyperv_fb: Avoid taking busy spinlock on panic pathGuilherme G. Piccoli1-0/+2
[ Upstream commit 1d044ca035dc22df0d3b39e56f2881071d9118bd ] The Hyper-V framebuffer code registers a panic notifier in order to try updating its fbdev if the kernel crashed. The notifier callback is straightforward, but it calls the vmbus_sendpacket() routine eventually, and such function takes a spinlock for the ring buffer operations. Panic path runs in atomic context, with local interrupts and preemption disabled, and all secondary CPUs shutdown. That said, taking a spinlock might cause a lockup if a secondary CPU was disabled with such lock taken. Fix it here by checking if the ring buffer spinlock is busy on Hyper-V framebuffer panic notifier; if so, bail-out avoiding the potential lockup scenario. Cc: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Michael Kelley <mikelley@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Tianyu Lan <Tianyu.Lan@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Tested-by: Fabio A M Martins <fabiomirmar@gmail.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20220819221731.480795-10-gpiccoli@igalia.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31dmaengine: idxd: Fix crc_val field for completion recordFenghua Yu1-1/+1
[ Upstream commit dc901d98b1fe6e52ab81cd3e0879379168e06daa ] The crc_val in the completion record should be 64 bits and not 32 bits. Fixes: 4ac823e9cd85 ("dmaengine: idxd: fix delta_rec and crc size field for completion record") Reported-by: Nirav N Shah <nirav.n.shah@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20221111012715.2031481-1-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31include/uapi/linux/swab: Fix potentially missing __always_inlineMatt Redfearn1-1/+1
[ Upstream commit defbab270d45e32b068e7e73c3567232d745c60f ] Commit bc27fb68aaad ("include/uapi/linux/byteorder, swab: force inlining of some byteswap operations") added __always_inline to swab functions and commit 283d75737837 ("uapi/linux/stddef.h: Provide __always_inline to userspace headers") added a definition of __always_inline for use in exported headers when the kernel's compiler.h is not available. However, since swab.h does not include stddef.h, if the header soup does not indirectly include it, the definition of __always_inline is missing, resulting in a compilation failure, which was observed compiling the perf tool using exported headers containing this commit: In file included from /usr/include/linux/byteorder/little_endian.h:12:0, from /usr/include/asm/byteorder.h:14, from tools/include/uapi/linux/perf_event.h:20, from perf.h:8, from builtin-bench.c:18: /usr/include/linux/swab.h:160:8: error: unknown type name `__always_inline' static __always_inline __u16 __swab16p(const __u16 *p) Fix this by replacing the inclusion of linux/compiler.h with linux/stddef.h to ensure that we pick up that definition if required, without relying on it's indirect inclusion. compiler.h is then included indirectly, via stddef.h. Fixes: 283d75737837 ("uapi/linux/stddef.h: Provide __always_inline to userspace headers") Signed-off-by: Matt Redfearn <matt.redfearn@mips.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Petr Vaněk <arkamar@atlas.cz> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31iio: adis: add '__adis_enable_irq()' implementationRamona Bolboaca1-1/+12
[ Upstream commit 99c05e4283a19a02a256f14100ca4ec3b2da3f62 ] Add '__adis_enable_irq()' implementation which is the unlocked version of 'adis_enable_irq()'. Call '__adis_enable_irq()' instead of 'adis_enable_irq()' from '__adis_intial_startup()' to keep the expected unlocked functionality. This fix is needed to remove a deadlock for all devices which are using 'adis_initial_startup()'. The deadlock occurs because the same mutex is acquired twice, without releasing it. The mutex is acquired once inside 'adis_initial_startup()', before calling '__adis_initial_startup()', and once inside 'adis_enable_irq()', which is called by '__adis_initial_startup()'. The deadlock is removed by calling '__adis_enable_irq()', instead of 'adis_enable_irq()' from within '__adis_initial_startup()'. Fixes: b600bd7eb3335 ("iio: adis: do not disabe IRQs in 'adis_init()'") Signed-off-by: Ramona Bolboaca <ramona.bolboaca@analog.com> Reviewed-by: Nuno Sá <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20221122082757.449452-2-ramona.bolboaca@analog.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31RDMA/hns: Fix incorrect sge nums calculationLuoyouming1-0/+15
[ Upstream commit 0c5e259b06a8efc69f929ad777ea49281bb58e37 ] The user usually configures the number of sge through the max_send_sge parameter when creating qp, and configures the maximum size of inline data that can be sent through max_inline_data. Inline uses sge to fill data to send. Expect the following: 1) When the sge space cannot hold inline data, the sge space needs to be expanded to accommodate all inline data 2) When the sge space is enough to accommodate inline data, the upper limit of inline data can be increased so that users can send larger inline data Currently case one is not implemented. When the inline data is larger than the sge space, an error of insufficient sge space occurs. This part of the code needs to be reimplemented according to the expected rules. The calculation method of sge num is modified to take the maximum value of max_send_sge and the sge for max_inline_data to solve this problem. Fixes: 05201e01be93 ("RDMA/hns: Refactor process of setting extended sge") Fixes: 30b707886aeb ("RDMA/hns: Support inline data in extented sge space for RC") Link: https://lore.kernel.org/r/20221108133847.2304539-3-xuhaoyue1@hisilicon.com Signed-off-by: Luoyouming <luoyouming@huawei.com> Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31crypto: hisilicon/qm - add missing pci_dev_put() in q_num_set()Xiongfeng Wang1-2/+4
[ Upstream commit cc7710d0d4ebc6998f04035cde4f32c5ddbe9d7f ] pci_get_device() will increase the reference count for the returned pci_dev. We need to use pci_dev_put() to decrease the reference count before q_num_set() returns. Fixes: c8b4b477079d ("crypto: hisilicon - add HiSilicon HPRE accelerator") Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: Weili Qian <qianweili@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31IB/mad: Don't call to function that might sleep while in atomic contextLeonid Ravich1-9/+4
[ Upstream commit 5c20311d76cbaeb7ed2ecf9c8b8322f8fc4a7ae3 ] Tracepoints are not allowed to sleep, as such the following splat is generated due to call to ib_query_pkey() in atomic context. WARNING: CPU: 0 PID: 1888000 at kernel/trace/ring_buffer.c:2492 rb_commit+0xc1/0x220 CPU: 0 PID: 1888000 Comm: kworker/u9:0 Kdump: loaded Tainted: G OE --------- - - 4.18.0-305.3.1.el8.x86_64 #1 Hardware name: Red Hat KVM, BIOS 1.13.0-2.module_el8.3.0+555+a55c8938 04/01/2014 Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core] RIP: 0010:rb_commit+0xc1/0x220 RSP: 0000:ffffa8ac80f9bca0 EFLAGS: 00010202 RAX: ffff8951c7c01300 RBX: ffff8951c7c14a00 RCX: 0000000000000246 RDX: ffff8951c707c000 RSI: ffff8951c707c57c RDI: ffff8951c7c14a00 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: ffff8951c7c01300 R11: 0000000000000001 R12: 0000000000000246 R13: 0000000000000000 R14: ffffffff964c70c0 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8951fbc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f20e8f39010 CR3: 000000002ca10005 CR4: 0000000000170ef0 Call Trace: ring_buffer_unlock_commit+0x1d/0xa0 trace_buffer_unlock_commit_regs+0x3b/0x1b0 trace_event_buffer_commit+0x67/0x1d0 trace_event_raw_event_ib_mad_recv_done_handler+0x11c/0x160 [ib_core] ib_mad_recv_done+0x48b/0xc10 [ib_core] ? trace_event_raw_event_cq_poll+0x6f/0xb0 [ib_core] __ib_process_cq+0x91/0x1c0 [ib_core] ib_cq_poll_work+0x26/0x80 [ib_core] process_one_work+0x1a7/0x360 ? create_worker+0x1a0/0x1a0 worker_thread+0x30/0x390 ? create_worker+0x1a0/0x1a0 kthread+0x116/0x130 ? kthread_flush_work_fn+0x10/0x10 ret_from_fork+0x35/0x40 ---[ end trace 78ba8509d3830a16 ]--- Fixes: 821bf1de45a1 ("IB/MAD: Add recv path trace point") Signed-off-by: Leonid Ravich <lravich@gmail.com> Link: https://lore.kernel.org/r/Y2t5feomyznrVj7V@leonid-Inspiron-3421 Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31f2fs: fix the assign logic of iocbMukesh Ojha1-15/+19
[ Upstream commit 0db18eec0d9a7ee525209e31e3ac2f673545b12f ] commit 18ae8d12991b ("f2fs: show more DIO information in tracepoint") introduces iocb field in 'f2fs_direct_IO_enter' trace event And it only assigns the pointer and later it accesses its field in trace print log. Unable to handle kernel paging request at virtual address ffffffc04cef3d30 Mem abort info: ESR = 0x96000007 EC = 0x25: DABT (current EL), IL = 32 bits pc : trace_raw_output_f2fs_direct_IO_enter+0x54/0xa4 lr : trace_raw_output_f2fs_direct_IO_enter+0x2c/0xa4 sp : ffffffc0443cbbd0 x29: ffffffc0443cbbf0 x28: ffffff8935b120d0 x27: ffffff8935b12108 x26: ffffff8935b120f0 x25: ffffff8935b12100 x24: ffffff8935b110c0 x23: ffffff8935b10000 x22: ffffff88859a936c x21: ffffff88859a936c x20: ffffff8935b110c0 x19: ffffff8935b10000 x18: ffffffc03b195060 x17: ffffff8935b11e76 x16: 00000000000000cc x15: ffffffef855c4f2c x14: 0000000000000001 x13: 000000000000004e x12: ffff0000ffffff00 x11: ffffffef86c350d0 x10: 00000000000010c0 x9 : 000000000fe0002c x8 : ffffffc04cef3d28 x7 : 7f7f7f7f7f7f7f7f x6 : 0000000002000000 x5 : ffffff8935b11e9a x4 : 0000000000006250 x3 : ffff0a00ffffff04 x2 : 0000000000000002 x1 : ffffffef86a0a31f x0 : ffffff8935b10000 Call trace: trace_raw_output_f2fs_direct_IO_enter+0x54/0xa4 print_trace_fmt+0x9c/0x138 print_trace_line+0x154/0x254 tracing_read_pipe+0x21c/0x380 vfs_read+0x108/0x3ac ksys_read+0x7c/0xec __arm64_sys_read+0x20/0x30 invoke_syscall+0x60/0x150 el0_svc_common.llvm.1237943816091755067+0xb8/0xf8 do_el0_svc+0x28/0xa0 Fix it by copying the required variables for printing and while at it fix the similar issue at some other places in the same file. Fixes: bd984c03097b ("f2fs: show more DIO information in tracepoint") Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31fortify: Do not cast to "unsigned char"Kees Cook1-1/+1
[ Upstream commit e9a40e1585d792751d3a122392695e5a53032809 ] Do not cast to "unsigned char", as this needlessly creates type problems when attempting builds without -Wno-pointer-sign[1]. The intent of the cast is to drop possible "const" types. [1] https://lore.kernel.org/lkml/CAHk-=wgz3Uba8w7kdXhsqR1qvfemYL+OFQdefJnkeqXG8qZ_pA@mail.gmail.com/ Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Fixes: 3009f891bb9f ("fortify: Allow strlen() and strnlen() to pass compile-time known lengths") Cc: linux-hardening@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31ipvs: use u64_stats_t for the per-cpu countersJulian Anastasov1-5/+5
[ Upstream commit 1dbd8d9a82e3f26b9d063292d47ece673f48fce2 ] Use the provided u64_stats_t type to avoid load/store tearing. Fixes: 316580b69d0a ("u64_stats: provide u64_stats_t type") Signed-off-by: Julian Anastasov <ja@ssi.bg> Cc: yunhong-cgl jiang <xintian1976@gmail.com> Cc: "dust.li" <dust.li@linux.alibaba.com> Reviewed-by: Jiri Wiesner <jwiesner@suse.de> Tested-by: Jiri Wiesner <jwiesner@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31bpf, sockmap: Fix missing BPF_F_INGRESS flag when using apply_bytesPengcheng Yang2-2/+3
[ Upstream commit a351d6087bf7d3d8440d58d3bf244ec64b89394a ] When redirecting, we use sk_msg_to_ingress() to get the BPF_F_INGRESS flag from the msg->flags. If apply_bytes is used and it is larger than the current data being processed, sk_psock_msg_verdict() will not be called when sendmsg() is called again. At this time, the msg->flags is 0, and we lost the BPF_F_INGRESS flag. So we need to save the BPF_F_INGRESS flag in sk_psock and use it when redirection. Fixes: 8934ce2fd081 ("bpf: sockmap redirect ingress support") Signed-off-by: Pengcheng Yang <yangpc@wangsu.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/1669718441-2654-3-git-send-email-yangpc@wangsu.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31clk: imx: rename video_pll1 to video_pllDario Binacchi1-4/+8
[ Upstream commit bedcf9d1dcf88ed38731f0ac9620e5a421e1e9d6 ] Unlike audio_pll1 and audio_pll2, there is no video_pll2. Further, the name used in the RM is video_pll. So, let's rename "video_pll1" to "video_pll" to be consistent with the RM and avoid misunderstandings. The IMX8MN_VIDEO_PLL1* constants have not been removed to ensure backward compatibility of the patch. No functional changes intended. Fixes: 96d6392b54dbb ("clk: imx: Add support for i.MX8MN clock driver") Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com> Acked-by: Marco Felsch <m.felsch@pengutronix.de> Reviewed-by: Abel Vesa <abel.vesa@linaro.org> Signed-off-by: Abel Vesa <abel.vesa@linaro.org> Link: https://lore.kernel.org/r/20221117113637.1978703-4-dario.binacchi@amarulasolutions.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31clk: imx8mn: rename vpu_pll to m7_alt_pllDario Binacchi1-4/+8
[ Upstream commit a429c60baefd95ab43a2ce7f25d5b2d7a2e431df ] The IMX8MN platform does not have any video processing unit (VPU), and indeed in the reference manual (document IMX8MNRM Rev 2, 07/2022) there is no occurrence of its pll. From an analysis of the code and the RM itself, I think vpu pll is used instead of m7 alternate pll, probably for copy and paste of code taken from modules of similar architectures. As an example for all, if we consider the second row of the "Clock Root" table of chapter 5 (Clocks and Power Management) of the RM: Clock Root offset Source Select (CCM_TARGET_ROOTn[MUX]) ... ... ... ARM_M7_CLK_ROOT 0x8080 000 - 24M_REF_CLK 001 - SYSTEM_PLL2_DIV5 010 - SYSTEM_PLL2_DIV4 011 - M7_ALT_PLL_CLK 100 - SYSTEM_PLL1_CLK 101 - AUDIO_PLL1_CLK 110 - VIDEO_PLL_CLK 111 - SYSTEM_PLL3_CLK ... ... ... but in the source code, the imx8mn_m7_sels clocks list contains vpu_pll for the source select bits 011b. So, let's rename "vpu_pll" to "m7_alt_pll" to be consistent with the RM. The IMX8MN_VPU_* constants have not been removed to ensure backward compatibility of the patch. No functional changes intended. Fixes: 96d6392b54dbb ("clk: imx: Add support for i.MX8MN clock driver") Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com> Acked-by: Marco Felsch <m.felsch@pengutronix.de> Reviewed-by: Abel Vesa <abel.vesa@linaro.org> Signed-off-by: Abel Vesa <abel.vesa@linaro.org> Link: https://lore.kernel.org/r/20221117113637.1978703-2-dario.binacchi@amarulasolutions.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31ALSA: seq: fix undefined behavior in bit shift for SNDRV_SEQ_FILTER_USE_EVENTBaisong Zhong1-4/+4
[ Upstream commit cf59e1e4c79bf741905484cdb13c130b53576a16 ] Shifting signed 32-bit value by 31 bits is undefined, so changing significant bit to unsigned. The UBSAN warning calltrace like below: UBSAN: shift-out-of-bounds in sound/core/seq/seq_clientmgr.c:509:22 left shift of 1 by 31 places cannot be represented in type 'int' ... Call Trace: <TASK> dump_stack_lvl+0x8d/0xcf ubsan_epilogue+0xa/0x44 __ubsan_handle_shift_out_of_bounds+0x1e7/0x208 snd_seq_deliver_single_event.constprop.21+0x191/0x2f0 snd_seq_deliver_event+0x1a2/0x350 snd_seq_kernel_client_dispatch+0x8b/0xb0 snd_seq_client_notify_subscription+0x72/0xa0 snd_seq_ioctl_subscribe_port+0x128/0x160 snd_seq_kernel_client_ctl+0xce/0xf0 snd_seq_oss_create_client+0x109/0x15b alsa_seq_oss_init+0x11c/0x1aa do_one_initcall+0x80/0x440 kernel_init_freeable+0x370/0x3c3 kernel_init+0x1b/0x190 ret_from_fork+0x1f/0x30 </TASK> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Baisong Zhong <zhongbaisong@huawei.com> Link: https://lore.kernel.org/r/20221121111630.3119259-1-zhongbaisong@huawei.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31ALSA: pcm: fix undefined behavior in bit shift for SNDRV_PCM_RATE_KNOTBaisong Zhong1-18/+18
[ Upstream commit b5172e62458f8e6ff359e5f096044a488db90ac5 ] Shifting signed 32-bit value by 31 bits is undefined, so changing significant bit to unsigned. The UBSAN warning calltrace like below: UBSAN: shift-out-of-bounds in sound/core/pcm_native.c:2676:21 left shift of 1 by 31 places cannot be represented in type 'int' ... Call Trace: <TASK> dump_stack_lvl+0x8d/0xcf ubsan_epilogue+0xa/0x44 __ubsan_handle_shift_out_of_bounds+0x1e7/0x208 snd_pcm_open_substream+0x9f0/0xa90 snd_pcm_oss_open.part.26+0x313/0x670 snd_pcm_oss_open+0x30/0x40 soundcore_open+0x18b/0x2e0 chrdev_open+0xe2/0x270 do_dentry_open+0x2f7/0x620 path_openat+0xd66/0xe70 do_filp_open+0xe3/0x170 do_sys_openat2+0x357/0x4a0 do_sys_open+0x87/0xd0 do_syscall_64+0x34/0x80 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Baisong Zhong <zhongbaisong@huawei.com> Link: https://lore.kernel.org/r/20221121110044.3115686-1-zhongbaisong@huawei.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31net, proc: Provide PROC_FS=n fallback for proc_create_net_single_write()David Howells1-0/+2
[ Upstream commit c3d96f690a790074b508fe183a41e36a00cd7ddd ] Provide a CONFIG_PROC_FS=n fallback for proc_create_net_single_write(). Also provide a fallback for proc_create_net_data_write(). Fixes: 564def71765c ("proc: Add a way to make network proc files writable") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31drm/ttm: fix undefined behavior in bit shift for TTM_TT_FLAG_PRIV_POPULATEDGaosheng Cui1-1/+1
[ Upstream commit 387659939c00156f8d6bab0fbc55b4eaf2b6bc5b ] Shifting signed 32-bit value by 31 bits is undefined, so changing significant bit to unsigned. The UBSAN warning calltrace like below: UBSAN: shift-out-of-bounds in ./include/drm/ttm/ttm_tt.h:122:26 left shift of 1 by 31 places cannot be represented in type 'int' Call Trace: <TASK> dump_stack_lvl+0x7d/0xa5 dump_stack+0x15/0x1b ubsan_epilogue+0xe/0x4e __ubsan_handle_shift_out_of_bounds+0x1e7/0x20c ttm_bo_move_memcpy+0x3b4/0x460 [ttm] bo_driver_move+0x32/0x40 [drm_vram_helper] ttm_bo_handle_move_mem+0x118/0x200 [ttm] ttm_bo_validate+0xfa/0x220 [ttm] drm_gem_vram_pin_locked+0x70/0x1b0 [drm_vram_helper] drm_gem_vram_pin+0x48/0xb0 [drm_vram_helper] drm_gem_vram_plane_helper_prepare_fb+0x53/0xe0 [drm_vram_helper] drm_gem_vram_simple_display_pipe_prepare_fb+0x26/0x30 [drm_vram_helper] drm_simple_kms_plane_prepare_fb+0x4d/0xe0 [drm_kms_helper] drm_atomic_helper_prepare_planes+0xda/0x210 [drm_kms_helper] drm_atomic_helper_commit+0xc3/0x1e0 [drm_kms_helper] drm_atomic_commit+0x9c/0x160 [drm] drm_client_modeset_commit_atomic+0x33a/0x380 [drm] drm_client_modeset_commit_locked+0x77/0x220 [drm] drm_client_modeset_commit+0x31/0x60 [drm] __drm_fb_helper_restore_fbdev_mode_unlocked+0xa7/0x170 [drm_kms_helper] drm_fb_helper_set_par+0x51/0x90 [drm_kms_helper] fbcon_init+0x316/0x790 visual_init+0x113/0x1d0 do_bind_con_driver+0x2a3/0x5c0 do_take_over_console+0xa9/0x270 do_fbcon_takeover+0xa1/0x170 do_fb_registered+0x2a8/0x340 fbcon_fb_registered+0x47/0xe0 register_framebuffer+0x294/0x4a0 __drm_fb_helper_initial_config_and_unlock+0x43c/0x880 [drm_kms_helper] drm_fb_helper_initial_config+0x52/0x80 [drm_kms_helper] drm_fbdev_client_hotplug+0x156/0x1b0 [drm_kms_helper] drm_fbdev_generic_setup+0xfc/0x290 [drm_kms_helper] bochs_pci_probe+0x6ca/0x772 [bochs] local_pci_probe+0x4d/0xb0 pci_device_probe+0x119/0x320 really_probe+0x181/0x550 __driver_probe_device+0xc6/0x220 driver_probe_device+0x32/0x100 __driver_attach+0x195/0x200 bus_for_each_dev+0xbb/0x120 driver_attach+0x27/0x30 bus_add_driver+0x22e/0x2f0 driver_register+0xa9/0x190 __pci_register_driver+0x90/0xa0 bochs_pci_driver_init+0x52/0x1000 [bochs] do_one_initcall+0x76/0x430 do_init_module+0x61/0x28a load_module+0x1f82/0x2e50 __do_sys_finit_module+0xf8/0x190 __x64_sys_finit_module+0x23/0x30 do_syscall_64+0x58/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd </TASK> Fixes: 3312be8f6fc8 ("drm/ttm: move populated state into page flags") Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221031113350.4180975-1-cuigaosheng1@huawei.com Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31soreuseport: Fix socket selection for SO_INCOMING_CPU.Kuniyuki Iwashima1-0/+2
[ Upstream commit b261eda84ec136240a9ca753389853a3a1bccca2 ] Kazuho Oku reported that setsockopt(SO_INCOMING_CPU) does not work with setsockopt(SO_REUSEPORT) since v4.6. With the combination of SO_REUSEPORT and SO_INCOMING_CPU, we could build a highly efficient server application. setsockopt(SO_INCOMING_CPU) associates a CPU with a TCP listener or UDP socket, and then incoming packets processed on the CPU will likely be distributed to the socket. Technically, a socket could even receive packets handled on another CPU if no sockets in the reuseport group have the same CPU receiving the flow. The logic exists in compute_score() so that a socket will get a higher score if it has the same CPU with the flow. However, the score gets ignored after the blamed two commits, which introduced a faster socket selection algorithm for SO_REUSEPORT. This patch introduces a counter of sockets with SO_INCOMING_CPU in a reuseport group to check if we should iterate all sockets to find a proper one. We increment the counter when * calling listen() if the socket has SO_INCOMING_CPU and SO_REUSEPORT * enabling SO_INCOMING_CPU if the socket is in a reuseport group Also, we decrement it when * detaching a socket out of the group to apply SO_INCOMING_CPU to migrated TCP requests * disabling SO_INCOMING_CPU if the socket is in a reuseport group When the counter reaches 0, we can get back to the O(1) selection algorithm. The overall changes are negligible for the non-SO_INCOMING_CPU case, and the only notable thing is that we have to update sk_incomnig_cpu under reuseport_lock. Otherwise, the race prevents transitioning to the O(n) algorithm and results in the wrong socket selection. cpu1 (setsockopt) cpu2 (listen) +-----------------+ +-------------+ lock_sock(sk1) lock_sock(sk2) reuseport_update_incoming_cpu(sk1, val) . | /* set CPU as 0 */ |- WRITE_ONCE(sk1->incoming_cpu, val) | | spin_lock_bh(&reuseport_lock) | reuseport_grow(sk2, reuse) | . | |- more_socks_size = reuse->max_socks * 2U; | |- if (more_socks_size > U16_MAX && | | reuse->num_closed_socks) | | . | | |- RCU_INIT_POINTER(sk1->sk_reuseport_cb, NULL); | | `- __reuseport_detach_closed_sock(sk1, reuse) | | . | | `- reuseport_put_incoming_cpu(sk1, reuse) | | . | | | /* Read shutdown()ed sk1's sk_incoming_cpu | | | * without lock_sock(). | | | */ | | `- if (sk1->sk_incoming_cpu >= 0) | | . | | | /* decrement not-yet-incremented | | | * count, which is never incremented. | | | */ | | `- __reuseport_put_incoming_cpu(reuse); | | | `- spin_lock_bh(&reuseport_lock) | |- spin_lock_bh(&reuseport_lock) | |- reuse = rcu_dereference_protected(sk1->sk_reuseport_cb, ...) |- if (!reuse) | . | | /* Cannot increment reuse->incoming_cpu. */ | `- goto out; | `- spin_unlock_bh(&reuseport_lock) Fixes: e32ea7e74727 ("soreuseport: fast reuseport UDP socket selection") Fixes: c125e80b8868 ("soreuseport: fast reuseport TCP socket selection") Reported-by: Kazuho Oku <kazuhooku@gmail.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31wifi: fix multi-link element subelement iterationJohannes Berg1-1/+1
[ Upstream commit 1177aaa7fe9373c762cd5bf5f5de8517bac989d5 ] The subelements obviously start after the common data, including the common multi-link element structure definition itself. This bug was possibly just hidden by the higher bits of the control being set to 0, so the iteration just found one bogus element and most of the code could continue anyway. Fixes: 0f48b8b88aa9 ("wifi: ieee80211: add definitions for multi-link element") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31eventfd: change int to __u64 in eventfd_signal() ifndef CONFIG_EVENTFDZhang Qilong1-1/+1
[ Upstream commit fd4e60bf0ef8eb9edcfa12dda39e8b6ee9060492 ] Commit ee62c6b2dc93 ("eventfd: change int to __u64 in eventfd_signal()") forgot to change int to __u64 in the CONFIG_EVENTFD=n stub function. Link: https://lkml.kernel.org/r/20221124140154.104680-1-zhangqilong3@huawei.com Fixes: ee62c6b2dc93 ("eventfd: change int to __u64 in eventfd_signal()") Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Cc: Dylan Yudaken <dylany@fb.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Sha Zhengju <handai.szj@taobao.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31debugfs: fix error when writing negative value to atomic_t debugfs fileAkinobu Mita1-2/+17
[ Upstream commit d472cf797c4e268613dbce5ec9b95d0bcae19ecb ] The simple attribute files do not accept a negative value since the commit 488dac0c9237 ("libfs: fix error cast of negative value in simple_attr_write()"), so we have to use a 64-bit value to write a negative value for a debugfs file created by debugfs_create_atomic_t(). This restores the previous behaviour by introducing DEFINE_DEBUGFS_ATTRIBUTE_SIGNED for a signed value. Link: https://lkml.kernel.org/r/20220919172418.45257-4-akinobu.mita@gmail.com Fixes: 488dac0c9237 ("libfs: fix error cast of negative value in simple_attr_write()") Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reported-by: Zhao Gongyi <zhaogongyi@huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Oscar Salvador <osalvador@suse.de> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Wei Yongjun <weiyongjun1@huawei.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31libfs: add DEFINE_SIMPLE_ATTRIBUTE_SIGNED for signed valueAkinobu Mita1-2/+10
[ Upstream commit 2e41f274f9aa71cdcc69dc1f26a3f9304a651804 ] Patch series "fix error when writing negative value to simple attribute files". The simple attribute files do not accept a negative value since the commit 488dac0c9237 ("libfs: fix error cast of negative value in simple_attr_write()"), but some attribute files want to accept a negative value. This patch (of 3): The simple attribute files do not accept a negative value since the commit 488dac0c9237 ("libfs: fix error cast of negative value in simple_attr_write()"), so we have to use a 64-bit value to write a negative value. This adds DEFINE_SIMPLE_ATTRIBUTE_SIGNED for a signed value. Link: https://lkml.kernel.org/r/20220919172418.45257-1-akinobu.mita@gmail.com Link: https://lkml.kernel.org/r/20220919172418.45257-2-akinobu.mita@gmail.com Fixes: 488dac0c9237 ("libfs: fix error cast of negative value in simple_attr_write()") Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reported-by: Zhao Gongyi <zhaogongyi@huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Oscar Salvador <osalvador@suse.de> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Wei Yongjun <weiyongjun1@huawei.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31timerqueue: Use rb_entry_safe() in timerqueue_getnext()Barnabás Pőcze1-1/+1
[ Upstream commit 2f117484329b233455ee278f2d9b0a4356835060 ] When `timerqueue_getnext()` is called on an empty timer queue, it will use `rb_entry()` on a NULL pointer, which is invalid. Fix that by using `rb_entry_safe()` which handles NULL pointers. This has not caused any issues so far because the offset of the `rb_node` member in `timerqueue_node` is 0, so `rb_entry()` is essentially a no-op. Fixes: 511885d7061e ("lib/timerqueue: Rely on rbtree semantics for next timer") Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20221114195421.342929-1-pobrn@protonmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-11Merge tag 'mm-hotfixes-stable-2022-12-10-1' of ↵Linus Torvalds1-3/+5
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "Nine hotfixes. Six for MM, three for other areas. Four of these patches address post-6.0 issues" * tag 'mm-hotfixes-stable-2022-12-10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: memcg: fix possible use-after-free in memcg_write_event_control() MAINTAINERS: update Muchun Song's email mm/gup: fix gup_pud_range() for dax mmap: fix do_brk_flags() modifying obviously incorrect VMAs mm/swap: fix SWP_PFN_BITS with CONFIG_PHYS_ADDR_T_64BIT on 32bit tmpfs: fix data loss from failed fallocate kselftests: cgroup: update kmem test precision tolerance mm: do not BUG_ON missing brk mapping, because userspace can unmap it mailmap: update Matti Vaittinen's email address
2022-12-10memcg: fix possible use-after-free in memcg_write_event_control()Tejun Heo1-0/+1
memcg_write_event_control() accesses the dentry->d_name of the specified control fd to route the write call. As a cgroup interface file can't be renamed, it's safe to access d_name as long as the specified file is a regular cgroup file. Also, as these cgroup interface files can't be removed before the directory, it's safe to access the parent too. Prior to 347c4a874710 ("memcg: remove cgroup_event->cft"), there was a call to __file_cft() which verified that the specified file is a regular cgroupfs file before further accesses. The cftype pointer returned from __file_cft() was no longer necessary and the commit inadvertently dropped the file type check with it allowing any file to slip through. With the invarients broken, the d_name and parent accesses can now race against renames and removals of arbitrary files and cause use-after-free's. Fix the bug by resurrecting the file type check in __file_cft(). Now that cgroupfs is implemented through kernfs, checking the file operations needs to go through a layer of indirection. Instead, let's check the superblock and dentry type. Link: https://lkml.kernel.org/r/Y5FRm/cfcKPGzWwl@slm.duckdns.org Fixes: 347c4a874710 ("memcg: remove cgroup_event->cft") Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Jann Horn <jannh@google.com> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: <stable@vger.kernel.org> [3.14+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-10mm/swap: fix SWP_PFN_BITS with CONFIG_PHYS_ADDR_T_64BIT on 32bitDavid Hildenbrand1-3/+5
We use "unsigned long" to store a PFN in the kernel and phys_addr_t to store a physical address. On a 64bit system, both are 64bit wide. However, on a 32bit system, the latter might be 64bit wide. This is, for example, the case on x86 with PAE: phys_addr_t and PTEs are 64bit wide, while "unsigned long" only spans 32bit. The current definition of SWP_PFN_BITS without MAX_PHYSMEM_BITS misses that case, and assumes that the maximum PFN is limited by an 32bit phys_addr_t. This implies, that SWP_PFN_BITS will currently only be able to cover 4 GiB - 1 on any 32bit system with 4k page size, which is wrong. Let's rely on the number of bits in phys_addr_t instead, but make sure to not exceed the maximum swap offset, to not make the BUILD_BUG_ON() in is_pfn_swap_entry() unhappy. Note that swp_entry_t is effectively an unsigned long and the maximum swap offset shares that value with the swap type. For example, on an 8 GiB x86 PAE system with a kernel config based on Debian 11.5 (-> CONFIG_FLATMEM=y, CONFIG_X86_PAE=y), we will currently fail removing migration entries (remove_migration_ptes()), because mm/page_vma_mapped.c:check_pte() will fail to identify a PFN match as swp_offset_pfn() wrongly masks off PFN bits. For example, split_huge_page_to_list()->...->remap_page() will leave migration entries in place and continue to unlock the page. Later, when we stumble over these migration entries (e.g., via /proc/self/pagemap), pfn_swap_entry_to_page() will BUG_ON() because these migration entries shouldn't exist anymore and the page was unlocked. [ 33.067591] kernel BUG at include/linux/swapops.h:497! [ 33.067597] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 33.067602] CPU: 3 PID: 742 Comm: cow Tainted: G E 6.1.0-rc8+ #16 [ 33.067605] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014 [ 33.067606] EIP: pagemap_pmd_range+0x644/0x650 [ 33.067612] Code: 00 00 00 00 66 90 89 ce b9 00 f0 ff ff e9 ff fb ff ff 89 d8 31 db e8 48 c6 52 00 e9 23 fb ff ff e8 61 83 56 00 e9 b6 fe ff ff <0f> 0b bf 00 f0 ff ff e9 38 fa ff ff 3e 8d 74 26 00 55 89 e5 57 31 [ 33.067615] EAX: ee394000 EBX: 00000002 ECX: ee394000 EDX: 00000000 [ 33.067617] ESI: c1b0ded4 EDI: 00024a00 EBP: c1b0ddb4 ESP: c1b0dd68 [ 33.067619] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010246 [ 33.067624] CR0: 80050033 CR2: b7a00000 CR3: 01bbbd20 CR4: 00350ef0 [ 33.067625] Call Trace: [ 33.067628] ? madvise_free_pte_range+0x720/0x720 [ 33.067632] ? smaps_pte_range+0x4b0/0x4b0 [ 33.067634] walk_pgd_range+0x325/0x720 [ 33.067637] ? mt_find+0x1d6/0x3a0 [ 33.067641] ? mt_find+0x1d6/0x3a0 [ 33.067643] __walk_page_range+0x164/0x170 [ 33.067646] walk_page_range+0xf9/0x170 [ 33.067648] ? __kmem_cache_alloc_node+0x2a8/0x340 [ 33.067653] pagemap_read+0x124/0x280 [ 33.067658] ? default_llseek+0x101/0x160 [ 33.067662] ? smaps_account+0x1d0/0x1d0 [ 33.067664] vfs_read+0x90/0x290 [ 33.067667] ? do_madvise.part.0+0x24b/0x390 [ 33.067669] ? debug_smp_processor_id+0x12/0x20 [ 33.067673] ksys_pread64+0x58/0x90 [ 33.067675] __ia32_sys_ia32_pread64+0x1b/0x20 [ 33.067680] __do_fast_syscall_32+0x4c/0xc0 [ 33.067683] do_fast_syscall_32+0x29/0x60 [ 33.067686] do_SYSENTER_32+0x15/0x20 [ 33.067689] entry_SYSENTER_32+0x98/0xf1 Decrease the indentation level of SWP_PFN_BITS and SWP_PFN_MASK to keep it readable and consistent. [david@redhat.com: rely on sizeof(phys_addr_t) and min_t() instead] Link: https://lkml.kernel.org/r/20221206105737.69478-1-david@redhat.com [david@redhat.com: use "int" for comparison, as we're only comparing numbers < 64] Link: https://lkml.kernel.org/r/1f157500-2676-7cef-a84e-9224ed64e540@redhat.com Link: https://lkml.kernel.org/r/20221205150857.167583-1-david@redhat.com Fixes: 0d206b5d2e0d ("mm/swap: add swp_offset_pfn() to fetch PFN from swap entry") Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Yang Shi <shy828301@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-09Merge tag 'net-6.1-rc9' of ↵Linus Torvalds2-4/+11
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth, can and netfilter. Current release - new code bugs: - bonding: ipv6: correct address used in Neighbour Advertisement parsing (src vs dst typo) - fec: properly scope IRQ coalesce setup during link up to supported chips only Previous releases - regressions: - Bluetooth fixes for fake CSR clones (knockoffs): - re-add ERR_DATA_REPORTING quirk - fix crash when device is replugged - Bluetooth: - silence a user-triggerable dmesg error message - L2CAP: fix u8 overflow, oob access - correct vendor codec definition - fix support for Read Local Supported Codecs V2 - ti: am65-cpsw: fix RGMII configuration at SPEED_10 - mana: fix race on per-CQ variable NAPI work_done Previous releases - always broken: - af_unix: diag: fetch user_ns from in_skb in unix_diag_get_exact(), avoid null-deref - af_can: fix NULL pointer dereference in can_rcv_filter - can: slcan: fix UAF with a freed work - can: can327: flush TX_work on ldisc .close() - macsec: add missing attribute validation for offload - ipv6: avoid use-after-free in ip6_fragment() - nft_set_pipapo: actually validate intervals in fields after the first one - mvneta: prevent oob access in mvneta_config_rss() - ipv4: fix incorrect route flushing when table ID 0 is used, or when source address is deleted - phy: mxl-gpy: add workaround for IRQ bug on GPY215B and GPY215C" * tag 'net-6.1-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (77 commits) net: dsa: sja1105: avoid out of bounds access in sja1105_init_l2_policing() s390/qeth: fix use-after-free in hsci macsec: add missing attribute validation for offload net: mvneta: Fix an out of bounds check net: thunderbolt: fix memory leak in tbnet_open() ipv6: avoid use-after-free in ip6_fragment() net: plip: don't call kfree_skb/dev_kfree_skb() under spin_lock_irq() net: phy: mxl-gpy: add MDINT workaround net: dsa: mv88e6xxx: accept phy-mode = "internal" for internal PHY ports xen/netback: don't call kfree_skb() under spin_lock_irqsave() dpaa2-switch: Fix memory leak in dpaa2_switch_acl_entry_add() and dpaa2_switch_acl_entry_remove() ethernet: aeroflex: fix potential skb leak in greth_init_rings() tipc: call tipc_lxc_xmit without holding node_read_lock can: esd_usb: Allow REC and TEC to return to zero can: can327: flush TX_work on ldisc .close() can: slcan: fix freed work crash can: af_can: fix NULL pointer dereference in can_rcv_filter net: dsa: sja1105: fix memory leak in sja1105_setup_devlink_regions() ipv4: Fix incorrect route flushing when table ID 0 is used ipv4: Fix incorrect route flushing when source address is deleted ...
2022-12-08memcg: Fix possible use-after-free in memcg_write_event_control()Tejun Heo1-0/+1
memcg_write_event_control() accesses the dentry->d_name of the specified control fd to route the write call. As a cgroup interface file can't be renamed, it's safe to access d_name as long as the specified file is a regular cgroup file. Also, as these cgroup interface files can't be removed before the directory, it's safe to access the parent too. Prior to 347c4a874710 ("memcg: remove cgroup_event->cft"), there was a call to __file_cft() which verified that the specified file is a regular cgroupfs file before further accesses. The cftype pointer returned from __file_cft() was no longer necessary and the commit inadvertently dropped the file type check with it allowing any file to slip through. With the invarients broken, the d_name and parent accesses can now race against renames and removals of arbitrary files and cause use-after-free's. Fix the bug by resurrecting the file type check in __file_cft(). Now that cgroupfs is implemented through kernfs, checking the file operations needs to go through a layer of indirection. Instead, let's check the superblock and dentry type. Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: 347c4a874710 ("memcg: remove cgroup_event->cft") Cc: stable@kernel.org # v3.14+ Reported-by: Jann Horn <jannh@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-12-07fscache: Fix oops due to race with cookie_lru and use_cookieDave Wysochanski1-0/+2
If a cookie expires from the LRU and the LRU_DISCARD flag is set, but the state machine has not run yet, it's possible another thread can call fscache_use_cookie and begin to use it. When the cookie_worker finally runs, it will see the LRU_DISCARD flag set, transition the cookie->state to LRU_DISCARDING, which will then withdraw the cookie. Once the cookie is withdrawn the object is removed the below oops will occur because the object associated with the cookie is now NULL. Fix the oops by clearing the LRU_DISCARD bit if another thread uses the cookie before the cookie_worker runs. BUG: kernel NULL pointer dereference, address: 0000000000000008 ... CPU: 31 PID: 44773 Comm: kworker/u130:1 Tainted: G E 6.0.0-5.dneg.x86_64 #1 Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022 Workqueue: events_unbound netfs_rreq_write_to_cache_work [netfs] RIP: 0010:cachefiles_prepare_write+0x28/0x90 [cachefiles] ... Call Trace: netfs_rreq_write_to_cache_work+0x11c/0x320 [netfs] process_one_work+0x217/0x3e0 worker_thread+0x4a/0x3b0 kthread+0xd6/0x100 Fixes: 12bb21a29c19 ("fscache: Implement cookie user counting and resource pinning") Reported-by: Daire Byrne <daire.byrne@gmail.com> Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Daire Byrne <daire@dneg.com> Link: https://lore.kernel.org/r/20221117115023.1350181-1-dwysocha@redhat.com/ # v1 Link: https://lore.kernel.org/r/20221117142915.1366990-1-dwysocha@redhat.com/ # v2 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-12-03Merge tag 'mmc-v6.1-rc5-2' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fixes from Ulf Hansson: "MMC core: - Fix ambiguous TRIM and DISCARD args - Fix removal of debugfs file for mmc_test MMC host: - mtk-sd: Add missing clk_disable_unprepare() in an error path - sdhci: Fix I/O voltage switch delay for UHS-I SD cards - sdhci-esdhc-imx: Fix CQHCI exit halt state check - sdhci-sprd: Fix voltage switch" * tag 'mmc-v6.1-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: sdhci-sprd: Fix no reset data and command after voltage switch mmc: sdhci: Fix voltage switch delay mmc: mtk-sd: Fix missing clk_disable_unprepare in msdc_of_clock_parse() mmc: mmc_test: Fix removal of debugfs file mmc: sdhci-esdhc-imx: correct CQHCI exit halt state check mmc: core: Fix ambiguous TRIM and DISCARD arg
2022-12-03Merge tag 'mm-hotfixes-stable-2022-12-02' of ↵Linus Torvalds5-12/+59
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc hotfixes from Andrew Morton: "15 hotfixes, 11 marked cc:stable. Only three or four of the latter address post-6.0 issues, which is hopefully a sign that things are converging" * tag 'mm-hotfixes-stable-2022-12-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: revert "kbuild: fix -Wimplicit-function-declaration in license_is_gpl_compatible" Kconfig.debug: provide a little extra FRAME_WARN leeway when KASAN is enabled drm/amdgpu: temporarily disable broken Clang builds due to blown stack-frame mm/khugepaged: invoke MMU notifiers in shmem/file collapse paths mm/khugepaged: fix GUP-fast interaction by sending IPI mm/khugepaged: take the right locks for page table retraction mm: migrate: fix THP's mapcount on isolation mm: introduce arch_has_hw_nonleaf_pmd_young() mm: add dummy pmd_young() for architectures not having it mm/damon/sysfs: fix wrong empty schemes assumption under online tuning in damon_sysfs_set_schemes() tools/vm/slabinfo-gnuplot: use "grep -E" instead of "egrep" nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry() hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processing madvise: use zap_page_range_single for madvise dontneed mm: replace VM_WARN_ON to pr_warn if the node is offline with __GFP_THISNODE
2022-12-03Bluetooth: Remove codec id field in vendor codec definitionChethan T N1-1/+0
As per the specfication vendor codec id is defined. BLUETOOTH CORE SPECIFICATION Version 5.3 | Vol 4, Part E page 2127 Fixes: 9ae664028a9e ("Bluetooth: Add support for Read Local Supported Codecs V2") Signed-off-by: Chethan T N <chethan.tumkur.narayan@intel.com> Signed-off-by: Kiran K <kiran.k@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-12-03Bluetooth: btusb: Fix CSR clones again by re-adding ERR_DATA_REPORTING quirkIsmael Ferreras Morezuelas1-0/+11
A patch series by a Qualcomm engineer essentially removed my quirk/workaround because they thought it was unnecessary. It wasn't, and it broke everything again: https://patchwork.kernel.org/project/netdevbpf/list/?series=661703&archive=both&state=* He argues that the quirk is not necessary because the code should check if the dongle says if it's supported or not. The problem is that for these Chinese CSR clones they say that it would work: = New Index: 00:00:00:00:00:00 (Primary,USB,hci0) = Open Index: 00:00:00:00:00:00 < HCI Command: Read Local Version Information (0x04|0x0001) plen 0 > HCI Event: Command Complete (0x0e) plen 12 > [hci0] 11.276039 Read Local Version Information (0x04|0x0001) ncmd 1 Status: Success (0x00) HCI version: Bluetooth 5.0 (0x09) - Revision 2064 (0x0810) LMP version: Bluetooth 5.0 (0x09) - Subversion 8978 (0x2312) Manufacturer: Cambridge Silicon Radio (10) ... < HCI Command: Read Local Supported Features (0x04|0x0003) plen 0 > HCI Event: Command Complete (0x0e) plen 68 > [hci0] 11.668030 Read Local Supported Commands (0x04|0x0002) ncmd 1 Status: Success (0x00) Commands: 163 entries ... Read Default Erroneous Data Reporting (Octet 18 - Bit 2) Write Default Erroneous Data Reporting (Octet 18 - Bit 3) ... ... < HCI Command: Read Default Erroneous Data Reporting (0x03|0x005a) plen 0 = Close Index: 00:1A:7D:DA:71:XX So bring it back wholesale. Fixes: 63b1a7dd38bf ("Bluetooth: hci_sync: Remove HCI_QUIRK_BROKEN_ERR_DATA_REPORTING") Fixes: e168f6900877 ("Bluetooth: btusb: Remove HCI_QUIRK_BROKEN_ERR_DATA_REPORTING for fake CSR") Fixes: 766ae2422b43 ("Bluetooth: hci_sync: Check LMP feature bit instead of quirk") Cc: stable@vger.kernel.org Cc: Zijun Hu <quic_zijuhu@quicinc.com> Cc: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Cc: Hans de Goede <hdegoede@redhat.com> Tested-by: Ismael Ferreras Morezuelas <swyterzone@gmail.com> Signed-off-by: Ismael Ferreras Morezuelas <swyterzone@gmail.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-12-01inet: ping: use hlist_nulls rcu iterator during lookupFlorian Westphal1-3/+0
ping_lookup() does not acquire the table spinlock, so iteration should use hlist_nulls_for_each_entry_rcu(). Spotted during code review. Fixes: dbca1596bbb0 ("ping: convert to RCU lookups, get rid of rwlock") Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/20221129140644.28525-1-fw@strlen.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01revert "kbuild: fix -Wimplicit-function-declaration in ↵Andrew Morton1-2/+0
license_is_gpl_compatible" It causes build failures with unusual CC/HOSTCC combinations. Quoting https://lkml.kernel.org/r/A222B1E6-69B8-4085-AD1B-27BDB72CA971@goldelico.com: HOSTCC scripts/mod/modpost.o - due to target missing In file included from include/linux/string.h:5, from scripts/mod/../../include/linux/license.h:5, from scripts/mod/modpost.c:24: include/linux/compiler.h:246:10: fatal error: asm/rwonce.h: No such file or directory 246 | #include <asm/rwonce.h> | ^~~~~~~~~~~~~~ compilation terminated. ... The problem is that HOSTCC is not necessarily the same compiler or even architecture as CC and pulling in <linux/compiler.h> or <asm/rwonce.h> files indirectly isn't a good idea then. My toolchain is providing HOSTCC = gcc (MacPorts) and CC = arm-linux-gnueabihf (built from gcc source) and all running on Darwin. If I change the include to <string.h> I can then "HOSTCC scripts/mod/modpost.c" but then it fails for "CC kernel/module/main.c" not finding <string.h>: CC kernel/module/main.o - due to target missing In file included from kernel/module/main.c:43:0: ./include/linux/license.h:5:20: fatal error: string.h: No such file or directory #include <string.h> ^ compilation terminated. Reported-by: "H. Nikolaus Schaller" <hns@goldelico.com> Cc: Sam James <sam@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-01mm/khugepaged: fix GUP-fast interaction by sending IPIJann Horn1-0/+4
Since commit 70cbc3cc78a99 ("mm: gup: fix the fast GUP race against THP collapse"), the lockless_pages_from_mm() fastpath rechecks the pmd_t to ensure that the page table was not removed by khugepaged in between. However, lockless_pages_from_mm() still requires that the page table is not concurrently freed. Fix it by sending IPIs (if the architecture uses semi-RCU-style page table freeing) before freeing/reusing page tables. Link: https://lkml.kernel.org/r/20221129154730.2274278-2-jannh@google.com Link: https://lkml.kernel.org/r/20221128180252.1684965-2-jannh@google.com Link: https://lkml.kernel.org/r/20221125213714.4115729-2-jannh@google.com Fixes: ba76149f47d8 ("thp: khugepaged") Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: Yang Shi <shy828301@gmail.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Peter Xu <peterx@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-01mm: introduce arch_has_hw_nonleaf_pmd_young()Juergen Gross1-0/+11
When running as a Xen PV guests commit eed9a328aa1a ("mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG") can cause a protection violation in pmdp_test_and_clear_young(): BUG: unable to handle page fault for address: ffff8880083374d0 #PF: supervisor write access in kernel mode #PF: error_code(0x0003) - permissions violation PGD 3026067 P4D 3026067 PUD 3027067 PMD 7fee5067 PTE 8010000008337065 Oops: 0003 [#1] PREEMPT SMP NOPTI CPU: 7 PID: 158 Comm: kswapd0 Not tainted 6.1.0-rc5-20221118-doflr+ #1 RIP: e030:pmdp_test_and_clear_young+0x25/0x40 This happens because the Xen hypervisor can't emulate direct writes to page table entries other than PTEs. This can easily be fixed by introducing arch_has_hw_nonleaf_pmd_young() similar to arch_has_hw_pte_young() and test that instead of CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG. Link: https://lkml.kernel.org/r/20221123064510.16225-1-jgross@suse.com Fixes: eed9a328aa1a ("mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG") Signed-off-by: Juergen Gross <jgross@suse.com> Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Acked-by: Yu Zhao <yuzhao@google.com> Tested-by: Sander Eikelenboom <linux@eikelenboom.it> Acked-by: David Hildenbrand <david@redhat.com> [core changes] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-01mm: add dummy pmd_young() for architectures not having itJuergen Gross1-0/+7
In order to avoid #ifdeffery add a dummy pmd_young() implementation as a fallback. This is required for the later patch "mm: introduce arch_has_hw_nonleaf_pmd_young()". Link: https://lkml.kernel.org/r/fd3ac3cd-7349-6bbd-890a-71a9454ca0b3@suse.com Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Yu Zhao <yuzhao@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Sander Eikelenboom <linux@eikelenboom.it> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-01hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processingMike Kravetz1-0/+2
madvise(MADV_DONTNEED) ends up calling zap_page_range() to clear page tables associated with the address range. For hugetlb vmas, zap_page_range will call __unmap_hugepage_range_final. However, __unmap_hugepage_range_final assumes the passed vma is about to be removed and deletes the vma_lock to prevent pmd sharing as the vma is on the way out. In the case of madvise(MADV_DONTNEED) the vma remains, but the missing vma_lock prevents pmd sharing and could potentially lead to issues with truncation/fault races. This issue was originally reported here [1] as a BUG triggered in page_try_dup_anon_rmap. Prior to the introduction of the hugetlb vma_lock, __unmap_hugepage_range_final cleared the VM_MAYSHARE flag to prevent pmd sharing. Subsequent faults on this vma were confused as VM_MAYSHARE indicates a sharable vma, but was not set so page_mapping was not set in new pages added to the page table. This resulted in pages that appeared anonymous in a VM_SHARED vma and triggered the BUG. Address issue by adding a new zap flag ZAP_FLAG_UNMAP to indicate an unmap call from unmap_vmas(). This is used to indicate the 'final' unmapping of a hugetlb vma. When called via MADV_DONTNEED, this flag is not set and the vm_lock is not deleted. [1] https://lore.kernel.org/lkml/CAO4mrfdLMXsao9RF4fUE8-Wfde8xmjsKrTNMNC9wjUb6JudD0g@mail.gmail.com/ Link: https://lkml.kernel.org/r/20221114235507.294320-3-mike.kravetz@oracle.com Fixes: 90e7e7f5ef3f ("mm: enable MADV_DONTNEED for hugetlb mappings") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reported-by: Wei Chen <harperchen1110@gmail.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mina Almasry <almasrymina@google.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev> Cc: Peter Xu <peterx@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>