summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-03-17Linux 6.1.20v6.1.20Greg Kroah-Hartman1-1/+1
Link: https://lore.kernel.org/r/20230315115740.429574234@linuxfoundation.org Tested-by: Chris Paterson (CIP) <chris.paterson2@renesas.com> Tested-by: Markus Reichelt <lkt+2023@mareichelt.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: Salvatore Bonaccorso <carnil@debian.org> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20230316083444.336870717@linuxfoundation.org Tested-by: Chris Paterson (CIP) <chris.paterson2@renesas.com> Tested-by: Markus Reichelt <lkt+2023@mareichelt.com> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-17UML: define RUNTIME_DISCARD_EXITMasahiro Yamada1-1/+1
commit b99ddbe8336ee680257c8ab479f75051eaa49dcf upstream. With CONFIG_VIRTIO_UML=y, GNU ld < 2.36 fails to link UML vmlinux (w/wo CONFIG_LD_SCRIPT_STATIC). `.exit.text' referenced in section `.uml.exitcall.exit' of arch/um/drivers/virtio_uml.o: defined in discarded section `.exit.text' of arch/um/drivers/virtio_uml.o collect2: error: ld returned 1 exit status This fix is similar to the following commits: - 4b9880dbf3bd ("powerpc/vmlinux.lds: Define RUNTIME_DISCARD_EXIT") - a494398bde27 ("s390: define RUNTIME_DISCARD_EXIT to fix link error with GNU ld < 2.36") - c1c551bebf92 ("sh: define RUNTIME_DISCARD_EXIT") Fixes: 99cb0d917ffa ("arch: fix broken BuildID for arm64 and riscv") Reported-by: SeongJae Park <sj@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: SeongJae Park <sj@kernel.org> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-17Revert "bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMES"Martin KaFai Lau2-27/+9
commit 181127fb76e62d06ab17a75fd610129688612343 upstream. This reverts commit 6c20822fada1b8adb77fa450d03a0d449686a4a9. build bot failed on arch with different cache line size: https://lore.kernel.org/bpf/50c35055-afa9-d01e-9a05-ea5351280e4f@intel.com/ Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-17filelocks: use mount idmapping for setlease permission checkSeth Forshee1-1/+2
commit 42d0c4bdf753063b6eec55415003184d3ca24f6e upstream. A user should be allowed to take out a lease via an idmapped mount if the fsuid matches the mapped uid of the inode. generic_setlease() is checking the unmapped inode uid, causing these operations to be denied. Fix this by comparing against the mapped inode uid instead of the unmapped uid. Fixes: 9caccd41541a ("fs: introduce MOUNT_ATTR_IDMAP") Cc: stable@vger.kernel.org Signed-off-by: Seth Forshee (DigitalOcean) <sforshee@kernel.org> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-17drm/amd/display: adjust MALL size available for DCN32 and DCN321Samson Tam5-5/+78
commit 235fef6c7fd341026eee90cc546e6e8ff8b2c315 upstream. [Why] MALL size available can vary for different SKUs. Use num_chans read from VBIOS to determine the available MALL size we can use [How] Define max_chans for DCN32 and DCN321. If num_chans is max_chans, then return max_chans as we can access the entire MALL space. Otherwise, define avail_chans as the number of available channels we are allowed instead. Return corresponding number of channels back and use this to calculate available MALL size. Reviewed-by: Nevenko Stupar <Nevenko.Stupar@amd.com> Acked-by: Alan Liu <HaoPing.Liu@amd.com> Signed-off-by: Samson Tam <Samson.Tam@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-17drm/amd/display: Allow subvp on vactive pipes that are 2560x1440@60Alvin Lee2-1/+32
commit 2ebd1036209c2e7b61e6bc6e5bee4b67c1684ac6 upstream. Enable subvp on specifically 1440p@60hz displays even though it can switch in vactive. Tested-by: Daniel Wheeler <Daniel.Wheeler@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-17media: rc: gpio-ir-recv: add remove functionLi Jun1-0/+18
[ Upstream commit 30040818b338b8ebc956ce0ebd198f8d593586a6 ] In case runtime PM is enabled, do runtime PM clean up to remove cpu latency qos request, otherwise driver removal may have below kernel dump: [ 19.463299] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000048 [ 19.472161] Mem abort info: [ 19.474985] ESR = 0x0000000096000004 [ 19.478754] EC = 0x25: DABT (current EL), IL = 32 bits [ 19.484081] SET = 0, FnV = 0 [ 19.487149] EA = 0, S1PTW = 0 [ 19.490361] FSC = 0x04: level 0 translation fault [ 19.495256] Data abort info: [ 19.498149] ISV = 0, ISS = 0x00000004 [ 19.501997] CM = 0, WnR = 0 [ 19.504977] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000049f81000 [ 19.511432] [0000000000000048] pgd=0000000000000000, p4d=0000000000000000 [ 19.518245] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 19.524520] Modules linked in: gpio_ir_recv(+) rc_core [last unloaded: rc_core] [ 19.531845] CPU: 0 PID: 445 Comm: insmod Not tainted 6.2.0-rc1-00028-g2c397a46d47c #72 [ 19.531854] Hardware name: FSL i.MX8MM EVK board (DT) [ 19.531859] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 19.551777] pc : cpu_latency_qos_remove_request+0x20/0x110 [ 19.557277] lr : gpio_ir_recv_runtime_suspend+0x18/0x30 [gpio_ir_recv] [ 19.557294] sp : ffff800008ce3740 [ 19.557297] x29: ffff800008ce3740 x28: 0000000000000000 x27: ffff800008ce3d50 [ 19.574270] x26: ffffc7e3e9cea100 x25: 00000000000f4240 x24: ffffc7e3f9ef0e30 [ 19.574284] x23: 0000000000000000 x22: ffff0061803820f4 x21: 0000000000000008 [ 19.574296] x20: ffffc7e3fa75df30 x19: 0000000000000020 x18: ffffffffffffffff [ 19.588570] x17: 0000000000000000 x16: ffffc7e3f9efab70 x15: ffffffffffffffff [ 19.595712] x14: ffff800008ce37b8 x13: ffff800008ce37aa x12: 0000000000000001 [ 19.602853] x11: 0000000000000001 x10: ffffcbe3ec0dff87 x9 : 0000000000000008 [ 19.609991] x8 : 0101010101010101 x7 : 0000000000000000 x6 : 000000000f0bfe9f [ 19.624261] x5 : 00ffffffffffffff x4 : 0025ab8e00000000 x3 : ffff006180382010 [ 19.631405] x2 : ffffc7e3e9ce8030 x1 : ffffc7e3fc3eb810 x0 : 0000000000000020 [ 19.638548] Call trace: [ 19.640995] cpu_latency_qos_remove_request+0x20/0x110 [ 19.646142] gpio_ir_recv_runtime_suspend+0x18/0x30 [gpio_ir_recv] [ 19.652339] pm_generic_runtime_suspend+0x2c/0x44 [ 19.657055] __rpm_callback+0x48/0x1dc [ 19.660807] rpm_callback+0x6c/0x80 [ 19.664301] rpm_suspend+0x10c/0x640 [ 19.667880] rpm_idle+0x250/0x2d0 [ 19.671198] update_autosuspend+0x38/0xe0 [ 19.675213] pm_runtime_set_autosuspend_delay+0x40/0x60 [ 19.680442] gpio_ir_recv_probe+0x1b4/0x21c [gpio_ir_recv] [ 19.685941] platform_probe+0x68/0xc0 [ 19.689610] really_probe+0xc0/0x3dc [ 19.693189] __driver_probe_device+0x7c/0x190 [ 19.697550] driver_probe_device+0x3c/0x110 [ 19.701739] __driver_attach+0xf4/0x200 [ 19.705578] bus_for_each_dev+0x70/0xd0 [ 19.709417] driver_attach+0x24/0x30 [ 19.712998] bus_add_driver+0x17c/0x240 [ 19.716834] driver_register+0x78/0x130 [ 19.720676] __platform_driver_register+0x28/0x34 [ 19.725386] gpio_ir_recv_driver_init+0x20/0x1000 [gpio_ir_recv] [ 19.731404] do_one_initcall+0x44/0x2ac [ 19.735243] do_init_module+0x48/0x1d0 [ 19.739003] load_module+0x19fc/0x2034 [ 19.742759] __do_sys_finit_module+0xac/0x12c [ 19.747124] __arm64_sys_finit_module+0x20/0x30 [ 19.751664] invoke_syscall+0x48/0x114 [ 19.755420] el0_svc_common.constprop.0+0xcc/0xec [ 19.760132] do_el0_svc+0x38/0xb0 [ 19.763456] el0_svc+0x2c/0x84 [ 19.766516] el0t_64_sync_handler+0xf4/0x120 [ 19.770789] el0t_64_sync+0x190/0x194 [ 19.774460] Code: 910003fd a90153f3 aa0003f3 91204021 (f9401400) [ 19.780556] ---[ end trace 0000000000000000 ]--- Signed-off-by: Li Jun <jun.li@nxp.com> Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17media: ov5640: Fix analogue gain controlPaul Elder1-1/+1
[ Upstream commit afa4805799c1d332980ad23339fdb07b5e0cf7e0 ] Gain control is badly documented in publicly available (including leaked) documentation. There is an AGC pre-gain in register 0x3a13, expressed as a 6-bit value (plus an enable bit in bit 6). The driver hardcodes it to 0x43, which one application note states is equal to x1.047. The documentation also states that 0x40 is equel to x1.000. The pre-gain thus seems to be expressed as in 1/64 increments, and thus ranges from x1.00 to x1.984. What the pre-gain does is however unspecified. There is then an AGC gain limit, in registers 0x3a18 and 0x3a19, expressed as a 10-bit "real gain format" value. One application note sets it to 0x00f8 and states it is equal to x15.5, so it appears to be expressed in 1/16 increments, up to x63.9375. The manual gain is stored in registers 0x350a and 0x350b, also as a 10-bit "real gain format" value. It is documented in the application note as a Q6.4 values, up to x63.9375. One version of the datasheet indicates that the sensor supports a digital gain: The OV5640 supports 1/2/4 digital gain. Normally, the gain is controlled automatically by the automatic gain control (AGC) block. It isn't clear how that would be controlled manually. There appears to be no indication regarding whether the gain controlled through registers 0x350a and 0x350b is an analogue gain only or also includes digital gain. The words "real gain" don't necessarily mean "combined analogue and digital gains". Some OmniVision sensors (such as the OV8858) are documented as supoprting different formats for the gain values, selectable through a register bit, and they are called "real gain format" and "sensor gain format". For that sensor, we have (one of) the gain registers documented as 0x3503[2]=0, gain[7:0] is real gain format, where low 4 bits are fraction bits, for example, 0x10 is 1x gain, 0x28 is 2.5x gain If 0x3503[2]=1, gain[7:0] is sensor gain format, gain[7:4] is coarse gain, 00000: 1x, 00001: 2x, 00011: 4x, 00111: 8x, gain[7] is 1, gain[3:0] is fine gain. For example, 0x10 is 1x gain, 0x30 is 2x gain, 0x70 is 4x gain (The second part of the text makes little sense) "Real gain" may thus refer to the combination of the coarse and fine analogue gains as a single value. The OV5640 0x350a and 0x350b registers thus appear to control analogue gain. The driver incorrectly uses V4L2_CID_GAIN as V4L2 has a specific control for analogue gain, V4L2_CID_ANALOGUE_GAIN. Use it. If registers 0x350a and 0x350b are later found to control digital gain as well, the driver could then restrict the range of the analogue gain control value to lower than x64 and add a separate digital gain control. Signed-off-by: Paul Elder <paul.elder@ideasonboard.com> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Reviewed-by: Jai Luthra <j-luthra@ti.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17scripts: handle BrokenPipeError for python scriptsMasahiro Yamada3-10/+40
[ Upstream commit 87c7ee67deb7fce9951a5f9d80641138694aad17 ] In the follow-up of commit fb3041d61f68 ("kbuild: fix SIGPIPE error message for AR=gcc-ar and AR=llvm-ar"), Kees Cook pointed out that tools should _not_ catch their own SIGPIPEs [1] [2]. Based on his feedback, LLVM was fixed [3]. However, Python's default behavior is to show noisy bracktrace when SIGPIPE is sent. So, scripts written in Python are basically in the same situation as the buggy llvm tools. Example: $ make -s allnoconfig $ make -s allmodconfig $ scripts/diffconfig .config.old .config | head -n1 -ALIX n Traceback (most recent call last): File "/home/masahiro/linux/scripts/diffconfig", line 132, in <module> main() File "/home/masahiro/linux/scripts/diffconfig", line 130, in main print_config("+", config, None, b[config]) File "/home/masahiro/linux/scripts/diffconfig", line 64, in print_config print("+%s %s" % (config, new_value)) BrokenPipeError: [Errno 32] Broken pipe Python documentation [4] notes how to make scripts die immediately and silently: """ Piping output of your program to tools like head(1) will cause a SIGPIPE signal to be sent to your process when the receiver of its standard output closes early. This results in an exception like BrokenPipeError: [Errno 32] Broken pipe. To handle this case, wrap your entry point to catch this exception as follows: import os import sys def main(): try: # simulate large output (your code replaces this loop) for x in range(10000): print("y") # flush output here to force SIGPIPE to be triggered # while inside this try block. sys.stdout.flush() except BrokenPipeError: # Python flushes standard streams on exit; redirect remaining output # to devnull to avoid another BrokenPipeError at shutdown devnull = os.open(os.devnull, os.O_WRONLY) os.dup2(devnull, sys.stdout.fileno()) sys.exit(1) # Python exits with error code 1 on EPIPE if __name__ == '__main__': main() Do not set SIGPIPE’s disposition to SIG_DFL in order to avoid BrokenPipeError. Doing that would cause your program to exit unexpectedly whenever any socket connection is interrupted while your program is still writing to it. """ Currently, tools/perf/scripts/python/intel-pt-events.py seems to be the only script that fixes the issue that way. tools/perf/scripts/python/compaction-times.py uses another approach signal.signal(signal.SIGPIPE, signal.SIG_DFL) but the Python documentation clearly says "Don't do it". I cannot fix all Python scripts since there are so many. I fixed some in the scripts/ directory. [1]: https://lore.kernel.org/all/202211161056.1B9611A@keescook/ [2]: https://github.com/llvm/llvm-project/issues/59037 [3]: https://github.com/llvm/llvm-project/commit/4787efa38066adb51e2c049499d25b3610c0877b [4]: https://docs.python.org/3/library/signal.html#note-on-sigpipe Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17PCI: Add SolidRun vendor IDAlvaro Karsz1-0/+2
[ Upstream commit db6c4dee4c104f50ed163af71c53bfdb878a8318 ] Add SolidRun vendor ID to pci_ids.h The vendor ID is used in 2 different source files, the SNET vDPA driver and PCI quirks. Signed-off-by: Alvaro Karsz <alvaro.karsz@solid-run.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Message-Id: <20230110165638.123745-2-alvaro.karsz@solid-run.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17macintosh: windfarm: Use unsigned type for 1-bit bitfieldsNathan Chancellor2-4/+4
[ Upstream commit 748ea32d2dbd813d3bd958117bde5191182f909a ] Clang warns: drivers/macintosh/windfarm_lm75_sensor.c:63:14: error: implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Werror,-Wsingle-bit-bitfield-constant-conversion] lm->inited = 1; ^ ~ drivers/macintosh/windfarm_smu_sensors.c:356:19: error: implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Werror,-Wsingle-bit-bitfield-constant-conversion] pow->fake_volts = 1; ^ ~ drivers/macintosh/windfarm_smu_sensors.c:368:18: error: implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Werror,-Wsingle-bit-bitfield-constant-conversion] pow->quadratic = 1; ^ ~ There is no bug here since no code checks the actual value of these fields, just whether or not they are zero (boolean context), but this can be easily fixed by switching to an unsigned type. Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230215-windfarm-wsingle-bit-bitfield-constant-conversion-v1-1-26415072e855@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17alpha: fix R_ALPHA_LITERAL reloc for large modulesEdward Humes1-3/+1
[ Upstream commit b6b17a8b3ecd878d98d5472a9023ede9e669ca72 ] Previously, R_ALPHA_LITERAL relocations would overflow for large kernel modules. This was because the Alpha's apply_relocate_add was relying on the kernel's module loader to have sorted the GOT towards the very end of the module as it was mapped into memory in order to correctly assign the global pointer. While this behavior would mostly work fine for small kernel modules, this approach would overflow on kernel modules with large GOT's since the global pointer would be very far away from the GOT, and thus, certain entries would be out of range. This patch fixes this by instead using the Tru64 behavior of assigning the global pointer to be 32KB away from the start of the GOT. The change made in this patch won't work for multi-GOT kernel modules as it makes the assumption the module only has one GOT located at the beginning of .got, although for the vast majority kernel modules, this should be fine. Of the kernel modules that would previously result in a relocation error, none of them, even modules like nouveau, have even come close to filling up a single GOT, and they've all worked fine under this patch. Signed-off-by: Edward Humes <aurxenon@lunos.org> Signed-off-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17powerpc/kcsan: Exclude udelay to prevent recursive instrumentationRohan McLure1-2/+2
[ Upstream commit 2a7ce82dc46c591c9244057d89a6591c9639b9b9 ] In order for KCSAN to increase its likelihood of observing a data race, it sets a watchpoint on memory accesses and stalls, allowing for detection of conflicting accesses by other kernel threads or interrupts. Stalls are implemented by injecting a call to udelay in instrumented code. To prevent recursive instrumentation, exclude udelay from being instrumented. Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230206021801.105268-3-rmclure@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17powerpc/64: Move paca allocation to early_setup()Nicholas Piggin5-16/+13
[ Upstream commit dc222fa7737212fe0da513e5b8937c156d02225d ] The early paca and boot cpuid dance is complicated and currently does not quite work as expected for boot cpuid != 0 cases. early_init_devtree() currently allocates the paca_ptrs and boot cpuid paca, but until that returns and early_setup() calls setup_paca(), this thread is currently still executing with smp_processor_id() == 0. One problem this causes is the paca_ptrs[smp_processor_id()] pointer is poisoned, so valid_emergency_stack() (any backtrace) and any similar users will crash. Another is that the hardware id which is set here will not be returned by get_hard_smp_processor_id(smp_processor_id()), but it would work correctly for boot_cpuid == 0, which could lead to difficult to reproduce or find bugs. The hard id does not seem to be used by the rest of early_init_devtree(), it just looks like all this code might have been put here to allocate somewhere to store boot CPU hardware id while scanning the devtree. Rearrange things so the hwid is put in a global variable like boot_cpuid, and do all the paca allocation and boot paca setup in the 64-bit early_setup() after we have everything ready to go. The paca_ptrs[0] re-poisoning code in early_setup does not seem to have ever worked, because paca_ptrs[0] was never not-poisoned when boot_cpuid is not 0. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Fix build error on 32-bit] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221216115930.2667772-4-npiggin@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17powerpc/64: Fix task_cpu in early boot when booting non-zero cpuidNicholas Piggin1-0/+5
[ Upstream commit 9fa24404f5044967753a6cd3e5e36f57686bec6e ] powerpc/64 can boot on a non-zero SMP processor id. Initially, the boot CPU is said to be "assumed to be 0" until early_init_devtree() discovers the id from the device tree. That is not a good description because the assumption can be wrong and that has to be handled, the better description is that 0 is used as a placeholder, and things are fixed after the real id is discovered. smp_processor_id() is set to the boot cpuid, but task_cpu(current) is not, which causes the smp_processor_id() == task_cpu(current) invariant to be broken until init_idle() in sched_init(). This is quite fragile and could lead to subtle bugs in future. One bug is that validate_sp_size uses task_cpu() to get the process stack, so any stack trace from the booting CPU between early_init_devtree() and sched_init() will have problems. Early on paca_ptrs[0] will be poisoned, so that can cause machine checks dereferencing that memory in real mode. Later, validating the current stack pointer against the idle task of a different secondary will probably cause no stack trace to be printed. Fix this by setting thread_info->cpu right after smp_processor_id() is set to the boot cpuid. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Fix SMP=n build as reported by sfr] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221216115930.2667772-3-npiggin@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17powerpc/bpf/32: Only set a stack frame when necessaryChristophe Leroy1-2/+18
[ Upstream commit d084dcf256bc4565b4b1af9b00297ac7b51c7049 ] Until now a stack frame was set at all time due to the need to keep tail call counter in the stack. But since commit 89d21e259a94 ("powerpc/bpf/32: Fix Oops on tail call tests") the tail call counter is passed via register r4. It is therefore not necessary anymore to have a stack frame for that. Just like PPC64, implement bpf_has_stack_frame() and only sets the frame when needed. The difference with PPC64 is that PPC32 doesn't have a redzone, so the stack is required as soon as non volatile registers are used or when tail call count is set up. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Fix commit reference in change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/62d7b654a3cfe73d998697cb29bbc5ffd89bfdb1.1675245773.git.christophe.leroy@csgroup.eu Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17clk: renesas: rcar-gen3: Disable R-Car H3 ES1.*Wolfram Sang5-173/+13
[ Upstream commit b1dec4e78599a2ce5bf8557056cd6dd72e1096b0 ] R-Car H3 ES1.* was only available to an internal development group and needed a lot of quirks and workarounds. These become a maintenance burden now, so our development group decided to remove upstream support for this SoC. Public users only have ES2 onwards. In addition to the ES1 specific removals, a check for it was added preventing the machine to boot further. It may otherwise inherit wrong clock settings from ES2 which could damage the hardware. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20230202092332.2504-1-wsa+renesas@sang-engineering.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17powerpc/iommu: fix memory leak with using debugfs_lookup()Greg Kroah-Hartman1-3/+1
[ Upstream commit b505063910c134778202dfad9332dfcecb76bab3 ] When calling debugfs_lookup() the result must have dput() called on it, otherwise the memory will leak over time. To make things simpler, just call debugfs_lookup_and_remove() instead which handles all of the logic at once. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230202141919.2298821-1-gregkh@linuxfoundation.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17powerpc/64: Don't recurse irq replayNicholas Piggin2-37/+70
[ Upstream commit 5746ca131e2496ccd5bb4d7a0244d6c38070cbf5 ] Interrupt handlers called by soft-pending irq replay code can run softirqs, softirq replay enables and disables local irqs, which allows interrupts to come in including soft-masked interrupts, and it can cause pending irqs to be replayed again. That makes the soft irq replay state machine and possible races more complicated and fragile than it needs to be. Use irq_enter/irq_exit around irq replay to prevent softirqs running while interrupts are being replayed. Softirqs will now be run at the irq_exit() call after all the irq replaying is done. This prevents irqs being replayed while irqs are being replayed, and should hopefully make things simpler and easier to think about and debug. A new PACA_IRQ_REPLAYING is added to prevent asynchronous interrupt handlers hard-enabling EE while pending irqs are being replayed, because that causes new pending irqs to arrive which is also a complexity. This means pending irqs won't be profiled quite so well because perf irqs can't be taken. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230121102618.2824429-1-npiggin@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17MIPS: Fix a compilation issuexurui1-1/+1
[ Upstream commit 109d587a4b4d7ccca2200ab1f808f43ae23e2585 ] arch/mips/include/asm/mach-rc32434/pci.h:377: cc1: error: result of ‘-117440512 << 16’ requires 44 bits to represent, but ‘int’ only has 32 bits [-Werror=shift-overflow=] All bits in KORINA_STAT are already at the correct position, so there is no addtional shift needed. Signed-off-by: xurui <xurui@kylinos.cn> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17tpm/eventlog: Don't abort tpm_read_log on faulty ACPI addressMorten Linderud1-1/+5
[ Upstream commit 80a6c216b16d7f5c584d2148c2e4345ea4eb06ce ] tpm_read_log_acpi() should return -ENODEV when no eventlog from the ACPI table is found. If the firmware vendor includes an invalid log address we are unable to map from the ACPI memory and tpm_read_log() returns -EIO which would abort discovery of the eventlog. Change the return value from -EIO to -ENODEV when acpi_os_map_iomem() fails to map the event log. The following hardware was used to test this issue: Framework Laptop (Pre-production) BIOS: INSYDE Corp, Revision: 3.2 TPM Device: NTC, Firmware Revision: 7.2 Dump of the faulty ACPI TPM2 table: [000h 0000 4] Signature : "TPM2" [Trusted Platform Module hardware interface Table] [004h 0004 4] Table Length : 0000004C [008h 0008 1] Revision : 04 [009h 0009 1] Checksum : 2B [00Ah 0010 6] Oem ID : "INSYDE" [010h 0016 8] Oem Table ID : "TGL-ULT" [018h 0024 4] Oem Revision : 00000002 [01Ch 0028 4] Asl Compiler ID : "ACPI" [020h 0032 4] Asl Compiler Revision : 00040000 [024h 0036 2] Platform Class : 0000 [026h 0038 2] Reserved : 0000 [028h 0040 8] Control Address : 0000000000000000 [030h 0048 4] Start Method : 06 [Memory Mapped I/O] [034h 0052 12] Method Parameters : 00 00 00 00 00 00 00 00 00 00 00 00 [040h 0064 4] Minimum Log Length : 00010000 [044h 0068 8] Log Address : 000000004053D000 Fixes: 0cf577a03f21 ("tpm: Fix handling of missing event log") Tested-by: Erkki Eilonen <erkki@bearmetal.eu> Signed-off-by: Morten Linderud <morten@linderud.pw> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17watch_queue: fix IOC_WATCH_QUEUE_SET_SIZE alloc error pathsDavid Disseldorp1-0/+1
[ Upstream commit 03e1d60e177eedbd302b77af4ea5e21b5a7ade31 ] The watch_queue_set_size() allocation error paths return the ret value set via the prior pipe_resize_ring() call, which will always be zero. As a result, IOC_WATCH_QUEUE_SET_SIZE callers such as "keyctl watch" fail to detect kernel wqueue->notes allocation failures and proceed to KEYCTL_WATCH_KEY, with any notifications subsequently lost. Fixes: c73be61cede58 ("pipe: Add general notification queue support") Signed-off-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17drm/msm/adreno: fix runtime PM imbalance at unbindJohan Hovold1-1/+2
[ Upstream commit 6153c44392b04ff2da1e9aa82ba87da9ab9a0fc1 ] A recent commit moved enabling of runtime PM from adreno_gpu_init() to adreno_load_gpu() (called on first open()), which means that unbind() may now be called with runtime PM disabled in case the device was never opened in between. Make sure to only forcibly suspend and disable runtime PM at unbind() in case runtime PM has been enabled to prevent a disable count imbalance. This specifically avoids leaving runtime PM disabled when the device is later opened after a successful bind: msm_dpu ae01000.display-controller: [drm:adreno_load_gpu [msm]] *ERROR* Couldn't power up the GPU: -13 Fixes: 4b18299b3365 ("drm/msm/adreno: Defer enabling runpm until hw_init()") Reported-by: Bjorn Andersson <quic_bjorande@quicinc.com> Link: https://lore.kernel.org/lkml/20230203181245.3523937-1-quic_bjorande@quicinc.com Cc: stable@vger.kernel.org # 6.0 Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Patchwork: https://patchwork.freedesktop.org/patch/523549/ Link: https://lore.kernel.org/r/20230221101430.14546-2-johan+linaro@kernel.org Signed-off-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17adreno: Shutdown the GPU properlyJoel Fernandes (Google)1-2/+3
[ Upstream commit e752e5454e6417da3f40ec1306a041ea96c56423 ] During kexec on ARM device, we notice that device_shutdown() only calls pm_runtime_force_suspend() while shutting down the GPU. This means the GPU kthread is still running and further, there maybe active submits. This causes all kinds of issues during a kexec reboot: Warning from shutdown path: [ 292.509662] WARNING: CPU: 0 PID: 6304 at [...] adreno_runtime_suspend+0x3c/0x44 [ 292.509863] Hardware name: Google Lazor (rev3 - 8) with LTE (DT) [ 292.509872] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 292.509881] pc : adreno_runtime_suspend+0x3c/0x44 [ 292.509891] lr : pm_generic_runtime_suspend+0x30/0x44 [ 292.509905] sp : ffffffc014473bf0 [...] [ 292.510043] Call trace: [ 292.510051] adreno_runtime_suspend+0x3c/0x44 [ 292.510061] pm_generic_runtime_suspend+0x30/0x44 [ 292.510071] pm_runtime_force_suspend+0x54/0xc8 [ 292.510081] adreno_shutdown+0x1c/0x28 [ 292.510090] platform_shutdown+0x2c/0x38 [ 292.510104] device_shutdown+0x158/0x210 [ 292.510119] kernel_restart_prepare+0x40/0x4c And here from GPU kthread, an SError OOPs: [ 192.648789] el1h_64_error+0x7c/0x80 [ 192.648812] el1_interrupt+0x20/0x58 [ 192.648833] el1h_64_irq_handler+0x18/0x24 [ 192.648854] el1h_64_irq+0x7c/0x80 [ 192.648873] local_daif_inherit+0x10/0x18 [ 192.648900] el1h_64_sync_handler+0x48/0xb4 [ 192.648921] el1h_64_sync+0x7c/0x80 [ 192.648941] a6xx_gmu_set_oob+0xbc/0x1fc [ 192.648968] a6xx_hw_init+0x44/0xe38 [ 192.648991] msm_gpu_hw_init+0x48/0x80 [ 192.649013] msm_gpu_submit+0x5c/0x1a8 [ 192.649034] msm_job_run+0xb0/0x11c [ 192.649058] drm_sched_main+0x170/0x434 [ 192.649086] kthread+0x134/0x300 [ 192.649114] ret_from_fork+0x10/0x20 Fix by calling adreno_system_suspend() in the device_shutdown() path. [ Applied Rob Clark feedback on fixing adreno_unbind() similarly, also tested as above. ] Cc: Rob Clark <robdclark@chromium.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ricardo Ribalda <ribalda@chromium.org> Cc: Ross Zwisler <zwisler@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Reviewed-by: Ricardo Ribalda <ribalda@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/517633/ Link: https://lore.kernel.org/r/20230109222547.1368644-1-joel@joelfernandes.org Signed-off-by: Rob Clark <robdclark@chromium.org> Stable-dep-of: 6153c44392b0 ("drm/msm/adreno: fix runtime PM imbalance at unbind") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17drm/amdgpu/soc21: Add video cap query support for VCN_4_0_4Veerabadhran Gopalakrishnan1-0/+1
[ Upstream commit 6ce2ea07c5ff0a8188eab0e5cd1f0e4899b36835 ] Added the video capability query support for VCN version 4_0_4 Signed-off-by: Veerabadhran Gopalakrishnan <veerabadhran.gopalakrishnan@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17drm/amdgpu/soc21: don't expose AV1 if VCN0 is harvestedAlex Deucher1-13/+48
[ Upstream commit a6de636eb04f146d23644dbbb7173e142452a9b7 ] Only VCN0 supports AV1. Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Stable-dep-of: 6ce2ea07c5ff ("drm/amdgpu/soc21: Add video cap query support for VCN_4_0_4") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17ext4: Fix deadlock during directory renameJan Kara1-9/+17
[ Upstream commit 3c92792da8506a295afb6d032b4476e46f979725 ] As lockdep properly warns, we should not be locking i_rwsem while having transactions started as the proper lock ordering used by all directory handling operations is i_rwsem -> transaction start. Fix the lock ordering by moving the locking of the directory earlier in ext4_rename(). Reported-by: syzbot+9d16c39efb5fade84574@syzkaller.appspotmail.com Fixes: 0813299c586b ("ext4: Fix possible corruption when moving a directory") Link: https://syzkaller.appspot.com/bug?extid=9d16c39efb5fade84574 Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230301141004.15087-1-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17drm/amdgpu: fix return value check in kfdShashank Sharma1-1/+1
[ Upstream commit 20534dbcc7b7bfb447279cdcfb0d88ee3b779a18 ] This patch fixes a return value check in kfd doorbell handling. This function should return 0(error) only when the ida_simple_get returns < 0(error), return > 0 is a success case. Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Fixes: 16f0013157bf ("drm/amdkfd: Allocate doorbells only when needed") Acked-by: Christian Koenig <chriatian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17RISC-V: Don't check text_mutex during stop_machineConor Dooley4-6/+39
[ Upstream commit 2a8db5ec4a28a0fce822d10224db9471a44b6925 ] We're currently using stop_machine() to update ftrace & kprobes, which means that the thread that takes text_mutex during may not be the same as the thread that eventually patches the code. This isn't actually a race because the lock is still held (preventing any other concurrent accesses) and there is only one thread running during stop_machine(), but it does trigger a lockdep failure. This patch just elides the lockdep check during stop_machine. Fixes: c15ac4fd60d5 ("riscv/ftrace: Add dynamic function tracer support") Suggested-by: Steven Rostedt <rostedt@goodmis.org> Reported-by: Changbin Du <changbin.du@gmail.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230303143754.4005217-1-conor.dooley@microchip.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17riscv: Use READ_ONCE_NOCHECK in imprecise unwinding stack modeAlexandre Ghiti1-1/+1
[ Upstream commit 76950340cf03b149412fe0d5f0810e52ac1df8cb ] When CONFIG_FRAME_POINTER is unset, the stack unwinding function walk_stackframe randomly reads the stack and then, when KASAN is enabled, it can lead to the following backtrace: [ 0.000000] ================================================================== [ 0.000000] BUG: KASAN: stack-out-of-bounds in walk_stackframe+0xa6/0x11a [ 0.000000] Read of size 8 at addr ffffffff81807c40 by task swapper/0 [ 0.000000] [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.2.0-12919-g24203e6db61f #43 [ 0.000000] Hardware name: riscv-virtio,qemu (DT) [ 0.000000] Call Trace: [ 0.000000] [<ffffffff80007ba8>] walk_stackframe+0x0/0x11a [ 0.000000] [<ffffffff80099ecc>] init_param_lock+0x26/0x2a [ 0.000000] [<ffffffff80007c4a>] walk_stackframe+0xa2/0x11a [ 0.000000] [<ffffffff80c49c80>] dump_stack_lvl+0x22/0x36 [ 0.000000] [<ffffffff80c3783e>] print_report+0x198/0x4a8 [ 0.000000] [<ffffffff80099ecc>] init_param_lock+0x26/0x2a [ 0.000000] [<ffffffff80007c4a>] walk_stackframe+0xa2/0x11a [ 0.000000] [<ffffffff8015f68a>] kasan_report+0x9a/0xc8 [ 0.000000] [<ffffffff80007c4a>] walk_stackframe+0xa2/0x11a [ 0.000000] [<ffffffff80007c4a>] walk_stackframe+0xa2/0x11a [ 0.000000] [<ffffffff8006e99c>] desc_make_final+0x80/0x84 [ 0.000000] [<ffffffff8009a04e>] stack_trace_save+0x88/0xa6 [ 0.000000] [<ffffffff80099fc2>] filter_irq_stacks+0x72/0x76 [ 0.000000] [<ffffffff8006b95e>] devkmsg_read+0x32a/0x32e [ 0.000000] [<ffffffff8015ec16>] kasan_save_stack+0x28/0x52 [ 0.000000] [<ffffffff8006e998>] desc_make_final+0x7c/0x84 [ 0.000000] [<ffffffff8009a04a>] stack_trace_save+0x84/0xa6 [ 0.000000] [<ffffffff8015ec52>] kasan_set_track+0x12/0x20 [ 0.000000] [<ffffffff8015f22e>] __kasan_slab_alloc+0x58/0x5e [ 0.000000] [<ffffffff8015e7ea>] __kmem_cache_create+0x21e/0x39a [ 0.000000] [<ffffffff80e133ac>] create_boot_cache+0x70/0x9c [ 0.000000] [<ffffffff80e17ab2>] kmem_cache_init+0x6c/0x11e [ 0.000000] [<ffffffff80e00fd6>] mm_init+0xd8/0xfe [ 0.000000] [<ffffffff80e011d8>] start_kernel+0x190/0x3ca [ 0.000000] [ 0.000000] The buggy address belongs to stack of task swapper/0 [ 0.000000] and is located at offset 0 in frame: [ 0.000000] stack_trace_save+0x0/0xa6 [ 0.000000] [ 0.000000] This frame has 1 object: [ 0.000000] [32, 56) 'c' [ 0.000000] [ 0.000000] The buggy address belongs to the physical page: [ 0.000000] page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x81a07 [ 0.000000] flags: 0x1000(reserved|zone=0) [ 0.000000] raw: 0000000000001000 ff600003f1e3d150 ff600003f1e3d150 0000000000000000 [ 0.000000] raw: 0000000000000000 0000000000000000 00000001ffffffff [ 0.000000] page dumped because: kasan: bad access detected [ 0.000000] [ 0.000000] Memory state around the buggy address: [ 0.000000] ffffffff81807b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 0.000000] ffffffff81807b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 0.000000] >ffffffff81807c00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 f3 [ 0.000000] ^ [ 0.000000] ffffffff81807c80: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 [ 0.000000] ffffffff81807d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 0.000000] ================================================================== Fix that by using READ_ONCE_NOCHECK when reading the stack in imprecise mode. Fixes: 5d8544e2d007 ("RISC-V: Generic library routines and assembly") Reported-by: Chathura Rajapaksha <chathura.abeyrathne.lk@gmail.com> Link: https://lore.kernel.org/all/CAD7mqryDQCYyJ1gAmtMm8SASMWAQ4i103ptTb0f6Oda=tPY2=A@mail.gmail.com/ Suggested-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20230308091639.602024-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17erofs: Revert "erofs: fix kvcalloc() misuse with __GFP_NOFAIL"Gao Xiang1-6/+6
[ Upstream commit 647dd2c3f0e16b71a1a77897d038164d48eea154 ] Let's revert commit 12724ba38992 ("erofs: fix kvcalloc() misuse with __GFP_NOFAIL") since kvmalloc() already supports __GFP_NOFAIL in commit a421ef303008 ("mm: allow !GFP_KERNEL allocations for kvmalloc"). So the original fix was wrong. Actually there was some issue as [1] discussed, so before that mm fix is landed, the warn could still happen but applying this commit first will cause less. [1] https://lore.kernel.org/r/20230305053035.1911-1-hsiangkao@linux.alibaba.com Fixes: 12724ba38992 ("erofs: fix kvcalloc() misuse with __GFP_NOFAIL") Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230309053148.9223-1-hsiangkao@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17af_unix: fix struct pid leaks in OOB supportEric Dumazet1-2/+8
[ Upstream commit 2aab4b96900272885bc157f8b236abf1cdc02e08 ] syzbot reported struct pid leak [1]. Issue is that queue_oob() calls maybe_add_creds() which potentially holds a reference on a pid. But skb->destructor is not set (either directly or by calling unix_scm_to_skb()) This means that subsequent kfree_skb() or consume_skb() would leak this reference. In this fix, I chose to fully support scm even for the OOB message. [1] BUG: memory leak unreferenced object 0xffff8881053e7f80 (size 128): comm "syz-executor242", pid 5066, jiffies 4294946079 (age 13.220s) hex dump (first 32 bytes): 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff812ae26a>] alloc_pid+0x6a/0x560 kernel/pid.c:180 [<ffffffff812718df>] copy_process+0x169f/0x26c0 kernel/fork.c:2285 [<ffffffff81272b37>] kernel_clone+0xf7/0x610 kernel/fork.c:2684 [<ffffffff812730cc>] __do_sys_clone+0x7c/0xb0 kernel/fork.c:2825 [<ffffffff849ad699>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<ffffffff849ad699>] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 [<ffffffff84a0008b>] entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: 314001f0bf92 ("af_unix: Add OOB support") Reported-by: syzbot+7699d9e5635c10253a27@syzkaller.appspotmail.com Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Rao Shoaib <rao.shoaib@oracle.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230307164530.771896-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17net: dsa: mt7530: permit port 5 to work without port 6 on MT7621 SoCVladimir Oltean1-15/+20
[ Upstream commit c8b8a3c601f2cfad25ab5ce5b04df700048aef6e ] The MT7530 switch from the MT7621 SoC has 2 ports which can be set up as internal: port 5 and 6. Arınç reports that the GMAC1 attached to port 5 receives corrupted frames, unless port 6 (attached to GMAC0) has been brought up by the driver. This is true regardless of whether port 5 is used as a user port or as a CPU port (carrying DSA tags). Offline debugging (blind for me) which began in the linked thread showed experimentally that the configuration done by the driver for port 6 contains a step which is needed by port 5 as well - the write to CORE_GSWPLL_GRP2 (note that I've no idea as to what it does, apart from the comment "Set core clock into 500Mhz"). Prints put by Arınç show that the reset value of CORE_GSWPLL_GRP2 is RG_GSWPLL_POSDIV_500M(1) | RG_GSWPLL_FBKDIV_500M(40) (0x128), both on the MCM MT7530 from the MT7621 SoC, as well as on the standalone MT7530 from MT7623NI Bananapi BPI-R2. Apparently, port 5 on the standalone MT7530 can work under both values of the register, while on the MT7621 SoC it cannot. The call path that triggers the register write is: mt753x_phylink_mac_config() for port 6 -> mt753x_pad_setup() -> mt7530_pad_clk_setup() so this fully explains the behavior noticed by Arınç, that bringing port 6 up is necessary. The simplest fix for the problem is to extract the register writes which are needed for both port 5 and 6 into a common mt7530_pll_setup() function, which is called at mt7530_setup() time, immediately after switch reset. We can argue that this mirrors the code layout introduced in mt7531_setup() by commit 42bc4fafe359 ("net: mt7531: only do PLL once after the reset"), in that the PLL setup has the exact same positioning, and further work to consolidate the separate setup() functions is not hindered. Testing confirms that: - the slight reordering of writes to MT7530_P6ECR and to CORE_GSWPLL_GRP1 / CORE_GSWPLL_GRP2 introduced by this change does not appear to cause problems for the operation of port 6 on MT7621 and on MT7623 (where port 5 also always worked) - packets sent through port 5 are not corrupted anymore, regardless of whether port 6 is enabled by phylink or not (or even present in the device tree) My algorithm for determining the Fixes: tag is as follows. Testing shows that some logic from mt7530_pad_clk_setup() is needed even for port 5. Prior to commit ca366d6c889b ("net: dsa: mt7530: Convert to PHYLINK API"), a call did exist for all phy_is_pseudo_fixed_link() ports - so port 5 included. That commit replaced it with a temporary "Port 5 is not supported!" comment, and the following commit 38f790a80560 ("net: dsa: mt7530: Add support for port 5") replaced that comment with a configuration procedure in mt7530_setup_port5() which was insufficient for port 5 to work. I'm laying the blame on the patch that claimed support for port 5, although one would have also needed the change from commit c3b8e07909db ("net: dsa: mt7530: setup core clock even in TRGMII mode") for the write to be performed completely independently from port 6's configuration. Thanks go to Arınç for describing the problem, for debugging and for testing. Reported-by: Arınç ÜNAL <arinc.unal@arinc9.com> Link: https://lore.kernel.org/netdev/f297c2c4-6e7c-57ac-2394-f6025d309b9d@arinc9.com/ Fixes: 38f790a80560 ("net: dsa: mt7530: Add support for port 5") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230307155411.868573-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17SUNRPC: Fix a server shutdown leakBenjamin Coddington1-1/+5
[ Upstream commit 9ca6705d9d609441d34f8b853e1e4a6369b3b171 ] Fix a race where kthread_stop() may prevent the threadfn from ever getting called. If that happens the svc_rqst will not be cleaned up. Fixes: ed6473ddc704 ("NFSv4: Fix callback server shutdown") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17octeontx2-af: Unlock contexts in the queue context cache in case of fault ↵Suman Ghosh5-7/+82
detection [ Upstream commit ea9dd2e5c6d12c8b65ce7514c8359a70eeaa0e70 ] NDC caches contexts of frequently used queue's (Rx and Tx queues) contexts. Due to a HW errata when NDC detects fault/poision while accessing contexts it could go into an illegal state where a cache line could get locked forever. To makesure all cache lines in NDC are available for optimum performance upon fault/lockerror/posion errors scan through all cache lines in NDC and clear the lock bit. Fixes: 4a3581cd5995 ("octeontx2-af: NPA AQ instruction enqueue support") Signed-off-by: Suman Ghosh <sumang@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: Sai Krishna <saikrishnag@marvell.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17net/smc: fix fallback failed while sendmsg with fastopenD. Wythe1-5/+8
[ Upstream commit ce7ca794712f186da99719e8b4e97bd5ddbb04c3 ] Before determining whether the msg has unsupported options, it has been prematurely terminated by the wrong status check. For the application, the general usages of MSG_FASTOPEN likes fd = socket(...) /* rather than connect */ sendto(fd, data, len, MSG_FASTOPEN) Hence, We need to check the flag before state check, because the sock state here is always SMC_INIT when applications tries MSG_FASTOPEN. Once we found unsupported options, fallback it to TCP. Fixes: ee9dfbef02d1 ("net/smc: handle sockopts forcing fallback") Signed-off-by: D. Wythe <alibuda@linux.alibaba.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> v2 -> v1: Optimize code style Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17ethernet: ice: avoid gcc-9 integer overflow warningArnd Bergmann1-4/+4
[ Upstream commit 8f5c5a790e3025d6eca96bf7ee5e3873dc92373f ] With older compilers like gcc-9, the calculation of the vlan priority field causes a false-positive warning from the byteswap: In file included from drivers/net/ethernet/intel/ice/ice_tc_lib.c:4: drivers/net/ethernet/intel/ice/ice_tc_lib.c: In function 'ice_parse_cls_flower': include/uapi/linux/swab.h:15:15: error: integer overflow in expression '(int)(short unsigned int)((int)match.key-><U67c8>.<U6698>.vlan_priority << 13) & 57344 & 255' of type 'int' results in '0' [-Werror=overflow] 15 | (((__u16)(x) & (__u16)0x00ffU) << 8) | \ | ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~ include/uapi/linux/swab.h:106:2: note: in expansion of macro '___constant_swab16' 106 | ___constant_swab16(x) : \ | ^~~~~~~~~~~~~~~~~~ include/uapi/linux/byteorder/little_endian.h:42:43: note: in expansion of macro '__swab16' 42 | #define __cpu_to_be16(x) ((__force __be16)__swab16((x))) | ^~~~~~~~ include/linux/byteorder/generic.h:96:21: note: in expansion of macro '__cpu_to_be16' 96 | #define cpu_to_be16 __cpu_to_be16 | ^~~~~~~~~~~~~ drivers/net/ethernet/intel/ice/ice_tc_lib.c:1458:5: note: in expansion of macro 'cpu_to_be16' 1458 | cpu_to_be16((match.key->vlan_priority << | ^~~~~~~~~~~ After a change to be16_encode_bits(), the code becomes more readable to both people and compilers, which avoids the warning. Fixes: 34800178b302 ("ice: Add support for VLAN priority filters in switchdev") Suggested-by: Alexander Lobakin <alexandr.lobakin@intel.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17ice: Fix DSCP PFC TLV creationDave Ertman1-1/+1
[ Upstream commit fef3f92e8a4214652d8f33f50330dc5a92efbf11 ] When creating the TLV to send to the FW for configuring DSCP mode PFC,the PFCENABLE field was being masked with a 4 bit mask (0xF), but this is an 8 bit bitmask for enabled classes for PFC. This means that traffic classes 4-7 could not be enabled for PFC. Remove the mask completely, as it is not necessary, as we are assigning 8 bits to an 8 bit field. Fixes: 2a87bd73e50d ("ice: Add DSCP support") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Karen Ostrowska <karen.ostrowska@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17NFSD: Protect against filesystem freezingChuck Lever1-0/+2
[ Upstream commit fd9a2e1d513823e840960cb3bc26d8b7749d4ac2 ] Flole observes this WARNING on occasion: [1210423.486503] WARNING: CPU: 8 PID: 1524732 at fs/ext4/ext4_jbd2.c:75 ext4_journal_check_start+0x68/0xb0 Reported-by: <flole@flole.de> Suggested-by: Jan Kara <jack@suse.cz> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217123 Fixes: 73da852e3831 ("nfsd: use vfs_iter_read/write") Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17block: fix wrong mode for blkdev_put() from disk_scan_partitions()Yu Kuai1-1/+1
[ Upstream commit 428913bce1e67ccb4dae317fd0332545bf8c9233 ] If disk_scan_partitions() is called with 'FMODE_EXCL', blkdev_get_by_dev() will be called without 'FMODE_EXCL', however, follow blkdev_put() is still called with 'FMODE_EXCL', which will cause 'bd_holders' counter to leak. Fix the problem by using the right mode for blkdev_put(). Reported-by: syzbot+2bcc0d79e548c4f62a59@syzkaller.appspotmail.com Link: https://lore.kernel.org/lkml/f9649d501bc8c3444769418f6c26263555d9d3be.camel@linux.ibm.com/T/ Tested-by: Julian Ruess <julianr@linux.ibm.com> Fixes: e5cfefa97bcc ("block: fix scan partition for exclusively open device again") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17platform: x86: MLX_PLATFORM: select REGMAP instead of depending on itRandy Dunlap1-1/+2
[ Upstream commit 7e7e1541c91615e9950d0b96bcd1806d297e970e ] REGMAP is a hidden (not user visible) symbol. Users cannot set it directly thru "make *config", so drivers should select it instead of depending on it if they need it. Consistently using "select" or "depends on" can also help reduce Kconfig circular dependency issues. Therefore, change the use of "depends on REGMAP" to "select REGMAP". Fixes: ef0f62264b2a ("platform/x86: mlx-platform: Add physical bus number auto detection") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Vadim Pasternak <vadimp@mellanox.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Mark Gross <markgross@kernel.org> Cc: platform-driver-x86@vger.kernel.org Link: https://lore.kernel.org/r/20230226053953.4681-7-rdunlap@infradead.org Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17platform: mellanox: select REGMAP instead of depending on itRandy Dunlap1-5/+4
[ Upstream commit 03f5eb300ad1241f854269a3e521b119189a4493 ] REGMAP is a hidden (not user visible) symbol. Users cannot set it directly thru "make *config", so drivers should select it instead of depending on it if they need it. Consistently using "select" or "depends on" can also help reduce Kconfig circular dependency issues. Therefore, change the use of "depends on REGMAP" to "select REGMAP". For NVSW_SN2201, select REGMAP_I2C instead of depending on it. Fixes: c6acad68eb2d ("platform/mellanox: mlxreg-hotplug: Modify to use a regmap interface") Fixes: 5ec4a8ace06c ("platform/mellanox: Introduce support for Mellanox register access driver") Fixes: 62f9529b8d5c ("platform/mellanox: mlxreg-lc: Add initial support for Nvidia line card devices") Fixes: 662f24826f95 ("platform/mellanox: Add support for new SN2201 system") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Darren Hart <dvhart@infradead.org> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Michael Shych <michaelsh@nvidia.com> Cc: Mark Gross <markgross@kernel.org> Cc: Vadim Pasternak <vadimp@nvidia.com> Cc: platform-driver-x86@vger.kernel.org Link: https://lore.kernel.org/r/20230226053953.4681-6-rdunlap@infradead.org Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17netfilter: conntrack: adopt safer max chain lengthEric Dumazet1-2/+2
[ Upstream commit c77737b736ceb50fdf150434347dbd81ec76dbb1 ] Customers using GKE 1.25 and 1.26 are facing conntrack issues root caused to commit c9c3b6811f74 ("netfilter: conntrack: make max chain length random"). Even if we assume Uniform Hashing, a bucket often reachs 8 chained items while the load factor of the hash table is smaller than 0.5 With a limit of 16, we reach load factors of 3. With a limit of 32, we reach load factors of 11. With a limit of 40, we reach load factors of 15. With a limit of 50, we reach load factors of 24. This patch changes MIN_CHAINLEN to 50, to minimize risks. Ideally, we could in the future add a cushion based on expected load factor (2 * nf_conntrack_max / nf_conntrack_buckets), because some setups might expect unusual values. Fixes: c9c3b6811f74 ("netfilter: conntrack: make max chain length random") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17scsi: sd: Fix wrong zone_write_granularity value during revalidateShin'ichiro Kawasaki2-9/+6
[ Upstream commit 288b3271d920c9ba949c3bab0f749f4cecc70e09 ] When the sd driver revalidates host-managed SMR disks, it calls disk_set_zoned() which changes the zone_write_granularity attribute value to the logical block size regardless of the device type. After that, the sd driver overwrites the value in sd_zbc_read_zone() with the physical block size, since ZBC/ZAC requires this for host-managed disks. Between the calls to disk_set_zoned() and sd_zbc_read_zone(), there exists a window where the attribute shows the logical block size as the zone_write_granularity value, which is wrong for host-managed disks. The duration of the window is from 20ms to 200ms, depending on report zone command execution time. To avoid the wrong zone_write_granularity value between disk_set_zoned() and sd_zbc_read_zone(), modify the value not in sd_zbc_read_zone() but just after disk_set_zoned() call. Fixes: a805a4fa4fa3 ("block: introduce zone_write_granularity limit") Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Link: https://lore.kernel.org/r/20230306063024.3376959-1-shinichiro.kawasaki@wdc.com Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17scsi: megaraid_sas: Update max supported LD IDs to 240Chandrakanth Patil2-1/+3
[ Upstream commit bfa659177dcba48cf13f2bd88c1972f12a60bf1c ] The firmware only supports Logical Disk IDs up to 240 and LD ID 255 (0xFF) is reserved for deleted LDs. However, in some cases, firmware was assigning LD ID 254 (0xFE) to deleted LDs and this was causing the driver to mark the wrong disk as deleted. This in turn caused the wrong disk device to be taken offline by the SCSI midlayer. To address this issue, limit the LD ID range from 255 to 240. This ensures the deleted LD ID is properly identified and removed by the driver without accidently deleting any valid LDs. Fixes: ae6874ba4b43 ("scsi: megaraid_sas: Early detection of VD deletion through RaidMap update") Reported-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Link: https://lore.kernel.org/r/20230302105342.34933-2-chandrakanth.patil@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17net: tls: fix device-offloaded sendpage straddling recordsJakub Kicinski1-0/+2
[ Upstream commit e539a105f947b9db470fec39fe91d85fe737a432 ] Adrien reports that incorrect data is transmitted when a single page straddles multiple records. We would transmit the same data in all iterations of the loop. Reported-by: Adrien Moulin <amoulin@corp.free.fr> Link: https://lore.kernel.org/all/61481278.42813558.1677845235112.JavaMail.zimbra@corp.free.fr Fixes: c1318b39c7d3 ("tls: Add opt-in zerocopy mode of sendfile()") Tested-by: Adrien Moulin <amoulin@corp.free.fr> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Acked-by: Maxim Mikityanskiy <maxtram95@gmail.com> Link: https://lore.kernel.org/r/20230304192610.3818098-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17net: ethernet: mtk_eth_soc: fix RX data corruption issueDaniel Golle2-1/+3
[ Upstream commit 193250ace270fecd586dd2d0dfbd9cbd2ade977f ] Fix data corruption issue with SerDes connected PHYs operating at 1.25 Gbps speed where we could previously observe about 30% packet loss while the bad packet counter was increasing. As almost all boards with MediaTek MT7622 or MT7986 use either the MT7531 switch IC operating at 3.125Gbps SerDes rate or single-port PHYs using rate-adaptation to 2500Base-X mode, this issue only got exposed now when we started trying to use SFP modules operating with 1.25 Gbps with the BananaPi R3 board. The fix is to set bit 12 which disables the RX FIFO clear function when setting up MAC MCR, MediaTek SDK did the same change stating: "If without this patch, kernel might receive invalid packets that are corrupted by GMAC."[1] [1]: https://git01.mediatek.com/plugins/gitiles/openwrt/feeds/mtk-openwrt-feeds/+/d8a2975939a12686c4a95c40db21efdc3f821f63 Fixes: 42c03844e93d ("net-next: mediatek: add support for MediaTek MT7622 SoC") Tested-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/138da2735f92c8b6f8578ec2e5a794ee515b665f.1677937317.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17net: phy: smsc: fix link up detection in forced irq modeHeiner Kallweit1-11/+3
[ Upstream commit 58aac3a2ef414fea6d7fdf823ea177744a087d13 ] Currently link up can't be detected in forced mode if polling isn't used. Only link up interrupt source we have is aneg complete which isn't applicable in forced mode. Therefore we have to use energy-on as link up indicator. Fixes: 7365494550f6 ("net: phy: smsc: skip ENERGYON interrupt if disabled") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17btf: fix resolving BTF_KIND_VAR after ARRAY, STRUCT, UNION, PTRLorenz Bauer1-0/+1
[ Upstream commit 9b459804ff9973e173fabafba2a1319f771e85fa ] btf_datasec_resolve contains a bug that causes the following BTF to fail loading: [1] DATASEC a size=2 vlen=2 type_id=4 offset=0 size=1 type_id=7 offset=1 size=1 [2] INT (anon) size=1 bits_offset=0 nr_bits=8 encoding=(none) [3] PTR (anon) type_id=2 [4] VAR a type_id=3 linkage=0 [5] INT (anon) size=1 bits_offset=0 nr_bits=8 encoding=(none) [6] TYPEDEF td type_id=5 [7] VAR b type_id=6 linkage=0 This error message is printed during btf_check_all_types: [1] DATASEC a size=2 vlen=2 type_id=7 offset=1 size=1 Invalid type By tracing btf_*_resolve we can pinpoint the problem: btf_datasec_resolve(depth: 1, type_id: 1, mode: RESOLVE_TBD) = 0 btf_var_resolve(depth: 2, type_id: 4, mode: RESOLVE_TBD) = 0 btf_ptr_resolve(depth: 3, type_id: 3, mode: RESOLVE_PTR) = 0 btf_var_resolve(depth: 2, type_id: 4, mode: RESOLVE_PTR) = 0 btf_datasec_resolve(depth: 1, type_id: 1, mode: RESOLVE_PTR) = -22 The last invocation of btf_datasec_resolve should invoke btf_var_resolve by means of env_stack_push, instead it returns EINVAL. The reason is that env_stack_push is never executed for the second VAR. if (!env_type_is_resolve_sink(env, var_type) && !env_type_is_resolved(env, var_type_id)) { env_stack_set_next_member(env, i + 1); return env_stack_push(env, var_type, var_type_id); } env_type_is_resolve_sink() changes its behaviour based on resolve_mode. For RESOLVE_PTR, we can simplify the if condition to the following: (btf_type_is_modifier() || btf_type_is_ptr) && !env_type_is_resolved() Since we're dealing with a VAR the clause evaluates to false. This is not sufficient to trigger the bug however. The log output and EINVAL are only generated if btf_type_id_size() fails. if (!btf_type_id_size(btf, &type_id, &type_size)) { btf_verifier_log_vsi(env, v->t, vsi, "Invalid type"); return -EINVAL; } Most types are sized, so for example a VAR referring to an INT is not a problem. The bug is only triggered if a VAR points at a modifier. Since we skipped btf_var_resolve that modifier was also never resolved, which means that btf_resolved_type_id returns 0 aka VOID for the modifier. This in turn causes btf_type_id_size to return NULL, triggering EINVAL. To summarise, the following conditions are necessary: - VAR pointing at PTR, STRUCT, UNION or ARRAY - Followed by a VAR pointing at TYPEDEF, VOLATILE, CONST, RESTRICT or TYPE_TAG The fix is to reset resolve_mode to RESOLVE_TBD before attempting to resolve a VAR from a DATASEC. Fixes: 1dc92851849c ("bpf: kernel side support for BTF Var and DataSec") Signed-off-by: Lorenz Bauer <lmb@isovalent.com> Link: https://lore.kernel.org/r/20230306112138.155352-2-lmb@isovalent.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMESAlexander Lobakin2-9/+27
[ Upstream commit 6c20822fada1b8adb77fa450d03a0d449686a4a9 ] &xdp_buff and &xdp_frame are bound in a way that xdp_buff->data_hard_start == xdp_frame It's always the case and e.g. xdp_convert_buff_to_frame() relies on this. IOW, the following: for (u32 i = 0; i < 0xdead; i++) { xdpf = xdp_convert_buff_to_frame(&xdp); xdp_convert_frame_to_buff(xdpf, &xdp); } shouldn't ever modify @xdpf's contents or the pointer itself. However, "live packet" code wrongly treats &xdp_frame as part of its context placed *before* the data_hard_start. With such flow, data_hard_start is sizeof(*xdpf) off to the right and no longer points to the XDP frame. Instead of replacing `sizeof(ctx)` with `offsetof(ctx, xdpf)` in several places and praying that there are no more miscalcs left somewhere in the code, unionize ::frm with ::data in a flex array, so that both starts pointing to the actual data_hard_start and the XDP frame actually starts being a part of it, i.e. a part of the headroom, not the context. A nice side effect is that the maximum frame size for this mode gets increased by 40 bytes, as xdp_buff::frame_sz includes everything from data_hard_start (-> includes xdpf already) to the end of XDP/skb shared info. Also update %MAX_PKT_SIZE accordingly in the selftests code. Leave it hardcoded for 64 bit && 4k pages, it can be made more flexible later on. Minor: align `&head->data` with how `head->frm` is assigned for consistency. Minor #2: rename 'frm' to 'frame' in &xdp_page_head while at it for clarity. (was found while testing XDP traffic generator on ice, which calls xdp_convert_frame_to_buff() for each XDP frame) Fixes: b530e9e1063e ("bpf: Add "live packet" mode for XDP in BPF_PROG_RUN") Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://lore.kernel.org/r/20230215185440.4126672-1-aleksander.lobakin@intel.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>