kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2019-12-20	Merge tag 'tty-5.5-rc3' of ↵	Linus Torvalds	5	-24/+39
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial fixes from Greg KH: "Here are some small tty and serial driver fixes for 5.5-rc3. Only four small patches here: - atmel serial driver fix - msm_serial driver fix - sprd serial driver fix - tty core port fix The last tty core fix should resolve a long-standing bug with a race at port creation time that some people would see, and Sudip finally tracked down. All of these have been in linux-next with no reported issues" * tag 'tty-5.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: tty/serial: atmel: fix out of range clock divider handling tty: link tty and port before configuring it as console serial: sprd: Add clearing break interrupt operation tty: serial: msm_serial: Fix lockup for sysrq and oops
2019-12-20	Merge tag 'usb-5.5-rc3' of ↵	Linus Torvalds	5	-6/+26
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some small USB fixes for some reported issues. Included in here are: - xhci build warning fix - ehci disconnect warning fix - usbip lockup fix and error cleanup fix - typec build fix All of these have been in linux-next with no reported issues" * tag 'usb-5.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: xhci: Fix build warning seen with CONFIG_PM=n usbip: Fix error path of vhci_recv_ret_submit() usbip: Fix receive error in vhci-hcd when using scatter-gather USB: EHCI: Do not return -EPIPE when hub is disconnected usb: typec: fusb302: Fix an undefined reference to 'extcon_get_state'
2019-12-20	Merge tag 'pinctrl-v5.5-3' of ↵	Linus Torvalds	6	-152/+184
	git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: "Sorry that this fixes pull request took a while. Too much christmas business going on. This contains a few really important Intel fixes and some odd fixes: - A host of fixes for the Intel baytrail and cherryview: properly serialize all register accesses and add the irqchip with the gpiochip as we need to, fix some pin lists and initialize the hardware in the right order. - Fix the Aspeed G6 LPC configuration. - Handle a possible NULL pointer exception in the core. - Fix the Kconfig dependencies for the Equilibrium driver" * tag 'pinctrl-v5.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: ingenic: Fixup PIN_CONFIG_OUTPUT config pinctrl: Modify Kconfig to fix linker error pinctrl: pinmux: fix a possible null pointer in pinmux_can_be_used_for_gpio pinctrl: aspeed-g6: Fix LPC/eSPI mux configuration pinctrl: cherryview: Pass irqchip when adding gpiochip pinctrl: cherryview: Add GPIO <-> pin mapping ranges via callback pinctrl: cherryview: Split out irq hw-init into a separate helper function pinctrl: baytrail: Pass irqchip when adding gpiochip pinctrl: baytrail: Add GPIO <-> pin mapping ranges via callback pinctrl: baytrail: Update North Community pin list pinctrl: baytrail: Really serialize all register accesses
2019-12-20	io_uring: pass in 'sqe' to the prep handlers	Jens Axboe	1	-242/+251
	This moves the prep handlers outside of the opcode handlers, and allows us to pass in the sqe directly. If the sqe is non-NULL, it means that the request should be prepared for the first time. With the opcode handlers not having access to the sqe at all, we are guaranteed that the prep handler has setup the request fully by the time we get there. As before, for opcodes that need to copy in more data then the io_kiocb allows for, the io_async_ctx holds that info. If a prep handler is invoked with req->io set, it must use that to retain information for later. Finally, we can remove io_kiocb->sqe as well. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-12-20	io_uring: standardize the prep methods	Jens Axboe	1	-65/+63
	We currently have a mix of use cases. Most of the newer ones are pretty uniform, but we have some older ones that use different calling calling conventions. This is confusing. For the opcodes that currently rely on the req->io->sqe copy saving them from reuse, add a request type struct in the io_kiocb command union to store the data they need. Prepare for all opcodes having a standard prep method, so we can call it in a uniform fashion and outside of the opcode handler. This is in preparation for passing in the 'sqe' pointer, rather than storing it in the io_kiocb. Once we have uniform prep handlers, we can leave all the prep work to that part, and not even pass in the sqe to the opcode handler. This ensures that we don't reuse sqe data inadvertently. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-12-20	platform/x86: pcengines-apuv2: Spelling fixes in the driver	Andy Shevchenko	1	-20/+20
	Mainly does: - capitalize gpio and bios to GPIO and BIOS - capitalize beginning of comments - add periods in multi-line comments Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2019-12-20	platform/x86: pcengines-apuv2: detect apuv4 board	Enrico Weigelt, metux IT consult	1	-0/+27
	GPIO stuff on APUv4 seems to be the same as on APUv2, so we just need to match on DMI data. Signed-off-by: Enrico Weigelt, metux IT consult <info@metux.net> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2019-12-20	platform/x86: pcengines-apuv2: fix simswap GPIO assignment	Enrico Weigelt, metux IT consult	1	-1/+1
	The mapping entry has to hold the GPIO line index instead of controller's register number. Fixes: 5037d4ddda31 ("platform/x86: pcengines-apuv2: wire up simswitch gpio as led") Signed-off-by: Enrico Weigelt, metux IT consult <info@metux.net> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2019-12-20	platform/x86: pmc_atom: Add Siemens CONNECT X300 to critclk_systems DMI table	Michael Haener	1	-0/+8
	The CONNECT X300 uses the PMC clock for on-board components and gets stuck during boot if the clock is disabled. Therefore, add this device to the critical systems list. Tested on CONNECT X300. Fixes: 648e921888ad ("clk: x86: Stop marking clocks as CLK_IS_CRITICAL") Signed-off-by: Michael Haener <michael.haener@siemens.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2019-12-20	platform/x86: hp-wmi: Make buffer for HPWMI_FEATURE2_QUERY 128 bytes	Hans de Goede	1	-1/+1
	At least on the HP Envy x360 15-cp0xxx model the WMI interface for HPWMI_FEATURE2_QUERY requires an outsize of at least 128 bytes, otherwise it fails with an error code 5 (HPWMI_RET_INVALID_PARAMETERS): Dec 06 00:59:38 kernel: hp_wmi: query 0xd returned error 0x5 We do not care about the contents of the buffer, we just want to know if the HPWMI_FEATURE2_QUERY command is supported. This commits bumps the buffer size, fixing the error. Fixes: 8a1513b4932 ("hp-wmi: limit hotkey enable") Cc: stable@vger.kernel.org BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1520703 Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2019-12-20	platform/mellanox: fix the mlx-bootctl sysfs	Liming Sun	2	-6/+6
	This is a follow-up commit for the sysfs attributes to change from DRIVER_ATTR to DEVICE_ATTR according to some initial comments. In such case, it's better to point the sysfs path to the device itself instead of the driver. The ABI document is also updated. Fixes: 79e29cb8fbc5 ("platform/mellanox: Add bootctl driver for Mellanox BlueField Soc") Signed-off-by: Liming Sun <lsun@mellanox.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2019-12-20	io_uring: read 'count' for IORING_OP_TIMEOUT in prep handler	Jens Axboe	1	-3/+8
	Add the count field to struct io_timeout, and ensure the prep handler has read it. Timeout also needs an async context always, set it up in the prep handler if we don't have one. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-12-20	io_uring: move all prep state for IORING_OP_{SEND,RECV}_MGS to prep handler	Jens Axboe	1	-31/+33
	Add struct io_sr_msg in our io_kiocb per-command union, and ensure that the send/recvmsg prep handlers have grabbed what they need from the SQE by the time prep is done. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-12-20	io_uring: move all prep state for IORING_OP_CONNECT to prep handler	Jens Axboe	1	-18/+22
	Add struct io_connect in our io_kiocb per-command union, and ensure that io_connect_prep() has grabbed what it needs from the SQE. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-12-20	io_uring: add and use struct io_rw for read/writes	Jens Axboe	1	-46/+50
	Put the kiocb in struct io_rw, and add the addr/len for the request as well. Use the kiocb->private field for the buffer index for fixed reads and writes. Any use of kiocb->ki_filp is flipped to req->file. It's the same thing, and less confusing. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-12-20	rxrpc: Fix missing security check on incoming calls	David Howells	6	-60/+59
	Fix rxrpc_new_incoming_call() to check that we have a suitable service key available for the combination of service ID and security class of a new incoming call - and to reject calls for which we don't. This causes an assertion like the following to appear: rxrpc: Assertion failed - 6(0x6) == 12(0xc) is false kernel BUG at net/rxrpc/call_object.c:456! Where call->state is RXRPC_CALL_SERVER_SECURING (6) rather than RXRPC_CALL_COMPLETE (12). Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com>
2019-12-20	rxrpc: Don't take call->user_mutex in rxrpc_new_incoming_call()	David Howells	1	-17/+3
	Standard kernel mutexes cannot be used in any way from interrupt or softirq context, so the user_mutex which manages access to a call cannot be a mutex since on a new call the mutex must start off locked and be unlocked within the softirq handler to prevent userspace interfering with a call we're setting up. Commit a0855d24fc22d49cdc25664fb224caee16998683 ("locking/mutex: Complain upon mutex API misuse in IRQ contexts") causes big warnings to be splashed in dmesg for each a new call that comes in from the server. Whilst it seems like it should be okay, since the accept path uses trylock, there are issues with PI boosting and marking the wrong task as the owner. Fix this by not taking the mutex in the softirq path at all. It's not obvious that there should be any need for it as the state is set before the first notification is generated for the new call. There's also no particular reason why the link-assessing ping should be triggered inside the mutex. It's not actually transmitted there anyway, but rather it has to be deferred to a workqueue. Further, I don't think that there's any particular reason that the socket notification needs to be done from within rx->incoming_lock, so the amount of time that lock is held can be shortened too and the ping prepared before the new call notification is sent. Fixes: 540b1c48c37a ("rxrpc: Fix deadlock between call creation and sendmsg/recvmsg") Signed-off-by: David Howells <dhowells@redhat.com> cc: Peter Zijlstra (Intel) <peterz@infradead.org> cc: Ingo Molnar <mingo@redhat.com> cc: Will Deacon <will@kernel.org> cc: Davidlohr Bueso <dave@stgolabs.net>
2019-12-20	rxrpc: Unlock new call in rxrpc_new_incoming_call() rather than the caller	David Howells	2	-26/+28
	Move the unlock and the ping transmission for a new incoming call into rxrpc_new_incoming_call() rather than doing it in the caller. This makes it clearer to see what's going on. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> cc: Ingo Molnar <mingo@redhat.com> cc: Will Deacon <will@kernel.org> cc: Davidlohr Bueso <dave@stgolabs.net>
2019-12-20	xfs: Make the symbol 'xfs_rtalloc_log_count' static	Chen Wandun	1	-1/+1
	Fix the following sparse warning: fs/xfs/libxfs/xfs_trans_resv.c:206:1: warning: symbol 'xfs_rtalloc_log_count' was not declared. Should it be static? Fixes: b1de6fc7520f ("xfs: fix log reservation overflows when allocating large rt extents") Signed-off-by: Chen Wandun <chenwandun@huawei.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-12-20	io_uring: use u64_to_user_ptr() consistently	Jens Axboe	1	-9/+7
	We use it in some spots, but not consistently. Convert the rest over, makes it easier to read as well. No functional changes in this patch. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-12-20	xen/grant-table: remove multiple BUG_ON on gnttab_interface	Aditya Pakki	1	-4/+0
	gnttab_request_version() always sets the gnttab_interface variable and the assertions to check for empty gnttab_interface is unnecessary. The patch eliminates multiple such assertions. Signed-off-by: Aditya Pakki <pakki001@umn.edu> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2019-12-20	xen-blkback: support dynamic unbind/bind	Paul Durrant	1	-18/+38
	By simply re-attaching to shared rings during connect_ring() rather than assuming they are freshly allocated (i.e assuming the counters are zero) it is possible for vbd instances to be unbound and re-bound from and to (respectively) a running guest. This has been tested by running: while true; do fio --name=randwrite --ioengine=libaio --iodepth=16 \ --rw=randwrite --bs=4k --direct=1 --size=1G --verify=crc32; done in a PV guest whilst running: while true; do echo vbd-$DOMID-$VBD >unbind; echo unbound; sleep 5; echo vbd-$DOMID-$VBD >bind; echo bound; sleep 3; done in dom0 from /sys/bus/xen-backend/drivers/vbd to continuously unbind and re-bind its system disk image. This is a highly useful feature for a backend module as it allows it to be unloaded and re-loaded (i.e. updated) without requiring domUs to be halted. This was also tested by running: while true; do echo vbd-$DOMID-$VBD >unbind; echo unbound; sleep 5; rmmod xen-blkback; echo unloaded; sleep 1; modprobe xen-blkback; echo bound; cd $(pwd); sleep 3; done in dom0 whilst running the same loop as above in the (single) PV guest. Some (less stressful) testing has also been done using a Windows HVM guest with the latest 9.0 PV drivers installed. Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: Juergen Gross <jgross@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2019-12-20	xen/interface: re-define FRONT/BACK_RING_ATTACH()	Paul Durrant	1	-20/+9
	Currently these macros are defined to re-initialize a front/back ring (respectively) to values read from the shared ring in such a way that any requests/responses that are added to the shared ring whilst the front/back is detached will be skipped over. This, in general, is not a desirable semantic since most frontend implementations will eventually block waiting for a response which would either never appear or never be processed. Since the macros are currently unused, take this opportunity to re-define them to re-initialize a front/back ring using specified values. This also allows FRONT/BACK_RING_INIT() to be re-defined in terms of FRONT/BACK_RING_ATTACH() using a specified value of 0. NOTE: BACK_RING_ATTACH() will be used directly in a subsequent patch. Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2019-12-20	xenbus: limit when state is forced to closed	Paul Durrant	2	-2/+11
	If a driver probe() fails then leave the xenstore state alone. There is no reason to modify it as the failure may be due to transient resource allocation issues and hence a subsequent probe() may succeed. If the driver supports re-binding then only force state to closed during remove() only in the case when the toolstack may need to clean up. This can be detected by checking whether the state in xenstore has been set to closing prior to device removal. NOTE: Re-bind support is indicated by new boolean in struct xenbus_driver, which defaults to false. Subsequent patches will add support to some backend drivers. Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2019-12-20	xenbus: move xenbus_dev_shutdown() into frontend code...	Paul Durrant	4	-27/+23
	...and make it static xenbus_dev_shutdown() is seemingly intended to cause clean shutdown of PV frontends when a guest is rebooted. Indeed the function waits for a conpletion which is only set by a call to xenbus_frontend_closed(). This patch removes the shutdown() method from backends and moves xenbus_dev_shutdown() from xenbus_probe.c into xenbus_probe_frontend.c, renaming it appropriately and making it static. NOTE: In the case where the backend is running in a driver domain, the toolstack should have already terminated any frontends that may be using it (since Xen does not support re-startable PV driver domains) so xenbus_dev_shutdown() should never be called. Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2019-12-20	xen/blkfront: Adjust indentation in xlvbd_alloc_gendisk	Nathan Chancellor	1	-2/+2
	Clang warns: ../drivers/block/xen-blkfront.c:1117:4: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation] nr_parts = PARTS_PER_DISK; ^ ../drivers/block/xen-blkfront.c:1115:3: note: previous statement is here if (err) ^ This is because there is a space at the beginning of this line; remove it so that the indentation is consistent according to the Linux kernel coding style and clang no longer warns. While we are here, the previous line has some trailing whitespace; clean that up as well. Fixes: c80a420995e7 ("xen-blkfront: handle Xen major numbers other than XENVBD") Link: https://github.com/ClangBuiltLinux/linux/issues/791 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Juergen Gross <jgross@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2019-12-20	riscv: move sifive_l2_cache.c to drivers/soc	Christoph Hellwig	8	-2/+17
	The sifive_l2_cache.c is in no way related to RISC-V architecture memory management. It is a little stub driver working around the fact that the EDAC maintainers prefer their drivers to be structured in a certain way that doesn't fit the SiFive SOCs. Move the file to drivers/soc and add a Kconfig option for it, as well as the whole drivers/soc boilerplate for CONFIG_SOC_SIFIVE. Fixes: a967a289f169 ("RISC-V: sifive_l2_cache: Add L2 cache controller driver for SiFive SoCs") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Borislav Petkov <bp@suse.de> [paul.walmsley@sifive.com: keep the MAINTAINERS change specific to the L2$ controller code] Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-12-20	riscv: define vmemmap before pfn_to_page calls	David Abdurachmanov	1	-17/+21
	pfn_to_page & page_to_pfn depend on vmemmap being available before the calls if kernel is configured with CONFIG_SPARSEMEM_VMEMMAP=y. This was caused by NOMMU changes which moved vmemmap definition bellow functions definitions calling pfn_to_page & page_to_pfn. Noticed while compiled 5.5-rc2 kernel for Fedora/RISCV. v2: - Add a comment for vmemmap in source Signed-off-by: David Abdurachmanov <david.abdurachmanov@sifive.com> Fixes: 6bd33e1ece52 ("riscv: add nommu support") Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-12-20	riscv: fix scratch register clearing in M-mode.	Greentime Hu	1	-1/+1
	This patch fixes that the sscratch register clearing in M-mode. It cleared sscratch register in M-mode, but it should clear mscratch register. That will cause kernel trap if the CPU core doesn't support S-mode when trying to access sscratch. Fixes: 9e80635619b5 ("riscv: clear the instruction cache and all registers when booting") Signed-off-by: Greentime Hu <greentime.hu@sifive.com> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-12-20	riscv: Fix use of undefined config option CONFIG_CONFIG_MMU	Andreas Schwab	1	-1/+1
	In Kconfig files, config options are written without the CONFIG_ prefix. Fixes: 6bd33e1ece52 ("riscv: add nommu support") Signed-off-by: Andreas Schwab <schwab@suse.de> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-12-20	Merge branch 'replace-cg_bpf-prog'	Alexei Starovoitov	15	-626/+652
	Andrey Ignatov says: ==================== v3->v4: - use OPTS_VALID and OPTS_GET to handle bpf_prog_attach_opts. v2->v3: - rely on DECLARE_LIBBPF_OPTS from libbpf_common.h; - separate "required" and "optional" arguments in bpf_prog_attach_xattr; - convert test_cgroup_attach to prog_tests; - move the new selftest to prog_tests/cgroup_attach_multi. v1->v2: - move DECLARE_LIBBPF_OPTS from libbpf.h to bpf.h (patch 4); - switch new libbpf API to OPTS framework; - switch selftest to libbpf OPTS framework. This patch set adds support for replacing cgroup-bpf programs attached with BPF_F_ALLOW_MULTI flag so that any program in a list can be updated to a new version without service interruption and order of programs can be preserved. Please see patch 3 for details on the use-case and API changes. Other patches: Patch 1 is preliminary refactoring of __cgroup_bpf_attach to simplify it. Patch 2 is minor cleanup of hierarchy_allows_attach. Patch 4 extends libbpf API to support new set of attach attributes. Patch 5 converts test_cgroup_attach to prog_tests. Patch 6 adds selftest coverage for the new API. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-12-20	selftests/bpf: Test BPF_F_REPLACE in cgroup_attach_multi	Andrey Ignatov	1	-3/+50
	Test replacing a cgroup-bpf program attached with BPF_F_ALLOW_MULTI and possible failure modes: invalid combination of flags, invalid replace_bpf_fd, replacing a non-attachd to specified cgroup program. Example of program replacing: # gdb -q --args ./test_progs --name=cgroup_attach_multi ... Breakpoint 1, test_cgroup_attach_multi () at cgroup_attach_multi.c:227 (gdb) [1]+ Stopped gdb -q --args ./test_progs --name=cgroup_attach_multi # bpftool c s /mnt/cgroup2/cgroup-test-work-dir/cg1 ID AttachType AttachFlags Name 2133 egress multi 2134 egress multi # fg gdb -q --args ./test_progs --name=cgroup_attach_multi (gdb) c Continuing. Breakpoint 2, test_cgroup_attach_multi () at cgroup_attach_multi.c:233 (gdb) [1]+ Stopped gdb -q --args ./test_progs --name=cgroup_attach_multi # bpftool c s /mnt/cgroup2/cgroup-test-work-dir/cg1 ID AttachType AttachFlags Name 2139 egress multi 2134 egress multi Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/7b9b83e8d5fb82e15b034341bd40b6fb2431eeba.1576741281.git.rdna@fb.com
2019-12-20	selftests/bpf: Convert test_cgroup_attach to prog_tests	Andrey Ignatov	6	-574/+498
	Convert test_cgroup_attach to prog_tests. This change does a lot of things but in many cases it's pretty expensive to separate them, so they go in one commit. Nevertheless the logic is ketp as is and changes made are just moving things around, simplifying them (w/o changing the meaning of the tests) and making prog_tests compatible: * split the 3 tests in the file into 3 separate files in prog_tests/; * rename the test functions to test_<file_base_name>; * remove unused includes, constants, variables and functions from every test; * replace `if`-s with or `if (CHECK())` where additional context should be logged and with `if (CHECK_FAIL())` where line number is enough; * switch from `log_err()` to logging via `CHECK()`; * replace `assert`-s with `CHECK_FAIL()` to avoid crashing the whole test_progs if one assertion fails; * replace cgroup_helpers with test__join_cgroup() in cgroup_attach_override only, other tests need more fine-grained control for cgroup creation/deletion so cgroup_helpers are still used there; * simplify cgroup_attach_autodetach by switching to easiest possible program since this test doesn't really need such a complicated program as cgroup_attach_multi does; * remove test_cgroup_attach.c itself. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/0ff19cc64d2dc5cf404349f07131119480e10e32.1576741281.git.rdna@fb.com
2019-12-20	libbpf: Introduce bpf_prog_attach_xattr	Andrey Ignatov	3	-1/+28
	Introduce a new bpf_prog_attach_xattr function that, in addition to program fd, target fd and attach type, accepts an extendable struct bpf_prog_attach_opts. bpf_prog_attach_opts relies on DECLARE_LIBBPF_OPTS macro to maintain backward and forward compatibility and has the following "optional" attach attributes: * existing attach_flags, since it's not required when attaching in NONE mode. Even though it's quite often used in MULTI and OVERRIDE mode it seems to be a good idea to reduce number of arguments to bpf_prog_attach_xattr; * newly introduced attribute of BPF_PROG_ATTACH command: replace_prog_fd that is fd of previously attached cgroup-bpf program to replace if BPF_F_REPLACE flag is used. The new function is named to be consistent with other xattr-functions (bpf_prog_test_run_xattr, bpf_create_map_xattr, bpf_load_program_xattr). The struct bpf_prog_attach_opts is supposed to be used with DECLARE_LIBBPF_OPTS macro. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/bd6e0732303eb14e4b79cb128268d9e9ad6db208.1576741281.git.rdna@fb.com
2019-12-20	bpf: Support replacing cgroup-bpf program in MULTI mode	Andrey Ignatov	6	-9/+54
	The common use-case in production is to have multiple cgroup-bpf programs per attach type that cover multiple use-cases. Such programs are attached with BPF_F_ALLOW_MULTI and can be maintained by different people. Order of programs usually matters, for example imagine two egress programs: the first one drops packets and the second one counts packets. If they're swapped the result of counting program will be different. It brings operational challenges with updating cgroup-bpf program(s) attached with BPF_F_ALLOW_MULTI since there is no way to replace a program: * One way to update is to detach all programs first and then attach the new version(s) again in the right order. This introduces an interruption in the work a program is doing and may not be acceptable (e.g. if it's egress firewall); * Another way is attach the new version of a program first and only then detach the old version. This introduces the time interval when two versions of same program are working, what may not be acceptable if a program is not idempotent. It also imposes additional burden on program developers to make sure that two versions of their program can co-exist. Solve the problem by introducing a "replace" mode in BPF_PROG_ATTACH command for cgroup-bpf programs being attached with BPF_F_ALLOW_MULTI flag. This mode is enabled by newly introduced BPF_F_REPLACE attach flag and bpf_attr.replace_bpf_fd attribute to pass fd of the old program to replace That way user can replace any program among those attached with BPF_F_ALLOW_MULTI flag without the problems described above. Details of the new API: * If BPF_F_REPLACE is set but replace_bpf_fd doesn't have valid descriptor of BPF program, BPF_PROG_ATTACH will return corresponding error (EINVAL or EBADF). * If replace_bpf_fd has valid descriptor of BPF program but such a program is not attached to specified cgroup, BPF_PROG_ATTACH will return ENOENT. BPF_F_REPLACE is introduced to make the user intent clear, since replace_bpf_fd alone can't be used for this (its default value, 0, is a valid fd). BPF_F_REPLACE also makes it possible to extend the API in the future (e.g. add BPF_F_BEFORE and BPF_F_AFTER if needed). Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Andrii Narkyiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/30cd850044a0057bdfcaaf154b7d2f39850ba813.1576741281.git.rdna@fb.com
2019-12-20	bpf: Remove unused new_flags in hierarchy_allows_attach()	Andrey Ignatov	1	-3/+2
	new_flags is unused, remove it. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/2c49b30ab750f93cfef04a1e40b097d70c3a39a1.1576741281.git.rdna@fb.com
2019-12-20	bpf: Simplify __cgroup_bpf_attach	Andrey Ignatov	1	-39/+23
	__cgroup_bpf_attach has a lot of identical code to handle two scenarios: BPF_F_ALLOW_MULTI is set and unset. Simplify it by splitting the two main steps: * First, the decision is made whether a new bpf_prog_list entry should be allocated or existing entry should be reused for the new program. This decision is saved in replace_pl pointer; * Next, replace_pl pointer is used to handle both possible states of BPF_F_ALLOW_MULTI flag (set / unset) instead of doing similar work for them separately. This splitting, in turn, allows to make further simplifications: * The check for attaching same program twice in BPF_F_ALLOW_MULTI mode can be done before allocating cgroup storage, so that if user tries to attach same program twice no alloc/free happens as it was before; * pl_was_allocated becomes redundant so it's removed. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/c6193db6fe630797110b0d3ff06c125d093b834c.1576741281.git.rdna@fb.com
2019-12-20	Merge branch 'simplify-do_redirect'	Alexei Starovoitov	8	-197/+75
	Björn Töpel says: ==================== This series aims to simplify the XDP maps and xdp_do_redirect_map()/xdp_do_flush_map(), and to crank out some more performance from XDP_REDIRECT scenarios. The first part of the series simplifies all XDP_REDIRECT capable maps, so that __XXX_flush_map() does not require the map parameter, by moving the flush list from the map to global scope. This results in that the map_to_flush member can be removed from struct bpf_redirect_info, and its corresponding logic. Simpler code, and more performance due to that checks/code per-packet is moved to flush. Pre-series performance: $ sudo taskset -c 22 ./xdpsock -i enp134s0f0 -q 20 -n 1 -r -z sock0@enp134s0f0:20 rxdrop xdp-drv pps pkts 1.00 rx 20,797,350 230,942,399 tx 0 0 $ sudo ./xdp_redirect_cpu --dev enp134s0f0 --cpu 22 xdp_cpu_map0 Running XDP/eBPF prog_name:xdp_cpu_map5_lb_hash_ip_pairs XDP-cpumap CPU:to pps drop-pps extra-info XDP-RX 20 7723038 0 0 XDP-RX total 7723038 0 cpumap_kthread total 0 0 0 redirect_err total 0 0 xdp_exception total 0 0 Post-series performance: $ sudo taskset -c 22 ./xdpsock -i enp134s0f0 -q 20 -n 1 -r -z sock0@enp134s0f0:20 rxdrop xdp-drv pps pkts 1.00 rx 21,524,979 86,835,327 tx 0 0 $ sudo ./xdp_redirect_cpu --dev enp134s0f0 --cpu 22 xdp_cpu_map0 Running XDP/eBPF prog_name:xdp_cpu_map5_lb_hash_ip_pairs XDP-cpumap CPU:to pps drop-pps extra-info XDP-RX 20 7840124 0 0 XDP-RX total 7840124 0 cpumap_kthread total 0 0 0 redirect_err total 0 0 xdp_exception total 0 0 Results: +3.5% and +1.5% for the ubenchmarks. v1->v2 [1]: * Removed 'unused-variable' compiler warning (Jakub) [1] https://lore.kernel.org/bpf/20191218105400.2895-1-bjorn.topel@gmail.com/ ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-12-20	xdp: Simplify __bpf_tx_xdp_map()	Björn Töpel	1	-26/+7
	The explicit error checking is not needed. Simply return the error instead. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20191219061006.21980-9-bjorn.topel@gmail.com
2019-12-20	xdp: Remove map_to_flush and map swap detection	Björn Töpel	2	-25/+3
	Now that all XDP maps that can be used with bpf_redirect_map() tracks entries to be flushed in a global fashion, there is not need to track that the map has changed and flush from xdp_do_generic_map() anymore. All entries will be flushed in xdp_do_flush_map(). This means that the map_to_flush can be removed, and the corresponding checks. Moving the flush logic to one place, xdp_do_flush_map(), give a bulking behavior and performance boost. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20191219061006.21980-8-bjorn.topel@gmail.com
2019-12-20	xdp: Make cpumap flush_list common for all map instances	Björn Töpel	3	-21/+21
	The cpumap flush list is used to track entries that need to flushed from via the xdp_do_flush_map() function. This list used to be per-map, but there is really no reason for that. Instead make the flush list global for all devmaps, which simplifies __cpu_map_flush() and cpu_map_alloc(). Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20191219061006.21980-7-bjorn.topel@gmail.com
2019-12-20	xdp: Make devmap flush_list common for all map instances	Björn Töpel	3	-25/+16
	The devmap flush list is used to track entries that need to flushed from via the xdp_do_flush_map() function. This list used to be per-map, but there is really no reason for that. Instead make the flush list global for all devmaps, which simplifies __dev_map_flush() and dev_map_init_map(). Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20191219061006.21980-6-bjorn.topel@gmail.com
2019-12-20	xsk: Make xskmap flush_list common for all map instances	Björn Töpel	4	-35/+20
	The xskmap flush list is used to track entries that need to flushed from via the xdp_do_flush_map() function. This list used to be per-map, but there is really no reason for that. Instead make the flush list global for all xskmaps, which simplifies __xsk_map_flush() and xsk_map_alloc(). Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20191219061006.21980-5-bjorn.topel@gmail.com
2019-12-20	xdp: Fix graze->grace type-o in cpumap comments	Björn Töpel	1	-3/+3
	Simple spelling fix. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20191219061006.21980-4-bjorn.topel@gmail.com
2019-12-20	xdp: Simplify cpumap cleanup	Björn Töpel	1	-29/+5
	After the RCU flavor consolidation [1], call_rcu() and synchronize_rcu() waits for preempt-disable regions (NAPI) in addition to the read-side critical sections. As a result of this, the cleanup code in cpumap can be simplified * There is no longer a need to flush in __cpu_map_entry_free, since we know that this has been done when the call_rcu() callback is triggered. * When freeing the map, there is no need to explicitly wait for a flush. It's guaranteed to be done after the synchronize_rcu() call in cpu_map_free(). [1] https://lwn.net/Articles/777036/ Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20191219061006.21980-3-bjorn.topel@gmail.com
2019-12-20	xdp: Simplify devmap cleanup	Björn Töpel	1	-38/+5
	After the RCU flavor consolidation [1], call_rcu() and synchronize_rcu() waits for preempt-disable regions (NAPI) in addition to the read-side critical sections. As a result of this, the cleanup code in devmap can be simplified * There is no longer a need to flush in __dev_map_entry_free, since we know that this has been done when the call_rcu() callback is triggered. * When freeing the map, there is no need to explicitly wait for a flush. It's guaranteed to be done after the synchronize_rcu() call in dev_map_free(). The rcu_barrier() is still needed, so that the map is not freed prior the elements. [1] https://lwn.net/Articles/777036/ Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20191219061006.21980-2-bjorn.topel@gmail.com
2019-12-20	NFC: pn544: Adjust indentation in pn544_hci_check_presence	Nathan Chancellor	1	-1/+1
	Clang warns ../drivers/nfc/pn544/pn544.c:696:4: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation] return nfc_hci_send_cmd(hdev, NFC_HCI_RF_READER_A_GATE, ^ ../drivers/nfc/pn544/pn544.c:692:3: note: previous statement is here if (target->nfcid1_len != 4 && target->nfcid1_len != 7 && ^ 1 warning generated. This warning occurs because there is a space after the tab on this line. Remove it so that the indentation is consistent with the Linux kernel coding style and clang no longer warns. Fixes: da052850b911 ("NFC: Add pn544 presence check for different targets") Link: https://github.com/ClangBuiltLinux/linux/issues/814 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-20	Merge branch 'bcmgenet-Turn-on-offloads-by-default'	David S. Miller	2	-50/+67
	Doug Berger says: ==================== net: bcmgenet: Turn on offloads by default This commit stack is based on Florian's commit 4e8aedfe78c7 ("net: systemport: Turn on offloads by default") and enables the offloads for the bcmgenet driver by default. The first commit adds support for the HIGHDMA feature to the driver. The second converts the Tx checksum implementation to use the generic hardware logic rather than the deprecated IP centric methods. The third modifies the Rx checksum implementation to use the hardware offload to compute the complete checksum rather than filtering out bad packets detected by the hardware's IP centric implementation. This may increase processing load by passing bad packets to the network stack, but it provides for more flexible handling of packets by the network stack without requiring software computation of the checksum. The remaining commits mirror the extensions Florian made to the sysport driver to retain symmetry with that driver and to make the benefits of the hardware offloads more ubiquitous. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-20	net: bcmgenet: Add software counters to track reallocations	Doug Berger	2	-0/+8
	When inserting the TSB, keep track of how many times we had to do it and if there was a failure in doing so, this helps profile the driver for possibly incorrect headroom settings. Signed-off-by: Doug Berger <opendmb@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-20	net: bcmgenet: Be drop monitor friendly while re-allocating headroom	Doug Berger	1	-1/+2
	During bcmgenet_put_tx_csum() make sure we differentiate a SKB headroom re-allocation failure from the normal swap and replace path. Signed-off-by: Doug Berger <opendmb@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>