summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-02-22io_uring: disallow modification of rsrc_data during quiesceDylan Yudaken1-1/+9
io_rsrc_ref_quiesce will unlock the uring while it waits for references to the io_rsrc_data to be killed. There are other places to the data that might add references to data via calls to io_rsrc_node_switch. There is a race condition where this reference can be added after the completion has been signalled. At this point the io_rsrc_ref_quiesce call will wake up and relock the uring, assuming the data is unused and can be freed - although it is actually being used. To fix this check in io_rsrc_ref_quiesce if a resource has been revived. Reported-by: syzbot+ca8bf833622a1662745b@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220222161751.995746-1-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-22hwmon: (pmbus) Clear pmbus fault/warning bits after readVikash Chandola1-0/+5
Almost all fault/warning bits in pmbus status registers remain set even after fault/warning condition are removed. As per pmbus specification these faults must be cleared by user. Modify hwmon behavior to clear fault/warning bit after fetching data if fault/warning bit was set. This allows to get fresh data in next read. Signed-off-by: Vikash Chandola <vikash.chandola@linux.intel.com> Link: https://lore.kernel.org/r/20220222131253.2426834-1-vikash.chandola@linux.intel.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-02-22hwmon: Handle failure to register sensor with thermal zone correctlyGuenter Roeck1-6/+8
If an attempt is made to a sensor with a thermal zone and it fails, the call to devm_thermal_zone_of_sensor_register() may return -ENODEV. This may result in crashes similar to the following. Unable to handle kernel NULL pointer dereference at virtual address 00000000000003cd ... Internal error: Oops: 96000021 [#1] PREEMPT SMP ... pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : mutex_lock+0x18/0x60 lr : thermal_zone_device_update+0x40/0x2e0 sp : ffff800014c4fc60 x29: ffff800014c4fc60 x28: ffff365ee3f6e000 x27: ffffdde218426790 x26: ffff365ee3f6e000 x25: 0000000000000000 x24: ffff365ee3f6e000 x23: ffffdde218426870 x22: ffff365ee3f6e000 x21: 00000000000003cd x20: ffff365ee8bf3308 x19: ffffffffffffffed x18: 0000000000000000 x17: ffffdde21842689c x16: ffffdde1cb7a0b7c x15: 0000000000000040 x14: ffffdde21a4889a0 x13: 0000000000000228 x12: 0000000000000000 x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 x8 : 0000000001120000 x7 : 0000000000000001 x6 : 0000000000000000 x5 : 0068000878e20f07 x4 : 0000000000000000 x3 : 00000000000003cd x2 : ffff365ee3f6e000 x1 : 0000000000000000 x0 : 00000000000003cd Call trace: mutex_lock+0x18/0x60 hwmon_notify_event+0xfc/0x110 0xffffdde1cb7a0a90 0xffffdde1cb7a0b7c irq_thread_fn+0x2c/0xa0 irq_thread+0x134/0x240 kthread+0x178/0x190 ret_from_fork+0x10/0x20 Code: d503201f d503201f d2800001 aa0103e4 (c8e47c02) Jon Hunter reports that the exact call sequence is: hwmon_notify_event() --> hwmon_thermal_notify() --> thermal_zone_device_update() --> update_temperature() --> mutex_lock() The hwmon core needs to handle all errors returned from calls to devm_thermal_zone_of_sensor_register(). If the call fails with -ENODEV, report that the sensor was not attached to a thermal zone but continue to register the hwmon device. Reported-by: Jon Hunter <jonathanh@nvidia.com> Cc: Dmitry Osipenko <digetx@gmail.com> Fixes: 1597b374af222 ("hwmon: Add notification support") Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-02-22block: clear iocb->private in blkdev_bio_end_io_async()Stefano Garzarella1-0/+2
iocb_bio_iopoll() expects iocb->private to be cleared before releasing the bio. We already do this in blkdev_bio_end_io(), but we forgot in the recently added blkdev_bio_end_io_async(). Fixes: 54a88eb838d3 ("block: add single bio async direct IO helper") Cc: asml.silence@gmail.com Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220211090136.44471-1-sgarzare@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-22Merge branch 'net-dsa-b53-non-legacy'David S. Miller5-48/+87
Russell King says: ==================== net: dsa: b53: convert to phylink_generic_validate() and mark as non-legacy This series converts b53 to use phylink_generic_validate() and also marks this driver as non-legacy. Patch 1 cleans up an if() condition to be more readable before we proceed with the conversion. Patch 2 populates the supported_interfaces and mac_capabilities members of phylink_config. Patch 3 drops the use of phylink_helper_basex_speed() which is now not necessary. Patch 4 switches the driver to use phylink_generic_validate() Patch 5 marks the driver as non-legacy. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-22net: dsa: b53: mark as non-legacyRussell King (Oracle)1-0/+6
The B53 driver does not make use of the speed, duplex, pause or advertisement in its phylink_mac_config() implementation, so it can be marked as a non-legacy driver. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-22net: dsa: b53: switch to using phylink_generic_validate()Russell King (Oracle)5-76/+13
Switch the Broadcom b53 driver to using the phylink_generic_validate() implementation by removing its own .phylink_validate method and associated code. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-22net: dsa: b53: drop use of phylink_helper_basex_speed()Russell King (Oracle)1-2/+0
Now that we have a better method to select SFP interface modes, we no longer need to use phylink_helper_basex_speed() in a driver's validation function. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-22net: dsa: b53: populate supported_interfaces and mac_capabilitiesRussell King (Oracle)5-0/+95
Populate the supported interfaces and MAC capabilities for the Broadcom B53 DSA switches in preparation to using these for the generic validation functionality. The interface modes are derived from: - b53_serdes_phylink_validate() - SRAB mux configuration Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-22net: dsa: b53: clean up if() condition to be more readableRussell King (Oracle)1-4/+7
I've stared at this if() statement for a while trying to work out if it really does correspond with the comment above, and it does seem to. However, let's make it more readable and phrase it in the same way as the comment. Also add a FIXME into the comment - we appear to deny Gigabit modes for 802.3z interface modes, but 802.3z interface modes only operate at gigabit and above. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nfDavid S. Miller9-9/+58
Pablo Neira Ayuso says: ==================== Netfilter fixes for net This is fixing up the use without proper initialization in patch 5/5 -o- Hi, The following patchset contains Netfilter fixes for net: 1) Missing #ifdef CONFIG_IP6_NF_IPTABLES in recent xt_socket fix. 2) Fix incorrect flow action array size in nf_tables. 3) Unregister flowtable hooks from netns exit path. 4) Fix missing limit object release, from Florian Westphal. 5) Memleak in nf_tables object update path, also from Florian. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-22netfilter: nf_tables: fix memory leak during stateful obj updateFlorian Westphal1-4/+9
stateful objects can be updated from the control plane. The transaction logic allocates a temporary object for this purpose. The ->init function was called for this object, so plain kfree() leaks resources. We must call ->destroy function of the object. nft_obj_destroy does this, but it also decrements the module refcount, but the update path doesn't increment it. To avoid special-casing the update object release, do module_get for the update case too and release it via nft_obj_destroy(). Fixes: d62d0ba97b58 ("netfilter: nf_tables: Introduce stateful object update operation") Cc: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-02-22net: dm9051: Fix use after free in dm9051_loop_tx()Dan Carpenter1-1/+3
This code dereferences "skb" after calling dev_kfree_skb(). Fixes: 2dc95a4d30ed ("net: Add dm9051 driver") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20220221105440.GA10045@kili Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-22net: hsr: fix hsr build error when lockdep is not enabledJuhee Kang2-11/+22
In hsr, lockdep_is_held() is needed for rcu_dereference_bh_check(). But if lockdep is not enabled, lockdep_is_held() causes a build error: ERROR: modpost: "lockdep_is_held" [net/hsr/hsr.ko] undefined! Thus, this patch solved by adding lockdep_hsr_is_held(). This helper function calls the lockdep_is_held() when lockdep is enabled, and returns 1 if not defined. Fixes: e7f27420681f ("net: hsr: fix suspicious RCU usage warning in hsr_node_get_first()") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Juhee Kang <claudiajkang@gmail.com> Link: https://lore.kernel.org/r/20220220153250.5285-1-claudiajkang@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-21Merge tag 'platform-drivers-x86-v5.17-3' of ↵Linus Torvalds3-2/+4
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: "Two small fixes and one hardware-id addition" * tag 'platform-drivers-x86-v5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: int3472: Add terminator to gpiod_lookup_table platform/x86: asus-wmi: Fix regression when probing for fan curve control platform/x86: thinkpad_acpi: Add dual-fan quirk for T15g (2nd gen)
2022-02-21lib/iov_iter: initialize "flags" in new pipe_bufferMax Kellermann1-0/+2
The functions copy_page_to_iter_pipe() and push_pipe() can both allocate a new pipe_buffer, but the "flags" member initializer is missing. Fixes: 241699cd72a8 ("new iov_iter flavour: pipe-backed") To: Alexander Viro <viro@zeniv.linux.org.uk> To: linux-fsdevel@vger.kernel.org To: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Max Kellermann <max.kellermann@ionos.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-02-21netfilter: nft_limit: fix stateful object memory leakFlorian Westphal1-0/+18
We need to provide a destroy callback to release the extra fields. Fixes: 3b9e2ea6c11b ("netfilter: nft_limit: move stateful fields out of expression data") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-02-21netfilter: nf_tables: unregister flowtable hooks on netns exitPablo Neira Ayuso1-0/+3
Unregister flowtable hooks before they are releases via nf_tables_flowtable_destroy() otherwise hook core reports UAF. BUG: KASAN: use-after-free in nf_hook_entries_grow+0x5a7/0x700 net/netfilter/core.c:142 net/netfilter/core.c:142 Read of size 4 at addr ffff8880736f7438 by task syz-executor579/3666 CPU: 0 PID: 3666 Comm: syz-executor579 Not tainted 5.16.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] __dump_stack lib/dump_stack.c:88 [inline] lib/dump_stack.c:106 dump_stack_lvl+0x1dc/0x2d8 lib/dump_stack.c:106 lib/dump_stack.c:106 print_address_description+0x65/0x380 mm/kasan/report.c:247 mm/kasan/report.c:247 __kasan_report mm/kasan/report.c:433 [inline] __kasan_report mm/kasan/report.c:433 [inline] mm/kasan/report.c:450 kasan_report+0x19a/0x1f0 mm/kasan/report.c:450 mm/kasan/report.c:450 nf_hook_entries_grow+0x5a7/0x700 net/netfilter/core.c:142 net/netfilter/core.c:142 __nf_register_net_hook+0x27e/0x8d0 net/netfilter/core.c:429 net/netfilter/core.c:429 nf_register_net_hook+0xaa/0x180 net/netfilter/core.c:571 net/netfilter/core.c:571 nft_register_flowtable_net_hooks+0x3c5/0x730 net/netfilter/nf_tables_api.c:7232 net/netfilter/nf_tables_api.c:7232 nf_tables_newflowtable+0x2022/0x2cf0 net/netfilter/nf_tables_api.c:7430 net/netfilter/nf_tables_api.c:7430 nfnetlink_rcv_batch net/netfilter/nfnetlink.c:513 [inline] nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:634 [inline] nfnetlink_rcv_batch net/netfilter/nfnetlink.c:513 [inline] net/netfilter/nfnetlink.c:652 nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:634 [inline] net/netfilter/nfnetlink.c:652 nfnetlink_rcv+0x10e6/0x2550 net/netfilter/nfnetlink.c:652 net/netfilter/nfnetlink.c:652 __nft_release_hook() calls nft_unregister_flowtable_net_hooks() which only unregisters the hooks, then after RCU grace period, it is guaranteed that no packets add new entries to the flowtable (no flow offload rules and flowtable hooks are reachable from packet path), so it is safe to call nf_flow_table_free() which cleans up the remaining entries from the flowtable (both software and hardware) and it unbinds the flow_block. Fixes: ff4bf2f42a40 ("netfilter: nf_tables: add nft_unregister_flowtable_hook()") Reported-by: syzbot+e918523f77e62790d6d9@syzkaller.appspotmail.com Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-02-21platform/x86: int3472: Add terminator to gpiod_lookup_tableDaniel Scally1-1/+2
Without the terminator, if a con_id is passed to gpio_find() that does not exist in the lookup table the function will not stop looping correctly, and eventually cause an oops. Fixes: 19d8d6e36b4b ("platform/x86: int3472: Pass tps68470_regulator_platform_data to the tps68470-regulator MFD-cell") Signed-off-by: Daniel Scally <djrscally@gmail.com> Link: https://lore.kernel.org/r/20220216225304.53911-5-djrscally@gmail.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-02-21Merge branch 'octeontx2-ptp-updates'David S. Miller6-10/+155
Rakesh Babu Saladi says: ==================== RVU AF and NETDEV drivers' PTP updates. Patch 1: Add suppot such that RVU drivers support new timestamp format. Patch 2: This patch adds workaround for PTP errata. Changes made from v1 to v2 1. CC'd Richard Cochran to review PTP related patches. 2. Removed a patch from the old patch series. Will submit the removed patch separately. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-21octeontx2-af: cn10k: add workaround for ptp errataNaveen Mamindlapalli1-7/+80
This patch adds workaround for PTP errata given below. 1. At the time of 1 sec rollover of nano-second counter, the nano-second counter is set to 0. However, it should be set to (existing counter_value - 10^9). This leads to an accumulating error in the timestamp value with each sec rollover. 2. Additionally, the nano-second counter currently is rolling over at 'h3B9A_C9FF. It should roll over at 'h3B9A_CA00. The workaround for issue #1 is to speed up the ptp clock by adjusting PTP_CLOCK_COMP register to the desired value to compensate for the nanoseconds lost per each second. The workaround for issue #2 is to slow down the ptp clock such that the rollover occurs at ~1sec. Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: Rakesh Babu Saladi <rsaladi2@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-21octeontx2-pf: cn10k: add support for new ptp timestamp formatNaveen Mamindlapalli6-3/+75
The cn10k hardware ptp timestamp format has been modified primarily to support 1-step ptp clock. The 64-bit timestamp used by hardware is split into two 32-bit fields, the upper one holds seconds, the lower one nanoseconds. A new register (PTP_CLOCK_SEC) has been added that returns the current seconds value. The nanoseconds register PTP_CLOCK_HI resets after every second. The cn10k RPM block provides Rx/Tx timestamps to the NIX block using the new timestamp format. The software can read the current timestamp in nanoseconds by reading both PTP_CLOCK_SEC & PTP_CLOCK_HI registers. This patch provides support for new timestamp format. Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: Rakesh Babu Saladi <rsaladi2@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-21net: mdio-ipq4019: add delay after clock enableBaruch Siach1-1/+5
Experimentation shows that PHY detect might fail when the code attempts MDIO bus read immediately after clock enable. Add delay to stabilize the clock before bus access. PHY detect failure started to show after commit 7590fc6f80ac ("net: mdio: Demote probed message to debug print") that removed coincidental delay between clock enable and bus access. 10ms is meant to match the time it take to send the probed message over UART at 115200 bps. This might be a far overshoot. Fixes: 23a890d493e3 ("net: mdio: Add the reset function for IPQ MDIO driver") Signed-off-by: Baruch Siach <baruch.siach@siklu.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-21io_uring: don't convert to jiffies for waiting on timeoutsJens Axboe1-6/+7
If an application calls io_uring_enter(2) with a timespec passed in, convert that timespec to ktime_t rather than jiffies. The latter does not provide the granularity the application may expect, and may in fact provided different granularity on different systems, depending on what the HZ value is configured at. Turn the timespec into an absolute ktime_t, and use that with schedule_hrtimeout() instead. Link: https://github.com/axboe/liburing/issues/531 Cc: stable@vger.kernel.org Reported-by: Bob Chen <chenbo.chen@alibaba-inc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-21Merge branch 'bonding-ipv6-NA-NS-monitor'David S. Miller10-71/+481
Hangbin Liu says: ==================== bonding: add IPv6 NS/NA monitor support This patch add bond IPv6 NS/NA monitor support. A new option ns_ip6_target is added, which is similar with arp_ip_target. The IPv6 NS/NA monitor will take effect when there is a valid IPv6 address. Both ARP monitor and NS monitor will working at the same time. A new extra storage field is added to struct bond_opt_value for IPv6 support. Function bond_handle_vlan() is split from bond_arp_send() for both IPv4/IPv6 usage. To alloc NS message and send out. ndisc_ns_create() and ndisc_send_skb() are exported. v1 -> v2: 1. remove sysfs entry[1] and only keep netlink support. RFC -> v1: 1. define BOND_MAX_ND_TARGETS as BOND_MAX_ARP_TARGETS 2. adjust for reverse xmas tree ordering of local variables 3. remove bond_do_ns_validate() 4. add extra field for bond_opt_value 5. set IS_ENABLED(CONFIG_IPV6) for IPv6 codes [1] https://lore.kernel.org/netdev/8863.1645071997@famine ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-21bonding: add new option ns_ip6_targetHangbin Liu7-0/+155
This patch add a new bonding option ns_ip6_target, which correspond to the arp_ip_target. With this we set IPv6 targets and send IPv6 NS request to determine the health of the link. For other related options like the validation, we still use arp_validate, and will change to ns_validate later. Note: the sysfs configuration support was removed based on https://lore.kernel.org/netdev/8863.1645071997@famine Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-21bonding: add new parameter ns_targetsHangbin Liu3-21/+237
Add a new bonding parameter ns_targets to store IPv6 address. Add required bond_ns_send/rcv functions first before adding IPv6 address option setting. Add two functions bond_send/rcv_validate so we can send/recv ARP and NS at the same time. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-21bonding: add extra field for bond_opt_valueHangbin Liu1-9/+18
Adding an extra storage field for bond_opt_value so we can set large bytes of data for bonding options in future, e.g. IPv6 address. Define a new call bond_opt_initextra(). Also change the checking order of __bond_opt_init() and check values first. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-21Bonding: split bond_handle_vlan from bond_arp_sendHangbin Liu1-24/+34
Function bond_handle_vlan() is split from bond_arp_send() for later IPv6 usage. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-21ipv6: separate ndisc_ns_create() from ndisc_send_ns()Hangbin Liu2-17/+37
This patch separate NS message allocation steps from ndisc_send_ns(), so it could be used in other places, like bonding, to allocate and send IPv6 NS message. Also export ndisc_send_skb() and ndisc_ns_create() for later bonding usage. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-21ravb: Use GFP_KERNEL instead of GFP_ATOMIC when possibleChristophe JAILLET1-1/+1
'max_rx_len' can be up to GBETH_RX_BUFF_MAX (i.e. 8192) (see 'gbeth_hw_info'). The default value of 'num_rx_ring' can be BE_RX_RING_SIZE (i.e. 1024). So this loop can allocate 8 Mo of memory. Previous memory allocations in this function already use GFP_KERNEL, so use __netdev_alloc_skb() and an explicit GFP_KERNEL instead of a implicit GFP_ATOMIC. This gives more opportunities of successful allocation. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Reviewed-by: Biju Das <biju.das.jz@bp.renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-21net: qualcomm: rmnet: Use skb_put_zero() to simplify codeChristophe JAILLET1-3/+1
Use skb_put_zero() instead of hand-writing it. This saves a few lines of code and is more readable. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-21Merge branch 'ipv4-invalidate-broadcast-neigh-upon-address-addition'David S. Miller4-4/+69
Ido Schimmel says: ==================== ipv4: Invalidate neighbour for broadcast address upon address addition Patch #1 solves a recently reported issue [1]. See detailed description in the changelog. Patch #2 adds a matching test case. Targeting at net-next since as far as I can tell this use case never worked. There are no regressions in fib_tests.sh with this change: # ./fib_tests.sh ... Tests passed: 186 Tests failed: 0 [1] https://lore.kernel.org/netdev/55a04a8f-56f3-f73c-2aea-2195923f09d1@huawei.com/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-21selftests: fib_test: Add a test case for IPv4 broadcast neighboursIdo Schimmel1-1/+57
Test that resolved neighbours for IPv4 broadcast addresses are unaffected by the configuration of matching broadcast routes, whereas unresolved neighbours are invalidated. Without previous patch: # ./fib_tests.sh -t ipv4_bcast_neigh IPv4 broadcast neighbour tests TEST: Resolved neighbour for broadcast address [ OK ] TEST: Resolved neighbour for network broadcast address [ OK ] TEST: Unresolved neighbour for broadcast address [FAIL] TEST: Unresolved neighbour for network broadcast address [FAIL] Tests passed: 2 Tests failed: 2 With previous patch: # ./fib_tests.sh -t ipv4_bcast_neigh IPv4 broadcast neighbour tests TEST: Resolved neighbour for broadcast address [ OK ] TEST: Resolved neighbour for network broadcast address [ OK ] TEST: Unresolved neighbour for broadcast address [ OK ] TEST: Unresolved neighbour for network broadcast address [ OK ] Tests passed: 4 Tests failed: 0 Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-21ipv4: Invalidate neighbour for broadcast address upon address additionIdo Schimmel3-3/+12
In case user space sends a packet destined to a broadcast address when a matching broadcast route is not configured, the kernel will create a unicast neighbour entry that will never be resolved [1]. When the broadcast route is configured, the unicast neighbour entry will not be invalidated and continue to linger, resulting in packets being dropped. Solve this by invalidating unresolved neighbour entries for broadcast addresses after routes for these addresses are internally configured by the kernel. This allows the kernel to create a broadcast neighbour entry following the next route lookup. Another possible solution that is more generic but also more complex is to have the ARP code register a listener to the FIB notification chain and invalidate matching neighbour entries upon the addition of broadcast routes. It is also possible to wave off the issue as a user space problem, but it seems a bit excessive to expect user space to be that intimately familiar with the inner workings of the FIB/neighbour kernel code. [1] https://lore.kernel.org/netdev/55a04a8f-56f3-f73c-2aea-2195923f09d1@huawei.com/ Reported-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-21gso: do not skip outer ip header in case of ipip and net_failoverTao Liu2-1/+6
We encounter a tcp drop issue in our cloud environment. Packet GROed in host forwards to a VM virtio_net nic with net_failover enabled. VM acts as a IPVS LB with ipip encapsulation. The full path like: host gro -> vm virtio_net rx -> net_failover rx -> ipvs fullnat -> ipip encap -> net_failover tx -> virtio_net tx When net_failover transmits a ipip pkt (gso_type = 0x0103, which means SKB_GSO_TCPV4, SKB_GSO_DODGY and SKB_GSO_IPXIP4), there is no gso did because it supports TSO and GSO_IPXIP4. But network_header points to inner ip header. Call Trace: tcp4_gso_segment ------> return NULL inet_gso_segment ------> inner iph, network_header points to ipip_gso_segment inet_gso_segment ------> outer iph skb_mac_gso_segment Afterwards virtio_net transmits the pkt, only inner ip header is modified. And the outer one just keeps unchanged. The pkt will be dropped in remote host. Call Trace: inet_gso_segment ------> inner iph, outer iph is skipped skb_mac_gso_segment __skb_gso_segment validate_xmit_skb validate_xmit_skb_list sch_direct_xmit __qdisc_run __dev_queue_xmit ------> virtio_net dev_hard_start_xmit __dev_queue_xmit ------> net_failover ip_finish_output2 ip_output iptunnel_xmit ip_tunnel_xmit ipip_tunnel_xmit ------> ipip dev_hard_start_xmit __dev_queue_xmit ip_finish_output2 ip_output ip_forward ip_rcv __netif_receive_skb_one_core netif_receive_skb_internal napi_gro_receive receive_buf virtnet_poll net_rx_action The root cause of this issue is specific with the rare combination of SKB_GSO_DODGY and a tunnel device that adds an SKB_GSO_ tunnel option. SKB_GSO_DODGY is set from external virtio_net. We need to reset network header when callbacks.gso_segment() returns NULL. This patch also includes ipv6_gso_segment(), considering SIT, etc. Fixes: cb32f511a70b ("ipip: add GSO/TSO support") Signed-off-by: Tao Liu <thomas.liu@ucloud.cn> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-21net: core: Use csum_replace_by_diff() and csum_sub() instead of opencodingChristophe Leroy1-2/+2
Open coded calculation can be avoided and replaced by the equivalent csum_replace_by_diff() and csum_sub(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-21tools/cgroup/slabinfo: update to work with struct slabRoman Gushchin1-15/+15
After the introduction of the dedicated struct slab to describe slab pages by commit d122019bf061 ("mm: Split slab into its own type") and the following removal of the corresponding struct page's fields by commit 07f910f9b729 ("mm: Remove slab from struct page") the memcg_slabinfo tool broke. An attempt to run it produces a trace like this: Traceback (most recent call last): File "/usr/bin/drgn", line 33, in <module> sys.exit(load_entry_point('drgn==0.0.16', 'console_scripts', 'drgn')()) File "/usr/lib64/python3.9/site-packages/drgn/internal/cli.py", line 133, in main runpy.run_path(args.script[0], init_globals=init_globals, run_name="__main__") File "/usr/lib64/python3.9/runpy.py", line 268, in run_path return _run_module_code(code, init_globals, run_name, File "/usr/lib64/python3.9/runpy.py", line 97, in _run_module_code _run_code(code, mod_globals, init_globals, File "/usr/lib64/python3.9/runpy.py", line 87, in _run_code exec(code, run_globals) File "memcg_slabinfo.py", line 226, in <module> main() File "memcg_slabinfo.py", line 199, in main cache = page.slab_cache AttributeError: 'struct page' has no member 'slab_cache' The problem can be fixed by explicitly casting struct page * to struct slab * for slab pages. The tools works as expected with this fix, e.g.: cred_jar 776 776 192 21 1 : tunables 0 0 0 : slabdata 547 547 0 kmalloc-cg-32 6 6 32 128 1 : tunables 0 0 0 : slabdata 9 9 0 files_cache 3 3 832 39 8 : tunables 0 0 0 : slabdata 8 8 0 kmalloc-cg-512 1 1 512 32 4 : tunables 0 0 0 : slabdata 10 10 0 task_struct 10 10 6720 4 8 : tunables 0 0 0 : slabdata 63 63 0 mm_struct 3 3 1664 19 8 : tunables 0 0 0 : slabdata 9 9 0 kmalloc-cg-16 1 1 16 256 1 : tunables 0 0 0 : slabdata 8 8 0 pde_opener 1 1 40 102 1 : tunables 0 0 0 : slabdata 8 8 0 anon_vma_chain 375 375 64 64 1 : tunables 0 0 0 : slabdata 81 81 0 radix_tree_node 3 3 584 28 4 : tunables 0 0 0 : slabdata 419 419 0 dentry 98 98 312 26 2 : tunables 0 0 0 : slabdata 1420 1420 0 btrfs_inode 3 3 2368 13 8 : tunables 0 0 0 : slabdata 730 730 0 signal_cache 3 3 1600 20 8 : tunables 0 0 0 : slabdata 17 17 0 sighand_cache 3 3 2240 14 8 : tunables 0 0 0 : slabdata 20 20 0 filp 90 90 512 32 4 : tunables 0 0 0 : slabdata 95 95 0 anon_vma 214 214 200 20 1 : tunables 0 0 0 : slabdata 162 162 0 kmalloc-cg-1k 1 1 1024 32 8 : tunables 0 0 0 : slabdata 22 22 0 pid 10 10 256 32 2 : tunables 0 0 0 : slabdata 14 14 0 kmalloc-cg-64 2 2 64 64 1 : tunables 0 0 0 : slabdata 8 8 0 kmalloc-cg-96 3 3 96 42 1 : tunables 0 0 0 : slabdata 8 8 0 sock_inode_cache 5 5 1408 23 8 : tunables 0 0 0 : slabdata 29 29 0 UNIX 7 7 1920 17 8 : tunables 0 0 0 : slabdata 21 21 0 inode_cache 36 36 1152 28 8 : tunables 0 0 0 : slabdata 680 680 0 proc_inode_cache 26 26 1224 26 8 : tunables 0 0 0 : slabdata 64 64 0 kmalloc-cg-2k 2 2 2048 16 8 : tunables 0 0 0 : slabdata 9 9 0 v2: change naming and count_partial()/count_free()/for_each_slab() signatures to work with slabs, suggested by Matthew Wilcox Fixes: 07f910f9b729 ("mm: Remove slab from struct page") Reported-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Roman Gushchin <guro@fb.com> Tested-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Link: https://lore.kernel.org/linux-patches/Yg2cKKnIboNu7j+p@carbon.DHCP.thefacebook.com/
2022-02-21slab: remove __alloc_size attribute from __kmalloc_track_callerGreg Kroah-Hartman1-2/+1
Commit c37495d6254c ("slab: add __alloc_size attributes for better bounds checking") added __alloc_size attributes to a bunch of kmalloc function prototypes. Unfortunately the change to __kmalloc_track_caller seems to cause clang to generate broken code and the first time this is called when booting, the box will crash. While the compiler problems are being reworked and attempted to be solved [1], let's just drop the attribute to solve the issue now. Once it is resolved it can be added back. [1] https://github.com/ClangBuiltLinux/linux/issues/1599 Fixes: c37495d6254c ("slab: add __alloc_size attributes for better bounds checking") Cc: stable <stable@vger.kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Nathan Chancellor <nathan@kernel.org> Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org Cc: llvm@lists.linux.dev Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Link: https://lore.kernel.org/r/20220218131358.3032912-1-gregkh@linuxfoundation.org
2022-02-21Linux 5.17-rc5Linus Torvalds1-1/+1
2022-02-20Merge tag 'locking_urgent_for_v5.17_rc5' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Borislav Petkov: "Fix a NULL ptr dereference when dumping lockdep chains through /proc/lockdep_chains" * tag 'locking_urgent_for_v5.17_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: lockdep: Correct lock_classes index mapping
2022-02-20Merge tag 'x86_urgent_for_v5.17_rc5' of ↵Linus Torvalds3-16/+7
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Fix the ptrace regset xfpregs_set() callback to behave according to the ABI - Handle poisoned pages properly in the SGX reclaimer code * tag 'x86_urgent_for_v5.17_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ptrace: Fix xfpregs_set()'s incorrect xmm clearing x86/sgx: Fix missing poison handling in reclaimer
2022-02-20Merge tag 'sched_urgent_for_v5.17_rc5' of ↵Linus Torvalds3-16/+35
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Borislav Petkov: "Fix task exposure order when forking tasks" * tag 'sched_urgent_for_v5.17_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix yet more sched_fork() races
2022-02-20Merge tag 'edac_urgent_for_v5.17_rc5' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC fix from Borislav Petkov: "Fix a long-standing struct alignment bug in the EDAC struct allocation code" * tag 'edac_urgent_for_v5.17_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC: Fix calculation of returned address and next offset in edac_align_ptr()
2022-02-20Merge tag 'scsi-fixes' of ↵Linus Torvalds6-8/+29
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Three fixes, all in drivers. The ufs and qedi fixes are minor; the lpfc one is a bit bigger because it involves adding a heuristic to detect and deal with common but not standards compliant behaviour" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: core: Fix divide by zero in ufshcd_map_queues() scsi: lpfc: Fix pt2pt NVMe PRLI reject LOGO loop scsi: qedi: Fix ABBA deadlock in qedi_process_tmf_resp() and qedi_process_cmd_cleanup_resp()
2022-02-20Merge tag 'dmaengine-fix-5.17' of ↵Linus Torvalds5-13/+25
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine fixes from Vinod Koul: "A bunch of driver fixes for: - ptdma error handling in init - lock fix in at_hdmac - error path and error num fix for sh dma - pm balance fix for stm32" * tag 'dmaengine-fix-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: dmaengine: shdma: Fix runtime PM imbalance on error dmaengine: sh: rcar-dmac: Check for error num after dma_set_max_seg_size dmaengine: stm32-dmamux: Fix PM disable depth imbalance in stm32_dmamux_probe dmaengine: sh: rcar-dmac: Check for error num after setting mask dmaengine: at_xdmac: Fix missing unlock in at_xdmac_tasklet() dmaengine: ptdma: Fix the error handling path in pt_core_init()
2022-02-20Merge branch 'i2c/for-current' of ↵Linus Torvalds5-17/+26
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Some driver updates, a MAINTAINERS fix, and additions to COMPILE_TEST (so we won't miss build problems again)" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: MAINTAINERS: remove duplicate entry for i2c-qcom-geni i2c: brcmstb: fix support for DSL and CM variants i2c: qup: allow COMPILE_TEST i2c: imx: allow COMPILE_TEST i2c: cadence: allow COMPILE_TEST i2c: qcom-cci: don't put a device tree node before i2c_add_adapter() i2c: qcom-cci: don't delete an unregistered adapter i2c: bcm2835: Avoid clock stretching timeouts
2022-02-20Merge branch 'for-linus' of ↵Linus Torvalds3-0/+28
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: - a fix for Synaptics touchpads in RMI4 mode failing to suspend/resume properly because I2C client devices are now being suspended and resumed asynchronously which changed the ordering - a change to make sure we do not set right and middle buttons capabilities on touchpads that are "buttonpads" (i.e. do not have separate physical buttons) - a change to zinitix touchscreen driver adding more compatible strings/IDs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: psmouse - set up dependency between PS/2 and SMBus companions Input: zinitix - add new compatible strings Input: clear BTN_RIGHT/MIDDLE on buttonpads
2022-02-20Merge tag 'for-v5.17-rc' of ↵Linus Torvalds3-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply Pull power supply fixes from Sebastian Reichel: "Three regression fixes for the 5.17 cycle: - build warning fix for power-supply documentation - pointer size fix in cw2015 battery driver - OOM handling in bq256xx charger driver" * tag 'for-v5.17-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: power: supply: bq256xx: Handle OOM correctly power: supply: core: fix application of sizeof to pointer power: supply: fix table problem in sysfs-class-power
2022-02-20Merge tag 'fs.mount_setattr.v5.17-rc4' of ↵Linus Torvalds3-2/+41
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull mount_setattr test/doc fixes from Christian Brauner: "This contains a fix for one of the selftests for the mount_setattr syscall to create idmapped mounts, an entry for idmapped mounts for maintainers, and missing kernel documentation for the helper we split out some time ago to get and yield write access to a mount when changing mount properties" * tag 'fs.mount_setattr.v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: fs: add kernel doc for mnt_{hold,unhold}_writers() MAINTAINERS: add entry for idmapped mounts tests: fix idmapped mount_setattr test