kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2018-11-18	Merge branch 'ncsi-Allow-enabling-multiple-packages-and-channels'	David S. Miller	6	-206/+660
	Samuel Mendoza-Jonas says: ==================== net/ncsi: Allow enabling multiple packages & channels This series extends the NCSI driver to configure multiple packages and/or channels simultaneously. Since the RFC series this includes a few extra changes to fix areas in the driver that either made this harder or were roadblocks due to deviations from the NCSI specification. Patches 1 & 2 fix two issues where the driver made assumptions about the capabilities of the NCSI topology. Patches 3 & 4 change some internal semantics slightly to make multi-mode easier. Patch 5 introduces a cleaner way of reconfiguring the NCSI configuration and keeping track of channel states. Patch 6 implements the main multi-package/multi-channel configuration, configured via the Netlink interface. Readers who have an interesting NCSI setup - especially multi-package with HWA - please test! I think I've covered all permutations but I don't have infinite hardware to test on. Changes in v2: - Updated use of the channel lock in ncsi_reset_dev(), making the channel invisible and leaving the monitor check to ncsi_stop_channel_monitor(). - Fixed ncsi_channel_is_tx() to consider the state of channels in other packages. Changes in v3: - Fixed bisectability bug in patch 1 - Consider channels on all packages in a few places when multi-package is enabled. - Avoid doubling up reset operations, and check the current driver state before reset to let any running operations complete. - Reorganise the LSC handler slightly to avoid enabling Tx twice. Changes in v4: - Fix failover in the single-channel case - Better handle ncsi_reset_dev() entry during a current config/suspend operation. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-18	net/ncsi: Configure multi-package, multi-channel modes with failover	Samuel Mendoza-Jonas	6	-88/+493
	This patch extends the ncsi-netlink interface with two new commands and three new attributes to configure multiple packages and/or channels at once, and configure specific failover modes. NCSI_CMD_SET_PACKAGE mask and NCSI_CMD_SET_CHANNEL_MASK set a whitelist of packages or channels allowed to be configured with the NCSI_ATTR_PACKAGE_MASK and NCSI_ATTR_CHANNEL_MASK attributes respectively. If one of these whitelists is set only packages or channels matching the whitelist are considered for the channel queue in ncsi_choose_active_channel(). These commands may also use the NCSI_ATTR_MULTI_FLAG to signal that multiple packages or channels may be configured simultaneously. NCSI hardware arbitration (HWA) must be available in order to enable multi-package mode. Multi-channel mode is always available. If the NCSI_ATTR_CHANNEL_ID attribute is present in the NCSI_CMD_SET_CHANNEL_MASK command the it sets the preferred channel as with the NCSI_CMD_SET_INTERFACE command. The combination of preferred channel and channel whitelist defines a primary channel and the allowed failover channels. If the NCSI_ATTR_MULTI_FLAG attribute is also present then the preferred channel is configured for Tx/Rx and the other channels are enabled only for Rx. Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-18	net/ncsi: Reset channel state in ncsi_start_dev()	Samuel Mendoza-Jonas	3	-12/+113
	When the NCSI driver is stopped with ncsi_stop_dev() the channel monitors are stopped and the state set to "inactive". However the channels are still configured and active from the perspective of the network controller. We should suspend each active channel but in the context of ncsi_stop_dev() the transmit queue has been or is about to be stopped so we won't have time to do so. Instead when ncsi_start_dev() is called if the NCSI topology has already been probed then call ncsi_reset_dev() to suspend any channels that were previously active. This resets the network controller to a known state, provides an up to date view of channel link state, and makes sure that mode flags such as NCSI_MODE_TX_ENABLE are properly reset. In addition to ncsi_start_dev() use ncsi_reset_dev() in ncsi-netlink.c to update the channel configuration more cleanly. Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-18	net/ncsi: Don't mark configured channels inactive	Samuel Mendoza-Jonas	2	-8/+12
	The concepts of a channel being 'active' and it having link are slightly muddled in the NCSI driver. Tweak this slightly so that NCSI_CHANNEL_ACTIVE represents a channel that has been configured and enabled, and NCSI_CHANNEL_INACTIVE represents a de-configured channel. This distinction is important because a channel can be 'active' but have its link down; in this case the channel may still need to be configured so that it may receive AEN link-state-change packets. Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-18	net/ncsi: Don't deselect package in suspend if active	Samuel Mendoza-Jonas	1	-2/+13
	When a package is deselected all channels of that package cease communication. If there are other channels active on the package of the suspended channel this will disable them as well, so only send a deselect-package command if no other channels are active. Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-18	net/ncsi: Probe single packages to avoid conflict	Samuel Mendoza-Jonas	2	-55/+31
	Currently the NCSI driver sends a select-package command to all possible packages simultaneously to discover what packages are available. However at this stage in the probe process the driver does not know if hardware arbitration is available: if it isn't then this process could cause collisions on the RMII bus when packages try to respond. Update the probe loop to probe each package one by one, and once complete check if HWA is universally supported. Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-18	net/ncsi: Don't enable all channels when HWA available	Samuel Mendoza-Jonas	2	-48/+5
	NCSI hardware arbitration allows multiple packages to be enabled at once and share the same wiring. If the NCSI driver recognises that HWA is available it unconditionally enables all packages and channels; but that is a configuration decision rather than something required by HWA. Additionally the current implementation will not failover on link events which can cause connectivity to be lost unless the interface is manually bounced. Retain basic HWA support but remove the separate configuration path to enable all channels, leaving this to be handled by a later implementation. Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-18	cxgb4: Remove SGE_HOST_PAGE_SIZE dependency on page size	Arjun Vynipadath	1	-11/+0
	The SGE Host Page Size has nothing to do with the actual Host Page Size. It's the SGE's BAR2 Doorbell/GTS Page Size for interpreting the SGE Ingress/Egress Queue per Page values. Firmware reads all of these things and makes all the subsequent changes necessary. The Host Driver uses the SGE Host Page Size in order to properly calculate BAR2 Offsets. Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-18	tcp: add SRTT to SCM_TIMESTAMPING_OPT_STATS	Yousuk Seung	2	-0/+3
	Add TCP_NLA_SRTT to SCM_TIMESTAMPING_OPT_STATS that reports the smoothed round trip time in microseconds (tcp_sock.srtt_us >> 3). Signed-off-by: Yousuk Seung <ysseung@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-18	uapi/ethtool: fix spelling errors	Stephen Hemminger	1	-2/+2
	Trivial spelling errors found by codespell. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-18	tun: Adjust on-stack tun_page initialization.	David S. Miller	1	-1/+3
	Instead of constantly playing with the struct initializer syntax trying to make gcc and CLang both happy, just clear it out using memset(). >> drivers/net/tun.c:2503:42: warning: Using plain integer as NULL pointer Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	tuntap: free XDP dropped packets in a batch	Jason Wang	1	-3/+26
	Thanks to the batched XDP buffs through msg_control. Instead of calling put_page() for each page which involves a atomic operation, let's batch them by record the last page that needs to be freed and its refcnt count and free them in a batch. Testpmd(virtio-user + vhost_net) + XDP_DROP shows 3.8% improvement. Before: 4.71Mpps After : 4.89Mpps Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	vhost_net: mitigate page reference counting during page frag refill	Jason Wang	1	-3/+51
	We do a get_page() which involves a atomic operation. This patch tries to mitigate a per packet atomic operation by maintaining a reference bias which is initially USHRT_MAX. Each time a page is got, instead of calling get_page() we decrease the bias and when we find it's time to use a new page we will decrease the bias at one time through __page_cache_drain_cache(). Testpmd(virtio_user + vhost_net) + XDP_DROP on TAP shows about 1.6% improvement. Before: 4.63Mpps After: 4.71Mpps Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	netfilter: xt_hashlimit: fix a possible memory leak in htable_create()	Taehee Yoo	1	-6/+3
	In the htable_create(), hinfo is allocated by vmalloc() So that if error occurred, hinfo should be freed. Fixes: 11d5f15723c9 ("netfilter: xt_hashlimit: Create revision 2 to support higher pps rates") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-11-17	dax: Fix huge page faults	Matthew Wilcox	1	-8/+4
	Using xas_load() with a PMD-sized xa_state would work if either a PMD-sized entry was present or a PTE sized entry was present in the first 64 entries (of the 512 PTEs in a PMD on x86). If there was no PTE in the first 64 entries, grab_mapping_entry() would believe there were no entries present, allocate a PMD-sized entry and overwrite the PTE in the page cache. Use xas_find_conflict() instead which turns out to simplify both get_unlocked_entry() and grab_mapping_entry(). Also remove a WARN_ON_ONCE from grab_mapping_entry() as it will have already triggered in get_unlocked_entry(). Fixes: cfc93c6c6c96 ("dax: Convert dax_insert_pfn_mkwrite to XArray") Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-11-17	dax: Fix dax_unlock_mapping_entry for PMD pages	Matthew Wilcox	1	-9/+8
	Device DAX PMD pages do not set the PageHead bit for compound pages. Fix for now by retrieving the PMD bit from the entry, but eventually we will be passed the page size by the caller. Reported-by: Dan Williams <dan.j.williams@intel.com> Fixes: 9f32d221301c ("dax: Convert dax_lock_mapping_entry to XArray") Signed-off-by: Matthew Wilcox <willy@infradead.org>
2018-11-17	Merge branch 'net-sched-gred-introduce-per-virtual-queue-attributes'	David S. Miller	2	-27/+281
	Jakub Kicinski says: ==================== net: sched: gred: introduce per-virtual queue attributes This series updates the GRED Qdisc. The Qdisc matches nfp offload very well, but before we can offload it there are a number of improvements to make. First few patches add extack messages to the Qdisc and pass extack to netlink validation. Next a new netlink attribute group is added, to allow GRED to be extended more easily. Currently GRED passes C structures as attributes, and even an array of C structs for virtual queue configuration. User space has hard coded the expected length of that array, so adding new fields is not possible. New two-level attribute group is added: [TCA_GRED_VQ_LIST] [TCA_GRED_VQ_ENTRY] [TCA_GRED_VQ_DP] [TCA_GRED_VQ_FLAGS] [TCA_GRED_VQ_STAT_] [TCA_GRED_VQ_ENTRY] [TCA_GRED_VQ_DP] [TCA_GRED_VQ_FLAGS] [TCA_GRED_VQ_STAT_] [TCA_GRED_VQ_ENTRY] ... Statistics are dump only. Patch 4 switches the byte counts to be 64 bit, and patch 5 introduces the new stats attributes for dump. Patch 6 switches RED flags to be per-virtual queue, and patch 7 allows them to be dumped and set at virtual queue granularity. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	net: sched: gred: allow manipulating per-DP RED flags	Jakub Kicinski	2	-1/+144
	Allow users to set and dump RED flags (ECN enabled and harddrop) on per-virtual queue basis. Validation of attributes is split from changes to make sure we won't have to undo previous operations when we find out configuration is invalid. The objective is to allow changing per-Qdisc parameters without overwriting the per-vq configured flags. Old user space will not pass the TCA_GRED_VQ_FLAGS attribute and per-Qdisc flags will always get propagated to the virtual queues. New user space which wants to make use of per-vq flags should set per-Qdisc flags to 0 and then configure per-vq flags as it sees fit. Once per-vq flags are set per-Qdisc flags can't be changed to non-zero. Vice versa - if the per-Qdisc flags are non-zero the TCA_GRED_VQ_FLAGS attribute has to either be omitted or set to the same value as per-Qdisc flags. Update per-Qdisc parameters: per-Qdisc \| per-VQ \| result 0 \| 0 \| all vq flags updated 0 \| non-0 \| error (vq flags in use) non-0 \| 0 \| -- impossible -- non-0 \| non-0 \| all vq flags updated Update per-VQ state (flags parameter not specified): no change to flags Update per-VQ state (flags parameter set): per-Qdisc \| per-VQ \| result 0 \| any \| per-vq flags updated non-0 \| 0 \| -- impossible -- non-0 \| non-0 \| error (per-Qdisc flags in use) Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	net: sched: gred: store red flags per virtual queue	Jakub Kicinski	1	-6/+18
	Right now ECN marking and HARD drop (the common RED flags) can only be configured for the entire Qdisc. In preparation for per-vq flags store the values in the virtual queue structure. Setting per-vq flags will only be allowed when no flags are set for the entire Qdisc. For the new flags we will also make sure undefined bits are 0. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	net: sched: gred: provide a better structured dump and expose stats	Jakub Kicinski	2	-1/+78
	Currently all GRED's virtual queue data is dumped in a single array in a single attribute. This makes it pretty much impossible to add new fields. In order to expose more detailed stats add a new set of attributes. We can now expose the 64 bit value of bytesin and all the mark stats which were not part of the original design. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	net: sched: gred: store bytesin as a 64 bit value	Jakub Kicinski	1	-1/+1
	32 bit counters for bytes are not really going to last long in modern world. Make sch_gred count bytes on a 64 bit counter. It will still get truncated during dump but follow up patch will add set of new stat dump attributes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	net: sched: gred: use extack to provide more details on configuration errors	Jakub Kicinski	1	-11/+33
	Add extack messages to -EINVAL errors, to help users identify their mistakes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	net: sched: gred: pass extack to nla_parse_nested()	Jakub Kicinski	1	-2/+2
	In case netlink wants to provide parsing error pass extack to nla_parse_nested(). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	net: sched: gred: separate error and non-error path in gred_change()	Jakub Kicinski	1	-6/+6
	We will soon want to add more code to the non-error path, separate it from the error handling flow. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	Revert "net: phy: mdio-gpio: Fix working over slow can_sleep GPIOs"	David S. Miller	1	-9/+5
	This reverts commit dfa0d55ff6be64e7b6881212a291cb95f8da3b08. Discussion still ongoing, I shouldn't have applied this. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	selftests: add explicit test for multiple concurrent GRO sockets	Paolo Abeni	1	-0/+34
	This covers for proper accounting of encap needed static keys Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	isdn/hisax: remove set but not used variable 'total'	YueHaibing	1	-2/+1
	Fixes gcc '-Wunused-but-set-variable' warning: drivers/isdn/hisax/hfc_pci.c:277:6: warning: variable ‘total’ set but not used [-Wunused-but-set-variable] It never used since git history start. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	udp: fix jump label misuse	Paolo Abeni	2	-4/+4
	The commit 60fb9567bf30 ("udp: implement complete book-keeping for encap_needed") introduced a severe misuse of jump label APIs, which syzbot, as reported by Eric, was able to exploit. When multiple sockets/process can concurrently request (and than disable) the udp encap, we need to track the activation counter with _inc()/_dec() jump label variants, or we can experience bad things at disable time. Fixes: 60fb9567bf30 ("udp: implement complete book-keeping for encap_needed") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	bpf: allocate local storage buffers using GFP_ATOMIC	Roman Gushchin	1	-1/+2
	Naresh reported an issue with the non-atomic memory allocation of cgroup local storage buffers: [ 73.047526] BUG: sleeping function called from invalid context at /srv/oe/build/tmp-rpb-glibc/work-shared/intel-corei7-64/kernel-source/mm/slab.h:421 [ 73.060915] in_atomic(): 1, irqs_disabled(): 0, pid: 3157, name: test_cgroup_sto [ 73.068342] INFO: lockdep is turned off. [ 73.072293] CPU: 2 PID: 3157 Comm: test_cgroup_sto Not tainted 4.20.0-rc2-next-20181113 #1 [ 73.080548] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.0b 07/27/2017 [ 73.088018] Call Trace: [ 73.090463] dump_stack+0x70/0xa5 [ 73.093783] ___might_sleep+0x152/0x240 [ 73.097619] __might_sleep+0x4a/0x80 [ 73.101191] __kmalloc_node+0x1cf/0x2f0 [ 73.105031] ? cgroup_storage_update_elem+0x46/0x90 [ 73.109909] cgroup_storage_update_elem+0x46/0x90 cgroup_storage_update_elem() (as well as other update map update callbacks) is called with disabled preemption, so GFP_ATOMIC allocation should be used: e.g. alloc_htab_elem() in hashtab.c. Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-17	bpf: fix off-by-one error in adjust_subprog_starts	Edward Cree	2	-1/+20
	When patching in a new sequence for the first insn of a subprog, the start of that subprog does not change (it's the first insn of the sequence), so adjust_subprog_starts should check start <= off (rather than < off). Also added a test to test_verifier.c (it's essentially the syz reproducer). Fixes: cc8b0b92a169 ("bpf: introduce function calls (function boundaries)") Reported-by: syzbot+4fc427c7af994b0948be@syzkaller.appspotmail.com Signed-off-by: Edward Cree <ecree@solarflare.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-17	bpf: fix null pointer dereference on pointer offload	Colin Ian King	1	-2/+3
	Pointer offload is being null checked however the following statement dereferences the potentially null pointer offload when assigning offload->dev_state. Fix this by only assigning it if offload is not null. Detected by CoverityScan, CID#1475437 ("Dereference after null check") Fixes: 00db12c3d141 ("bpf: call verifier_prep from its callback in struct bpf_offload_dev") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-17	bpftool: make libbfd optional	Stanislav Fomichev	5	-6/+35
	Make it possible to build bpftool without libbfd. libbfd and libopcodes are typically provided in dev/dbg packages (binutils-dev in debian) which we usually don't have installed on the fleet machines and we'd like a way to have bpftool version that works without installing any additional packages. This excludes support for disassembling jit-ted code and prints an error if the user tries to use these features. Tested by: cat > FEATURES_DUMP.bpftool <<EOF feature-libbfd=0 feature-disassembler-four-args=1 feature-reallocarray=0 feature-libelf=1 feature-libelf-mmap=1 feature-bpf=1 EOF FEATURES_DUMP=$PWD/FEATURES_DUMP.bpftool make ldd bpftool \| grep libbfd Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-17	etf: Drop all expired packets	Jesus Sanchez-Palencia	1	-15/+21
	Currently on dequeue() ETF only drops the first expired packet, which causes a problem if the next packet is already expired. When this happens, the watchdog will be configured with a time in the past, fire straight way and the packet will finally be dropped once the dequeue() function of the qdisc is called again. We can save quite a few cycles and improve the overall behavior of the qdisc if we drop all expired packets if the next packet is expired. This should allow ETF to recover faster from bad situations. But packet drops are still a very serious warning that the requirements imposed on the system aren't reasonable. This was inspired by how the implementation of hrtimers use the rb_tree inside the kernel. Signed-off-by: Jesus Sanchez-Palencia <jesus.s.palencia@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	etf: Split timersortedlist_erase()	Jesus Sanchez-Palencia	1	-15/+29
	This is just a refactor that will simplify the implementation of the next patch in this series which will drop all expired packets on the dequeue flow. Signed-off-by: Jesus Sanchez-Palencia <jesus.s.palencia@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	etf: Use cached rb_root	Jesus Sanchez-Palencia	1	-9/+12
	ETF's peek() operation is heavily used so use an rb_root_cached instead and leverage rb_first_cached() which will run in O(1) instead of O(log n). Even if on 'timesortedlist_clear()' we could be using rb_erase(), we choose to use rb_erase_cached(), because if in the future we allow runtime changes to ETF parameters, and need to do a '_clear()', this might cause some hard to debug issues. Signed-off-by: Jesus Sanchez-Palencia <jesus.s.palencia@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	etf: Cancel timer if there are no pending skbs	Jesus Sanchez-Palencia	1	-1/+3
	There is no point in firing the qdisc watchdog if there are no future skbs pending in the queue and the watchdog had been set previously. Signed-off-by: Jesus Sanchez-Palencia <jesus.s.palencia@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	tcp: clean up STATE_TRACE	Yafang Shao	2	-16/+0
	Currently we can use bpf or tcp tracepoint to conveniently trace the tcp state transition at the run time. So we don't need to do this stuff at the compile time anymore. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	Merge tag 'batadv-net-for-davem-20181114' of git://git.open-mesh.org/linux-merge	David S. Miller	2	-3/+5
	Simon Wunderlich says: ==================== Here are two batman-adv bugfixes: - Explicitly pad short ELP packets with zeros, by Sven Eckelmann - Fix packet size calculation when merging fragments, by Sven Eckelmann ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	net: phy: mdio-gpio: Fix working over slow can_sleep GPIOs	Martin Schiller	1	-5/+9
	This commit re-enables support for slow GPIO pins. It was initially introduced by commit 2d6c9091ab76 ("net: mdio-gpio: support access that may sleep") and got lost by commit 7e5fbd1e0700 ("net: mdio-gpio: Convert to use gpiod functions where possible"). Also add a warning about slow GPIO pins like it is done in i2c-gpio. Signed-off-by: Martin Schiller <ms@dev.tdt.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	Merge branch 'SMSC95xx-driver-updates'	David S. Miller	1	-21/+34
	Ben Dooks says: ==================== SMSC95xx driver updates (round 1) This is a series of a few driver cleanups and some fixups of the code for the SMSC95XX driver. There have been a few reviews, and the issues have been fixed so this should be ready for merging. I will work on the tx-alignment and the other bits of usbnet changes and produce at least two more patch series for this later. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	usbnet: smsc95xx: check for csum being in last four bytes	Ben Dooks	1	-1/+18
	The manual states that the checksum cannot lie in the last DWORD of the transmission, so add a basic check for this and fall back to software checksumming the packet. This only seems to trigger for ACK packets with no options or data to return to the other end, and the use of the tx-alignment option makes it more likely to happen. Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	usbnet: smsc95xx: fix memcpy for accessing rx-data	Ben Dooks	1	-5/+2
	Change the RX code to use get_unaligned_le32() instead of the combo of memcpy and cpu_to_le32s(&var). Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	usbnet: smsc95xx: simplify tx_fixup code	Ben Dooks	1	-15/+13
	The smsc95xx_tx_fixup is doing multiple calls to skb_push() to put an 8-byte command header onto the packet. It would be easier to do one skb_push() and then copy the data in once the push is done. We also make the code smaller by using proper unaligned puts for the header. This merges in the CPU to LE32 conversion as well and makes the whole sequence easier to understand hopefully. Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	usbnet: smsc95xx: fix rx packet alignment	Ben Dooks	1	-0/+1
	The smsc95xx driver already takes into account the NET_IP_ALIGN parameter when setting up the receive packet data, which means we do not need to worry about aligning the packets in the usbnet driver. Adding the EVENT_NO_IP_ALIGN means that the IPv4 header is now passed to the ip_rcv() routine with the start on an aligned address. Tested on Raspberry Pi B3. Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	Merge branch 'dpaa2-eth-add-bql-support'	David S. Miller	2	-25/+54
	Ioana Ciocoi Radulescu says: ==================== dpaa2-eth: add bql support The first two patches make minor tweaks to the driver to simplify bql implementation. The third patch adds the actual bql support. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	dpaa2-eth: bql support	Ioana Ciocoi Radulescu	2	-15/+46
	Add support for byte queue limit. On NAPI poll, we save the total number of Tx confirmed frames/bytes and register them with bql at the end of the poll function. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	dpaa2-eth: Update callback signature	Ioana Ciocoi Radulescu	2	-9/+6
	Change the frame consume callback signature: * the entire FQ structure is passed to the callback instead of just the queue index * the NAPI structure can be easily obtained from the channel it is associated to, so we don't need to pass it explicitly Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	dpaa2-eth: Don't use multiple queues per channel	Ioana Ciocoi Radulescu	2	-2/+3
	The DPNI object on which we build a network interface has a certain number of {Rx, Tx, Tx confirmation} frame queues as resources. The default hardware setup offers one queue of each type, as well as one DPCON channel, for each core available in the system. There are however cases where the number of queues is greater than the number of cores or channels. Until now, we configured and used all the frame queues associated with a DPNI, even if it meant assigning multiple queues of one type to the same channel. Update the driver to only use a number of queues equal to the number of channels, ensuring each channel will contain exactly one Rx and one Tx confirmation queue. >From the user viewpoint, this change is completely transparent. Performance wise there is no impact in most scenarios. In case the number of queues is larger than and not a multiple of the number of channels, Rx hash distribution offers now better load balancing between cores, which can have a positive impact on overall system performance. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	net/sched: act_pedit: fix memory leak when IDR allocation fails	Davide Caratti	1	-1/+2
	tcf_idr_check_alloc() can return a negative value, on allocation failures (-ENOMEM) or IDR exhaustion (-ENOSPC): don't leak keys_ex in these cases. Fixes: 0190c1d452a9 ("net: sched: atomically check-allocate action") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-17	net: 8021q: move vlan offload registrations into vlan_core	Jiri Pirko	2	-96/+99
	Currently, the vlan packet offloads are registered only upon 8021q module load. However, even without this module loaded, the offloads could be utilized, for example by openvswitch datapath. As reported by Michael, that causes 2x to 5x performance improvement, depending on a testcase. So move the vlan offload registrations into vlan_core and make this available even without 8021q module loaded. Reported-by: Michael Shteinbok <michaelsh86@gmail.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Tested-by: Michael Shteinbok <michaelsh86@gmail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>