summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2017-01-10Merge tag 'mlx5-4kuar-for-4.11' of ↵David S. Miller6-60/+81
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== mlx5 4K UAR The following series of patches optimizes the usage of the UAR area which is contained within the BAR 0-1. Previous versions of the firmware and the driver assumed each system page contains a single UAR. This patch set will query the firmware for a new capability that if published, means that the firmware can support UARs of fixed 4K regardless of system page size. In the case of powerpc, where page size equals 64KB, this means we can utilize 16 UARs per system page. Since user space processes by default consume eight UARs per context this means that with this change a process will need a single system page to fulfill that requirement and in fact make use of more UARs which is better in terms of performance. In addition to optimizing user-space processes, we introduce an allocator that can be used by kernel consumers to allocate blue flame registers (which are areas within a UAR that are used to write doorbells). This provides further optimization on using the UAR area since the Ethernet driver makes use of a single blue flame register per system page and now it will use two blue flame registers per 4K. The series also makes changes to naming conventions and now the terms used in the driver code match the terms used in the PRM (programmers reference manual). Thus, what used to be called UUAR (micro UAR) is now called BFREG (blue flame register). In order to support compatibility between different versions of library/driver/firmware, the library has now means to notify the kernel driver that it supports the new scheme and the kernel can notify the library if it supports this extension. So mixed versions of libraries can run concurrently without any issues. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-10bpf: rename ARG_PTR_TO_STACKAlexei Starovoitov1-6/+6
since ARG_PTR_TO_STACK is no longer just pointer to stack rename it to ARG_PTR_TO_MEM and adjust comment. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-10smc: netlink interface for SMC socketsUrsula Braun4-0/+109
Support for SMC socket monitoring via netlink sockets of protocol NETLINK_SOCK_DIAG. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-10smc: establish pnet table managementThomas Richter1-0/+35
Connection creation with SMC-R starts through an internal TCP-connection. The Ethernet interface for this TCP-connection is not restricted to the Ethernet interface of a RoCE device. Any existing Ethernet interface belonging to the same physical net can be used, as long as there is a defined relation between the Ethernet interface and some RoCE devices. This relation is defined with the help of an identification string called "Physical Net ID" or short "pnet ID". Information about defined pnet IDs and their related Ethernet interfaces and RoCE devices is stored in the SMC-R pnet table. A pnet table entry consists of the identifying pnet ID and the associated network and IB device. This patch adds pnet table configuration support using the generic netlink message interface referring to network and IB device by their names. Commands exist to add, delete, and display pnet table entries, and to flush or display the entire pnet table. There are cross-checks to verify whether the ethernet interfaces or infiniband devices really exist in the system. If either device is not available, the pnet ID entry is not created. Loss of network devices and IB devices is also monitored; a pnet ID entry is removed when an associated network or IB device is removed. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-10smc: establish new socket familyUrsula Braun1-1/+6
* enable smc module loading and unloading * register new socket family * basic smc socket creation and deletion * use backing TCP socket to run CLC (Connection Layer Control) handshake of SMC protocol * Setup for infiniband traffic is implemented in follow-on patches. For now fallback to TCP socket is always used. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: Utz Bacher <utz.bacher@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-10net: introduce keepalive function in struct protoUrsula Braun1-0/+1
Direct call of tcp_set_keepalive() function from protocol-agnostic sock_setsockopt() function in net/core/sock.c violates network layering. And newly introduced protocol (SMC-R) will need its own keepalive function. Therefore, add "keepalive" function pointer to "struct proto", and call it from sock_setsockopt() via this pointer. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: Utz Bacher <utz.bacher@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-09Merge tag 'rxrpc-rewrite-20170109' of ↵David S. Miller1-0/+184
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== afs: Refcount afs_call struct These patches provide some tracepoints for AFS and fix a potential leak by adding refcounting to the afs_call struct. The patches are: (1) Add some tracepoints for logging incoming calls and monitoring notifications from AF_RXRPC and data reception. (2) Get rid of afs_wait_mode as it didn't turn out to be as useful as initially expected. It can be brought back later if needed. This clears some stuff out that I don't then need to fix up in (4). (3) Allow listen(..., 0) to be used to disable listening. This makes shutting down the AFS cache manager server in the kernel much easier and the accounting simpler as we can then be sure that (a) all preallocated afs_call structs are relesed and (b) no new incoming calls are going to be started. For the moment, listening cannot be reenabled. (4) Add refcounting to the afs_call struct to fix a potential multiple release detected by static checking and add a tracepoint to follow the lifecycle of afs_call objects. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-09net: dsa: Make dsa_switch_ops constFlorian Fainelli1-2/+2
Now that we have properly encapsulated and made drivers utilize exported functions, we can switch dsa_switch_ops to be a annotated with const. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-09net: dsa: Encapsulate legacy switch drivers into dsa_switch_driverFlorian Fainelli1-4/+7
In preparation for making struct dsa_switch_ops const, encapsulate it within a dsa_switch_driver which has a list pointer and a pointer to dsa_switch_ops. This allows us to take the list_head pointer out of dsa_switch_ops, which is written to by {un,}register_switch_driver. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller9-80/+55
2017-01-09stmmac: move stmmac_clk, pclk, clk_ptp_ref and stmmac_rst to platform structurejpinto1-0/+5
This patch moves stmmac_clk, pclk, clk_ptp_ref and stmmac_rst to the plat_stmmacenet_data structure. It also moves these platform variables initialization to stmmac_platform. This was done for two reasons: a) If PCI is used, platform related code is being executed in stmmac_main resulting in warnings that have no sense and conceptually was not right b) stmmac as a synopsys reference ethernet driver stack will be hosting more and more drivers to its structure like synopsys/dwc_eth_qos.c. These drivers have their own DT bindings that are not compatible with stmmac's. One of the most important are the clock names, and so they need to be parsed in the glue logic and initialized there, and that is the main reason why the clocks were passed to the platform structure. Signed-off-by: Joao Pinto <jpinto@synopsys.com> Tested-by: Niklas Cassel <niklas.cassel@axis.com> Reviewed-by: Lars Persson <larper@axis.com> Acked-by: Alexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-09stmmac: adding DT parameter for LPI tx clock gatingjpinto1-0/+1
This patch adds a new parameter to the stmmac DT: snps,en-tx-lpi-clockgating. It was ported from synopsys/dwc_eth_qos.c and it is useful if lpi tx clock gating is needed by stmmac users also. Signed-off-by: Joao Pinto <jpinto@synopsys.com> Tested-by: Niklas Cassel <niklas.cassel@axis.com> Reviewed-by: Lars Persson <larper@axis.com> Acked-by: Alexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-09net/sched: act_csum: compute crc32c on SCTP packetsDavide Caratti1-1/+2
modify act_csum to compute crc32c on IPv4/IPv6 packets having SCTP in their payload, and extend UAPI definitions accordingly. Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-09siphash: implement HalfSipHash1-3 for hash tablesJason A. Donenfeld1-1/+56
HalfSipHash, or hsiphash, is a shortened version of SipHash, which generates 32-bit outputs using a weaker 64-bit key. It has *much* lower security margins, and shouldn't be used for anything too sensitive, but it could be used as a hashtable key function replacement, if the output is never exposed, and if the security requirement is not too high. The goal is to make this something that performance-critical jhash users would be willing to use. On 64-bit machines, HalfSipHash1-3 is slower than SipHash1-3, so we alias SipHash1-3 to HalfSipHash1-3 on those systems. 64-bit x86_64: [ 0.509409] test_siphash: SipHash2-4 cycles: 4049181 [ 0.510650] test_siphash: SipHash1-3 cycles: 2512884 [ 0.512205] test_siphash: HalfSipHash1-3 cycles: 3429920 [ 0.512904] test_siphash: JenkinsHash cycles: 978267 So, we map hsiphash() -> SipHash1-3 32-bit x86: [ 0.509868] test_siphash: SipHash2-4 cycles: 14812892 [ 0.513601] test_siphash: SipHash1-3 cycles: 9510710 [ 0.515263] test_siphash: HalfSipHash1-3 cycles: 3856157 [ 0.515952] test_siphash: JenkinsHash cycles: 1148567 So, we map hsiphash() -> HalfSipHash1-3 hsiphash() is roughly 3 times slower than jhash(), but comes with a considerable security improvement. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-09siphash: add cryptographically secure PRFJason A. Donenfeld1-0/+85
SipHash is a 64-bit keyed hash function that is actually a cryptographically secure PRF, like HMAC. Except SipHash is super fast, and is meant to be used as a hashtable keyed lookup function, or as a general PRF for short input use cases, such as sequence numbers or RNG chaining. For the first usage: There are a variety of attacks known as "hashtable poisoning" in which an attacker forms some data such that the hash of that data will be the same, and then preceeds to fill up all entries of a hashbucket. This is a realistic and well-known denial-of-service vector. Currently hashtables use jhash, which is fast but not secure, and some kind of rotating key scheme (or none at all, which isn't good). SipHash is meant as a replacement for jhash in these cases. There are a modicum of places in the kernel that are vulnerable to hashtable poisoning attacks, either via userspace vectors or network vectors, and there's not a reliable mechanism inside the kernel at the moment to fix it. The first step toward fixing these issues is actually getting a secure primitive into the kernel for developers to use. Then we can, bit by bit, port things over to it as deemed appropriate. While SipHash is extremely fast for a cryptographically secure function, it is likely a bit slower than the insecure jhash, and so replacements will be evaluated on a case-by-case basis based on whether or not the difference in speed is negligible and whether or not the current jhash usage poses a real security risk. For the second usage: A few places in the kernel are using MD5 or SHA1 for creating secure sequence numbers, syn cookies, port numbers, or fast random numbers. SipHash is a faster and more fitting, and more secure replacement for MD5 in those situations. Replacing MD5 and SHA1 with SipHash for these uses is obvious and straight-forward, and so is submitted along with this patch series. There shouldn't be much of a debate over its efficacy. Dozens of languages are already using this internally for their hash tables and PRFs. Some of the BSDs already use this in their kernels. SipHash is a widely known high-speed solution to a widely known set of problems, and it's time we catch-up. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-09IB/mlx5: Support 4k UAR for libmlx5Eli Cohen3-13/+8
Add fields to structs to convey to kernel an indication whether the library supports multi UARs per page and return to the library the size of a UAR based on the queried value. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-09IB/mlx5: Allow future extension of libmlx5 input dataEli Cohen2-13/+11
Current check requests that new fields in struct mlx5_ib_alloc_ucontext_req_v2 that are not known to the driver be zero. This was introduced so new libraries passing additional information to the kernel through struct mlx5_ib_alloc_ucontext_req_v2 will be notified by old kernels that do not support their request by failing the operation. This schecme is problematic since it requires libmlx5 to issue the requests with descending input size for struct mlx5_ib_alloc_ucontext_req_v2. To avoid this, we require that new features that will obey the following rules: If the feature requires one or more fields in the response and the at least one of the fields can be encoded such that a zero value means the kernel ignored the request then this field will provide the indication to the library. If no response is required or if zero is a valid response, a new field should be added that indicates to the library whether its request was processed. Fixes: b368d7cb8ceb ('IB/mlx5: Add hca_core_clock_offset to udata in init_ucontext') Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-09IB/mlx5: Use blue flame register allocator in mlx5_ibEli Cohen3-23/+5
Make use of the blue flame registers allocator at mlx5_ib. Since blue flame was not really supported we remove all the code that is related to blue flame and we let all consumers to use the same blue flame register. Once blue flame is supported we will add the code. As part of this patch we also move the definition of struct mlx5_bf to mlx5_ib.h as it is only used by mlx5_ib. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-09net/mlx5: Add interface to get reference to a UAREli Cohen1-1/+4
A reference to a UAR is required to generate CQ or EQ doorbells. Since CQ or EQ doorbells can all be generated using the same UAR area without any effect on performance, we are just getting a reference to any available UAR, If one is not available we allocate it but we don't waste the blue flame registers it can provide and we will use them for subsequent allocations. We get a reference to such UAR and put in mlx5_priv so any kernel consumer can make use of it. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-09afs: Refcount the afs_call structDavid Howells1-0/+75
A static checker warning occurs in the AFS filesystem: fs/afs/cmservice.c:155 SRXAFSCB_CallBack() error: dereferencing freed memory 'call' due to the reply being sent before we access the server it points to. The act of sending the reply causes the call to be freed if an error occurs (but not if it doesn't). On top of this, the lifetime handling of afs_call structs is fragile because they get passed around through workqueues without any sort of refcounting. Deal with the issues by: (1) Fix the maybe/maybe not nature of the reply sending functions with regards to whether they release the call struct. (2) Refcount the afs_call struct and sort out places that need to get/put references. (3) Pass a ref through the work queue and release (or pass on) that ref in the work function. Care has to be taken because a work queue may already own a ref to the call. (4) Do the cleaning up in the put function only. (5) Simplify module cleanup by always incrementing afs_outstanding_calls whenever a call is allocated. (6) Set the backlog to 0 with kernel_listen() at the beginning of the process of closing the socket to prevent new incoming calls from occurring and to remove the contribution of preallocated calls from afs_outstanding_calls before we wait on it. A tracepoint is also added to monitor the afs_call refcount and lifetime. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David Howells <dhowells@redhat.com> Fixes: 08e0e7c82eea: "[AF_RXRPC]: Make the in-kernel AFS filesystem use AF_RXRPC."
2017-01-09afs: Add some tracepointsDavid Howells1-0/+109
Add three tracepoints to the AFS filesystem: (1) The afs_recv_data tracepoint logs data segments that are extracted from the data received from the peer through afs_extract_data(). (2) The afs_notify_call tracepoint logs notification from AF_RXRPC of data coming in to an asynchronous call. (3) The afs_cb_call tracepoint logs incoming calls that have had their operation ID extracted and mapped into a supported cache manager service call. To make (3) work, the name strings in the afs_call_type struct objects have to be annotated with __tracepoint_string. This is done with the CM_NAME() macro. Further, the AFS call state enum needs a name so that it can be used to declare parameter types. Signed-off-by: David Howells <dhowells@redhat.com>
2017-01-09net-tc: convert tc_from to tc_from_ingress and tc_redirectedWillem de Bruijn3-8/+5
The tc_from field fulfills two roles. It encodes whether a packet was redirected by an act_mirred device and, if so, whether act_mirred was called on ingress or egress. Split it into separate fields. The information is needed by the special IFB loop, where packets are taken out of the normal path by act_mirred, forwarded to IFB, then reinjected at their original location (ingress or egress) by IFB. The IFB device cannot use skb->tc_at_ingress, because that may have been overwritten as the packet travels from act_mirred to ifb_xmit, when it passes through tc_classify on the IFB egress path. Cache this value in skb->tc_from_ingress. That field is valid only if a packet arriving at ifb_xmit came from act_mirred. Other packets can be crafted to reach ifb_xmit. These must be dropped. Set tc_redirected on redirection and drop all packets that do not have this bit set. Both fields are set only on cloned skbs in tc actions, so original packet sources do not have to clear the bit when reusing packets (notably, pktgen and octeon). Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-09net-tc: convert tc_at to tc_at_ingressWillem de Bruijn2-3/+3
Field tc_at is used only within tc actions to distinguish ingress from egress processing. A single bit is sufficient for this purpose. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-09net-tc: convert tc_verd to integer bitfieldsWillem de Bruijn3-36/+11
Extract the remaining two fields from tc_verd and remove the __u16 completely. TC_AT and TC_FROM are converted to equivalent two-bit integer fields tc_at and tc_from. Where possible, use existing helper skb_at_tc_ingress when reading tc_at. Introduce helper skb_reset_tc to clear fields. Not documenting tc_from and tc_at, because they will be replaced with single bit fields in follow-on patches. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-09net-tc: extract skip classify bit from tc_verdWillem de Bruijn3-7/+15
Packets sent by the IFB device skip subsequent tc classification. A single bit governs this state. Move it out of tc_verd in anticipation of removing that __u16 completely. The new bitfield tc_skip_classify temporarily uses one bit of a hole, until tc_verd is removed completely in a follow-up patch. Remove the bit hole comment. It could be 2, 3, 4 or 5 bits long. With that many options, little value in documenting it. Introduce a helper function to deduplicate the logic in the two sites that check this bit. The field tc_skip_classify is set only in IFB on skbs cloned in act_mirred, so original packet sources do not have to clear the bit when reusing packets (notably, pktgen and octeon). Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-09net-tc: make MAX_RECLASSIFY_LOOP localWillem de Bruijn1-5/+0
This field is no longer kept in tc_verd. Remove it from the global definition of that struct. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-09net-tc: remove unused tc_verd fieldsWillem de Bruijn1-7/+0
Remove the last reference to tc_verd's munge and redirect ttl bits. These fields are no longer used. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-09net: make ndo_get_stats64 a void functionstephen hemminger2-6/+6
The network device operation for reading statistics is only called in one place, and it ignores the return value. Having a structure return value is potentially confusing because some future driver could incorrectly assume that the return value was used. Fix all drivers with ndo_get_stats64 to have a void function. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-09net: ipmr: Remove nowait arg to ipmr_get_routeDavid Ahern1-1/+1
ipmr_get_route has 1 caller and the nowait arg is 0. Remove the arg and simplify ipmr_get_route accordingly. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08Merge tag 'usb-4.10-rc3' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are a bunch of USB fixes for 4.10-rc3. Yeah, it's a lot, an artifact of the holiday break I think. Lots of gadget and the usual XHCI fixups for reported issues (one day that driver will calm down...) Also included are a bunch of usb-serial driver fixes, and for good measure, a number of much-reported MUSB driver issues have finally been resolved. All of these have been in linux-next with no reported issues" * tag 'usb-4.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (72 commits) USB: fix problems with duplicate endpoint addresses usb: ohci-at91: use descriptor-based gpio APIs correctly usb: storage: unusual_uas: Add JMicron JMS56x to unusual device usb: hub: Move hub_port_disable() to fix warning if PM is disabled usb: musb: blackfin: add bfin_fifo_offset in bfin_ops usb: musb: fix compilation warning on unused function usb: musb: Fix trying to free already-free IRQ 4 usb: musb: dsps: implement clear_ep_rxintr() callback usb: musb: core: add clear_ep_rxintr() to musb_platform_ops USB: serial: ti_usb_3410_5052: fix NULL-deref at open USB: serial: spcp8x5: fix NULL-deref at open USB: serial: quatech2: fix sleep-while-atomic in close USB: serial: pl2303: fix NULL-deref at open USB: serial: oti6858: fix NULL-deref at open USB: serial: omninet: fix NULL-derefs at open and disconnect USB: serial: mos7840: fix misleading interrupt-URB comment USB: serial: mos7840: remove unused write URB USB: serial: mos7840: fix NULL-deref at open USB: serial: mos7720: remove obsolete port initialisation USB: serial: mos7720: fix parallel probe ...
2017-01-08Merge tag 'staging-4.10-rc3' of ↵Linus Torvalds1-0/+12
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging/IIO fixes from Greg KH: "Here are some staging and IIO driver fixes for 4.10-rc3. Most of these are minor IIO fixes of reported issues, along with one network driver fix to resolve an issue. And a MAINTAINERS update with a new mailing list. All of these, except the MAINTAINERS file update, have been in linux-next with no reported issues (the MAINTAINERS patch happened on Friday...)" * tag 'staging-4.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: MAINTAINERS: add greybus subsystem mailing list staging: octeon: Call SET_NETDEV_DEV() iio: accel: st_accel: fix LIS3LV02 reading and scaling iio: common: st_sensors: fix channel data parsing iio: max44000: correct value in illuminance_integration_time_available iio: adc: TI_AM335X_ADC should depend on HAS_DMA iio: bmi160: Fix time needed to sleep after command execution iio: 104-quad-8: Fix active level mismatch for the preset enable option iio: 104-quad-8: Fix off-by-one errors when addressing IOR iio: 104-quad-8: Fix index control configuration
2017-01-08net/mlx5: Introduce blue flame register allocatorEli Cohen3-2/+44
Here is an implementation of an allocator that allocates blue flame registers. A blue flame register is used for generating send doorbells. A blue flame register can be used to generate either a regular doorbell or a blue flame doorbell where the data to be sent is written to the device's I/O memory hence saving the need to read the data from memory. For blue flame kind of doorbells to succeed, the blue flame register need to be mapped as write combining. The user can specify what kind of send doorbells she wishes to use. If she requested write combining mapping but that failed, the allocator will fall back to non write combining mapping and will indicate that to the user. Subsequent patches in this series will make use of this allocator. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-08mlx5: Fix naming convention with respect to UARsEli Cohen3-17/+18
This establishes a solid naming conventions for UARs. A UAR (User Access Region) can have size identical to a system page or can be fixed 4KB depending on a value queried by firmware. Each UAR always has 4 blue flame register which are used to post doorbell to send queue. In addition, a UAR has section used for posting doorbells to CQs or EQs. In this patch we change names to reflect this conventions. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-08mm: workingset: fix use-after-free in shadow node shrinkerJohannes Weiner1-1/+3
Several people report seeing warnings about inconsistent radix tree nodes followed by crashes in the workingset code, which all looked like use-after-free access from the shadow node shrinker. Dave Jones managed to reproduce the issue with a debug patch applied, which confirmed that the radix tree shrinking indeed frees shadow nodes while they are still linked to the shadow LRU: WARNING: CPU: 2 PID: 53 at lib/radix-tree.c:643 delete_node+0x1e4/0x200 CPU: 2 PID: 53 Comm: kswapd0 Not tainted 4.10.0-rc2-think+ #3 Call Trace: delete_node+0x1e4/0x200 __radix_tree_delete_node+0xd/0x10 shadow_lru_isolate+0xe6/0x220 __list_lru_walk_one.isra.4+0x9b/0x190 list_lru_walk_one+0x23/0x30 scan_shadow_nodes+0x2e/0x40 shrink_slab.part.44+0x23d/0x5d0 shrink_node+0x22c/0x330 kswapd+0x392/0x8f0 This is the WARN_ON_ONCE(!list_empty(&node->private_list)) placed in the inlined radix_tree_shrink(). The problem is with 14b468791fa9 ("mm: workingset: move shadow entry tracking to radix tree exceptional tracking"), which passes an update callback into the radix tree to link and unlink shadow leaf nodes when tree entries change, but forgot to pass the callback when reclaiming a shadow node. While the reclaimed shadow node itself is unlinked by the shrinker, its deletion from the tree can cause the left-most leaf node in the tree to be shrunk. If that happens to be a shadow node as well, we don't unlink it from the LRU as we should. Consider this tree, where the s are shadow entries: root->rnode | [0 n] | | [s ] [sssss] Now the shadow node shrinker reclaims the rightmost leaf node through the shadow node LRU: root->rnode | [0 ] | [s ] Because the parent of the deleted node is the first level below the root and has only one child in the left-most slot, the intermediate level is shrunk and the node containing the single shadow is put in its place: root->rnode | [s ] The shrinker again sees a single left-most slot in a first level node and thus decides to store the shadow in root->rnode directly and free the node - which is a leaf node on the shadow node LRU. root->rnode | s Without the update callback, the freed node remains on the shadow LRU, where it causes later shrinker runs to crash. Pass the node updater callback into __radix_tree_delete_node() in case the deletion causes the left-most branch in the tree to collapse too. Also add warnings when linked nodes are freed right away, rather than wait for the use-after-free when the list is scanned much later. Fixes: 14b468791fa9 ("mm: workingset: move shadow entry tracking to radix tree exceptional tracking") Reported-by: Dave Chinner <david@fromorbit.com> Reported-by: Hugh Dickins <hughd@google.com> Reported-by: Andrea Arcangeli <aarcange@redhat.com> Reported-and-tested-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Chris Leech <cleech@redhat.com> Cc: Lee Duncan <lduncan@suse.com> Cc: Jan Kara <jack@suse.cz> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Matthew Wilcox <mawilcox@linuxonhyperv.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-08net: netcp: extract eflag from desc for rx_hook handlingKaricheri, Muralidharan1-0/+2
Extract the eflag bits from the received desc and pass it down the rx_hook chain to be available for netcp modules. Also the psdata and epib data has to be inspected by the netcp modules. So the desc can be freed only after returning from the rx_hook. So move knav_pool_desc_put() after the rx_hook processing. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-07Merge branch 'rc-fixes' of ↵Linus Torvalds1-0/+6
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull kbuild fix from Michal Marek: "The asm-prototypes.h file added in the last merge window results in invalid code with CONFIG_KMEMCHECK=y. The net result is that genksyms segfaults. This pull request fixes the header, the genksyms fix is in my kbuild branch for 4.11" * 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: asm-prototypes: Clear any CPP defines before declaring the functions
2017-01-07sctp: prepare asoc stream for stream reconfXin Long2-46/+31
sctp stream reconf, described in RFC 6525, needs a structure to save per stream information in assoc, like stream state. In the future, sctp stream scheduler also needs it to save some stream scheduler params and queues. This patchset is to prepare the stream array in assoc for stream reconf. It defines sctp_stream that includes stream arrays inside to replace ssnmap. Note that we use different structures for IN and OUT streams, as the members in per OUT stream will get more and more different from per IN stream. v1->v2: - put these patches into a smaller group. v2->v3: - define sctp_stream to contain stream arrays, and create stream.c to put stream-related functions. - merge 3 patches into 1, as new sctp_stream has the same name with before. Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-06net: ipv4: make fib_select_default staticDavid Ahern1-1/+0
fib_select_default has a single caller within the same file. Make it static. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-06Merge tag 'vfio-v4.10-rc3' of git://github.com/awilliam/linux-vfioLinus Torvalds1-43/+13
Pull VFIO fixes from Alex Williamson: - Add mtty sample driver properly into build system (Alex Williamson) - Restore type1 mapping performance after mdev (Alex Williamson) - Fix mdev device race (Alex Williamson) - Cleanups to the mdev ABI used by vendor drivers (Alex Williamson) - Build fix for old compilers (Arnd Bergmann) - Fix sample driver error path (Dan Carpenter) - Handle pci_iomap() error (Arvind Yadav) - Fix mdev ioctl return type (Paul Gortmaker) * tag 'vfio-v4.10-rc3' of git://github.com/awilliam/linux-vfio: vfio-mdev: fix non-standard ioctl return val causing i386 build fail vfio-pci: Handle error from pci_iomap vfio-mdev: fix some error codes in the sample code vfio-pci: use 32-bit comparisons for register address for gcc-4.5 vfio-mdev: Make mdev_device private and abstract interfaces vfio-mdev: Make mdev_parent private vfio-mdev: de-polute the namespace, rename parent_device & parent_ops vfio-mdev: Fix remove race vfio/type1: Restore mapping performance with mdev support vfio-mdev: Fix mtty sample driver building
2017-01-06Merge branch 'stable/for-linus-4.10' of ↵Linus Torvalds2-8/+20
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb Pull swiotlb fixes from Konrad Rzeszutek Wilk: "This has one fix to make i915 work when using Xen SWIOTLB, and a feature from Geert to aid in debugging of devices that can't do DMA outside the 32-bit address space. The feature from Geert is on top of v4.10 merge window commit (specifically you pulling my previous branch), as his changes were dependent on the Documentation/ movement patches. I figured it would just easier than me trying than to cherry-pick the Documentation patches to satisfy git. The patches have been soaking since 12/20, albeit I updated the last patch due to linux-next catching an compiler error and adding an Tested-and-Reported-by tag" * 'stable/for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb: swiotlb: Export swiotlb_max_segment to users swiotlb: Add swiotlb=noforce debug option swiotlb: Convert swiotlb_force from int to enum x86, swiotlb: Simplify pci_swiotlb_detect_override()
2017-01-06swiotlb: Export swiotlb_max_segment to usersKonrad Rzeszutek Wilk1-0/+3
So they can figure out what is the optimal number of pages that can be contingously stitched together without fear of bounce buffer. We also expose an mechanism for sub-users of SWIOTLB API, such as Xen-SWIOTLB to set the max segment value. And lastly if swiotlb=force is set (which mandates we bounce buffer everything) we set max_segment so at least we can bounce buffer one 4K page instead of a giant 512KB one for which we may not have space. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reported-and-Tested-by: Juergen Gross <jgross@suse.com>
2017-01-06Merge branch 'stable-4.10' of git://git.infradead.org/users/pcmoore/auditLinus Torvalds1-2/+0
Pull audit fixes from Paul Moore: "Two small fixes relating to audit's use of fsnotify. The first patch plugs a leak and the second fixes some lock shenanigans. The patches are small and I banged on this for an afternoon with our testsuite and didn't see anything odd" * 'stable-4.10' of git://git.infradead.org/users/pcmoore/audit: audit: Fix sleep in atomic fsnotify: Remove fsnotify_duplicate_mark()
2017-01-05Merge tag 'armsoc-fixes' of ↵Linus Torvalds1-26/+0
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Arnd Bergmann: "This is a rather large set of bugfixes, as we just returned from the Christmas break. Most of these are relatively unimportant fixes for regressions introduced during the merge window, and about half of the changes are for mach-omap2. A couple of patches are just cleanups and dead code removal that I would not normally have considered for merging after -rc2, but I decided to take them along with the fixes this time. Notable fixes include: - removing the skeleton.dtsi include broke a number of machines, and we have to put empty /chosen nodes back to be able to pass kernel command lines as before - enabling Samsung platforms no longer hardwires CONFIG_HZ to 200, as it had been for no good reason for a long time" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (46 commits) MAINTAINERS: extend PSCI entry to cover the newly add PSCI checker code drivers: psci: annotate timer on stack to silence odebug messages ARM64: defconfig: enable DRM_MESON as module ARM64: dts: meson-gx: Add Graphic Controller nodes ARM64: dts: meson-gxl: fix GPIO include ARM: dts: imx6: Disable "weim" node in the dtsi files ARM: dts: qcom: apq8064: Add missing scm clock ARM: davinci: da8xx: Fix sleeping function called from invalid context ARM: davinci: Make __clk_{enable,disable} functions public ARM: davinci: da850: don't add emac clock to lookup table twice ARM: davinci: da850: fix infinite loop in clk_set_rate() ARM: i.MX: remove map_io callback ARM: dts: vf610-zii-dev-rev-b: Add missing newline ARM: dts: imx6qdl-nitrogen6x: remove duplicate iomux entry ARM: dts: imx31: fix AVIC base address ARM: dts: am572x-idk: Add gpios property to control PCIE_RESETn arm64: dts: vexpress: Support GICC_DIR operations ARM: dts: vexpress: Support GICC_DIR operations firmware: arm_scpi: fix reading sensor values on pre-1.0 SCPI firmwares arm64: dts: msm8996: Add required memory carveouts ...
2017-01-05Merge tag 'rxrpc-rewrite-20170105' of ↵David S. Miller1-26/+492
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Update tracing and proc interfaces This set of patches fixes and extends tracing: (1) Fix the handling of enum-to-string translations so that external tracing tools can make use of it by using TRACE_DEFINE_ENUM. (2) Extend a couple of tracepoints to export some extra available information and add three new tracepoints to allow monitoring of received DATA packets, call disconnection and improper/implicit call termination. and adds a bit more procfs-exported information: (3) Show a call's hard-ACK cursors in /proc/net/rxrpc_calls. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller7-109/+6
2017-01-05asm-prototypes: Clear any CPP defines before declaring the functionsMichal Marek1-0/+6
The asm-prototypes.h file is used to provide dummy function declarations for genksyms, when processing asm files with EXPORT_SYMBOL. Make sure that any architecture defines get out of our way. x86 currently has an issue with memcpy on 64bit with CONFIG_KMEMCHECK=y and with memset/__memset on 32bit: $ cat init/test.c #include <asm/asm-prototypes.h> $ make -s init/test.o In file included from ./arch/x86/include/asm/string.h:4:0, from ./include/linux/string.h:18, from ./include/linux/bitmap.h:8, from ./include/linux/cpumask.h:11, from ./arch/x86/include/asm/cpumask.h:4, from ./arch/x86/include/asm/msr.h:10, from ./arch/x86/include/asm/processor.h:20, from ./arch/x86/include/asm/cpufeature.h:4, from ./arch/x86/include/asm/thread_info.h:52, from ./include/linux/thread_info.h:25, from ./arch/x86/include/asm/preempt.h:6, from ./include/linux/preempt.h:59, from ./include/linux/spinlock.h:50, from ./include/linux/seqlock.h:35, from ./include/linux/time.h:5, from ./include/uapi/linux/timex.h:56, from ./include/linux/timex.h:56, from ./include/linux/sched.h:19, from ./include/linux/uaccess.h:4, from ./arch/x86/include/asm/asm-prototypes.h:2, from init/test.c:1: ./arch/x86/include/asm/string_64.h:52:47: error: expected declaration specifiers or ‘...’ before ‘(’ token #define memcpy(dst, src, len) __inline_memcpy((dst), (src), (len)) ./include/asm-generic/asm-prototypes.h:6:14: note: in expansion of macro ‘memcpy’ extern void *memcpy(void *, const void *, __kernel_size_t); ^ ... During real build, this manifests itself by genksyms segfaulting. Fixes: 334bb7738764 ("x86/kbuild: enable modversions for symbols exported from asm") Reported-and-tested-by: Borislav Petkov <bp@alien8.de> Cc: Adam Borowski <kilobyte@angband.pl> Signed-off-by: Michal Marek <mmarek@suse.com>
2017-01-05rxrpc: Add some more tracingDavid Howells1-5/+89
Add the following extra tracing information: (1) Modify the rxrpc_transmit tracepoint to record the Tx window size as this is varied by the slow-start algorithm. (2) Modify the rxrpc_rx_ack tracepoint to record more information from received ACK packets. (3) Add an rxrpc_rx_data tracepoint to record the information in DATA packets. (4) Add an rxrpc_disconnect_call tracepoint to record call disconnection, including the reason the call was disconnected. (5) Add an rxrpc_improper_term tracepoint to record implicit termination of a call by a client either by starting a new call on a particular connection channel without first transmitting the final ACK for the previous call. Signed-off-by: David Howells <dhowells@redhat.com>
2017-01-05rxrpc: Fix handling of enums-to-string translation in tracingDavid Howells1-21/+403
Fix the way enum values are translated into strings in AF_RXRPC tracepoints. The problem with just doing a lookup in a normal flat array of strings or chars is that external tracing infrastructure can't find it. Rather, TRACE_DEFINE_ENUM must be used. Also sort the enums and string tables to make it easier to keep them in order so that a future patch to __print_symbolic() can be optimised to try a direct lookup into the table first before iterating over it. A couple of _proto() macro calls are removed because they refered to tables that got moved to the tracing infrastructure. The relevant data can be found by way of tracing. Signed-off-by: David Howells <dhowells@redhat.com>
2017-01-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds4-99/+2
Pull networking fixes from David Miller: 1) stmmac_drv_probe() can race with stmmac_open() because we register the netdevice too early. Fix from Florian Fainelli. 2) UFO handling in __ip6_append_data() and ip6_finish_output() use different tests for deciding whether a frame will be fragmented or not, put them in sync. Fix from Zheng Li. 3) The rtnetlink getstats handlers need to validate that the netlink request is large enough, fix from Mathias Krause. 4) Use after free in mlx4 driver, from Jack Morgenstein. 5) Fix setting of garbage UID value in sockets during setattr() calls, from Eric Biggers. 6) Packet drop_monitor doesn't format the netlink messages properly such that nlmsg_next fails to work, fix from Reiter Wolfgang. 7) Fix handling of wildcard addresses in l2tp lookups, from Guillaume Nault. 8) __skb_flow_dissect() can crash on pptp packets, from Ian Kumlien. 9) IGMP code doesn't reset group query timers properly, from Michal Tesar. 10) Fix overzealous MAIN/LOCAL route table combining in ipv4, from Alexander Duyck. 11) vxlan offload check needs to be more strict in be2net driver, from Sabrina Dubroca. 12) Moving l3mdev to packet hooks lost RX stat counters unintentionally, fix from David Ahern. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits) sh_eth: enable RX descriptor word 0 shift on SH7734 sfc: don't report RX hash keys to ethtool when RSS wasn't enabled dpaa_eth: Initialize CGR structure before init dpaa_eth: cleanup after init_phy() failure net: systemport: Pad packet before inserting TSB net: systemport: Utilize skb_put_padto() LiquidIO VF: s/select/imply/ for PTP_1588_CLOCK libcxgb: fix error check for ip6_route_output() net: usb: asix_devices: add .reset_resume for USB PM net: vrf: Add missing Rx counters drop_monitor: consider inserted data in genlmsg_end benet: stricter vxlan offloading check in be_features_check ipv4: Do not allow MAIN to be alias for new LOCAL w/ custom rules net: macb: Updated resource allocation function calls to new version of API. net: stmmac: dwmac-oxnas: use generic pm implementation net: stmmac: dwmac-oxnas: fix fixed-link-phydev leaks net: stmmac: dwmac-oxnas: fix of-node leak Documentation/networking: fix typo in mpls-sysctl igmp: Make igmp group member RFC 3376 compliant flow_dissector: Update pptp handling to avoid null pointer deref. ...
2017-01-05dsa: mv88e6xxx: Optimise atu_getAndrew Lunn1-0/+60
Lookup in the ATU can be performed starting from a given MAC address. This is faster than starting with the first possible MAC address and iterating all entries. Entries are returned in numeric order. So if the MAC address returned is bigger than what we are searching for, we know it is not in the ATU. Using the benchmark provided by Volodymyr Bendiuga <volodymyr.bendiuga@gmail.com>, https://www.spinics.net/lists/netdev/msg411550.html on an Marvell Armada 370 RD, the test to add a number of static fdb entries went from 1.616531 seconds to 0.312052 seconds. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>