summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2014-05-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller30-42/+192
Conflicts: drivers/net/bonding/bond_alb.c drivers/net/ethernet/altera/altera_msgdma.c drivers/net/ethernet/altera/altera_sgdma.c net/ipv6/xfrm6_output.c Several cases of overlapping changes. The xfrm6_output.c has a bug fix which overlaps the renaming of skb->local_df to skb->ignore_df. In the Altera TSE driver cases, the register access cleanups in net-next overlapped with bug fixes done in net. Similarly a bug fix to send ALB packets in the bonding driver using the right source address overlaps with cleanups in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds11-23/+91
Pull networking fixes from David Miller: "It looks like a sizeble collection but this is nearly 3 weeks of bug fixing while you were away. 1) Fix crashes over IPSEC tunnels with NAT, the latter can reroute the packet through a non-IPSEC protected path and the code has to be able to handle SKBs attached to routes lacking an attached xfrm state. From Steffen Klassert. 2) Fix OOPSs in ipv4 and ipv6 ipsec layers for unsupported sub-protocols, also from Steffen Klassert. 3) Set local_df on fragmented netfilter skbs otherwise we won't be able to forward successfully, from Florian Westphal. 4) cdc_mbim ipv6 neighbour code does __vlan_find_dev_deep without holding RCU lock, from Bjorn Mork. 5) local_df test in ip_may_fragment is inverted, from Florian Westphal. 6) jme driver doesn't check for DMA mapping failures, from Neil Horman. 7) qlogic driver doesn't calculate number of TX queues properly, from Shahed Shaikh. 8) fib_info_cnt can drift irreversibly positive if we fail to allocate the fi->fib_metrics array, from Sergey Popovich. 9) Fix use after free in ip6_route_me_harder(), also from Sergey Popovich. 10) When SYSCTL is disabled, we don't handle local_port_range and ping_group_range defaults properly at all, from Cong Wang. 11) Unaccelerated VLAN tagged frames improperly handled by cdc_mbim driver, fix from Bjorn Mork. 12) cassini driver needs nested lock annotations for TX locking, from Emil Goode. 13) On init error ipv6 VTI driver can unregister pernet ops twice, oops. Fix from Mahtias Krause. 14) If macvlan device is down, don't propagate IFF_ALLMULTI changes, from Peter Christensen. 15) Missing NULL pointer check while parsing netlink config options in ip6_tnl_validate(). From Susant Sahani. 16) Fix handling of neighbour entries during ipv6 router reachability probing, from Duan Jiong. 17) x86 and s390 JIT address randomization has some address calculation bugs leading to crashes, from Alexei Starovoitov and Heiko Carstens. 18) Clear up those uglies with nop patching and net_get_random_once(), from Hannes Frederic Sowa. 19) Option length miscalculated in ip6_append_data(), fix also from Hannes Frederic Sowa. 20) A while ago we fixed a race during device unregistry when a namespace went down, turns out there is a second place that needs similar protection. From Cong Wang. 21) In the new Altera TSE driver multicast filtering isn't working, disable it and just use promisc mode until the cause is found. From Vince Bridgers. 22) When we disable router enabling in ipv6 we have to flush the cached routes explicitly, from Duan Jiong. 23) NBMA tunnels should not cache routes on the tunnel object because the key is variable, from Timo Teräs. 24) With stacked devices GRO information in skb->cb[] can be not setup properly, make sure it is in all code paths. From Eric Dumazet. 25) Really fix stacked vlan locking, multiple levels of nesting with intervening non-vlan devices are possible. From Vlad Yasevich. 26) Fallback ipip tunnel device's mtu is not setup properly, from Steffen Klassert. 27) The packet scheduler's tcindex filter can crash because we structure copy objects with list_head's inside, oops. From Cong Wang. 28) Fix CHECKSUM_COMPLETE handling for ipv6 GRE tunnels, from Eric Dumazet. 29) In some configurations 'itag' in __mkroute_input() can end up being used uninitialized because of how fib_validate_source() works. Fix it by explitly initializing itag to zero like all the other fib_validate_source() callers do, from Li RongQing" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits) batman: fix a bogus warning from batadv_is_on_batman_iface() ipv4: initialise the itag variable in __mkroute_input bonding: Send ALB learning packets using the right source bonding: Don't assume 802.1Q when sending alb learning packets. net: doc: Update references to skb->rxhash stmmac: Remove unbalanced clk_disable call ipv6: gro: fix CHECKSUM_COMPLETE support net_sched: fix an oops in tcindex filter can: peak_pci: prevent use after free at netdev removal ip_tunnel: Initialize the fallback device properly vlan: Fix build error wth vlan_get_encap_level() can: c_can: remove obsolete STRICT_FRAME_ORDERING Kconfig option MAINTAINERS: Pravin Shelar is Open vSwitch maintainer. bnx2x: Convert return 0 to return rc bonding: Fix alb mode to only use first level vlans. bonding: Fix stacked device detection in arp monitoring macvlan: Fix lockdep warnings with stacked macvlan devices vlan: Fix lockdep warning with stacked vlan devices. net: Allow for more then a single subclass for netif_addr_lock net: Find the nesting level of a given device by type. ...
2014-05-24net: filter: let unattached filters use sock_fprog_kernDaniel Borkmann1-1/+1
The sk_unattached_filter_create() API is used by BPF filters that are not directly attached or related to sockets, and are used in team, ptp, xt_bpf, cls_bpf, etc. As such all users do their own internal managment of obtaining filter blocks and thus already have them in kernel memory and set up before calling into sk_unattached_filter_create(). As a result, due to __user annotation in sock_fprog, sparse triggers false positives (incorrect type in assignment [different address space]) when filters are set up before passing them to sk_unattached_filter_create(). Therefore, let sk_unattached_filter_create() API use sock_fprog_kern to overcome this issue. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-24net: filter: remove DL macroDaniel Borkmann1-3/+0
Lets get rid of this macro. After commit 5bcfedf06f7f ("net: filter: simplify label names from jump-table"), labels have become more readable due to omission of BPF_ prefix but at the same time more generic, so that things like `git grep -n` would not find them. As a middle path, lets get rid of the DL macro as it's not strictly needed and would otherwise just hide the full name. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-24l2tp: Add support for zero IPv6 checksumsTom Herbert1-0/+2
Added new L2TP configuration options to allow TX and RX of zero checksums in IPv6. Default is not to use them. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-24net: Make enabling of zero UDP6 csums more restrictiveTom Herbert2-1/+25
RFC 6935 permits zero checksums to be used in IPv6 however this is recommended only for certain tunnel protocols, it does not make checksums completely optional like they are in IPv4. This patch restricts the use of IPv6 zero checksums that was previously intoduced. no_check6_tx and no_check6_rx have been added to control the use of checksums in UDP6 RX and TX path. The normal sk_no_check_{rx,tx} settings are not used (this avoids ambiguity when dealing with a dual stack socket). A helper function has been added (udp_set_no_check6) which can be called by tunnel impelmentations to all zero checksums (send on the socket, and accept them as valid). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-24net: Split sk_no_check into sk_no_check_{rx,tx}Tom Herbert2-11/+4
Define separate fields in the sock structure for configuring disabling checksums in both TX and RX-- sk_no_check_tx and sk_no_check_rx. The SO_NO_CHECK socket option only affects sk_no_check_tx. Also, removed UDP_CSUM_* defines since they are no longer necessary. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-24net: Eliminate no_check from protoswTom Herbert1-1/+0
It doesn't seem like an protocols are setting anything other than the default, and allowing to arbitrarily disable checksums for a whole protocol seems dangerous. This can be done on a per socket basis. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-23net-next:v4: Add support to configure SR-IOV VF minimum and maximum Tx rate ↵Sucheta Chakraborty3-5/+15
through ip tool. o min_tx_rate puts lower limit on the VF bandwidth. VF is guaranteed to have a bandwidth of at least this value. max_tx_rate puts cap on the VF bandwidth. VF can have a bandwidth of up to this value. o A new handler set_vf_rate for attr IFLA_VF_RATE has been introduced which takes 4 arguments: netdev, VF number, min_tx_rate, max_tx_rate o ndo_set_vf_rate replaces ndo_set_vf_tx_rate handler. o Drivers that currently implement ndo_set_vf_tx_rate should now call ndo_set_vf_rate instead and reject attempt to set a minimum bandwidth greater than 0 for IFLA_VF_TX_RATE when IFLA_VF_RATE is not yet implemented by driver. o If user enters only one of either min_tx_rate or max_tx_rate, then, userland should read back the other value from driver and set both for IFLA_VF_RATE. Drivers that have not yet implemented IFLA_VF_RATE should always return min_tx_rate as 0 when read from ip tool. o If both IFLA_VF_TX_RATE and IFLA_VF_RATE options are specified, then IFLA_VF_RATE should override. o Idea is to have consistent display of rate values to user. o Usage example: - ./ip link set p4p1 vf 0 rate 900 ./ip link show p4p1 32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 1000 link/ether 00:0e:1e:08:b0:f0 brd ff:ff:ff:ff:ff:ff vf 0 MAC 3e:a0:ca:bd:ae:5a, tx rate 900 (Mbps), max_tx_rate 900Mbps vf 1 MAC f6:c6:7c:3f:3d:6c vf 2 MAC 56:32:43:98:d7:71 vf 3 MAC d6:be:c3:b5:85:ff vf 4 MAC ee:a9:9a:1e:19:14 vf 5 MAC 4a:d0:4c:07:52:18 vf 6 MAC 3a:76:44:93:62:f9 vf 7 MAC 82:e9:e7:e3:15:1a ./ip link set p4p1 vf 0 max_tx_rate 300 min_tx_rate 200 ./ip link show p4p1 32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 1000 link/ether 00:0e:1e:08:b0:f0 brd ff:ff:ff:ff:ff:ff vf 0 MAC 3e:a0:ca:bd:ae:5a, tx rate 300 (Mbps), max_tx_rate 300Mbps, min_tx_rate 200Mbps vf 1 MAC f6:c6:7c:3f:3d:6c vf 2 MAC 56:32:43:98:d7:71 vf 3 MAC d6:be:c3:b5:85:ff vf 4 MAC ee:a9:9a:1e:19:14 vf 5 MAC 4a:d0:4c:07:52:18 vf 6 MAC 3a:76:44:93:62:f9 vf 7 MAC 82:e9:e7:e3:15:1a ./ip link set p4p1 vf 0 max_tx_rate 600 rate 300 ./ip link show p4p1 32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 1000 link/ether 00:0e:1e:08:b0:f brd ff:ff:ff:ff:ff:ff vf 0 MAC 3e:a0:ca:bd:ae:5, tx rate 600 (Mbps), max_tx_rate 600Mbps, min_tx_rate 200Mbps vf 1 MAC f6:c6:7c:3f:3d:6c vf 2 MAC 56:32:43:98:d7:71 vf 3 MAC d6:be:c3:b5:85:ff vf 4 MAC ee:a9:9a:1e:19:14 vf 5 MAC 4a:d0:4c:07:52:18 vf 6 MAC 3a:76:44:93:62:f9 vf 7 MAC 82:e9:e7:e3:15:1a Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-23Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds1-2/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "The biggest commit is an irqtime accounting loop latency fix, the rest are misc fixes all over the place: deadline scheduling, docs, numa, balancer and a bad to-idle latency fix" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/numa: Initialize newidle balance stats in sd_numa_init() sched: Fix updating rq->max_idle_balance_cost and rq->next_balance in idle_balance() sched: Skip double execution of pick_next_task_fair() sched: Use CPUPRI_NR_PRIORITIES instead of MAX_RT_PRIO in cpupri check sched/deadline: Fix memory leak sched/deadline: Fix sched_yield() behavior sched: Sanitize irq accounting madness sched/docbook: Fix 'make htmldocs' warnings caused by missing description
2014-05-23Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "The biggest changes are fixes for races that kept triggering Trinity crashes, plus liblockdep build fixes and smaller misc fixes. The liblockdep bits in perf/urgent are a pull mistake - they should have been in locking/urgent - but by the time I noticed other commits were added and testing was done :-/ Sorry about that" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Fix a race between ring_buffer_detach() and ring_buffer_attach() perf: Prevent false warning in perf_swevent_add perf: Limit perf_event_attr::sample_period to 63 bits tools/liblockdep: Remove all build files when doing make clean tools/liblockdep: Build liblockdep from tools/Makefile perf/x86/intel: Fix Silvermont's event constraints perf: Fix perf_event_init_context() perf: Fix race in removing an event
2014-05-23wait: swap EXIT_ZOMBIE(Z) and EXIT_DEAD(X) chars in TASK_STATE_TO_CHAR_STRMasatake YAMATO1-1/+1
In commit ad86622b478e ("wait: swap EXIT_ZOMBIE and EXIT_DEAD to hide EXIT_TRACE from user-space") the order of task state definitions were changed: EXIT_DEAD and EXIT_ZOMBIE were swapped. Though the charterers for the states in TASK_STATE_TO_CHAR_STR string were not updated. This patch synchronizes the string to the order of definitions. Signed-off-by: Masatake YAMATO <yamato@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-05-23openvswitch: Avoid assigning a NULL pointer to flow actions.Jarno Rajahalme1-1/+3
Flow SET can accept an empty set of actions, with the intended semantics of leaving existing actions unmodified. This seems to have been brokin after OVS 1.7, as we have assigned the flow's actions pointer to NULL in this case, but we never check for the NULL pointer later on. This patch restores the intended behavior and documents it in the include/linux/openvswitch.h. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-05-23net/mlx4_core: Deprecate use_prio module parameterAmir Vadai1-1/+0
use_prio was added as part of an infrastructure for running FCoE in A0 mode. FCoE didn't get into Mellanox Upstream driver, and when it will, it won't be using A0 steering mode. Therefore we can safely deprecate this module parameter without hurting any existing user. CC: Carol Soto <clsoto@linux.vnet.ibm.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-23Merge branch 'master' of ↵David S. Miller1-35/+22
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2014-05-22 This is the last ipsec pull request before I leave for a three weeks vacation tomorrow. David, can you please take urgent ipsec patches directly into net/net-next during this time? I'll continue to run the ipsec/ipsec-next trees as soon as I'm back. 1) Simplify the xfrm audit handling, from Tetsuo Handa. 2) Codingstyle cleanup for xfrm_output, from abian Frederick. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-22of: mdio: remove of_phy_connect_fixed_linkFlorian Fainelli1-10/+0
All in-tree drivers have been converted to use the new pair of functions: of_is_fixed_phy_link() plus of_phy_register_fixed_link(), we can now safely remove of_phy_connect_fixed_link. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-22vlan: more careful checksum features handlingMichal Kubeček1-0/+14
When combining real_dev's features and vlan_features, simple bitwise AND is used. This doesn't work well for checksum offloading features as if one set has NETIF_F_HW_CSUM and the other NETIF_F_IP_CSUM and/or NETIF_F_IPV6_CSUM, we end up with no checksum offloading. However, from the logical point of view (how can_checksum_protocol() works), NETIF_F_HW_CSUM contains the functionality of NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM so that the result should be IP/IPV6. Add helper function netdev_intersect_features() implementing this logic and use it in vlan_dev_fix_features(). Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-22net: Add a software TSO helper APIEzequiel Garcia1-0/+20
Although the implementation probably needs a lot of work, this initial API allows to implement software TSO in mvneta and mv643xx_eth drivers in a not so intrusive way. Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nftablesDavid S. Miller3-11/+192
Pablo Neira Ayuso says: ==================== Netfilter/nftables updates for net-next The following patchset contains Netfilter/nftables updates for net-next, most relevantly they are: 1) Add set element update notification via netlink, from Arturo Borrero. 2) Put all object updates in one single message batch that is sent to kernel-space. Before this patch only rules where included in the batch. This series also introduces the generic transaction infrastructure so updates to all objects (tables, chains, rules and sets) are applied in an all-or-nothing fashion, these series from me. 3) Defer release of objects via call_rcu to reduce the time required to commit changes. The assumption is that all objects are destroyed in reverse order to ensure that dependencies betweem them are fulfilled (ie. rules and sets are destroyed first, then chains, and finally tables). 4) Allow to match by bridge port name, from Tomasz Bursztyka. This series include two patches to prepare this new feature. 5) Implement the proper set selection based on the characteristics of the data. The new infrastructure also allows you to specify your preferences in terms of memory and computational complexity so the underlying set type is also selected according to your needs, from Patrick McHardy. 6) Several cleanup patches for nft expressions, including one minor possible compilation breakage due to missing mark support, also from Patrick. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-22tcp: make cwnd-limited checks measurement-based, and gentlerNeal Cardwell2-5/+12
Experience with the recent e114a710aa50 ("tcp: fix cwnd limited checking to improve congestion control") has shown that there are common cases where that commit can cause cwnd to be much larger than necessary. This leads to TSO autosizing cooking skbs that are too large, among other things. The main problems seemed to be: (1) That commit attempted to predict the future behavior of the connection by looking at the write queue (if TSO or TSQ limit sending). That prediction sometimes overestimated future outstanding packets. (2) That commit always allowed cwnd to grow to twice the number of outstanding packets (even in congestion avoidance, where this is not needed). This commit improves both of these, by: (1) Switching to a measurement-based approach where we explicitly track the largest number of packets in flight during the past window ("max_packets_out"), and remember whether we were cwnd-limited at the moment we finished sending that flight. (2) Only allowing cwnd to grow to twice the number of outstanding packets ("max_packets_out") in slow start. In congestion avoidance mode we now only allow cwnd to grow if it was fully utilized. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-22net: filter: cleanup invocation of internal BPFAlexei Starovoitov1-4/+2
Kernel API for classic BPF socket filters is: sk_unattached_filter_create() - validate classic BPF, convert, JIT SK_RUN_FILTER() - run it sk_unattached_filter_destroy() - destroy socket filter Cleanup internal BPF kernel API as following: sk_filter_select_runtime() - final step of internal BPF creation. Try to JIT internal BPF program, if JIT is not available select interpreter SK_RUN_FILTER() - run it sk_filter_free() - free internal BPF program Disallow direct calls to BPF interpreter. Execution of the BPF program should be done with SK_RUN_FILTER() macro. Example of internal BPF create, run, destroy: struct sk_filter *fp; fp = kzalloc(sk_filter_size(prog_len), GFP_KERNEL); memcpy(fp->insni, prog, prog_len * sizeof(fp->insni[0])); fp->len = prog_len; sk_filter_select_runtime(fp); SK_RUN_FILTER(fp, ctx); sk_filter_free(fp); Sockets, seccomp, testsuite, tracing are using different ways to populate sk_filter, so first steps of program creation are not common. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-22Merge branch 'renameat2' of ↵Linus Torvalds1-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull renameat2 arch support from Miklos Szeredi: "I've collected architecture patches for the renameat2 syscall that maintainers acked and/or asked me to queue. This adds architecture support for the renameat2 syscall to m68k, parisc, ia64 and through asm-generic to arc, arm64, c6x, hexagon, metag, openrisc, score, tile, unicore32" * 'renameat2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: scripts/checksyscalls.sh: Make renameat optional asm-generic: Add renameat2 syscall ia64: add renameat2 syscall parisc: add renameat2 syscall m68k: add renameat2 syscall
2014-05-21Merge tag 'driver-core-3.15-rc6' of ↵Linus Torvalds1-1/+18
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are two driver core (well, sysfs) fixes for 3.15-rc6 that resolve some reported issues and a regression from 3.13" * tag 'driver-core-3.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: sysfs: make sure read buffer is zeroed kernfs, sysfs, cgroup: restrict extra perm check on open to sysfs
2014-05-21Merge branch 'for-3.15-fixes' of ↵Linus Torvalds1-0/+15
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull more cgroup fixes from Tejun Heo: "Three more patches to fix cgroup_freezer breakage due to the recent cgroup internal locking changes - an operation cgroup_freezer was using now requires sleepable context and cgroup_freezer was invoking that while holding a spin lock. cgroup_freezer was using an overly elaborate hierarchical locking scheme. While it's possible to convert the hierarchical spinlocks directly to mutexes, this patch simplifies the overall locking so that it uses a global mutex. This has the added benefit of avoiding iterating potentially huge number of tasks under a spinlock. While the patch is on the larger side in the devel cycle, the changes made are mostly straight-forward and the locking logic is a lot simpler afterwards" * 'for-3.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: fix rcu_read_lock() leak in update_if_frozen() cgroup_freezer: replace freezer->lock with freezer_mutex cgroup: introduce task_css_is_root()
2014-05-21Merge tag 'dt-for-linus' of git://git.secretlab.ca/git/linuxLinus Torvalds1-1/+1
Pull device tree fixes from Grant Likely: "Drivercore bugfixes for v3.15 This branch contains bug fixes important to get into v3.15. There is a fix for modifying properties seen during early boot, a fix for an incorrect prototype when CONFIG_OF=n, and a couple of corrections to device tree memory nodes on a few platforms" * tag 'dt-for-linus' of git://git.secretlab.ca/git/linux: mips: dts: Fix missing device_type="memory" property in memory nodes arm: dts: Fix missing device_type="memory" for ste-ccu8540 of: fix CONFIG_OF=n prototype of of_node_full_name() of: make of_update_property() usable earlier in the boot process
2014-05-21Merge tag 'linux-can-next-for-3.16-20140519' of ↵David S. Miller15-42/+59
git://gitorious.org/linux-can/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2014-05-19 this is a pull request of 13 patches for net-next/master. A patch by Dan Carpenter fixes a coccinelle warning in the mcp251x driver. Jean Delvare contributes three patches to tightening the Kconfig dependencies for some drivers. Then come three patches by Pavel Machek that improve the c_can driver support on the socfpga platform. Sergei Shtylyov's patch brings support for the CAN hardware found on Renesas R-Car CAN controllers. Four patches by Oliver Hartkopp, the first cleans up the guard macros in the CAN headers the other three improve the EFF frame filtering. Maximilian Schneider's patch adds support for the GS_USB CAN devices. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-21net: cdc_ncm: fix 64bit division build errorBjørn Mork1-2/+2
The upper timer_interval limit is arbitrary and much higher than anything usable in the real world. Reducing it from 15s to ~4s to make the timer_interval fit in an u32 does not make much difference. The limit is still outside the practical bounds. This eliminates the need for a 64bit timer_interval, fixing a build error related to 64bit division: drivers/built-in.o: In function `cdc_ncm_get_coalesce': ak8975.c:(.text+0x1ac994): undefined reference to `__aeabi_uldivmod' Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-20vlan: Fix build error wth vlan_get_encap_level()Vlad Yasevich1-5/+11
The new function vlan_get_encap_level() uses vlan_dev_priv() which is only conditionally avaialble when VLAN support is enabled. Make vlan_get_encap_level() conditionally available as well. Fixes: 44a4085538c8 ("bonding: Fix stacked device detection in arp monitoring") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> CC: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-20asm-generic: Add renameat2 syscallJames Hogan1-1/+3
Add the renameat2 syscall to the generic syscall list, which is used by the following architectures: arc, arm64, c6x, hexagon, metag, openrisc, score, tile, unicore32. Signed-off-by: James Hogan <james.hogan@imgtec.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: linux-arch@vger.kernel.org Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Mark Salter <msalter@redhat.com> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: linux-hexagon@vger.kernel.org Cc: linux-metag@vger.kernel.org Cc: Jonas Bonn <jonas@southpole.se> Cc: Chen Liqin <liqin.linux@gmail.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
2014-05-20Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linusLinus Torvalds1-0/+12
Pull MIPS fixes from Ralf Baechle: "MIPS fixes for various loose ends: - Fix workarounds for R4000 erratum. - Patch up DEC, Siemens-Nixdorf and Loongson hardware support. - Wire up renameat2 syscall. - Delete unused file - it was causing false warnings from maintenance scripts. - Revert a patch because it's functionality is now implemented twice which causes superfluous /proc/cpuinfo output. - Fix a microMIPS regression" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: MIPS: mm: Fix broken microMIPS kernel regression. MIPS: Add new AUDIT_ARCH token for the N32 ABI on MIPS64 MIPS: Wire up renameat2 syscall. MIPS: inst.h: Rename BITFIELD_FIELD to __BITFIELD_FIELD. MIPS: Remove file missed when removing rm9k support a while ago. MIPS/loongson2_cpufreq: Fix CPU clock rate setting MIPS: Loongson: No need to select GENERIC_HARDIRQS_NO__DO_IRQ MIPS: csum_partial.S CPU_DADDI_WORKAROUNDS bug fix MIPS: __strncpy_from_user_asm CPU_DADDI_WORKAROUNDS bug fix MIPS: __delay CPU_DADDI_WORKAROUNDS bug fix MIPS: DEC/SNI: O32 wrapper stack switching fixes MIPS: DEC: Bus error handler <asm/cpu-type.h> fixes MAINTAINERS: TURBOchannel: Update entry Revert "MIPS: MT: proc: Add support for printing VPE and TC ids"
2014-05-20Merge tag 'metag-for-v3.15-2' of ↵Linus Torvalds2-8/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag Pull Metag architecture and related fixes from James Hogan: "Mostly fixes for metag and parisc relating to upgrowing stacks. - Fix missing compiler barriers in metag memory barriers. - Fix BUG_ON on metag when RLIMIT_STACK hard limit is increased beyond safe value. - Make maximum stack size configurable. This reduces the default user stack size back to 80MB (especially on parisc after their removal of _STK_LIM_MAX override). This only affects metag and parisc. - Remove metag _STK_LIM_MAX override to match other arches and follow parisc, now that it is safe to do so (due to the BUG_ON fix mentioned above). - Finally now that both metag and parisc _STK_LIM_MAX overrides have been removed, it makes sense to remove _STK_LIM_MAX altogether" * tag 'metag-for-v3.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag: asm-generic: remove _STK_LIM_MAX metag: Remove _STK_LIM_MAX override parisc,metag: Do not hardcode maximum userspace stack size metag: Reduce maximum stack size to 256MB metag: fix memory barriers
2014-05-20Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds1-0/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "Two small updates from the irq departement: - Provide missing inline stub for a SMP only function - Add sub-maintainer for the drivers/irqchip/ part of the irq subsystem. YAY!" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: MAINTAINERS: Add co-maintainer for drivers/irqchip genirq: Provide irq_force_affinity fallback for non-SMP
2014-05-19perf: Fix a race between ring_buffer_detach() and ring_buffer_attach()Peter Zijlstra1-0/+2
Alexander noticed that we use RCU iteration on rb->event_list but do not use list_{add,del}_rcu() to add,remove entries to that list, nor do we observe proper grace periods when re-using the entries. Merge ring_buffer_detach() into ring_buffer_attach() such that attaching to the NULL buffer is detaching. Furthermore, ensure that between any 'detach' and 'attach' of the same event we observe the required grace period, but only when strictly required. In effect this means that only ioctl(.request = PERF_EVENT_IOC_SET_OUTPUT) will wait for a grace period, while the normal initial attach and final detach will not be delayed. This patch should, I think, do the right thing under all circumstances, the 'normal' cases all should never see the extra grace period, but the two cases: 1) PERF_EVENT_IOC_SET_OUTPUT on an event which already has a ring_buffer set, will now observe the required grace period between removing itself from the old and attaching itself to the new buffer. This case is 'simple' in that both buffers are present in perf_event_set_output() one could think an unconditional synchronize_rcu() would be sufficient; however... 2) an event that has a buffer attached, the buffer is destroyed (munmap) and then the event is attached to a new/different buffer using PERF_EVENT_IOC_SET_OUTPUT. This case is more complex because the buffer destruction does: ring_buffer_attach(.rb = NULL) followed by the ioctl() doing: ring_buffer_attach(.rb = foo); and we still need to observe the grace period between these two calls due to us reusing the event->rb_entry list_head. In order to make 2 happen we use Paul's latest cond_synchronize_rcu() call. Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Reported-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140507123526.GD13658@twins.programming.kicks-ass.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-05-19netfilter: nf_tables: defer all object release via rcuPablo Neira Ayuso1-0/+2
Now that all objects are released in the reverse order via the transaction infrastructure, we can enqueue the release via call_rcu to save one synchronize_rcu. For small rule-sets loaded via nft -f, it now takes around 50ms less here. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19netfilter: nf_tables: remove skb and nlh from context structurePablo Neira Ayuso1-4/+6
Instead of caching the original skbuff that contains the netlink messages, this stores the netlink message sequence number, the netlink portID and the report flag. This helps to prepare the introduction of the object release via call_rcu. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19netfilter: nf_tables: use new transaction infrastructure to handle elementsPablo Neira Ayuso1-0/+10
Leave the set content in consistent state if we fail to load the batch. Use the new generic transaction infrastructure to achieve this. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19netfilter: nf_tables: use new transaction infrastructure to handle tablePablo Neira Ayuso1-0/+10
This patch speeds up rule-set updates and it also provides a way to revert updates and leave things in consistent state in case that the batch needs to be aborted. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19netfilter: nf_tables: use new transaction infrastructure to handle chainPablo Neira Ayuso1-0/+17
This patch speeds up rule-set updates and it also introduces a way to revert chain updates if the batch is aborted. The idea is to store the changes in the transaction to apply that in the commit step. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19netfilter: nf_tables: use new transaction infrastructure to handle setsPablo Neira Ayuso2-0/+18
This patch reworks the nf_tables API so set updates are included in the same batch that contains rule updates. This speeds up rule-set updates since we skip a dialog of four messages between kernel and user-space (two on each direction), from: 1) create the set and send netlink message to the kernel 2) process the response from the kernel that contains the allocated name. 3) add the set elements and send netlink message to the kernel. 4) process the response from the kernel (to check for errors). To: 1) add the set to the batch. 2) add the set elements to the batch. 3) add the rule that points to the set. 4) send batch to the kernel. This also introduces an internal set ID (NFTA_SET_ID) that is unique in the batch so set elements and rules can refer to new sets. Backward compatibility has been only retained in userspace, this means that new nft versions can talk to the kernel both in the new and the old fashion. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19netfilter: nf_tables: add message type to transactionsPablo Neira Ayuso1-0/+2
The patch adds message type to the transaction to simplify the commit the and abort routines. Yet another step forward in the generalisation of the transaction infrastructure. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19netfilter: nf_tables: generalise transaction infrastructurePablo Neira Ayuso1-4/+11
This patch generalises the existing rule transaction infrastructure so it can be used to handle set, table and chain object transactions as well. The transaction provides a data area that stores private information depending on the transaction type. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19netfilter: nf_tables: deconstify table and chain in context structurePablo Neira Ayuso1-3/+3
The new transaction infrastructure updates the family, table and chain objects in the context structure, so let's deconstify them. While at it, move the context structure initialization routine to the top of the source file as it will be also used from the table and chain routines. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19can: unify identifiers to ensure unique include processingOliver Hartkopp14-42/+42
Armin pointed me to the fact that the identifier which is used to ensure the unique include processing in lunux/include/uapi/linux/can.h is CAN_H. This clashed with his own source as includes from libraries and APIs should use an underscore '_' at the identifier start. This patch fixes the protection identifiers in all CAN relavant includes. Reported-by: Armin Burchardt <armin@uni-bremen.de> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2014-05-19can: add Renesas R-Car CAN driverSergei Shtylyov1-0/+17
Add support for the CAN controller found in Renesas R-Car SoCs. Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2014-05-17net: cdc_ncm: remove redundant "disconnected" flagBjørn Mork1-1/+0
Calling netif_carrier_{on,off} is sufficient. There is no need to duplicate the carrier state in a driver specific flag. Acked-by: Enrico Mioso <mrkiko.rs@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-17net: cdc_ncm: use sane defaults for rx/tx buffersBjørn Mork1-0/+4
Lots of devices request much larger buffers than reasonable. This cause real problems for users of hosts with limited resources. Reducing the default buffer size to 16kB for such devices is a reasonable trade-off between allowing them to aggregate traffic and avoiding memory exhaustion on resource restrained hosts. Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-17net: cdc_ncm/cdc_mbim: adding NCM protocol statisticsBjørn Mork1-0/+11
To have an idea of the effects of the protocol coalescing it's useful to have some counters showing the different aspects. Due to the asymmetrical usbnet interface the netdev rx_bytes counter has been counting real received payload, while the tx_bytes counter has included the NCM/MBIM framing overhead. This overhead can be many times the payload because of the aggressive padding strategy of this driver, and will vary a lot depending on device and traffic. With very few exceptions, users are only interested in the payload size. Having an somewhat accurate payload byte counter is particularly important for mobile broadband devices, which many NCM devices and of course all MBIM devices are. Users and userspace applications will use this counter to monitor account quotas. Having protocol specific counters for the overhead, we are now able to correct the tx_bytes netdev counter so that it shows the real payload Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-17net: cdc_ncm: set reasonable padding limitsBjørn Mork1-0/+1
We pad frames larger than X to maximum size for devices which don't need a ZLP after maximum sized frames. This allows the device to optimize its transfers for one fixed buffer size. X was arbitrarily set at 512 bytes regardless of real buffer maximum, causing extreme overheads due to excessive padding of larger tx buffers. Limit the padding to at most 3 full USB packets, still allowing the overhead to payload ratio of 3/1. Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-17net: cdc_ncm: use true max dgram count for header estimatesBjørn Mork1-9/+1
Many newer NCM and MBIM devices will request a maximum tx datagram count which is much smaller than our hard-coded absolute max. We can reduce the overhead without sacrificing any of the simplicity for these devices, by simply using the true negotiated count in when calculated the maximum NTH and NDP header sizes. Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-17net: cdc_ncm: use ethtool to tune coalescing settingsBjørn Mork1-1/+5
Datagram coalescing is an integral part of the NCM and MBIM protocols, intended to reduce the interrupt load primarily on the device end of the USB link. As with all coalescing solutions, there is a trade-off between buffering and interrupts. The current defaults are based on the assumption that device side buffers should be the limiting factor. However, many modern high speed LTE modems suffers from buffer-bloat, making this assumption fail. This results in sub-optimal performance due to excessive coalescing. And in cases where such modems are connected to cheap embedded hosts there is often severe buffer allocation issues, giving very noticeable performance degradation . A start on improving this is going from build time hard coded limits to per device user configurable limits. The ethtool coalescing API was selected as user interface because, although the tuned values are buffer sizes, these settings directly control datagram coalescing. Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>