summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2013-01-29ethoc: Cleanup driver formatBarry Grussling1-17/+17
Cleanup the format of ethoc.c to meet network driver style as per checkpatch.pl. Signed-off-by: Barry Grussling <barry@grussling.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29ip_gre: When TOS is inherited, use configured TOS value for non-IP packetsDavid Ward1-2/+2
A GRE tunnel can be configured so that outgoing tunnel packets inherit the value of the TOS field from the inner IP header. In doing so, when a non-IP packet is transmitted through the tunnel, the TOS field will always be set to 0. Instead, the user should be able to configure a different TOS value as the fallback to use for non-IP packets. This is helpful when the non-IP packets are all control packets and should be handled by routers outside the tunnel as having Internet Control precedence. One example of this is the NHRP packets that control a DMVPN-compatible mGRE tunnel; they are encapsulated directly by GRE and do not contain an inner IP header. Under the existing behavior, the IFLA_GRE_TOS parameter must be set to '1' for the TOS value to be inherited. Now, only the least significant bit of this parameter must be set to '1', and when a non-IP packet is sent through the tunnel, the upper 6 bits of this same parameter will be copied into the TOS field. (The ECN bits get masked off as before.) This behavior is backwards-compatible with existing configurations and iproute2 versions. Signed-off-by: David Ward <david.ward@ll.mit.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29ipv4: introduce address lifetimeJiri Pirko4-9/+220
There are some usecase when lifetime of ipv4 addresses might be helpful. For example: 1) initramfs networkmanager uses a DHCP daemon to learn network configuration parameters 2) initramfs networkmanager addresses, routes and DNS configuration 3) initramfs networkmanager is requested to stop 4) initramfs networkmanager stops all daemons including dhclient 5) there are addresses and routes configured but no daemon running. If the system doesn't start networkmanager for some reason, addresses and routes will be used forever, which violates RFC 2131. This patch is essentially a backport of ivp6 address lifetime mechanism for ipv4 addresses. Current "ip" tool supports this without any patch (since it does not distinguish between ipv4 and ipv6 addresses in this perspective. Also, this should be back-compatible with all current netlink users. Reported-by: Pavel Šimerda <psimerda@redhat.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29Merge branch 'ipfrags'David S. Miller6-56/+118
Jesper Dangaard Brouer says: ==================== This patchset is V2, with some trivial code fixes, which were noticed by DaveM. It is still a partly respin of my fragmentation optimization patches: http://thread.gmane.org/gmane.linux.network/250914 This is not the complete patchset, from the gmane link above. In this patchset, I primarily focus on adjusting cacheline for better SMP/NUMA performance. Once this patchset have been agreed upon, I will continue and respin the rest of my patches. This time around, I have created a frag DoS generator, via the tool trafgen (http://netsniff-ng.org/). To create a stable DoS scenario (no longer relying on frame dropping due to disabled flow-control). Two 10G interfaces are under-test, and uses Ethernet flow-control. A third interface is used for generating the DoS attack (this interface is also 10G, but it does not need to be, as 500Kpps DoS is enough). Test types summary (netperf): Test-20G64K == 2x10G with 65K fragments Test-20G3F == 2x10G with 3x fragments (3*1472 bytes) Test-20G64K+DoS == Same as 20G64K with frag DoS Test-20G3F+DoS == Same as 20G3F with frag DoS Patch list: Patch-01 - net: cacheline adjust struct netns_frags for better frag performance Patch-02 - net: cacheline adjust struct inet_frags for better frag performance Patch-03 - net: cacheline adjust struct inet_frag_queue Patch-04 - net: frag helper functions for mem limit tracking Patch-05 - net: use lib/percpu_counter API for fragmentation mem accounting Patch-06 - net: frag, move LRU list maintenance outside of rwlock Performance table summary: Test-type: Test-20G64K Test-20G3F 20G64K+DoS 20G3F+DoS ---------- ----------- ---------- ---------- --------- net-next: 15114.5 Mbit/s 8954.21 2444.28 3918.01 Mbit/s Patch-01: 16075.8 Mbit/s 8976.18 2621.49 4072.79 Mbit/s Patch-02: 17806.9 Mbit/s 9280.32 2478.62 4274.59 Mbit/s Patch-03: 17317.4 Mbit/s 9308.62 2546.05 4336.59 Mbit/s Patch-04: 17635.9 Mbit/s 9256.16 2535.25 4327.63 Mbit/s Patch-05: 18027.0 Mbit/s 9918.99 2492.62 3621.68 Mbit/s Patch-06: 18486.7 Mbit/s 10723.20 3657.85 4560.64 Mbit/s I cannot explain the under-DoS regression that patch-05/percpu_counter introduces. But patch-06/LRU-lock corrects the situation again. Below is a testlab setup description, with links to the trafgen DoS packet config used. Testlab ======= Server setup ------------ The machine acting as a server: - 2x CPU (E5-2630) - Thus a NUMA arch/machine - 4x 10Gbit/s ports - NICs 2x Intel Dual port 82599 based (driver ixgbe) Setup: - Interfaces uses Ethernet flow control - Flush all iptables - Remove all iptables related module. - Kill irqbalance - Pin each 10G NIC port to a *single* CPU each Pinning can easily be done by command hacks:: for x in /proc/irq/*/eth8*/../smp_affinity_list ; do echo 1 > $x; done for x in /proc/irq/*/eth9*/../smp_affinity_list ; do echo 3 > $x; done for x in /proc/irq/*/eth31*/../smp_affinity_list; do echo 6 > $x; done for x in /proc/irq/*/eth32*/../smp_affinity_list; do echo 8 > $x; done Notice NUMA setting: The CPU to NIC tying is carefully choosen according to the NUMA node setup. Thus, NICs connected to a PCI-e slot that is connected to a physical CPU socket are tied together. Choosing only a single CPU per NIC (port) is just to ease provoking and debugging this performance issue. (In real setups, you can choose more CPU, just remember the NUMA node in the equation). Tools ----- Netperf is used, with option -T to ensure CPU binding. The netserver processes, are NAPI pinned:: numactl -m0 -c0 netserver numactl -m1 -c 1 netserver -p 1337 I now have a frag DoS generator, created via the tool: trafgen (see: http://netsniff-ng.org/) Trafgen packet config file: http://people.netfilter.org/hawk/frag_work/trafgen/frag_packet03_small_frag.txf Notice, I'm using features of trafgen, recently developed by Daniel Borkmann, thus you need the latest git tree to use my trafgen packet config. git://github.com/borkmann/netsniff-ng.git Command line: trafgen --dev eth51 --conf frag_packet03_small_frag.txf -V -k 100 --cpus 2 Tests types ----------- Test(20G64K) UDP-64K 2x 10Gbit/s with no DoS traffic: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ export SIZE=$((65507)); export TIME=$((20)); export LOG=/tmp/netperf.log ;\ netperf -p 1337 -H 192.168.31.2 -T7,7 -t UDP_STREAM -l $TIME -- -m $SIZE >> ${LOG}.31 &\ netperf -H 192.168.81.2 -T2,2 -t UDP_STREAM -l $TIME -- -m $SIZE >> ${LOG}.81 && \ wait $! && tail -n3 ${LOG}.* && \ tail -n3 ${LOG}.{31,81} | awk 'BEGIN{sum=0;} /212992 / {sum+=$4; print " +"$4} /==/ {print " file:"$2} END{print "sum:"sum" Mbit/s"}' Test(20G3F) UDP-3xfrags 2x 10Gbit/s with no DoS traffic: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ export SIZE=$((3*1472)); export TIME=$((20)); export LOG=/tmp/netperf.log ;\ netperf -p 1337 -H 192.168.31.2 -T7,7 -t UDP_STREAM -l $TIME -- -m $SIZE >> ${LOG}.31 &\ netperf -H 192.168.81.2 -T2,2 -t UDP_STREAM -l $TIME -- -m $SIZE >> ${LOG}.81 && \ wait $! && tail -n3 ${LOG}.* && \ tail -n3 ${LOG}.{31,81} | awk 'BEGIN{sum=0;} /212992 / {sum+=$4; print " +"$4} /==/ {print " file:"$2} END{print "sum:"sum" Mbit/s"}' Awk script for summming results: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ tail -n3 ${LOG}.{31,81} | awk 'BEGIN{sum=0;} /212992 / {sum+=$4; print " +"$4} /==/ {print " file:"$2} END{print "sum:"sum" Mbit/s"}' ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29net: frag, move LRU list maintenance outside of rwlockJesper Dangaard Brouer5-14/+33
Updating the fragmentation queues LRU (Least-Recently-Used) list, required taking the hash writer lock. However, the LRU list isn't tied to the hash at all, so we can use a separate lock for it. Original-idea-by: Florian Westphal <fw@strlen.de> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29net: use lib/percpu_counter API for fragmentation mem accountingJesper Dangaard Brouer2-8/+20
Replace the per network namespace shared atomic "mem" accounting variable, in the fragmentation code, with a lib/percpu_counter. Getting percpu_counter to scale to the fragmentation code usage requires some tweaks. At first view, percpu_counter looks superfast, but it does not scale on multi-CPU/NUMA machines, because the default batch size is too small, for frag code usage. Thus, I have adjusted the batch size by using __percpu_counter_add() directly, instead of percpu_counter_sub() and percpu_counter_add(). The batch size is increased to 130.000, based on the largest 64K fragment memory usage. This does introduce some imprecise memory accounting, but its does not need to be strict for this use-case. It is also essential, that the percpu_counter, does not share cacheline with other writers, to make this scale. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29net: frag helper functions for mem limit trackingJesper Dangaard Brouer6-33/+57
This change is primarily a preparation to ease the extension of memory limit tracking. The change does reduce the number atomic operation, during freeing of a frag queue. This does introduce a some performance improvement, as these atomic operations are at the core of the performance problems seen on NUMA systems. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29net: cacheline adjust struct inet_frag_queueJesper Dangaard Brouer1-4/+5
Fragmentation code cacheline adjusting of struct inet_frag_queue. Take advantage of the size of struct timer_list, and move all but spinlock_t lock, below the timer struct. On 64-bit 'lru_list', 'list' and 'refcnt', fits exactly into the next cacheline, and a new cacheline starts at 'fragments'. The netns_frags *net pointer is moved to the end of the struct, because its used in a compare, with "next/close-by" elements of which this struct is embedded into. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29net: cacheline adjust struct inet_frags for better frag performanceJesper Dangaard Brouer1-4/+7
The globally shared rwlock, of struct inet_frags, shares cacheline with the 'rnd' number, which is used by the hash calculations. Fix this, as this obviously is a bad idea, as unnecessary cache-misses will occur when accessing the 'rnd' number. Also small note that, moving function ptr (*match) up in struct, is to avoid it lands on the next cacheline (on 64-bit). Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29net: cacheline adjust struct netns_frags for better frag performanceJesper Dangaard Brouer1-1/+4
This small cacheline adjustment of struct netns_frags improves performance significantly for the fragmentation code. Struct members 'lru_list' and 'mem' are both hot elements, and it hurts performance, due to cacheline bouncing at every call point, when they share a cacheline. Also notice, how mem is placed together with 'high_thresh' and 'low_thresh', as they are used in the compare operations together. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29net: ks8851: convert to threaded IRQFelipe Balbi1-31/+12
just as it should have been. It also helps removing the, now unnecessary, workqueue. Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29net neigh: Optimize neighbor entry size calculation.YOSHIFUJI Hideaki / 吉藤英明2-10/+8
When allocating memory for neighbour cache entry, if tbl->entry_size is not set, we always calculate sizeof(struct neighbour) + tbl->key_len, which is common in the same table. With this change, set tbl->entry_size during the table initialization phase, if it was not set, and use it in neigh_alloc() and neighbour_priv(). This change also allow us to have both of protocol private data and device priate data at tha same time. Note that the only user of prototcol private is DECnet and the only user of device private is ATM CLIP. Since those are exclusive, we have not been facing issues here. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29net: avoid to hang up on sending due to sysctl configuration overflow.bingtian.ly@taobao.com2-8/+17
I found if we write a larger than 4GB value to some sysctl variables, the sending syscall will hang up forever, because these variables are 32 bits, such large values make them overflow to 0 or negative. This patch try to fix overflow or prevent from zero value setup of below sysctl variables: net.core.wmem_default net.core.rmem_default net.core.rmem_max net.core.wmem_max net.ipv4.udp_rmem_min net.ipv4.udp_wmem_min net.ipv4.tcp_wmem net.ipv4.tcp_rmem Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Li Yu <raise.sail@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29Merge branch 'merge' of ↵Linus Torvalds6-5/+33
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc fixes from Benjamin Herrenschmidt: "Whenever you have a chance between two dives, you might want to consider pulling my merge branch to pickup a few fixes for 3.8 that have been accumulating for the last couple of weeks (I was myself travelling then on vacation). Nothing major, just a handful of powerpc bug fixes that I consider worth getting in before 3.8 goes final." And I'll have everybody know that I'm not diving for several days yet. Snif. * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc: Max next_tb to prevent from replaying timer interrupt powerpc: kernel/kgdb.c: Fix memory leakage powerpc/book3e: Disable interrupt after preempt_schedule_irq powerpc/oprofile: Fix error in oprofile power7_marked_instr_event() function powerpc/pasemi: Fix crash on reboot powerpc: Fix MAX_STACK_TRACE_ENTRIES too low warning for ppc32
2013-01-29via-rhine: add 64bit statistics.Jamie Gloudon1-8/+39
Switch to use ndo_get_stats64 to get 64bit statistics. Signed-off-by: Jamie Gloudon <jamie.gloudon@gmail.com> Tested-by: Jamie Gloudon <jamie.gloudon@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29drivers/net/phy/micrel_phy: Add support for new PHYsDavid J. Choi2-5/+68
Summary of changes: .Newly added phys -KSZ8081/KSZ8091, which has some phy ids. -KSZ8061 -KSZ9031, which is Gigabit phy. -KSZ886X, which has a switch function. -KSZ8031, which has a same phy ids with KSZ8021. Signed-off-by: David J. Choi <david.choi@micrel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29net: phy: realtek: add rtl8211e driverGiuseppe CAVALLARO1-6/+44
This patch adds the minimal driver to manage the Realtek RTL8211E 10/100/1000 Transceivers. Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29netpoll: use the net namespace of current process instead of init_netCong Wang1-2/+4
This will allow us to setup netconsole in a different namespace rather than where init_net is. Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29netpoll: use ipv6_addr_equal() to compare ipv6 addrCong Wang1-3/+3
ipv6_addr_equal() is faster. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29netpoll: add RCU annotation to npinfo fieldCong Wang1-1/+1
dev->npinfo is protected by RCU. This fixes the following sparse warnings: net/core/netpoll.c:177:48: error: incompatible types in comparison expression (different address spaces) net/core/netpoll.c:200:35: error: incompatible types in comparison expression (different address spaces) net/core/netpoll.c:221:35: error: incompatible types in comparison expression (different address spaces) net/core/netpoll.c:327:18: error: incompatible types in comparison expression (different address spaces) Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29Merge branch 'for-davem' of ↵David S. Miller210-4987/+7702
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== Included is an NFC pull. Samuel says: "It brings the following goodies: - LLCP socket timestamping (To be used e.g with the recently released nfctool application for a more efficient skb timestamping when sniffing). - A pretty big pn533 rework from Waldemar, preparing the driver to support more flavours of pn533 based devices. - HCI changes from Eric in preparation for the microread driver support. - Some LLCP memory leak fixes, cleanups and slight improvements. - pn544 and nfcwilink move to the devm_kzalloc API. - An initial Secure Element (SE) API. - An nfc.h license change from the original author, allowing non GPL application code to safely include it." Also included are a pair of mac80211 pulls. Johannes says: "We found two bugs in the previous code, so I'm sending you a pull request again this soon. This contains two regulatory bug fixes, some of Thomas's hwsim beacon timer work and a documentation fix from Bob." "Another pull request for mac80211-next. This time, I have a number of things, the patches are mostly self-explanatory. There are a few fixes from Felix and myself, and random cleanups & improvements. The biggest thing is the partial patchset from Marco preparing for mesh powersave." Additionally, there are a pair of iwlwifi pulls. Johannes says: "For iwlwifi-next, I have a few cleanups/improvements as well as a few not very important fixes and more preparations for new devices." "Please pull a few updates for iwlwifi. These are just some cleanups and a debug improvement." On top of that, there is a slew of driver updates. This includes brcmfmac, mwifiex, ath9k, carl9170, and mwl8k as well as a handful of others. The bcma and ssb busses get some attention as well. Still, I don't see any big headliners here. Also included is a pull of the wireless tree, in order to resolve some merge conflicts. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29Merge branch 'master' of ↵David S. Miller12-162/+76
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next Jeff Kirsher says: ==================== This series contains updates to e1000e, ixgbevf, igb and igbvf. Majority of the patches are code cleanups of e1000e where code is removed (Yeah!). The other two e1000e patches are fixes. The first is to fix the maximum frame size for 82579 devices. The second fix is to resolve an issue with devices other than 82579 that suffer from dropped transactions on platforms with deep C-states when jumbo frames are enabled. The ixgbevf patch is to ensure that the driver fetches the correct, refreshed value for link status and speed when the values have changed. The igb and igbvf patches are a solution to an issue Stefan Assmann reported, where when the PF is up and igbvf is loaded, the MAC address is not generated using eth_hw_addr_random(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29powerpc: Max next_tb to prevent from replaying timer interruptTiejun Chen1-2/+7
With lazy interrupt, we always call __check_irq_replaysome with decrementers_next_tb to check if we need to replay timer interrupt. So in hotplug case we also need to set decrementers_next_tb as MAX to make sure __check_irq_replay don't replay timer interrupt when return as we expect, otherwise we'll trap here infinitely. Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-29powerpc: kernel/kgdb.c: Fix memory leakageCong Ding1-2/+3
the variable backup_current_thread_info isn't freed before existing the function. Signed-off-by: Cong Ding <dinggnu@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-29powerpc/book3e: Disable interrupt after preempt_schedule_irqTiejun Chen1-0/+13
In preempt case current arch_local_irq_restore() from preempt_schedule_irq() may enable hard interrupt but we really should disable interrupts when we return from the interrupt, and so that we don't get interrupted after loading SRR0/1. Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com> CC: <stable@vger.kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-29powerpc/oprofile: Fix error in oprofile power7_marked_instr_event() functionCarl E. Love1-1/+1
The calculation for the left shift of the mask OPROFILE_PM_PMCSEL_MSK has an error. The calculation is should be to shift left by (max_cntrs - cntr) times the width of the pmsel field width. However, the #define OPROFILE_MAX_PMC_NUM was used instead of OPROFILE_PMSEL_FIELD_WIDTH. This patch fixes the calculation. Signed-off-by: Carl Love <cel@us.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-29powerpc/pasemi: Fix crash on rebootSteven Rostedt1-0/+7
commit f96972f2dc "kernel/sys.c: call disable_nonboot_cpus() in kernel_restart()" added a call to disable_nonboot_cpus() on kernel_restart(), which tries to shutdown all the CPUs except the first one. The issue with the PA Semi, is that it does not support CPU hotplug. When the call is made to __cpu_down(), it calls the notifiers CPU_DOWN_PREPARE, and then tries to take the CPU down. One of the notifiers to the CPU hotplug code, is the cpufreq. The DOWN_PREPARE will call __cpufreq_remove_dev() which calls cpufreq_driver->exit. The PA Semi exit handler unmaps regions of I/O that is used by an interrupt that goes off constantly (system_reset_common, but it goes off during normal system operations too). I'm not sure exactly what this interrupt does. Running a simple function trace, you can see it goes off quite a bit: # tracer: function # # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | <idle>-0 [001] 1558.859363: .pasemi_system_reset_exception <-.system_reset_exception <idle>-0 [000] 1558.860112: .pasemi_system_reset_exception <-.system_reset_exception <idle>-0 [000] 1558.861109: .pasemi_system_reset_exception <-.system_reset_exception <idle>-0 [001] 1558.861361: .pasemi_system_reset_exception <-.system_reset_exception <idle>-0 [000] 1558.861437: .pasemi_system_reset_exception <-.system_reset_exception When the region is unmapped, the system crashes with: Disabling non-boot CPUs ... Error taking CPU1 down: -38 Unable to handle kernel paging request for data at address 0xd0000800903a0100 Faulting instruction address: 0xc000000000055fcc Oops: Kernel access of bad area, sig: 11 [#1] PREEMPT SMP NR_CPUS=64 NUMA PA Semi PWRficient Modules linked in: shpchp NIP: c000000000055fcc LR: c000000000055fb4 CTR: c0000000000df1fc REGS: c0000000012175d0 TRAP: 0300 Not tainted (3.8.0-rc4-test-dirty) MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI> CR: 24000088 XER: 00000000 SOFTE: 0 DAR: d0000800903a0100, DSISR: 42000000 TASK = c0000000010e9008[0] 'swapper/0' THREAD: c000000001214000 CPU: 0 GPR00: d0000800903a0000 c000000001217850 c0000000012167e0 0000000000000000 GPR04: 0000000000000000 0000000000000724 0000000000000724 0000000000000000 GPR08: 0000000000000000 0000000000000000 0000000000000001 0000000000a70000 GPR12: 0000000024000080 c00000000fff0000 ffffffffffffffff 000000003ffffae0 GPR16: ffffffffffffffff 0000000000a21198 0000000000000060 0000000000000000 GPR20: 00000000008fdd35 0000000000a21258 000000003ffffaf0 0000000000000417 GPR24: 0000000000a226d0 c000000000000000 0000000000000000 0000000000000000 GPR28: c00000000138b358 0000000000000000 c000000001144818 d0000800903a0100 NIP [c000000000055fcc] .set_astate+0x5c/0xa4 LR [c000000000055fb4] .set_astate+0x44/0xa4 Call Trace: [c000000001217850] [c000000000055fb4] .set_astate+0x44/0xa4 (unreliable) [c0000000012178f0] [c00000000005647c] .restore_astate+0x2c/0x34 [c000000001217980] [c000000000054668] .pasemi_system_reset_exception+0x6c/0x88 [c000000001217a00] [c000000000019ef0] .system_reset_exception+0x48/0x84 [c000000001217a80] [c000000000001e40] system_reset_common+0x140/0x180 Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-29can: rework skb reserved data handlingOliver Hartkopp6-13/+21
Added accessor and skb_reserve helpers for struct can_skb_priv. Removed pointless skb_headroom() check. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> CC: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29Merge tag 'md-3.8-fixes' of git://neil.brown.name/mdLinus Torvalds2-64/+38
Pull dmraid fix from NeilBrown: "Just one fix for md in 3.8 dmraid assess redundancy and replacements slightly inaccurately which could lead to some degraded arrays failing to assemble." * tag 'md-3.8-fixes' of git://neil.brown.name/md: DM-RAID: Fix RAID10's check for sufficient redundancy
2013-01-29powerpc: Fix MAX_STACK_TRACE_ENTRIES too low warning for ppc32Li Zhong1-0/+2
This patch fixes MAX_STACK_TRACE_ENTRIES too low warning for ppc32, which is similar to commit 12660b17. Reported-by: Christian Kujau <lists@nerdbynature.de> Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Tested-by: Christian Kujau <lists@nerdbynature.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixesLinus Torvalds1-1/+6
Pull GFS2 fix from Steven Whitehouse. * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes: GFS2: fix skip unlock condition
2013-01-28Merge tag 'iommu-fixes-v3.8-rc5' of ↵Linus Torvalds1-0/+34
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU fix from Joerg Roedel: "One fix for the AMD IOMMU driver to work around broken BIOSes found in the field. Some BIOSes forget to enable a workaround for a hardware problem which might cause the IOMMU to stop working under high load conditions. The fix makes sure this workaround is enabled." * tag 'iommu-fixes-v3.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: IOMMU, AMD Family15h Model10-1Fh erratum 746 Workaround
2013-01-28Merge tag 'mfd-for-linus-3.8-1' of ↵Linus Torvalds24-129/+314
git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6 Pull MFD fixes from Samuel Ortiz: "This is the first pull request for MFD fixes for 3.8 We have some build failure fixes (twl4030, vexpress, abx500 and tps65910), some actual runtime oops and lockup fixes (rtsx, da9052), and some more hypothetical NULL pointers dereferences fixes for pcf50633 and max776xx. Then we also have additional rtsx fixes for a correct switch output voltage and clock divider correctness for rtl8411 (rtsx driver), and irqdomain fix for db8550-prcmu, and some more cosmetic fixes for arizona and wm5102." * tag 'mfd-for-linus-3.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6: mfd: rtsx: Fix oops when rtsx_pci_sdmmc is not probed mfd: wm5102: Fix definition of WM5102_MAX_REGISTER mfd: twl4030: Don't warn about uninitialized return code mfd: da9052/53 lockup fix mfd: rtsx: Add clock divider hook mmc: rtsx: Call MFD hook to switch output voltage mfd: rtsx: Add output voltage switch hook mfd: Fix compile errors and warnings when !CONFIG_AB8500_BM mfd: vexpress: Export global functions to fix build error mfd: arizona: Check errors from regcache_sync() mfd: tc3589x: Use simple irqdomain mfd: pcf50633: Init pcf->dev before using it mfd: max77693: Init max77693->dev before using it mfd: max77686: Init max77686->dev before using it mfd: db8500-prcmu: Fix irqdomain usage mfd: tps65910: Select REGMAP_IRQ in Kconfig to fix build error mfd: arizona: Disable control interface reporting for WM5102 and WM5110
2013-01-28batman-adv: fix local translation table outputAntonio Quartulli1-2/+2
The last-seen field has to be printed for all the local entries but the one marked with the no-purge flag Introduced by 15727323d9f8864b2d41930940acc38de987045a ("batman-adv: don't print the last_seen time for bat0 TT local entry") Signed-off-by: Antonio Quartulli <ordex@autistici.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2013-01-28Merge branch 'master' of ↵John W. Linville210-4987/+7702
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
2013-01-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds97-418/+851
Pull networking updates from David Miller: "Much more accumulated than I would have liked due to an unexpected bout with a nasty flu: 1) AH and ESP input don't set ECN field correctly because the transport head of the SKB isn't set correctly, fix from Li RongQing. 2) If netfilter conntrack zones are disabled, we can return an uninitialized variable instead of the proper error code. Fix from Borislav Petkov. 3) Fix double SKB free in ath9k driver beacon handling, from Felix Feitkau. 4) Remove bogus assumption about netns cleanup ordering in nf_conntrack, from Pablo Neira Ayuso. 5) Remove a bogus BUG_ON in the new TCP fastopen code, from Eric Dumazet. It uses spin_is_locked() in it's test and is therefore unsuitable for UP. 6) Fix SELINUX labelling regressions added by the tuntap multiqueue changes, from Paul Moore. 7) Fix CRC errors with jumbo frame receive in tg3 driver, from Nithin Nayak Sujir. 8) CXGB4 driver sets interrupt coalescing parameters only on first queue, rather than all of them. Fix from Thadeu Lima de Souza Cascardo. 9) Fix regression in the dispatch of read/write registers in dm9601 driver, from Tushar Behera. 10) ipv6_append_data miscalculates header length, from Romain KUNTZ. 11) Fix PMTU handling regressions on ipv4 routes, from Steffen Klassert, Timo Teräs, and Julian Anastasov. 12) In 3c574_cs driver, add necessary parenthesis to "x << y & z" expression. From Nickolai Zeldovich. 13) macvlan_get_size() causes underallocation netlink message space, fix from Eric Dumazet. 14) Avoid division by zero in xfrm_replay_advance_bmp(), from Nickolai Zeldovich. Amusingly the zero check was already there, we were just performing it after the modulus :-) 15) Some more splice bug fixes from Eric Dumazet, which fix things mostly eminating from how we now more aggressively use high-order pages in SKBs. 16) Fix size calculation bug when freeing hash tables in the IPSEC xfrm code, from Michal Kubecek. 17) Fix PMTU event propagation into socket cached routes, from Steffen Klassert. 18) Fix off by one in TX buffer release in netxen driver, from Eric Dumazet. 19) Fix rediculous memory allocation requirements introduced by the tuntap multiqueue changes, from Jason Wang. 20) Remove bogus AMD platform workaround in r8169 driver that causes major problems in normal operation, from Timo Teräs. 21) virtio-net set affinity and select queue don't handle discontiguous cpu numbers properly, fix from Wanlong Gao. 22) Fix a route refcounting issue in loopback driver, from Eric Dumazet. There's a similar fix coming that we might add to the macvlan driver as well. 23) Fix SKB leaks in batman-adv's distributed arp table code, from Matthias Schiffer. 24) r8169 driver gives descriptor ownership back the hardware before we're done reading the VLAN tag out of it, fix from Francois Romieu. 25) Checksums not calculated properly in GRE tunnel driver fix from Pravin B Shelar. 26) Fix SCTP memory leak on namespace exit." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (101 commits) dm9601: support dm9620 variant SCTP: Free the per-net sysctl table on net exit. v2 net: phy: icplus: fix broken INTR pin settings net: phy: icplus: Use the RGMII interface mode to configure clock delays IP_GRE: Fix kernel panic in IP_GRE with GRE csum. sctp: set association state to established in dupcook_a handler ip6mr: limit IPv6 MRT_TABLE identifiers r8169: fix vlan tag read ordering. net: cdc_ncm: use IAD provided by the USB core batman-adv: filter ARP packets with invalid MAC addresses in DAT batman-adv: check for more types of invalid IP addresses in DAT batman-adv: fix skb leak in batadv_dat_snoop_incoming_arp_reply() net: loopback: fix a dst refcounting issue virtio-net: reset virtqueue affinity when doing cpu hotplug virtio-net: split out clean affinity function virtio-net: fix the set affinity bug when CPU IDs are not consecutive can: pch_can: fix invalid error codes can: ti_hecc: fix invalid error codes can: c_can: fix invalid error codes r8169: remove the obsolete and incorrect AMD workaround ...
2013-01-28Merge branch 'master' of ↵John W. Linville70-217/+6497
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless Conflicts: drivers/net/wireless/ath/ath9k/main.c drivers/net/wireless/iwlwifi/dvm/tx.c
2013-01-28IOMMU, AMD Family15h Model10-1Fh erratum 746 WorkaroundSuravee Suthikulpanit1-0/+34
The IOMMU may stop processing page translations due to a perceived lack of credits for writing upstream peripheral page service request (PPR) or event logs. If the L2B miscellaneous clock gating feature is enabled the IOMMU does not properly register credits after the log request has completed, leading to a potential system hang. BIOSes are supposed to disable L2B micellaneous clock gating by setting L2_L2B_CK_GATE_CONTROL[CKGateL2BMiscDisable](D0F2xF4_x90[2]) = 1b. This patch corrects that for those which do not enable this workaround. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Acked-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org Signed-off-by: Joerg Roedel <joro@8bytes.org>
2013-01-28igbvf: be sane about random MAC addressesMitch A Williams1-14/+10
Tighten up some of the code surrounding MAC addresses. Since the PF is now giving all zeros instead of a random address, check for this case and generate a random address. This ensures that we always know when we have a random address and udev won't get upset about it. Additionally, tighten up some of the log messages and clean up the formatting. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: Stefan Assmann <sassmann@kpanic.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Tested-by: Stefan Assmann <sassmann@redhat.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-01-28igb: Don't give VFs random MAC addressesMitch A Williams1-3/+3
If the user has not assigned a MAC address to a VM, then don't give it a random one. Instead, just give it zeros and let it figure out what to do with them. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: Stefan Assmann <sassmann@kpanic.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Tested-by: Stefan Assmann <sassmann@redhat.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-01-28GFS2: fix skip unlock conditionDavid Teigland1-1/+6
The recent commit fb6791d100d1bba20b5cdbc4912e1f7086ec60f8 included the wrong logic. The lvbptr check was incorrectly added after the patch was tested. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-01-28ixgbevf: Make sure link status and speed are fetchedGreg Rose1-0/+1
A recent change makes it necessary to set get_link_status to ensure that the driver fetches the correct, refreshed value for link status and speed when it has changed in the physical function device. Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-01-28e1000e: cleanup: remove comments which are no longer applicableBruce Allan1-7/+0
Code was removed but the applicable comments were not. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-01-28e1000e: cleanup hw.hBruce Allan1-15/+10
Remove unnecessary #include, forward prototype of struct e1000_adapter and an empty comment; fix a comment which mentions "static data for the MAC" which is not applicable to the following struct; and cleanup some whitespace issues. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-01-28e1000e: cleanup: remove unused #defineBruce Allan1-3/+0
All references to E1000_ERT_2048 have been removed. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-01-28e1000e: adjust PM QoS requestBruce Allan2-15/+19
It has been found that devices other than 82579 (a.k.a. e1000_pch2lan) suffer from dropped transactions on platforms with deep C-states when jumbo frames are enabled. For example, LOMs on ICH9- and ICH10-based platforms which recently had early-receive de-featured (for stability reasons) suffer from this. To resolve this for all devices, when jumbo frames are enabled set the PM QoS DMA latency request based on the size of the receive packet buffer less one full frame. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-01-28e1000e: correct maximum frame size on 82579Bruce Allan1-1/+1
The largest jumbo frame supported by the 82579 hardware is 9018. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-01-28e1000e: cleanup: remove e1000e_commit_phy()Bruce Allan4-28/+16
Remove the function e1000e_commit_phy() and replace the few calls to it with the same function pointer that it would call. The function pointer is almost always set for the devices that access these code paths so there is no risk of a NULL pointer dereference; for the few instances where the function pointer might not be set (i.e. can be called for the few devices which do not have this function pointer set), check for a valid function pointer. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-01-28e1000e: cleanup: remove e1000_get_cable_length()Bruce Allan2-7/+2
Remove the function e1000_get_cable_length() and replace the two calls to it with the same function pointer that it would call. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-01-28e1000e: cleanup: remove e1000_get_phy_cfg_done()Bruce Allan1-19/+1
Remove the function e1000_get_phy_cfg_done() and replace the single call to it with the same function pointer that it would call. The function pointer is always set so there is no risk of a NULL pointer dereference. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>