summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2010-03-26ixgbe: In SR-IOV mode insert delay before bring the adapter upGreg Rose1-0/+8
VFs running in guest VMs do not respond in as timely a manner to PF indication it is going down as they do when running in the host domain. If the adapter is in SR-IOV mode insert a two second delay to guarantee that all VFs have had time to respond to the PF reset. In any case resetting the PF while VFs are active should be discouraged but if it must be done then there will be a two second delay to help synchronize resets among the PF and all the VFs. Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-26ixgbevf: Fix signed/unsigned int errorGreg Rose1-1/+2
In the Tx mapping function if a DMA error occurred then the unwind of previously mapped sections would improperly check an unsigned int if it was less than zero. Changed the index variable to signed to avoid the error. Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-26netxen: update version to 4.0.73Amit Kumar Salecha1-2/+2
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-26netxen: added sanity check for pci mapAmit Kumar Salecha1-18/+27
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Return value of ioremap is not checked, NULL check added. Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-26netxen: fix warning in ioaddr for NX3031 chipAmit Kumar Salecha1-6/+8
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> crb_intr_mask/crb_sts_consumer is predefined for NX2031 not for NX3031. For NX3031, these values get defined in rx context creation. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-26netxen: fix bios version calculationAmit Kumar Salecha1-1/+1
Bios sub version from unified fw image is calculated incorrect. Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-26Revert "r8169: enable 64-bit DMA by default for PCI Express devices (v2)"David S. Miller1-14/+7
This reverts commit 353176888386d9025062a12dcec08d49af10cf2c. People are reporting problems due to this change and there is no anticipation that the cause will be tracked down any time soon. We can try next time to selectively re-enable this based upon chip type, or have a black list of some sort. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-26isdn: Add netdev to lists in MAINTAINERS entry.David S. Miller1-0/+1
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-25TIPC: Removed inactive maintainerJon Maloy1-1/+0
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-25isdn: Cleanup Sections in PCMCIA driver elsaHenne1-6/+6
Compiling this driver gave a section mismatch, so I reviewed the init/exit paths of the driver and made the correct changes. WARNING: drivers/isdn/hisax/built-in.o(.text+0x55e37): Section mismatch in reference from the function elsa_cs_config() to the function .devinit.text:hisax_init_pcmcia() The function elsa_cs_config() references the function __devinit hisax_init_pcmcia(). This is often because elsa_cs_config lacks a __devinit annotation or the annotation of hisax_init_pcmcia is wrong. Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de> Acked-by: Karsten Keil <keil@b1-systems.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-25isdn: Cleanup Sections in PCMCIA driver avma1Henne1-6/+6
Compiling this driver gave a section mismatch, so I reviewed the init/exit paths of the driver and made the correct changes. WARNING: drivers/isdn/hisax/built-in.o(.text+0x56512): Section mismatch in reference from the function avma1cs_config() to the function .devinit.text:hisax_init_pcmcia() The function avma1cs_config() references the function __devinit hisax_init_pcmcia(). This is often because avma1cs_config lacks a __devinit annotation or the annotation of hisax_init_pcmcia is wrong. Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de> Acked-by: Karsten Keil <keil@b1-systems.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-25isdn: Cleanup Sections in PCMCIA driver telesHenne1-6/+6
Compiling this driver gave a section mismatch, so I reviewed the init/exit paths of the driver and made the correct changes. WARNING: drivers/isdn/hisax/built-in.o(.text+0x56bfb): Section mismatch in reference from the function teles_cs_config() to the function .devinit.text:hisax_init_pcmcia() The function teles_cs_config() references the function __devinit hisax_init_pcmcia(). This is often because teles_cs_config lacks a __devinit annotation or the annotation of hisax_init_pcmcia is wrong. Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de> Acked-by: Karsten Keil <keil@b1-systems.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-25isdn: Cleanup Sections in PCMCIA driver sedlbauerHenne1-6/+6
Compiling this driver gave a section mismatch, so I reviewed the init/exit paths of the driver and made the correct changes. WARNING: drivers/isdn/hisax/built-in.o(.text+0x558d6): Section mismatch in reference from the function sedlbauer_config() to the function .devinit.text:hisax_init_pcmcia() The function sedlbauer_config() references the function __devinit hisax_init_pcmcia(). This is often because sedlbauer_config lacks a __devinit annotation or the annotation of hisax_init_pcmcia is wrong. Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de> Acked-by: Karsten Keil <keil@b1-systems.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-25via-velocity: Fix FLOW_CNTL_TX_RX handling in set_mii_flow_control()David S. Miller1-1/+1
Clear, don't set, ANAR_ASMDIR in this case. Noticed by Roel Kluin. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-25Merge branch 'master' of ↵David S. Miller4-3/+6
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6
2010-03-25netfilter: xt_hashlimit: IPV6 bugfixEric Dumazet1-0/+1
A missing break statement in hashlimit_ipv6_mask(), and masks between /64 and /95 are not working at all... Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-03-25netfilter: ip6table_raw: fix table priorityJozsef Kadlecsik2-1/+2
The order of the IPv6 raw table is currently reversed, that makes impossible to use the NOTRACK target in IPv6: for example if someone enters ip6tables -t raw -A PREROUTING -p tcp --dport 80 -j NOTRACK and if we receive fragmented packets then the first fragment will be untracked and thus skip nf_ct_frag6_gather (and conntrack), while all subsequent fragments enter nf_ct_frag6_gather and reassembly will never successfully be finished. Singed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-03-25netfilter: xt_hashlimit: dl_seq_stop() fixEric Dumazet1-1/+2
If dl_seq_start() memory allocation fails, we crash later in dl_seq_stop(), trying to kfree(ERR_PTR(-ENOMEM)) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-03-24af_key: return error if pfkey_xfrm_policy2msg_prep() failsDan Carpenter1-5/+3
The original code saved the error value but just returned 0 in the end. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Jamal Hadi Salim <hadi@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-24skbuff: remove unused dma_head & dma_maps fieldsAlexander Duyck1-6/+0
The dma map fields in the skb_shared_info structure no longer has any users and can be dropped since it is making the skb_shared_info unecessarily larger. Running slabtop show that we were using 4K slabs for the skb->head on x86_64 w/ an allocation size of 1522. It turns out that the dma_head and dma_maps array made skb_shared large enough that we had crossed over the 2k boundary with standard frames and as such we were using 4k blocks of memory for all skbs. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-24vlan: updates vlan real_num_tx_queuesVasu Dev1-0/+2
Updates real_num_tx_queues in case underlying real device has changed real_num_tx_queues. -v2 As per Eric Dumazet<eric.dumazet@gmail.com> comment:- -- adds BUG_ON to catch case of real_num_tx_queues exceeding num_tx_queues. -- created this self contained patch to just update real_num_tx_queues. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-24vlan: adds vlan_dev_select_queueVasu Dev1-3/+68
This is required to correctly select vlan tx queue for a driver supporting multi tx queue with ndo_select_queue implemented since currently selected vlan tx queue is unaligned to selected queue by real net_devce ndo_select_queue. Unaligned vlan tx queue selection causes thrash with higher vlan tx lock contention for least fcoe traffic and wrong socket tx queue_mapping for ixgbe having ndo_select_queue implemented. -v2 As per Eric Dumazet<eric.dumazet@gmail.com> comments, mirrored vlan net_device_ops to have them with and without vlan_dev_select_queue and then select according to real dev ndo_select_queue present or not for a vlan net_device. This is to completely skip vlan_dev_select_queue calling for real net_device not supporting ndo_select_queue. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-24igb: only use vlan_gro_receive if vlans are registeredAlexander Duyck1-1/+1
This change makes it so that vlan_gro_receive is only used if vlans have been registered to the adapter structure. Previously we were just sending all vlan tagged frames in via this function but this results in a null pointer dereference when vlans are not registered. [ This fixes bugzilla entry 15582 -Eric Dumazet] Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-24igb: do not modify tx_queue_len on link speed changeEmil Tantilov2-10/+1
Previously the driver tweaked txqueuelen to avoid false Tx hang reports seen at half duplex. This had the effect of overriding user set values on link change/reset. Testing shows that adjusting only the timeout factor is sufficient to prevent Tx hang reports at half duplex. Based on e1000e patch by Franco Fichtner <franco@lastsummer.de> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-24igb: count Rx FIFO errors correctlyMitch Williams1-5/+5
Don't aggregate rx_no_buffer_count into rx_fifo_errors. RNBC counts packets that get queued temporarily in the adapter's FIFO. These packets are not dropped and are not errors. The correct counter is rx_missed_errors (MPC). Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-24bnx2: Use proper handler during netpoll.Michael Chan1-3/+5
Netpoll needs to call the proper handler depending on the IRQ mode and the vector. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-24bnx2: Fix netpoll crash.Benjamin Li1-2/+4
The bnx2 driver calls netif_napi_add() for all the NAPI structs during ->probe() time but not all of them will be used if we're not in MSI-X mode. This creates a problem for netpoll since it will poll all the NAPI structs in the dev_list whether or not they are scheduled, resulting in a crash when we access structure fields not initialized for that vector. We fix it by moving the netif_napi_add() call to ->open() after the number of IRQ vectors has been determined. Signed-off-by: Benjamin Li <benli@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-23ksz884x: fix return value of netdev_set_eepromJens Rottmann1-1/+1
ksz884x: fix return value of netdev_set_eeprom netdev_set_eeprom() confused ethtool by just returning 1 on error instead of a proper -EINVAL. Signed-off-by: Jens Rottmann <JRottmann@LiPPERTEmbedded.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-23cgroups: net_cls as moduleBen Blum2-10/+31
Allows the net_cls cgroup subsystem to be compiled as a module This patch modifies net/sched/cls_cgroup.c to allow the net_cls subsystem to be optionally compiled as a module instead of builtin. The cgroup_subsys struct is moved around a bit to allow the subsys_id to be either declared as a compile-time constant by the cgroup_subsys.h include in cgroup.h, or, if it's a module, initialized within the struct by cgroup_load_subsys. Signed-off-by: Ben Blum <bblum@andrew.cmu.edu> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-23netxen: The driver doesn't work on NX_P3_B1 so cause probe to fail.Eric W. Biederman1-2/+2
I haven't been able to get link up on a NX_P3_B1 since 2.6.31. The driver complains about a firmware hang instead. When I asked I was told rev 0x41 was a preproduction rev. So disable support in the driver so no one is surprised the code doesn't work. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-23netpoll: warn when there are spaces in parametersAmerigo Wang1-2/+5
v2: update according to Frans' comments. Currently, if we leave spaces before dst port, netconsole will silently accept it as 0. Warn about this. Also, when spaces appear in other places, make them visible in error messages. Signed-off-by: WANG Cong <amwang@redhat.com> Cc: David Miller <davem@davemloft.net> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-23can: bfin_can: switch to common Blackfin can headerMike Frysinger1-90/+7
The MMR bits are being moved to this header, so include it. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Acked-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-23Merge branch 'master' of /home/davem/src/GIT/linux-2.6/David S. Miller674-7950/+56924
2010-03-22Fix up prototype for sys_ipc breakageLinus Torvalds1-1/+1
Commit 45575f5a426c ("ppc64 sys_ipc breakage in 2.6.34-rc2") fixed the definition of the sys_ipc() helper, but didn't fix the prototype in <linux/syscalls.h> Reported-and-tested-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-22netfilter: xt_recent: fix regression in rules using a zero hit_countPatrick McHardy1-1/+1
Commit 8ccb92ad (netfilter: xt_recent: fix false match) fixed supposedly false matches in rules using a zero hit_count. As it turns out there is nothing false about these matches and people are actually using entries with a hit_count of zero to make rules dependant on addresses inserted manually through /proc. Since this slipped past the eyes of three reviewers, instead of reverting the commit in question, this patch explicitly checks for a hit_count of zero to make the intentions more clear. Reported-by: Thomas Jarosch <thomas.jarosch@intra2net.com> Tested-by: Thomas Jarosch <thomas.jarosch@intra2net.com> Cc: stable@kernel.org Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-03-22rxrpc: Check allocation failure.Tetsuo Handa1-0/+6
alloc_skb() can return NULL. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-22Merge branch 'for-linus' of ↵Linus Torvalds1-2/+13
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio: console: Check if port is valid in resize_console virtio: console: Generate a kobject CHANGE event on adding 'name' attribute
2010-03-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds45-235/+463
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (38 commits) ip_gre: include route header_len in max_headroom calculation if_tunnel.h: add missing ams/byteorder.h include ipv4: Don't drop redirected route cache entry unless PTMU actually expired net: suppress lockdep-RCU false positive in FIB trie. Bluetooth: Fix kernel crash on L2CAP stress tests Bluetooth: Convert debug files to actually use debugfs instead of sysfs Bluetooth: Fix potential bad memory access with sysfs files netfilter: ctnetlink: fix reliable event delivery if message building fails netlink: fix NETLINK_RECV_NO_ENOBUFS in netlink_set_err() NET_DMA: free skbs periodically netlink: fix unaligned access in nla_get_be64() tcp: Fix tcp_mark_head_lost() with packets == 0 net: ipmr/ip6mr: fix potential out-of-bounds vif_table access KS8695: update ksp->next_rx_desc_read at the end of rx loop igb: Add support for 82576 ET2 Quad Port Server Adapter ixgbevf: Message formatting cleanups ixgbevf: Shorten up delay timer for watchdog task ixgbevf: Fix VF Stats accounting after reset ixgbe: Set IXGBE_RSC_CB(skb)->DMA field to zero after unmapping the address ixgbe: fix for real_num_tx_queues update issue ...
2010-03-22Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bpLinus Torvalds1-1/+6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: edac, mce: Filter out invalid values
2010-03-22rxrpc: Check allocation failure.Tetsuo Handa1-0/+6
alloc_skb() can return NULL. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-22AFS: Potential null dereferenceDan Carpenter1-2/+3
It seems clear from the surrounding code that xpermits is allowed to be NULL here. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-22ppc64 sys_ipc breakage in 2.6.34-rc2Anton Blanchard1-1/+1
I chased down a fail on ppc64 on 2.6.34-rc2 where an application that uses shared memory was getting a SEGV. Commit baed7fc9b580bd3fb8252ff1d9b36eaf1f86b670 ("Add generic sys_ipc wrapper") changed the second argument from an unsigned long to an int. When we call shmget the system call wrappers for sys_ipc will sign extend second (ie the size) which truncates it. It took a while to track down because the call succeeds and strace shows the untruncated size :) The patch below changes second from an int to an unsigned long which fixes shmget on ppc64 (and I assume s390, sparc64 and mips64). Signed-off-by: Anton Blanchard <anton@samba.org> -- I assume the function prototypes for the other IPC methods would cause us to sign or zero extend second where appropriate (avoiding any security issues). Come to think of it, the syscall wrappers for each method should do that for us as well. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-22x86 / perf: Fix suspend to RAM on HP nx6325Rafael J. Wysocki1-3/+5
Commit 3f6da3905398826d85731247e7fbcf53400c18bd (perf: Rework and fix the arch CPU-hotplug hooks) broke suspend to RAM on my HP nx6325 (and most likely on other AMD-based boxes too) by allowing amd_pmu_cpu_offline() to be executed for CPUs that are going offline as part of the suspend process. The problem is that cpuhw->amd_nb may be NULL already, so the function should make sure it's not NULL before accessing the object pointed to by it. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-22edac, mce: Filter out invalid valuesBorislav Petkov1-1/+6
Print the CPU associated with the error only when the field is valid. Cc: <stable@kernel.org> # .32.x .33.x Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-03-22virtio: console: Check if port is valid in resize_consoleAmit Shah1-0/+4
The console port could have been hot-unplugged. Check if it is valid before working on it. Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-03-22virtio: console: Generate a kobject CHANGE event on adding 'name' attributeAmit Shah1-2/+9
When the host lets us know what 'name' a port is assigned, we create the sysfs 'name' attribute. Generate a 'change' event after this so that udev wakes up and acts on the rules for virtio-ports (currently there's only one rule that creates a symlink from the 'name' to the actual char device). Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-03-22ip_gre: include route header_len in max_headroom calculationTimo Teräs1-1/+3
Taking route's header_len into account, and updating gre device needed_headroom will give better hints on upper bound of required headroom. This is useful if the gre traffic is xfrm'ed. Signed-off-by: Timo Teras <timo.teras@iki.fi> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-22if_tunnel.h: add missing ams/byteorder.h includePaulius Zaleckas1-0/+1
When compiling userspace application which includes if_tunnel.h and uses GRE_* defines you will get undefined reference to __cpu_to_be16. Fix this by adding missing #include <asm/byteorder.h> Cc: stable@kernel.org Signed-off-by: Paulius Zaleckas <paulius.zaleckas@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-22ipv4: Don't drop redirected route cache entry unless PTMU actually expiredGuenter Roeck1-1/+2
TCP sessions over IPv4 can get stuck if routers between endpoints do not fragment packets but implement PMTU instead, and we are using those routers because of an ICMP redirect. Setup is as follows MTU1 MTU2 MTU1 A--------B------C------D with MTU1 > MTU2. A and D are endpoints, B and C are routers. B and C implement PMTU and drop packets larger than MTU2 (for example because DF is set on all packets). TCP sessions are initiated between A and D. There is packet loss between A and D, causing frequent TCP retransmits. After the number of retransmits on a TCP session reaches tcp_retries1, tcp calls dst_negative_advice() prior to each retransmit. This results in route cache entries for the peer to be deleted in ipv4_negative_advice() if the Path MTU is set. If the outstanding data on an affected TCP session is larger than MTU2, packets sent from the endpoints will be dropped by B or C, and ICMP NEEDFRAG will be returned. A and D receive NEEDFRAG messages and update PMTU. Before the next retransmit, tcp will again call dst_negative_advice(), causing the route cache entry (with correct PMTU) to be deleted. The retransmitted packet will be larger than MTU2, causing it to be dropped again. This sequence repeats until the TCP session aborts or is terminated. Problem is fixed by removing redirected route cache entries in ipv4_negative_advice() only if the PMTU is expired. Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-22Merge branch 'master' of ↵David S. Miller6-51/+119
git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6