summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2005-10-04[NET]: Fix packet timestamping.Herbert Xu7-17/+12
I've found the problem in general. It affects any 64-bit architecture. The problem occurs when you change the system time. Suppose that when you boot your system clock is forward by a day. This gets recorded down in skb_tv_base. You then wind the clock back by a day. From that point onwards the offset will be negative which essentially overflows the 32-bit variables they're stored in. In fact, why don't we just store the real time stamp in those 32-bit variables? After all, we're not going to overflow for quite a while yet. When we do overflow, we'll need a better solution of course. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-30[ATM]: [lec] reset retry counter when new arp issuedScott Talbert1-0/+6
From: Scott Talbert <scott.talbert@lmco.com> Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-30[ATM]: [lec] attempt to support cisco failoverScott Talbert1-3/+34
From: Scott Talbert <scott.talbert@lmco.com> Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-30[TCP]: Don't over-clamp window in tcp_clamp_window()Alexey Kuznetsov1-2/+0
From: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Handle better the case where the sender sends full sized frames initially, then moves to a mode where it trickles out small amounts of data at a time. This known problem is even mentioned in the comments above tcp_grow_window() in tcp_input.c, specifically: ... * The scheme does not work when sender sends good segments opening * window and then starts to feed us spagetti. But it should work * in common situations. Otherwise, we have to rely on queue collapsing. ... When the sender gives full sized frames, the "struct sk_buff" overhead from each packet is small. So we'll advertize a larger window. If the sender moves to a mode where small segments are sent, this ratio becomes tilted to the other extreme and we start overrunning the socket buffer space. tcp_clamp_window() tries to address this, but it's clamping of tp->window_clamp is a wee bit too aggressive for this particular case. Fix confirmed by Ion Badulescu. Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-30[TCP]: Revert 6b251858d377196b8cea20e65cae60f584a42735David S. Miller1-5/+4
But retain the comment fix. Alexey Kuznetsov has explained the situation as follows: -------------------- I think the fix is incorrect. Look, the RFC function init_cwnd(mss) is not continuous: f.e. for mss=1095 it needs initial window 1095*4, but for mss=1096 it is 1096*3. We do not know exactly what mss sender used for calculations. If we advertised 1096 (and calculate initial window 3*1096), the sender could limit it to some value < 1096 and then it will need window his_mss*4 > 3*1096 to send initial burst. See? So, the honest function for inital rcv_wnd derived from tcp_init_cwnd() is: init_rcv_wnd(mss)= min { init_cwnd(mss1)*mss1 for mss1 <= mss } It is something sort of: if (mss < 1096) return mss*4; if (mss < 1096*2) return 1096*4; return mss*2; (I just scrablled a graph of piece of paper, it is difficult to see or to explain without this) I selected it differently giving more window than it is strictly required. Initial receive window must be large enough to allow sender following to the rfc (or just setting initial cwnd to 2) to send initial burst. But besides that it is arbitrary, so I decided to give slack space of one segment. Actually, the logic was: If mss is low/normal (<=ethernet), set window to receive more than initial burst allowed by rfc under the worst conditions i.e. mss*4. This gives slack space of 1 segment for ethernet frames. For msses slighlty more than ethernet frame, take 3. Try to give slack space of 1 frame again. If mss is huge, force 2*mss. No slack space. Value 1460*3 is really confusing. Minimal one is 1096*2, but besides that it is an arbitrary value. It was meant to be ~4096. 1460*3 is just the magic number from RFC, 1460*3 = 1095*4 is the magic :-), so that I guess hands typed this themselves. -------------------- Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-29Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds13-118/+150
2005-09-29[PATCH] proc_mkdir() should be used to create procfs directoriesAl Viro2-18/+7
A bunch of create_proc_dir_entry() calls creating directories had crept in since the last sweep; converted to proc_mkdir(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-29[NET]: Fix reversed logic in eth_type_trans().David S. Miller1-1/+1
I got the second compare_eth_addr() test reversed, oops. Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-29[ATM]: fix bug in atm address list handlingMartin Whitaker1-2/+4
From: Martin Whitaker <atm@martin-whitaker.co.uk> Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
2005-09-29[ATM]: track and close listen sockets when sigd exitsChas Williams3-6/+7
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
2005-09-29[ATM]: net/atm/ioctl.c: autoload pppoatm and br2684Roman Kagan1-8/+26
Signed-off-by: Roman Kagan <rkagan@mail.ru> Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
2005-09-29[TCP]: Fix init_cwnd calculations in tcp_select_initial_window()David S. Miller1-5/+6
Match it up to what RFC2414 really specifies. Noticed by Rick Jones. Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-28[APPLETALK]: Fix broadcast bug.Oliver Dawid1-9/+22
From: Oliver Dawid <oliver@helios.de> we found a bug in net/appletalk/ddp.c concerning broadcast packets. In kernel 2.4 it was working fine. The bug first occured 4 years ago when switching to new SNAP layer handling. This bug can be splitted up into a sending(1) and reception(2) problem: Sending(1) In kernel 2.4 broadcast packets were sent to a matching ethernet device and atalk_rcv() was called to receive it as "loopback" (so loopback packets were shortcutted and handled in DDP layer). When switching to the new SNAP structure, this shortcut was removed and the loopback packet was send to SNAP layer. The author forgot to replace the remote device pointer by the loopback device pointer before sending the packet to SNAP layer (by calling ddp_dl->request() ) therfor the packet was not sent back by underlying layers to ddp's atalk_rcv(). Reception(2) In atalk_rcv() a packet received by this loopback mechanism contains now the (rigth) loopback device pointer (in Kernel 2.4 it was the (wrong) remote ethernet device pointer) and therefor no matching socket will be found to deliver this packet to. Because a broadcast packet should be send to the first matching socket (as it is done in many other protocols (?)), we removed the network comparison in broadcast case. Below you will find a patch to correct this bug. Its diffed to kernel 2.6.14-rc1 Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-28[NET]: Slightly optimize ethernet address comparison.David S. Miller1-10/+21
We know the thing is at least 2-byte aligned, so take advantage of that instead of invoking memcmp() which results in truly horrifically inefficient code because it can't assume anything about alignment. Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-28[ROSE]: fix typo (regeistration)Alexey Dobriyan1-1/+1
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-28[ROSE]: check rose_ndevs earlierAlexey Dobriyan1-9/+11
* Don't bother with proto registering if rose_ndevs is bad. * Make escape structure more coherent. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-28[ROSE]: return sane -E* from rose_proto_init()Alexey Dobriyan1-4/+6
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-28[ROSE]: do proto_unregister() on exit pathsAlexey Dobriyan1-0/+2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-28[NET]: Fix module reference counts for loadable protocol modulesFrank Filz2-13/+20
I have been experimenting with loadable protocol modules, and ran into several issues with module reference counting. The first issue was that __module_get failed at the BUG_ON check at the top of the routine (checking that my module reference count was not zero) when I created the first socket. When sk_alloc() is called, my module reference count was still 0. When I looked at why sctp didn't have this problem, I discovered that sctp creates a control socket during module init (when the module ref count is not 0), which keeps the reference count non-zero. This section has been updated to address the point Stephen raised about checking the return value of try_module_get(). The next problem arose when my socket init routine returned an error. This resulted in my module reference count being decremented below 0. My socket ops->release routine was also being called. The issue here is that sock_release() calls the ops->release routine and decrements the ref count if sock->ops is not NULL. Since the socket probably didn't get correctly initialized, this should not be done, so we will set sock->ops to NULL because we will not call try_module_get(). While searching for another bug, I also noticed that sys_accept() has a possibility of doing a module_put() when it did not do an __module_get so I re-ordered the call to security_socket_accept(). Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-28[NET]: Prefetch dev->qdisc_lock in dev_queue_xmit()Eric Dumazet1-0/+2
We know the lock is going to be taken. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-28[NET]: Use non-recursive algorithm in skb_copy_datagram_iovec()Daniel Phillips1-55/+26
Use iteration instead of recursion. Fraglists within fraglists should never occur, so we BUG check this. Signed-off-by: Daniel Phillips <phillips@istop.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-27[NEIGH]: Add debugging check when adding timers.David S. Miller1-9/+14
If we double-add a neighbour entry timer, which should be impossible but has been reported, dump the current state of the entry so that we can debug this. Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-27Merge master.kernel.org:/pub/scm/linux/kernel/git/acme/llc-2.6David S. Miller18-604/+806
2005-09-27[NETFILTER]: Fix invalid module autoloading by splitting iptable_natHarald Welte4-34/+35
When you've enabled conntrack and NAT as a module (standard case in all distributions), and you've also enabled the new conntrack netlink interface, loading ip_conntrack_netlink.ko will auto-load iptable_nat.ko. This causes a huge performance penalty, since for every packet you iterate the nat code, even if you don't want it. This patch splits iptable_nat.ko into the NAT core (ip_nat.ko) and the iptables frontend (iptable_nat.ko). Threfore, ip_conntrack_netlink.ko will only pull ip_nat.ko, but not the frontend. ip_nat.ko will "only" allocate some resources, but not affect runtime performance. This separation is also a nice step in anticipation of new packet filters (nf-hipac, ipset, pkttables) being able to use the NAT core. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-27[AF_PACKET]: Remove bogus checks added to packet_sendmsg().David S. Miller1-6/+0
These broke existing apps, and the checks are superfluous as the values being verified aren't even used. Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-27[IPV6]: Fix [Bug 5306] Oops on IPv6 route lookupHerbert Xu1-0/+2
> Steps to reproduce: > 1. Boot Linux, do NOT setup any IPv6 routes > 2. ip route get 2001::1 (or any unroutable address) Well caught. We never set rt6i_idev on ip6_null_entry. This patch should make the problem go away. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-27[NET]: Make sure ctl buffer is aligned properly in sys_sendmsg().Alex Williamson1-1/+3
It's on the stack and declared as "unsigned char[]", but pointers and similar can be in here thus we need to give it an explicit alignment attribute. Signed-off-by: Alex Williamson <alex.williamson@hp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-25[NETFILTER] ip_conntrack: Update event cache when status changesHarald Welte3-1/+4
The GRE, SCTP and TCP protocol helpers did not call ip_conntrack_event_cache() when updating ct->status. This patch adds the respective calls. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-25[IRDA]: *irttp cleanupAlexey Dobriyan1-11/+4
* Remove useless comment. * Remove useless assertions. * Remove useless comparison. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-25[IRDA]: Fix memory leak in irttp_init()Alexey Dobriyan1-0/+1
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-25[NET]: Protect neigh_stat_seq_fops by CONFIG_PROC_FSAmos Waterland1-0/+2
From: Amos Waterland <apw@us.ibm.com> If CONFIG_PROC_FS is not selected, the compiler emits this warning: net/core/neighbour.c:64: warning: `neigh_stat_seq_fops' defined but not used Which is correct, because neigh_stat_seq_fops is in fact only initialized and used by code that is protected by CONFIG_PROC_FS. So this patch fixes that up. Signed-off-by: Amos Waterland <apw@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-25[NETFILTER]: Fix ip[6]t_NFQUEUE Kconfig dependencyHarald Welte4-2/+24
We have to introduce a separate Kconfig menu entry for the NFQUEUE targets. They cannot "just" depend on nfnetlink_queue, since nfnetlink_queue could be linked into the kernel, whereas iptables can be a module. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-23[SCTP]: Fix SCTP_SHUTDOWN notifications.Sridhar Samudrala1-11/+11
Fix to allow SCTP_SHUTDOWN notifications to be received on 1-1 style SCTP SOCK_STREAM sockets. Add SCTP_SHUTDOWN notification to the receive queue before updating the state of the association. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-23[NETFILTER] Fix conntrack event cache deadlock/oopsHarald Welte5-28/+28
This patch fixes a number of bugs. It cannot be reasonably split up in multiple fixes, since all bugs interact with each other and affect the same function: Bug #1: The event cache code cannot be called while a lock is held. Therefore, the call to ip_conntrack_event_cache() within ip_ct_refresh_acct() needs to be moved outside of the locked section. This fixes a number of 2.6.14-rcX oops and deadlock reports. Bug #2: We used to call ct_add_counters() for unconfirmed connections without holding a lock. Since the add operations are not atomic, we could race with another CPU. Bug #3: ip_ct_refresh_acct() lost REFRESH events in some cases where refresh (and the corresponding event) are desired, but no accounting shall be performed. Both, evenst and accounting implicitly depended on the skb parameter bein non-null. We now re-introduce a non-accounting "ip_ct_refresh()" variant to explicitly state the desired behaviour. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-23[NETFILTER] Fix sparse endian warnings in pptp helperAlexey Dobriyan1-6/+8
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-23[NETFILTER] fix DEBUG statement in PPTP helperHarald Welte1-1/+1
As noted by Alexey Dobriyan, the DEBUGP statement prints the wrong callID. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-23[BRIDGE]: TSO fix in br_dev_queue_push_xmitVlad Drukker1-1/+2
Signed-off-by: Vlad Drukker <vlad@storewiz.com> Acked-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-23[TCP]: Adjust Reno SACK estimate in tcp_fragmentHerbert Xu1-0/+9
Since the introduction of TSO pcount a year ago, it has been possible for tcp_fragment() to cause packets_out to decrease. Prior to that, tcp_retrans_try_collapse() was the only way for that to happen on the retransmission path. When this happens with Reno, it is possible for sasked_out to become invalid because it is only an estimate and not tied to any particular packet on the retransmission queue. Therefore we need to adjust sacked_out as well as left_out in the Reno case. The following patch does exactly that. This bug is pretty difficult to trigger in practice though since you need a SACKless peer with a retransmission that occurs just as the cached MTU value expires. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-22[LLC]: fix llc_ui_recvmsg, making it behave like tcp_recvmsgArnaldo Carvalho de Melo4-36/+153
In fact it is an exact copy of the parts that makes sense to LLC :-) Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22[LLC]: Fix the accept pathArnaldo Carvalho de Melo6-92/+130
Borrowing the structure of TCP/IP for this. On the receive of new connections I was bh_lock_socking the _new_ sock, not the listening one, duh, now it survives the ssh connections storm I've been using to test this specific bug. Also fixes send side skb sock accounting. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22[LLC]: Fix sparse warningsArnaldo Carvalho de Melo5-13/+6
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22[TR]: Set correct frame type for SNAP packetsJochen Friedrich1-1/+1
Signed-off-by: Jochen Friedrich <jochen@scram.de> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22[LLC]: Fix llc_fixup_skb() bugJochen Friedrich1-2/+6
llc_fixup_skb() had a bug dropping 3 bytes packets (like UA frames). Token ring doesn't pad these frames. Signed-off-by: Jochen Friedrich <jochen@scram.de> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22[LLC]: Fix for Bugzilla ticket #5157Jochen Friedrich1-2/+2
Signed-off-by: Jochen Friedrich <jochen@scram.de> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22[LLC]: Fix for Bugzilla ticket #5156Jochen Friedrich2-0/+8
Signed-off-by: Jochen Friedrich <jochen@scram.de> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22[LLC]: Use refcounting with struct llc_sapArnaldo Carvalho de Melo7-46/+41
Signed-off-by: Jochen Friedrich <jochen@scram.de> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22[LLC]: Do better struct sock accounting on skbsArnaldo Carvalho de Melo4-12/+14
Signed-off-by: Jochen Friedrich <jochen@scram.de> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22[LLC]: Use sk_wait_dataArnaldo Carvalho de Melo1-19/+23
Signed-off-by: Jochen Friedrich <jochen@scram.de> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22[LLC]: Use some more likely/unlikelyArnaldo Carvalho de Melo6-47/+42
Signed-off-by: Jochen Friedrich <jochen@scram.de> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-09-22[LLC]: Add sysctl support for the LLC timeoutsArnaldo Carvalho de Melo6-25/+195
Signed-off-by: Jochen Friedrich <jochen@scram.de> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>