summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2011-03-08net: allow handlers to be processed for orig_devJiri Pirko1-1/+2
This was there before, I forgot about this. Allows deliveries to ptype_base handlers registered for orig_dev. I presume this is still desired. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-08ipv4: Inline fib_semantic_match into check_leafDavid S. Miller3-75/+51
This elimiates a lot of pure overhead due to parameter passing. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-08ipv4: Validate route entry type at insert instead of every lookup.David S. Miller1-26/+28
fib_semantic_match() requires that if the type doesn't signal an automatic error, it must be of type RTN_UNICAST, RTN_LOCAL, RTN_BROADCAST, RTN_ANYCAST, or RTN_MULTICAST. Checking this every route lookup is pointless work. Instead validate it during route insertion, via fib_create_info(). Also, there was nothing making sure the type value was less than RTN_MAX, so add that missing check while we're here. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-07Merge branch 'batman-adv/next' of git://git.open-mesh.org/ecsv/linux-mergeDavid S. Miller24-1232/+1634
2011-03-05batman-adv: Disallow regular interface as mesh deviceSven Eckelmann3-12/+36
When trying to associate a net_device with another net_device which already exists, batman-adv assumes that this interface is a fully initialized batman mesh interface without checking it. The behaviour when accessing data behind netdev_priv of a random net_device is undefined and potentially dangerous. Reported-by: Linus Lüssing <linus.luessing@ascom.ch> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: Remove unused hdr_size variable in route_unicast_packet()Linus Lüssing3-7/+5
Signed-off-by: Linus Lüssing <linus.luessing@ascom.ch> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: rename batman_if struct to hard_ifaceMarek Lindner18-330/+335
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: rename global if_list to hardif_listMarek Lindner6-22/+22
Batman-adv works with "hard interfaces" as well as "soft interfaces". The new name should better make clear which kind of interfaces this list stores. Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: remove orig_hash spinlockMarek Lindner7-148/+38
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: increase refcount in create_neighbor to be consistentMarek Lindner2-37/+30
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: Correct rcu refcounting for orig_nodeMarek Lindner9-36/+49
It might be possible that 2 threads access the same data in the same rcu grace period. The first thread calls call_rcu() to decrement the refcount and free the data while the second thread increases the refcount to use the data. To avoid this race condition all refcount operations have to be atomic. Reported-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: remove extra layer between hash and hash element - hash bucketMarek Lindner10-294/+298
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: separate ethernet comparing calls from hash functionsMarek Lindner8-42/+51
Note: The function compare_ether_addr() provided by the Linux kernel requires aligned memory. Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: Fix possible buffer overflow in softif neigh list outputLinus Lüssing1-21/+1
When printing the soft interface table the number of entries in the softif neigh list are first being counted and a fitting buffer allocated. After that the softif neigh list gets locked again and the buffer printed - which has the following two issues: For one thing, the softif neigh list might have grown when reacquiring the rcu lock, which results in writing outside of the allocated buffer. Furthermore 31 Bytes are not enough for printing an entry with a vid of more than 2 digits. The manual buffering is unnecessary, we can safely print to the seq directly during the rcu_read_lock(). Signed-off-by: Linus Lüssing <linus.luessing@ascom.ch> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: Increase orig_node refcount before releasing rcu read lockLinus Lüssing2-3/+5
When unicast_send_skb() is increasing the orig_node's refcount another thread might have been freeing this orig_node already. We need to increase the refcount in the rcu read lock protected area to avoid that. Signed-off-by: Linus Lüssing <linus.luessing@ascom.ch> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: Make bat_priv->curr_gw an rcu protected pointerLinus Lüssing2-32/+72
The rcu protected macros rcu_dereference() and rcu_assign_pointer() for the bat_priv->curr_gw need to be used, as well as spin/rcu locking. Otherwise we might end up using a curr_gw pointer pointing to already freed memory. Reported-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Linus Lüssing <linus.luessing@ascom.ch> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: make broadcast seqno operations atomicMarek Lindner3-22/+37
Batman-adv could receive several payload broadcasts at the same time that would trigger access to the broadcast seqno sliding window to determine whether this is a new broadcast or not. If these incoming broadcasts are accessing the sliding window simultaneously it could be left in an inconsistent state. Therefore it is necessary to make sure this access is atomic. Reported-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: protect bit operations to count OGMs with spinlockMarek Lindner2-34/+33
Reported-by: Linus Lüssing <linus.luessing@saxnet.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: Correct rcu refcounting for batman_ifMarek Lindner4-38/+33
It might be possible that 2 threads access the same data in the same rcu grace period. The first thread calls call_rcu() to decrement the refcount and free the data while the second thread increases the refcount to use the data. To avoid this race condition all refcount operations have to be atomic. Reported-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: Correct rcu refcounting for softif_neighMarek Lindner2-17/+16
It might be possible that 2 threads access the same data in the same rcu grace period. The first thread calls call_rcu() to decrement the refcount and free the data while the second thread increases the refcount to use the data. To avoid this race condition all refcount operations have to be atomic. Reported-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: Correct rcu refcounting for gw_nodeMarek Lindner2-22/+17
It might be possible that 2 threads access the same data in the same rcu grace period. The first thread calls call_rcu() to decrement the refcount and free the data while the second thread increases the refcount to use the data. To avoid this race condition all refcount operations have to be atomic. Reported-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: Correct rcu refcounting for neigh_nodeMarek Lindner7-174/+313
It might be possible that 2 threads access the same data in the same rcu grace period. The first thread calls call_rcu() to decrement the refcount and free the data while the second thread increases the refcount to use the data. To avoid this race condition all refcount operations have to be atomic. Reported-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: protect bonding with rcu locksSimon Wunderlich7-163/+195
bonding / alternating candidates need to be secured by rcu locks as well. This patch therefore converts the bonding list from a plain pointer list to a rcu securable lists and references the bonding candidates. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: protect ogm counter arrays with spinlockMarek Lindner3-6/+33
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: protect originator nodes with reference countersMarek Lindner4-19/+78
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: protect each hash row with rcu locksMarek Lindner8-45/+141
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: protect neigh_nodes used outside of rcu_locks with refcountingMarek Lindner1-9/+31
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: free neighbors when an interface is deactivatedMarek Lindner1-2/+7
hardif_disable_interface() calls purge_orig_ref() to immediately free all neighbors associated with the interface that is going down. purge_orig_neighbors() checked if the interface status is IF_INACTIVE which is set to IF_NOT_IN_USE shortly before calling purge_orig_ref(). Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: protect neighbor list with rcu locksMarek Lindner3-21/+57
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: convert neighbor list to hlistMarek Lindner3-28/+35
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05batman-adv: protect neighbor nodes with reference countersMarek Lindner4-8/+28
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-03-05ipv4: Remove flowi from struct rtable.David S. Miller6-85/+133
The only necessary parts are the src/dst addresses, the interface indexes, the TOS, and the mark. The rest is unnecessary bloat, which amounts to nearly 50 bytes on 64-bit. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-05ipv4: Set rt->rt_iif more sanely on output routes.David S. Miller1-1/+1
rt->rt_iif is only ever inspected on input routes, for example DCCP uses this to populate a route lookup flow key when generating replies to another packet. Therefore, setting it to anything other than zero on output routes makes no sense. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-05ipv4: Get peer more cheaply in rt_init_metrics().David S. Miller1-2/+2
We know this is a new route object, so doing atomics and stuff makes no sense at all. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-05ipv4: Optimize flow initialization in output route lookup.David S. Miller1-8/+10
We burn a lot of useless cycles, cpu store buffer traffic, and memory operations memset()'ing the on-stack flow used to perform output route lookups in __ip_route_output_key(). Only the first half of the flow object members even matter for output route lookups in this context, specifically: FIB rules matching cares about: dst, src, tos, iif, oif, mark FIB trie lookup cares about: dst FIB semantic match cares about: tos, scope, oif Therefore only initialize these specific members and elide the memset entirely. On Niagara2 this kills about ~300 cycles from the output route lookup path. Likely, we can take things further, since all callers of output route lookups essentially throw away the on-stack flow they use. So they don't care if we use it as a scratch-pad to compute the final flow key. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
2011-03-05inetpeer: seqlock optimizationEric Dumazet1-40/+35
David noticed : ------------------ Eric, I was profiling the non-routing-cache case and something that stuck out is the case of calling inet_getpeer() with create==0. If an entry is not found, we have to redo the lookup under a spinlock to make certain that a concurrent writer rebalancing the tree does not "hide" an existing entry from us. This makes the case of a create==0 lookup for a not-present entry really expensive. It is on the order of 600 cpu cycles on my Niagara2. I added a hack to not do the relookup under the lock when create==0 and it now costs less than 300 cycles. This is now a pretty common operation with the way we handle COW'd metrics, so I think it's definitely worth optimizing. ----------------- One solution is to use a seqlock instead of a spinlock to protect struct inet_peer_base. After a failed avl tree lookup, we can easily detect if a writer did some changes during our lookup. Taking the lock and redo the lookup is only necessary in this case. Note: Add one private rcu_deref_locked() macro to place in one spot the access to spinlock included in seqlock. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-04Merge branch 'for-davem' of ↵David S. Miller14-159/+266
ssh://master.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
2011-03-04Merge branch 'master' of ↵John W. Linville14-159/+266
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
2011-03-04Merge branch 'master' of ↵David S. Miller17-73/+108
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bnx2x/bnx2x.h
2011-03-04Merge branch 'for-linus' of ↵Linus Torvalds1-3/+17
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: DNS: Fix a NULL pointer deref when trying to read an error key [CVE-2011-1076]
2011-03-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds8-19/+36
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (42 commits) MAINTAINERS: Add Andy Gospodarek as co-maintainer. r8169: disable ASPM RxRPC: Fix v1 keys AF_RXRPC: Handle receiving ACKALL packets cnic: Fix lost interrupt on bnx2x cnic: Prevent status block race conditions with hardware net: dcbnl: check correct ops in dcbnl_ieee_set() e1000e: disable broken PHY wakeup for ICH10 LOMs, use MAC wakeup instead igb: fix sparse warning e1000: fix sparse warning netfilter: nf_log: avoid oops in (un)bind with invalid nfproto values dccp: fix oops on Reset after close ipvs: fix dst_lock locking on dest update davinci_emac: Add Carrier Link OK check in Davinci RX Handler bnx2x: update driver version to 1.62.00-6 bnx2x: properly calculate lro_mss bnx2x: perform statistics "action" before state transition. bnx2x: properly configure coefficients for MinBW algorithm (NPAR mode). bnx2x: Fix ethtool -t link test for MF (non-pmf) devices. bnx2x: Fix nvram test for single port devices. ...
2011-03-04DNS: Fix a NULL pointer deref when trying to read an error key [CVE-2011-1076]David Howells1-3/+17
When a DNS resolver key is instantiated with an error indication, attempts to read that key will result in an oops because user_read() is expecting there to be a payload - and there isn't one [CVE-2011-1076]. Give the DNS resolver key its own read handler that returns the error cached in key->type_data.x[0] as an error rather than crashing. Also make the kenter() at the beginning of dns_resolver_instantiate() limit the amount of data it prints, since the data is not necessarily NUL-terminated. The buggy code was added in: commit 4a2d789267e00b5a1175ecd2ddefcc78b83fbf09 Author: Wang Lei <wang840925@gmail.com> Date: Wed Aug 11 09:37:58 2010 +0100 Subject: DNS: If the DNS server returns an error, allow that to be cached [ver #2] This can trivially be reproduced by any user with the following program compiled with -lkeyutils: #include <stdlib.h> #include <keyutils.h> #include <err.h> static char payload[] = "#dnserror=6"; int main() { key_serial_t key; key = add_key("dns_resolver", "a", payload, sizeof(payload), KEY_SPEC_SESSION_KEYRING); if (key == -1) err(1, "add_key"); if (keyctl_read(key, NULL, 0) == -1) err(1, "read_key"); return 0; } What should happen is that keyctl_read() reports error 6 (ENXIO) to the user: dns-break: read_key: No such device or address but instead the kernel oopses. This cannot be reproduced with the 'keyutils add' or 'keyutils padd' commands as both of those cut the data down below the NUL termination that must be included in the data. Without this dns_resolver_instantiate() will return -EINVAL and the key will not be instantiated such that it can be read. The oops looks like: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 IP: [<ffffffff811b99f7>] user_read+0x4f/0x8f PGD 3bdf8067 PUD 385b9067 PMD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/pci0000:00/0000:00:19.0/irq CPU 0 Modules linked in: Pid: 2150, comm: dns-break Not tainted 2.6.38-rc7-cachefs+ #468 /DG965RY RIP: 0010:[<ffffffff811b99f7>] [<ffffffff811b99f7>] user_read+0x4f/0x8f RSP: 0018:ffff88003bf47f08 EFLAGS: 00010246 RAX: 0000000000000001 RBX: ffff88003b5ea378 RCX: ffffffff81972368 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88003b5ea378 RBP: ffff88003bf47f28 R08: ffff88003be56620 R09: 0000000000000000 R10: 0000000000000395 R11: 0000000000000002 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: ffffffffffffffa1 FS: 00007feab5751700(0000) GS:ffff88003e000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000010 CR3: 000000003de40000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process dns-break (pid: 2150, threadinfo ffff88003bf46000, task ffff88003be56090) Stack: ffff88003b5ea378 ffff88003b5ea3a0 0000000000000000 0000000000000000 ffff88003bf47f68 ffffffff811b708e ffff88003c442bc8 0000000000000000 00000000004005a0 00007fffba368060 0000000000000000 0000000000000000 Call Trace: [<ffffffff811b708e>] keyctl_read_key+0xac/0xcf [<ffffffff811b7c07>] sys_keyctl+0x75/0xb6 [<ffffffff81001f7b>] system_call_fastpath+0x16/0x1b Code: 75 1f 48 83 7b 28 00 75 18 c6 05 58 2b fb 00 01 be bb 00 00 00 48 c7 c7 76 1c 75 81 e8 13 c2 e9 ff 4c 8b b3 e0 00 00 00 4d 85 ed <41> 0f b7 5e 10 74 2d 4d 85 e4 74 28 e8 98 79 ee ff 49 39 dd 48 RIP [<ffffffff811b99f7>] user_read+0x4f/0x8f RSP <ffff88003bf47f08> CR2: 0000000000000010 Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Jeff Layton <jlayton@redhat.com> cc: Wang Lei <wang840925@gmail.com> Signed-off-by: James Morris <jmorris@namei.org>
2011-03-04netlink: kill eff_cap from struct netlink_skb_parmsPatrick McHardy1-6/+0
Netlink message processing in the kernel is synchronous these days, capabilities can be checked directly in security_netlink_recv() from the current process. Signed-off-by: Patrick McHardy <kaber@trash.net> Reviewed-by: James Morris <jmorris@namei.org> [chrisw: update to include pohmelfs and uvesafb] Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-03ipv6: Use ERR_CAST in addrconf_dst_alloc.David S. Miller1-6/+1
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-03ipv4: Fix __ip_dev_find() to use ifa_local instead of ifa_address.David S. Miller1-2/+2
Reported-by: Stephen Hemminger <shemminger@vyatta.com> Reported-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-03net_sched: reduce fifo qdisc sizeEric Dumazet2-30/+22
Because of various alignements [SLUB / qdisc], we use 512 bytes of memory for one {p|b}fifo qdisc, instead of 256 bytes on 64bit arches and 192 bytes on 32bit ones. Move the "u32 limit" inside "struct Qdisc" (no impact on other qdiscs) Change qdisc_alloc(), first trying a regular allocation before an oversized one. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-03netlink: kill loginuid/sessionid/sid members from struct netlink_skb_parmsPatrick McHardy3-30/+35
Netlink message processing in the kernel is synchronous these days, the session information can be collected when needed. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-03ipv4: Fix crash in dst_release when udp_sendmsg route lookup fails.David S. Miller1-0/+1
As reported by Eric: [11483.697233] IP: [<c12b0638>] dst_release+0x18/0x60 ... [11483.697741] Call Trace: [11483.697764] [<c12fc9d2>] udp_sendmsg+0x282/0x6e0 [11483.697790] [<c12a1c01>] ? memcpy_toiovec+0x51/0x70 [11483.697818] [<c12dbd90>] ? ip_generic_getfrag+0x0/0xb0 The pointer passed to dst_release() is -EINVAL, that's because we leave an error pointer in the local variable "rt" by accident. NULL it out to fix the bug. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-03AF_RXRPC: Handle receiving ACKALL packetsDavid Howells1-0/+1
The OpenAFS server is now sending ACKALL packets, so we need to handle them. Otherwise we report a protocol error and abort. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-03dcbnl: add support for retrieving peer configuration - ceeShmulik Ravid1-4/+81
This patch adds the support for retrieving the remote or peer DCBX configuration via dcbnl for embedded DCBX stacks supporting the CEE DCBX standard. Signed-off-by: Shmulik Ravid <shmulikr@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>