summaryrefslogtreecommitdiff
path: root/net/ipv6
AgeCommit message (Collapse)AuthorFilesLines
2016-01-30ipv6/addrlabel: fix ip6addrlbl_get()Andrey Ryabinin1-1/+1
commit e459dfeeb64008b2d23bdf600f03b3605dbb8152 upstream. ip6addrlbl_get() has never worked. If ip6addrlbl_hold() succeeded, ip6addrlbl_get() will exit with '-ESRCH'. If ip6addrlbl_hold() failed, ip6addrlbl_get() will use about to be free ip6addrlbl_entry pointer. Fix this by inverting ip6addrlbl_hold() check. Fixes: 2a8cc6c89039 ("[IPV6] ADDRCONF: Support RFC3484 configurable address selection policy table.") Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Cong Wang <cwang@twopensource.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> (cherry picked from commit 39b214ba1a357359f9c0be6ef8d21f2e5187567a) Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-01-30net: add validation for the socket syscall protocol argumentHannes Frederic Sowa1-0/+3
commit 79462ad02e861803b3840cc782248c7359451cd9 upstream. 郭永刚 reported that one could simply crash the kernel as root by using a simple program: int socket_fd; struct sockaddr_in addr; addr.sin_port = 0; addr.sin_addr.s_addr = INADDR_ANY; addr.sin_family = 10; socket_fd = socket(10,3,0x40000000); connect(socket_fd , &addr,16); AF_INET, AF_INET6 sockets actually only support 8-bit protocol identifiers. inet_sock's skc_protocol field thus is sized accordingly, thus larger protocol identifiers simply cut off the higher bits and store a zero in the protocol fields. This could lead to e.g. NULL function pointer because as a result of the cut off inet_num is zero and we call down to inet_autobind, which is NULL for raw sockets. kernel: Call Trace: kernel: [<ffffffff816db90e>] ? inet_autobind+0x2e/0x70 kernel: [<ffffffff816db9a4>] inet_dgram_connect+0x54/0x80 kernel: [<ffffffff81645069>] SYSC_connect+0xd9/0x110 kernel: [<ffffffff810ac51b>] ? ptrace_notify+0x5b/0x80 kernel: [<ffffffff810236d8>] ? syscall_trace_enter_phase2+0x108/0x200 kernel: [<ffffffff81645e0e>] SyS_connect+0xe/0x10 kernel: [<ffffffff81779515>] tracesys_phase2+0x84/0x89 I found no particular commit which introduced this problem. CVE: CVE-2015-8543 Cc: Cong Wang <cwang@twopensource.com> Reported-by: 郭永刚 <guoyonggang@360.cn> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 2.6.32: open-code U8_MAX; adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-01-30udp: properly support MSG_PEEK with truncated buffersEric Dumazet1-2/+4
commit 197c949e7798fbf28cfadc69d9ca0c2abbf93191 upstream. Backport of this upstream commit into stable kernels : 89c22d8c3b27 ("net: Fix skb csum races when peeking") exposed a bug in udp stack vs MSG_PEEK support, when user provides a buffer smaller than skb payload. In this case, skb_copy_and_csum_datagram_iovec(skb, sizeof(struct udphdr), msg->msg_iov); returns -EFAULT. This bug does not happen in upstream kernels since Al Viro did a great job to replace this into : skb_copy_and_csum_datagram_msg(skb, sizeof(struct udphdr), msg); This variant is safe vs short buffers. For the time being, instead reverting Herbert Xu patch and add back skb->ip_summed invalid changes, simply store the result of udp_lib_checksum_complete() so that we avoid computing the checksum a second time, and avoid the problematic skb_copy_and_csum_datagram_iovec() call. This patch can be applied on recent kernels as it avoids a double checksumming, then backported to stable kernels as a bug fix. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> (cherry picked from commit 18a6eba2eabbcb50a78210b16f7dd43d888a537b) Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-01-30Revert "net: add length argument to skb_copy_and_csum_datagram_iovec"Willy Tarreau2-3/+2
This reverts commit c507639ba963bb47e3f515670a7cace76af76ab6. As reported by Michal Kubecek, this fix doesn't handle truncated reads correctly. Next patch from Eric fixes it better. Signed-off-by: Willy Tarreau <w@1wt.eu>
2016-01-30ip6mr: call del_timer_sync() in ip6mr_free_table()WANG Cong1-1/+1
commit 7ba0c47c34a1ea5bc7a24ca67309996cce0569b5 upstream. We need to wait for the flying timers, since we are going to free the mrtable right after it. Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Ben Hutchings <ben@decadent.org.uk> [ wt: 2.6.32 has a single table hence a single timer. ip6_mr_init() has the same del_timer() call on the error path, but we don't need to change it since at this point the timer hasn't been started yet ] Signed-off-by: Willy Tarreau <w@1wt.eu>
2015-12-06net: add length argument to skb_copy_and_csum_datagram_iovecSabrina Dubroca2-2/+3
Without this length argument, we can read past the end of the iovec in memcpy_toiovec because we have no way of knowing the total length of the iovec's buffers. This is needed for stable kernels where 89c22d8c3b27 ("net: Fix skb csum races when peeking") has been backported but that don't have the ioviter conversion, which is almost all the stable trees <= 3.18. This also fixes a kernel crash for NFS servers when the client uses -onfsvers=3,proto=udp to mount the export. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> [bwh: Backported to 3.2: adjust context in include/linux/skbuff.h] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> (cherry picked from commit 127500d724f8c43f452610c9080444eedb5eaa6c) Signed-off-by: Willy Tarreau <w@1wt.eu>
2015-12-06ipv6: addrconf: validate new MTU before applying itMarcelo Leitner1-1/+36
commit 77751427a1ff25b27d47a4c36b12c3c8667855ac upstream. Currently we don't check if the new MTU is valid or not and this allows one to configure a smaller than minimum allowed by RFCs or even bigger than interface own MTU, which is a problem as it may lead to packet drops. If you have a daemon like NetworkManager running, this may be exploited by remote attackers by forging RA packets with an invalid MTU, possibly leading to a DoS. (NetworkManager currently only validates for values too small, but not for too big ones.) The fix is just to make sure the new value is valid. That is, between IPV6_MIN_MTU and interface's MTU. Note that similar check is already performed at ndisc_router_discovery(), for when kernel itself parses the RA. Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 2.6.32: - Add a strategy for the sysctl as we don't get a default strategy - Adjust context, spacing] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Willy Tarreau <w@1wt.eu>
2015-09-18ipv6: Fix return of xfrm6_tunnel_rcv()David S. Miller1-1/+1
commit 6ac3f6649223d916bbdf1e823926f8f3b34b5d99 upstream. Like ipv4, just return xfrm6_rcv_spi()'s return value directly. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Willy Tarreau <w@1wt.eu>
2015-09-18udp: fix behavior of wrong checksumsEric Dumazet1-4/+2
commit beb39db59d14990e401e235faf66a6b9b31240b0 upstream. We have two problems in UDP stack related to bogus checksums : 1) We return -EAGAIN to application even if receive queue is not empty. This breaks applications using edge trigger epoll() 2) Under UDP flood, we can loop forever without yielding to other processes, potentially hanging the host, especially on non SMP. This patch is an attempt to make things better. We might in the future add extra support for rt applications wanting to better control time spent doing a recv() in a hostile environment. For example we could validate checksums before queuing packets in socket receive queue. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> CVE-2015-5364 CVE-2015-5366 Signed-off-by: Willy Tarreau <w@1wt.eu>
2015-05-24udp: only allow UFO for packets from SOCK_DGRAM socketsMichal Kubeček1-1/+2
[ Upstream commit acf8dd0a9d0b9e4cdb597c2f74802f79c699e802 ] If an over-MTU UDP datagram is sent through a SOCK_RAW socket to a UFO-capable device, ip_ufo_append_data() sets skb->ip_summed to CHECKSUM_PARTIAL unconditionally as all GSO code assumes transport layer checksum is to be computed on segmentation. However, in this case, skb->csum_start and skb->csum_offset are never set as raw socket transmit path bypasses udp_send_skb() where they are usually set. As a result, driver may access invalid memory when trying to calculate the checksum and store the result (as observed in virtio_net driver). Moreover, the very idea of modifying the userspace provided UDP header is IMHO against raw socket semantics (I wasn't able to find a document clearly stating this or the opposite, though). And while allowing CHECKSUM_NONE in the UFO case would be more efficient, it would be a bit too intrusive change just to handle a corner case like this. Therefore disallowing UFO for packets from SOCK_DGRAM seems to be the best option. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> (cherry picked from commit 332640b2821f75381b1049a904d93d4fb846334f) Signed-off-by: Willy Tarreau <w@1wt.eu>
2015-05-24ipv6: Don't reduce hop limit for an interfaceD.S. Ljungmark1-1/+8
commit 6fd99094de2b83d1d4c8457f2c83483b2828e75a upstream. A local route may have a lower hop_limit set than global routes do. RFC 3756, Section 4.2.7, "Parameter Spoofing" > 1. The attacker includes a Current Hop Limit of one or another small > number which the attacker knows will cause legitimate packets to > be dropped before they reach their destination. > As an example, one possible approach to mitigate this threat is to > ignore very small hop limits. The nodes could implement a > configurable minimum hop limit, and ignore attempts to set it below > said limit. Signed-off-by: D.S. Ljungmark <ljungmark@modio.se> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 3.2: adjust ND_PRINTK() usage] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> (cherry picked from commit f10f7d2a8200fe33c5030c7e32df3a2b3561f3cd) Signed-off-by: Willy Tarreau <w@1wt.eu>
2014-05-19ipv6: call udp_push_pending_frames when uncorking a socket with AF_INET ↵Hannes Frederic Sowa1-1/+6
pending data CVE-2013-4162 BugLink: http://bugs.launchpad.net/bugs/1205070 We accidentally call down to ip6_push_pending_frames when uncorking pending AF_INET data on a ipv6 socket. This results in the following splat (from Dave Jones): skbuff: skb_under_panic: text:ffffffff816765f6 len:48 put:40 head:ffff88013deb6df0 data:ffff88013deb6dec tail:0x2c end:0xc0 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:126! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: dccp_ipv4 dccp 8021q garp bridge stp dlci mpoa snd_seq_dummy sctp fuse hidp tun bnep nfnetlink scsi_transport_iscsi rfcomm can_raw can_bcm af_802154 appletalk caif_socket can caif ipt_ULOG x25 rose af_key pppoe pppox ipx phonet irda llc2 ppp_generic slhc p8023 psnap p8022 llc crc_ccitt atm bluetooth +netrom ax25 nfc rfkill rds af_rxrpc coretemp hwmon kvm_intel kvm crc32c_intel snd_hda_codec_realtek ghash_clmulni_intel microcode pcspkr snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep usb_debug snd_seq snd_seq_device snd_pcm e1000e snd_page_alloc snd_timer ptp snd pps_core soundcore xfs libcrc32c CPU: 2 PID: 8095 Comm: trinity-child2 Not tainted 3.10.0-rc7+ #37 task: ffff8801f52c2520 ti: ffff8801e6430000 task.ti: ffff8801e6430000 RIP: 0010:[<ffffffff816e759c>] [<ffffffff816e759c>] skb_panic+0x63/0x65 RSP: 0018:ffff8801e6431de8 EFLAGS: 00010282 RAX: 0000000000000086 RBX: ffff8802353d3cc0 RCX: 0000000000000006 RDX: 0000000000003b90 RSI: ffff8801f52c2ca0 RDI: ffff8801f52c2520 RBP: ffff8801e6431e08 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022ea0c800 R13: ffff88022ea0cdf8 R14: ffff8802353ecb40 R15: ffffffff81cc7800 FS: 00007f5720a10740(0000) GS:ffff880244c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000005862000 CR3: 000000022843c000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Stack: ffff88013deb6dec 000000000000002c 00000000000000c0 ffffffff81a3f6e4 ffff8801e6431e18 ffffffff8159a9aa ffff8801e6431e90 ffffffff816765f6 ffffffff810b756b 0000000700000002 ffff8801e6431e40 0000fea9292aa8c0 Call Trace: [<ffffffff8159a9aa>] skb_push+0x3a/0x40 [<ffffffff816765f6>] ip6_push_pending_frames+0x1f6/0x4d0 [<ffffffff810b756b>] ? mark_held_locks+0xbb/0x140 [<ffffffff81694919>] udp_v6_push_pending_frames+0x2b9/0x3d0 [<ffffffff81694660>] ? udplite_getfrag+0x20/0x20 [<ffffffff8162092a>] udp_lib_setsockopt+0x1aa/0x1f0 [<ffffffff811cc5e7>] ? fget_light+0x387/0x4f0 [<ffffffff816958a4>] udpv6_setsockopt+0x34/0x40 [<ffffffff815949f4>] sock_common_setsockopt+0x14/0x20 [<ffffffff81593c31>] SyS_setsockopt+0x71/0xd0 [<ffffffff816f5d54>] tracesys+0xdd/0xe2 Code: 00 00 48 89 44 24 10 8b 87 d8 00 00 00 48 89 44 24 08 48 8b 87 e8 00 00 00 48 c7 c7 c0 04 aa 81 48 89 04 24 31 c0 e8 e1 7e ff ff <0f> 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 RIP [<ffffffff816e759c>] skb_panic+0x63/0x65 RSP <ffff8801e6431de8> This patch adds a check if the pending data is of address family AF_INET and directly calls udp_push_ending_frames from udp_v6_push_pending_frames if that is the case. This bug was found by Dave Jones with trinity. (Also move the initialization of fl6 below the AF_INET check, even if not strictly necessary.) Cc: Dave Jones <davej@redhat.com> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> (back ported from commit 8822b64a0fa64a5dd1dfcf837c5b0be83f8c05d1) Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
2014-05-19inet: fix possible memory corruption with UDP_CORK and UFOHannes Frederic Sowa1-1/+1
[ This is a simplified -stable version of a set of upstream commits. ] This is a replacement patch only for stable which does fix the problems handled by the following two commits in -net: "ip_output: do skb ufo init for peeked non ufo skb as well" (e93b7d748be887cd7639b113ba7d7ef792a7efb9) "ip6_output: do skb ufo init for peeked non ufo skb as well" (c547dbf55d5f8cf615ccc0e7265e98db27d3fb8b) Three frames are written on a corked udp socket for which the output netdevice has UFO enabled. If the first and third frame are smaller than the mtu and the second one is bigger, we enqueue the second frame with skb_append_datato_frags without initializing the gso fields. This leads to the third frame appended regulary and thus constructing an invalid skb. This fixes the problem by always using skb_append_datato_frags as soon as the first frag got enqueued to the skb without marking the packet as SKB_GSO_UDP. The problem with only two frames for ipv6 was fixed by "ipv6: udp packets following an UFO enqueued packet need also be handled by UFO" (2811ebac2521ceac84f2bdae402455baa6a7fb47). Cc: Jiri Pirko <jiri@resnulli.us> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> (cherry picked from commit 5124ae99ac8a8f63d0fca9b75adaef40b20678ff) Signed-off-by: Willy Tarreau <w@1wt.eu>
2014-05-19ipv6: udp packets following an UFO enqueued packet need also be handled by UFOHannes Frederic Sowa1-19/+12
In the following scenario the socket is corked: If the first UDP packet is larger then the mtu we try to append it to the write queue via ip6_ufo_append_data. A following packet, which is smaller than the mtu would be appended to the already queued up gso-skb via plain ip6_append_data. This causes random memory corruptions. In ip6_ufo_append_data we also have to be careful to not queue up the same skb multiple times. So setup the gso frame only when no first skb is available. This also fixes a shortcoming where we add the current packet's length to cork->length but return early because of a packet > mtu with dontfrag set (instead of sutracting it again). Found with trinity. Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 2811ebac2521ceac84f2bdae402455baa6a7fb47) [wt: 2.6.32 doesn't have dontfrag so remove the optimization] Signed-off-by: Willy Tarreau <w@1wt.eu>
2014-05-19ipv6: fix possible seqlock deadlock in ip6_finish_output2Hannes Frederic Sowa1-2/+2
[ Upstream commit 7f88c6b23afbd31545c676dea77ba9593a1a14bf ] IPv6 stats are 64 bits and thus are protected with a seqlock. By not disabling bottom-half we could deadlock here if we don't disable bh and a softirq reentrantly updates the same mib. Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Willy Tarreau <w@1wt.eu>
2014-05-19ipv6: fix leaking uninitialized port number of offender sockaddrHannes Frederic Sowa1-0/+1
[ Upstream commit 1fa4c710b6fe7b0aac9907240291b6fe6aafc3b8 ] Offenders don't have port numbers, so set it to 0. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Willy Tarreau <w@1wt.eu>
2014-05-19inet: fix addr_len/msg->msg_namelen assignment in recv_error and rxpmtu ↵Hannes Frederic Sowa3-3/+4
functions [ Upstream commit 85fbaa75037d0b6b786ff18658ddf0b4014ce2a4 ] Commit bceaa90240b6019ed73b49965eac7d167610be69 ("inet: prevent leakage of uninitialized memory to user in recv syscalls") conditionally updated addr_len if the msg_name is written to. The recv_error and rxpmtu functions relied on the recvmsg functions to set up addr_len before. As this does not happen any more we have to pass addr_len to those functions as well and set it to the size of the corresponding sockaddr length. This broke traceroute and such. Fixes: bceaa90240b6 ("inet: prevent leakage of uninitialized memory to user in recv syscalls") Reported-by: Brad Spengler <spender@grsecurity.net> Reported-by: Tom Labanowski Cc: mpb <mpb.mail@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Willy Tarreau <w@1wt.eu>
2014-05-19inet: prevent leakage of uninitialized memory to user in recv syscallsHannes Frederic Sowa2-7/+2
[ Upstream commit bceaa90240b6019ed73b49965eac7d167610be69 ] Only update *addr_len when we actually fill in sockaddr, otherwise we can return uninitialized memory from the stack to the caller in the recvfrom, recvmmsg and recvmsg syscalls. Drop the the (addr_len == NULL) checks because we only get called with a valid addr_len pointer either from sock_common_recvmsg or inet_recvmsg. If a blocking read waits on a socket which is concurrently shut down we now return zero and set msg_msgnamelen to 0. Reported-by: mpb <mpb.mail@gmail.com> Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> [wt: no ieee802154, ping nor l2tp in 2.6.32] Signed-off-by: Willy Tarreau <w@1wt.eu>
2014-05-19ipv6: use rt6_get_dflt_router to get default router in rt6_route_rcvDuan Jiong1-2/+5
[ Upstream commit f104a567e673f382b09542a8dc3500aa689957b4 ] As the rfc 4191 said, the Router Preference and Lifetime values in a ::/0 Route Information Option should override the preference and lifetime values in the Router Advertisement header. But when the kernel deals with a ::/0 Route Information Option, the rt6_get_route_info() always return NULL, that means that overriding will not happen, because those default routers were added without flag RTF_ROUTEINFO in rt6_add_dflt_router(). In order to deal with that condition, we should call rt6_get_dflt_router when the prefix length is 0. Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Willy Tarreau <w@1wt.eu>
2014-05-19ICMPv6: treat dest unreachable codes 5 and 6 as EACCES, not EPROTOJiri Bohac1-1/+9
[ Upstream commit 61e76b178dbe7145e8d6afa84bb4ccea71918994 ] RFC 4443 has defined two additional codes for ICMPv6 type 1 (destination unreachable) messages: 5 - Source address failed ingress/egress policy 6 - Reject route to destination Now they are treated as protocol error and icmpv6_err_convert() converts them to EPROTO. RFC 4443 says: "Codes 5 and 6 are more informative subsets of code 1." Treat codes 5 and 6 as code 1 (EACCES) Btw, connect() returning -EPROTO confuses firefox, so that fallback to other/IPv4 addresses does not work: https://bugzilla.mozilla.org/show_bug.cgi?id=910773 Signed-off-by: Jiri Bohac <jbohac@suse.cz> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Willy Tarreau <w@1wt.eu>
2014-05-19ipv6: Don't depend on per socket memory for neighbour discovery messagesThomas Graf1-7/+9
[ Upstream commit 25a6e6b84fba601eff7c28d30da8ad7cfbef0d43 ] Allocating skbs when sending out neighbour discovery messages currently uses sock_alloc_send_skb() based on a per net namespace socket and thus share a socket wmem buffer space. If a netdevice is temporarily unable to transmit due to carrier loss or for other reasons, the queued up ndisc messages will cosnume all of the wmem space and will thus prevent from any more skbs to be allocated even for netdevices that are able to transmit packets. The number of neighbour discovery messages sent is very limited, use of alloc_skb() bypasses the socket wmem buffer size enforcement while the manual call to skb_set_owner_w() maintains the socket reference needed for the IPv6 output path. This patch has orginally been posted by Eric Dumazet in a modified form. Signed-off-by: Thomas Graf <tgraf@suug.ch> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Stephen Warren <swarren@wwwdotorg.org> Cc: Fabio Estevam <festevam@gmail.com> Tested-by: Fabio Estevam <fabio.estevam@freescale.com> Tested-by: Stephen Warren <swarren@nvidia.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Willy Tarreau <w@1wt.eu>
2014-05-19ipv6: drop packets with multiple fragmentation headersHannes Frederic Sowa1-0/+5
[ Upstream commit f46078cfcd77fa5165bf849f5e568a7ac5fa569c ] It is not allowed for an ipv6 packet to contain multiple fragmentation headers. So discard packets which were already reassembled by fragmentation logic and send back a parameter problem icmp. The updates for RFC 6980 will come in later, I have to do a bit more research here. Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Willy Tarreau <w@1wt.eu>
2014-05-19ipv6: remove max_addresses check from ipv6_create_tempaddrHannes Frederic Sowa1-6/+4
[ Upstream commit 4b08a8f1bd8cb4541c93ec170027b4d0782dab52 ] Because of the max_addresses check attackers were able to disable privacy extensions on an interface by creating enough autoconfigured addresses: <http://seclists.org/oss-sec/2012/q4/292> But the check is not actually needed: max_addresses protects the kernel to install too many ipv6 addresses on an interface and guards addrconf_prefix_rcv to install further addresses as soon as this limit is reached. We only generate temporary addresses in direct response of a new address showing up. As soon as we filled up the maximum number of addresses of an interface, we stop installing more addresses and thus also stop generating more temp addresses. Even if the attacker tries to generate a lot of temporary addresses by announcing a prefix and removing it again (lifetime == 0) we won't install more temp addresses, because the temporary addresses do count to the maximum number of addresses, thus we would stop installing new autoconfigured addresses when the limit is reached. This patch fixes CVE-2013-0343 (but other layer-2 attacks are still possible). Thanks to Ding Tianhong to bring this topic up again. Cc: Ding Tianhong <dingtianhong@huawei.com> Cc: George Kargiotakis <kargig@void.gr> Cc: P J P <ppandit@redhat.com> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Willy Tarreau <w@1wt.eu>
2014-05-19ipv6: don't stop backtracking in fib6_lookup_1 if subtree does not matchHannes Frederic Sowa1-4/+12
[ Upstream commit 3e3be275851bc6fc90bfdcd732cd95563acd982b ] In case a subtree did not match we currently stop backtracking and return NULL (root table from fib_lookup). This could yield in invalid routing table lookups when using subtrees. Instead continue to backtrack until a valid subtree or node is found and return this match. Also remove unneeded NULL check. Reported-by: Teco Boot <teco@inf-net.nl> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Cc: David Lamparter <equinox@diac24.net> Cc: <boutier@pps.univ-paris-diderot.fr> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Willy Tarreau <w@1wt.eu>
2014-05-19ipv6: fix possible crashes in ip6_cork_release()Eric Dumazet1-1/+1
[ Upstream commit 284041ef21fdf2e0d216ab6b787bc9072b4eb58a ] commit 0178b695fd6b4 ("ipv6: Copy cork options in ip6_append_data") added some code duplication and bad error recovery, leading to potential crash in ip6_cork_release() as kfree() could be called with garbage. use kzalloc() to make sure this wont happen. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Neal Cardwell <ncardwell@google.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
2014-05-19ipv6 mcast: use in6_dev_put in timer handlers instead of __in6_dev_putSalam Noureddine1-2/+2
[ Upstream commit 9260d3e1013701aa814d10c8fc6a9f92bd17d643 ] It is possible for the timer handlers to run after the call to ipv6_mc_down so use in6_dev_put instead of __in6_dev_put in the handler function in order to do proper cleanup when the refcnt reaches 0. Otherwise, the refcnt can reach zero without the inet6_dev being destroyed and we end up leaking a reference to the net_device and see messages like the following, unregister_netdevice: waiting for eth0 to become free. Usage count = 1 Tested on linux-3.4.43. Signed-off-by: Salam Noureddine <noureddine@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Willy Tarreau <w@1wt.eu>
2014-05-19net: do not call sock_put() on TIMEWAIT socketsEric Dumazet1-1/+1
[ Upstream commit 80ad1d61e72d626e30ebe8529a0455e660ca4693 ] commit 3ab5aee7fe84 ("net: Convert TCP & DCCP hash tables to use RCU / hlist_nulls") incorrectly used sock_put() on TIMEWAIT sockets. We should instead use inet_twsk_put() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Willy Tarreau <w@1wt.eu>
2014-05-19ipv6: tcp: fix panic in SYN processingEric Dumazet1-1/+1
commit 72a3effaf633bc ([NET]: Size listen hash tables using backlog hint) added a bug allowing inet6_synq_hash() to return an out of bound array index, because of u16 overflow. Bug can happen if system admins set net.core.somaxconn & net.ipv4.tcp_max_syn_backlog sysctls to values greater than 65536 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit c16a98ed91597b40b22b540c6517103497ef8e74) Signed-off-by: Willy Tarreau <w@1wt.eu>
2014-05-19ipv6: ip6_sk_dst_check() must not assume ipv6 dstEric Dumazet1-1/+7
commit a963a37d384d71ad43b3e9e79d68d42fbe0901f3 upstream It's possible to use AF_INET6 sockets and to connect to an IPv4 destination. After this, socket dst cache is a pointer to a rtable, not rt6_info. ip6_sk_dst_check() should check the socket dst cache is IPv6, or else various corruptions/crashes can happen. Dave Jones can reproduce immediate crash with trinity -q -l off -n -c sendmsg -c connect With help from Hannes Frederic Sowa Reported-by: Dave Jones <davej@redhat.com> Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Willy Tarreau <w@1wt.eu>
2013-06-10ipv6: make fragment identifications less predictableEric Dumazet3-6/+38
[ Backport of upstream commit 87c48fa3b4630905f98268dde838ee43626a060c ] Fernando Gont reported current IPv6 fragment identification generation was not secure, because using a very predictable system-wide generator, allowing various attacks. IPv4 uses inetpeer cache to address this problem and to get good performance. We'll use this mechanism when IPv6 inetpeer is stable enough in linux-3.1 For the time being, we use jhash on destination address to provide less predictable identifications. Also remove a spinlock and use cmpxchg() to get better SMP performance. Reported-by: Fernando Gont <fernando@gont.com.ar> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> [bwh: Backport further to 2.6.32] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Willy Tarreau <w@1wt.eu>
2013-06-10ipv6: discard overlapping fragmentNicolas Dichtel1-59/+15
commit 70789d7052239992824628db8133de08dc78e593 upstream RFC5722 prohibits reassembling fragments when some data overlaps. Bug spotted by Zhang Zuotao <zuotao.zhang@6wind.com>. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> [dannf: backported to Debian's 2.6.32] Signed-off-by: Willy Tarreau <w@1wt.eu>
2013-06-10inet: add RCU protection to inet->optEric Dumazet1-1/+1
commit f6d8bd051c391c1c0458a30b2a7abcd939329259 upstream. We lack proper synchronization to manipulate inet->opt ip_options Problem is ip_make_skb() calls ip_setup_cork() and ip_setup_cork() possibly makes a copy of ipc->opt (struct ip_options), without any protection against another thread manipulating inet->opt. Another thread can change inet->opt pointer and free old one under us. Use RCU to protect inet->opt (changed to inet->inet_opt). Instead of handling atomic refcounts, just copy ip_options when necessary, to avoid cache line dirtying. We cant insert an rcu_head in struct ip_options since its included in skb->cb[], so this patch is large because I had to introduce a new ip_options_rcu structure. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> [dannf/bwh: backported to Debian's 2.6.32] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Willy Tarreau <w@1wt.eu>
2012-10-08ipsec: be careful of non existing mac headersEric Dumazet2-9/+3
[ Upstream commit 03606895cd98c0a628b17324fd7b5ff15db7e3cd ] Niccolo Belli reported ipsec crashes in case we handle a frame without mac header (atm in his case) Before copying mac header, better make sure it is present. Bugzilla reference: https://bugzilla.kernel.org/show_bug.cgi?id=42809 Reported-by: Niccolò Belli <darkbasic@linuxsystems.it> Tested-by: Niccolò Belli <darkbasic@linuxsystems.it> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
2012-02-13net: sock_queue_err_skb() dont mess with sk_forward_allocEric Dumazet1-4/+2
commit b1faf5666438090a4dc4fceac8502edc7788b7e3 upstream. Correct sk_forward_alloc handling for error_queue would need to use a backlog of frames that softirq handler could not deliver because socket is owned by user thread. Or extend backlog processing to be able to process normal and error packets. Another possibility is to not use mem charge for error queue, this is what I implemented in this patch. Note: this reverts commit 29030374 (net: fix sk_forward_alloc corruptions), since we dont need to lock socket anymore. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: 单卫 <shanwei88@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-13net: fix sk_forward_alloc corruptionsEric Dumazet1-2/+4
commit 2903037400a26e7c0cc93ab75a7d62abfacdf485 upstream. As David found out, sock_queue_err_skb() should be called with socket lock hold, or we risk sk_forward_alloc corruption, since we use non atomic operations to update this field. This patch adds bh_lock_sock()/bh_unlock_sock() pair to three spots. (BH already disabled) 1) skb_tstamp_tx() 2) Before calling ip_icmp_error(), in __udp4_lib_err() 3) Before calling ipv6_icmp_error(), in __udp6_lib_err() Reported-by: Anton Blanchard <anton@samba.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: 单卫 <shanwei88@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2011-11-26ipv6: udp: fix the wrong headroom checkShan Wei1-1/+1
commit a9cf73ea7ff78f52662c8658d93c226effbbedde upstream. At this point, skb->data points to skb_transport_header. So, headroom check is wrong. For some case:bridge(UFO is on) + eth device(UFO is off), there is no enough headroom for IPv6 frag head. But headroom check is always false. This will bring about data be moved to there prior to skb->head, when adding IPv6 frag header to skb. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-08ipv6: Add GSO support on forwarding pathHerbert Xu1-1/+1
commit 0aa68271510ae2b221d4b60892103837be63afe4 upstream. Currently we disallow GSO packets on the IPv6 forward path. This patch fixes this. Note that I discovered that our existing GSO MTU checks (e.g., IPv4 forwarding) are buggy in that they skip the check altogether, when they really should be checking gso_size + header instead. I have also been lazy here in that I haven't bothered to segment the GSO packet by hand before generating an ICMP message. Someone should add that to be 100% correct. Reported-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Apollon Oikonomopoulos <apoikos@gmail.com> Signed-off-by: Faidon Liambotis <paravoid@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-11-08Fix broken backport for IPv6 tunnelsStratos Psomadakis1-2/+2
Fix broken backport for IPv6 tunnels in 2.6.32-longterm kernels. upstream commit d5aa407f59f5b83d2c50ec88f5bf56d40f1f8978 ("tunnels: fix netns vs proto registration ordering") , which was included in 2.6.32.44-longterm, was not backported correctly, and results in a NULL pointer dereference in ip6_tunnel.c for longterm kernels >=2.6.32.44 Use [un]register_pernet_gen_device() instead of [un]register_pernet_device() to fix it. Signed-off-by: Stratos Psomadakis <psomas@gentoo.org> Cc: Wolfgang Walter <wolfgang.walter@stwm.de> Cc: Tim Gardner <tim.gardner@canonical.com> Cc: Andy Whitcroft <apw@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-16net: Compute protocol sequence numbers and fragment IDs using MD5.David S. Miller2-0/+2
Computers have become a lot faster since we compromised on the partial MD4 hash which we use currently for performance reasons. MD5 is a much safer choice, and is inline with both RFC1948 and other ISS generators (OpenBSD, Solaris, etc.) Furthermore, only having 24-bits of the sequence number be truly unpredictable is a very serious limitation. So the periodic regeneration and 8-bit counter have been removed. We compute and use a full 32-bit sequence number. For ipv6, DCCP was found to use a 32-bit truncated initial sequence number (it needs 43-bits) and that is fixed here as well. Reported-by: Dan Kaminsky <dan@doxpara.com> Tested-by: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-08tunnels: fix netns vs proto registration orderingAlexey Dobriyan3-31/+32
commit d5aa407f59f5b83d2c50ec88f5bf56d40f1f8978 upstream. Same stuff as in ip_gre patch: receive hook can be called before netns setup is done, oopsing in net_generic(). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-08netns xfrm: fixup xfrm6_tunnel error propagationAlexey Dobriyan1-5/+11
commit e924960dacdf85d118a98c7262edf2f99c3015cf upstream. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-07-13udp/recvmsg: Clear MSG_TRUNC flag when starting over for a new packetXufeng Zhang1-0/+3
[ Upstream commit 9cfaa8def1c795a512bc04f2aec333b03724ca2e ] Consider this scenario: When the size of the first received udp packet is bigger than the receive buffer, MSG_TRUNC bit is set in msg->msg_flags. However, if checksum error happens and this is a blocking socket, it will goto try_again loop to receive the next packet. But if the size of the next udp packet is smaller than receive buffer, MSG_TRUNC flag should not be set, but because MSG_TRUNC bit is not cleared in msg->msg_flags before receive the next packet, MSG_TRUNC is still set, which is wrong. Fix this problem by clearing MSG_TRUNC flag when starting over for a new packet. Signed-off-by: Xufeng Zhang <xufeng.zhang@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-07-13ipv6/udp: Use the correct variable to determine non-blocking conditionXufeng Zhang1-1/+1
[ Upstream commit 32c90254ed4a0c698caa0794ebb4de63fcc69631 ] udpv6_recvmsg() function is not using the correct variable to determine whether or not the socket is in non-blocking operation, this will lead to unexpected behavior when a UDP checksum error occurs. Consider a non-blocking udp receive scenario: when udpv6_recvmsg() is called by sock_common_recvmsg(), MSG_DONTWAIT bit of flags variable in udpv6_recvmsg() is cleared by "flags & ~MSG_DONTWAIT" in this call: err = sk->sk_prot->recvmsg(iocb, sk, msg, size, flags & MSG_DONTWAIT, flags & ~MSG_DONTWAIT, &addr_len); i.e. with udpv6_recvmsg() getting these values: int noblock = flags & MSG_DONTWAIT int flags = flags & ~MSG_DONTWAIT So, when udp checksum error occurs, the execution will go to csum_copy_err, and then the problem happens: csum_copy_err: ............... if (flags & MSG_DONTWAIT) return -EAGAIN; goto try_again; ............... But it will always go to try_again as MSG_DONTWAIT has been cleared from flags at call time -- only noblock contains the original value of MSG_DONTWAIT, so the test should be: if (noblock) return -EAGAIN; This is also consistent with what the ipv4/udp code does. Signed-off-by: Xufeng Zhang <xufeng.zhang@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-06-24netfilter: IPv6: initialize TOS field in REJECT target moduleFernando Luis Vazquez Cao1-1/+3
commit 4319cc0cf5bb894b7368008cdf6dd20eb8868018 upstream. The IPv6 header is not zeroed out in alloc_skb so we must initialize it properly unless we want to see IPv6 packets with random TOS fields floating around. The current implementation resets the flow label but this could be changed if deemed necessary. We stumbled upon this issue when trying to apply a mangle rule to the RST packet generated by the REJECT target module. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-06-24netfilter: nf_conntrack_reasm: properly handle packets fragmented into a ↵Patrick McHardy1-7/+1
single fragment commit 9e2dcf72023d1447f09c47d77c99b0c49659e5ce upstream. When an ICMPV6_PKT_TOOBIG message is received with a MTU below 1280, all further packets include a fragment header. Unlike regular defragmentation, conntrack also needs to "reassemble" those fragments in order to obtain a packet without the fragment header for connection tracking. Currently nf_conntrack_reasm checks whether a fragment has either IP6_MF set or an offset != 0, which makes it ignore those fragments. Remove the invalid check and make reassembly handle fragment queues containing only a single fragment. Reported-and-tested-by: Ulrich Weber <uweber@astaro.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-05-10ipv6: Silence privacy extensions initializationRomain Francoise1-3/+0
commit 2fdc1c8093255f9da877d7b9ce3f46c2098377dc upstream. When a network namespace is created (via CLONE_NEWNET), the loopback interface is automatically added to the new namespace, triggering a printk in ipv6_add_dev() if CONFIG_IPV6_PRIVACY is set. This is problematic for applications which use CLONE_NEWNET as part of a sandbox, like Chromium's suid sandbox or recent versions of vsftpd. On a busy machine, it can lead to thousands of useless "lo: Disabled Privacy Extensions" messages appearing in dmesg. It's easy enough to check the status of privacy extensions via the use_tempaddr sysctl, so just removing the printk seems like the most sensible solution. Signed-off-by: Romain Francoise <romain@orebokech.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-04-15ipv6: netfilter: ip6_tables: fix infoleak to userspaceVasiliy Kulikov1-0/+3
commit 6a8ab060779779de8aea92ce3337ca348f973f54 upstream. Structures ip6t_replace, compat_ip6t_replace, and xt_get_revision are copied from userspace. Fields of these structs that are zero-terminated strings are not checked. When they are used as argument to a format string containing "%s" in request_module(), some sensitive information is leaked to userspace via argument of spawned modprobe process. The first bug was introduced before the git epoch; the second was introduced in 3bc3fe5e (v2.6.25-rc1); the third is introduced by 6b7d31fc (v2.6.15-rc1). To trigger the bug one should have CAP_NET_ADMIN. Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-03-15ip6ip6: autoload ip6 tunnelstephen hemminger1-0/+1
commit 6dfbd87a20a737641ef228230c77f4262434fa24 upstream ip6ip6: autoload ip6 tunnel Add necessary alias to autoload ip6ip6 tunnel module. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-03-15net: don't allow CAP_NET_ADMIN to load non-netdev kernel modulesVasiliy Kulikov1-1/+1
commit 8909c9ad8ff03611c9c96c9a92656213e4bb495b upstream. Since a8f80e8ff94ecba629542d9b4b5f5a8ee3eb565c any process with CAP_NET_ADMIN may load any module from /lib/modules/. This doesn't mean that CAP_NET_ADMIN is a superset of CAP_SYS_MODULE as modules are limited to /lib/modules/**. However, CAP_NET_ADMIN capability shouldn't allow anybody load any module not related to networking. This patch restricts an ability of autoloading modules to netdev modules with explicit aliases. This fixes CVE-2011-1019. Arnd Bergmann suggested to leave untouched the old pre-v2.6.32 behavior of loading netdev modules by name (without any prefix) for processes with CAP_SYS_MODULE to maintain the compatibility with network scripts that use autoloading netdev modules by aliases like "eth0", "wlan0". Currently there are only three users of the feature in the upstream kernel: ipip, ip_gre and sit. root@albatros:~# capsh --drop=$(seq -s, 0 11),$(seq -s, 13 34) -- root@albatros:~# grep Cap /proc/$$/status CapInh: 0000000000000000 CapPrm: fffffff800001000 CapEff: fffffff800001000 CapBnd: fffffff800001000 root@albatros:~# modprobe xfs FATAL: Error inserting xfs (/lib/modules/2.6.38-rc6-00001-g2bf4ca3/kernel/fs/xfs/xfs.ko): Operation not permitted root@albatros:~# lsmod | grep xfs root@albatros:~# ifconfig xfs xfs: error fetching interface information: Device not found root@albatros:~# lsmod | grep xfs root@albatros:~# lsmod | grep sit root@albatros:~# ifconfig sit sit: error fetching interface information: Device not found root@albatros:~# lsmod | grep sit root@albatros:~# ifconfig sit0 sit0 Link encap:IPv6-in-IPv4 NOARP MTU:1480 Metric:1 root@albatros:~# lsmod | grep sit sit 10457 0 tunnel4 2957 1 sit For CAP_SYS_MODULE module loading is still relaxed: root@albatros:~# grep Cap /proc/$$/status CapInh: 0000000000000000 CapPrm: ffffffffffffffff CapEff: ffffffffffffffff CapBnd: ffffffffffffffff root@albatros:~# ifconfig xfs xfs: error fetching interface information: Device not found root@albatros:~# lsmod | grep xfs xfs 745319 0 Reference: https://lkml.org/lkml/2011/2/24/203 Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Kees Cook <kees.cook@canonical.com> Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-10-29ip: fix truesize mismatch in ip fragmentationEric Dumazet1-5/+13
[ Upstream commit 3d13008e7345fa7a79d8f6438150dc15d6ba6e9d ] Special care should be taken when slow path is hit in ip_fragment() : When walking through frags, we transfert truesize ownership from skb to frags. Then if we hit a slow_path condition, we must undo this or risk uncharging frags->truesize twice, and in the end, having negative socket sk_wmem_alloc counter, or even freeing socket sooner than expected. Many thanks to Nick Bowler, who provided a very clean bug report and test program. Thanks to Jarek for reviewing my first patch and providing a V2 While Nick bisection pointed to commit 2b85a34e911 (net: No more expensive sock_hold()/sock_put() on each tx), underlying bug is older (2.6.12-rc5) A side effect is to extend work done in commit b2722b1c3a893e (ip_fragment: also adjust skb->truesize for packets not owned by a socket) to ipv6 as well. Reported-and-bisected-by: Nick Bowler <nbowler@elliptictech.com> Tested-by: Nick Bowler <nbowler@elliptictech.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Jarek Poplawski <jarkao2@gmail.com> CC: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>