summaryrefslogtreecommitdiff
path: root/net/ipv4
AgeCommit message (Collapse)AuthorFilesLines
2007-02-05[PATCH] TCP: skb is unexpectedly freed.Masayuki Nakagawa1-2/+4
I encountered a kernel panic with my test program, which is a very simple IPv6 client-server program. The server side sets IPV6_RECVPKTINFO on a listening socket, and the client side just sends a message to the server. Then the kernel panic occurs on the server. (If you need the test program, please let me know. I can provide it.) This problem happens because a skb is forcibly freed in tcp_rcv_state_process(). When a socket in listening state(TCP_LISTEN) receives a syn packet, then tcp_v6_conn_request() will be called from tcp_rcv_state_process(). If the tcp_v6_conn_request() successfully returns, the skb would be discarded by __kfree_skb(). However, in case of a listening socket which was already set IPV6_RECVPKTINFO, an address of the skb will be stored in treq->pktopts and a ref count of the skb will be incremented in tcp_v6_conn_request(). But, even if the skb is still in use, the skb will be freed. Then someone still using the freed skb will cause the kernel panic. I suggest to use kfree_skb() instead of __kfree_skb(). Signed-off-by: Masayuki Nakagawa <nakagawa.msy@ncos.nec.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05[PATCH] TCP: Fix sorting of SACK blocks.Baruch Even1-4/+5
The sorting of SACK blocks actually munges them rather than sort, causing the TCP stack to ignore some SACK information and breaking the assumption of ordered SACK blocks after sorting. The sort takes the data from a second buffer which isn't moved causing subsequent data moves to occur from the wrong location. The fix is to use a temporary buffer as a normal sort does. Signed-off-By: Baruch Even <baruch@ev-en.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05[PATCH] TCP: rare bad TCP checksum with 2.6.19Jarek Poplawski1-1/+2
The patch "Replace CHECKSUM_HW by CHECKSUM_PARTIAL/CHECKSUM_COMPLETE" changed to unconditional copying of ip_summed field from collapsed skb. This patch reverts this change. The majority of substantial work including heavy testing and diagnosing by: Michael Tokarev <mjt@tls.msk.ru> Possible reasons pointed by: Herbert Xu and Patrick McHardy. Signed-off-by: Jarek Poplawski <jarkao2@o2.pl> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05[PATCH] IPV4: Fix single-entry /proc/net/fib_trie output.Robert Olsson1-6/+7
When main table is just a single leaf this gets printed as belonging to the local table in /proc/net/fib_trie. A fix is below. Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Acked-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05[PATCH] IPV4: Fix the fib trie iterator to work with a single entry routing ↵Eric W. Biederman1-5/+16
tables In a kernel with trie routing enabled I had a simple routing setup with only a single route to the outside world and no default route. "ip route table list main" showed my the route just fine but /proc/net/route was an empty file. What was going on? Thinking it was a bug in something I did and I looked deeper. Eventually I setup a second route and everything looked correct, huh? Finally I realized that the it was just the iterator pair in fib_trie_get_first, fib_trie_get_next just could not handle a routing table with a single entry. So to save myself and others further confusion, here is a simple fix for the fib proc iterator so it works even when there is only a single route in a routing table. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05[PATCH] NETFILTER: ctnetlink: fix leak in ctnetlink_create_conntrack error pathPatrick McHardy1-1/+1
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05[PATCH] NETFILTER: ctnetlink: check for status attribute existence on ↵Pablo Neira Ayuso1-3/+5
conntrack creation Check that status flags are available in the netlink message received to create a new conntrack. Fixes a crash in ctnetlink_create_conntrack when the CTA_STATUS attribute is not present. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-02-05[PATCH] NETFILTER: Fix routing of REJECT target generated packets in output ↵Patrick McHardy1-2/+5
chain Packets generated by the REJECT target in the output chain have a local destination address and a foreign source address. Make sure not to use the foreign source address for the output route lookup. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-01-10[PATCH] IPV4/IPV6: Fix inet{,6} device initialization order.David L Stevens1-2/+3
It is important that we only assign dev->ip{,6}_ptr only after all portions of the inet{,6} are setup. Otherwise we can receive packets before the multicast spinlocks et al. are initialized. Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2007-01-10[PATCH] UDP: Fix reversed logic in udp_get_port()David Miller1-5/+8
When this code was converted to use sk_for_each() the logic for the "best hash chain length" code was reversed, breaking everything. The original code was of the form: size = 0; do { if (++size >= best_size_so_far) goto next; } while ((sk = sk->next) != NULL); best_size_so_far = size; best = result; next:; and this got converted into: sk_for_each(sk2, node, head) if (++size < best_size_so_far) { best_size_so_far = size; best = result; } Which does something very very different from the original. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2006-12-11[PATCH] IPSEC: Fix inetpeer leak in ipv4 xfrm dst entries.David Miller1-0/+2
We grab a reference to the route's inetpeer entry but forget to release it in xfrm4_dst_destroy(). Bug discovered by Kazunori MIYAZAWA <kazunori@miyazawa.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2006-12-11[PATCH] XFRM: Use output device disable_xfrm for forwarded packetsPatrick McHardy1-1/+1
Currently the behaviour of disable_xfrm is inconsistent between locally generated and forwarded packets. For locally generated packets disable_xfrm disables the policy lookup if it is set on the output device, for forwarded traffic however it looks at the input device. This makes it impossible to disable xfrm on all devices but a dummy device and use normal routing to direct traffic to that device. Always use the output device when checking disable_xfrm. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2006-12-11[PATCH] NETFILTER: Fix iptables compat hook validationDmitry Mishin1-27/+51
In compat mode, matches and targets valid hooks checks always successful due to not initialized e->comefrom field yet. This patch separates this checks from translation code and moves them after mark_source_chains() call, where these marks are initialized. Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by; Patrick McHardy <kaber@trash.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2006-12-11[PATCH] NETFILTER: Fix {ip, ip6, arp}_tables hook validationDmitry Mishin2-67/+49
Commit 590bdf7fd2292b47c428111cb1360e312eff207e introduced a regression in match/target hook validation. mark_source_chains builds a bitmask for each rule representing the hooks it can be reached from, which is then used by the matches and targets to make sure they are only called from valid hooks. The patch moved the match/target specific validation before the mark_source_chains call, at which point the mask is always zero. This patch returns back to the old order and moves the standard checks to mark_source_chains. This allows to get rid of a special case for standard targets as a nice side-effect. Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
2006-11-29[NETFILTER]: ipt_REJECT: fix memory corruptionPatrick McHardy1-7/+9
On devices with hard_header_len > LL_MAX_HEADER ip_route_me_harder() reallocates the skb, leading to memory corruption when using the stale tcph pointer to update the checksum. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-11-29[NETFILTER]: conntrack: fix refcount leak when finding expectationYasuyuki Kozakai1-3/+3
All users of __{ip,nf}_conntrack_expect_find() don't expect that it increments the reference count of expectation. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-11-29[NETFILTER]: ctnetlink: fix reference count leakPatrick McHardy1-0/+1
When NFA_NEST exceeds the skb size the protocol reference is leaked. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-11-26[NET]: Fix kfifo_alloc() error check.Akinobu Mita1-0/+2
The return value of kfifo_alloc() should be checked by IS_ERR(). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-11-26[UDP]: Make udp_encap_rcv use pskb_may_pullOlaf Kirch1-5/+14
Make udp_encap_rcv use pskb_may_pull IPsec with NAT-T breaks on some notebooks using the latest e1000 chipset, when header split is enabled. When receiving sufficiently large packets, the driver puts everything up to and including the UDP header into the header portion of the skb, and the rest goes into the paged part. udp_encap_rcv forgets to use pskb_may_pull, and fails to decapsulate it. Instead, it passes it up it to the IKE daemon. Signed-off-by: Olaf Kirch <okir@suse.de> Signed-off-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-11-26[NETFILTER]: H.323 conntrack: fix crash with CONFIG_IP_NF_CT_ACCTFaidon Liambotis1-2/+2
H.323 connection tracking code calls ip_ct_refresh_acct() when processing RCFs and URQs but passes NULL as the skb. When CONFIG_IP_NF_CT_ACCT is enabled, the connection tracking core tries to derefence the skb, which results in an obvious panic. A similar fix was applied on the SIP connection tracking code some time ago. Signed-off-by: Faidon Liambotis <paravoid@debian.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-11-16[TCP]: Fix up sysctl_tcp_mem initialization.John Heffner1-3/+4
Fix up tcp_mem initial settings to take into account the size of the hash entries (different on SMP and non-SMP systems). Signed-off-by: John Heffner <jheffner@psc.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-11-16[NETFILTER]: Use pskb_trim in {ip,ip6,nfnetlink}_queuePatrick McHardy1-3/+4
Based on patch by James D. Nurmi: I've got some code very dependant on nfnetlink_queue, and turned up a large number of warns coming from skb_trim. While it's quite possibly my code, having not seen it on older kernels made me a bit suspect. Anyhow, based on some googling I turned up this thread: http://lkml.org/lkml/2006/8/13/56 And believe the issue to be related, so attached is a small patch to the kernel -- not sure if this is completely correct, but for anyone else hitting the WARN_ON(1) in skbuff.h, it might be helpful.. Signed-off-by: James D. Nurmi <jdnurmi@gmail.com> Ported to ip6_queue and nfnetlink_queue and added return value checks. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-11-11[IPVS]: More endianness fixed.Julian Anastasov3-6/+6
- make sure port in FTP data is in network order (in fact it was looking buggy for big endian boxes before Viro's changes) - htonl -> htons for port Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-11-08[TCP]: Don't use highmem in tcp hash size calculation.John Heffner1-2/+2
This patch removes consideration of high memory when determining TCP hash table sizes. Taking into account high memory results in tcp_mem values that are too large. Signed-off-by: John Heffner <jheffner@psc.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-11-02[TCP]: Set default congestion control when no sysctl.Stephen Hemminger2-7/+8
The setting of the default congestion control was buried in the sysctl code so it would not be done properly if SYSCTL was not enabled. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-31[NetLabel]: protect the CIPSOv4 socket option from setsockopt()Paul Moore2-5/+4
This patch makes two changes to protect applications from either removing or tampering with the CIPSOv4 IP option on a socket. The first is the requirement that applications have the CAP_NET_RAW capability to set an IPOPT_CIPSO option on a socket; this prevents untrusted applications from setting their own CIPSOv4 security attributes on the packets they send. The second change is to SELinux and it prevents applications from setting any IPv4 options when there is an IPOPT_CIPSO option already present on the socket; this prevents applications from removing CIPSOv4 security attributes from the packets they send. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-31[NETFILTER]: ip_tables: compat code module refcounting fixDmitry Mishin1-25/+11
This patch fixes bug in iptables modules refcounting on compat error way. As we are getting modules in check_compat_entry_size_and_hooks(), in case of later error, we should put them all in translate_compat_table(), not in the compat_copy_entry_from_user() or compat_copy_match_from_user(), as it is now. Signed-off-by: Dmitry Mishin <dim@openvz.org> Acked-by: Vasily Averin <vvs@openvz.org> Acked-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-31[NETFILTER]: ip_tables: compat error way cleanupVasily Averin1-0/+1
This patch adds forgotten compat_flush_offset() call to error way of translate_compat_table(). May lead to table corruption on the next compat_do_replace(). Signed-off-by: Vasily Averin <vvs@openvz.org> Acked-by: Dmitry Mishin <dim@openvz.org> Acked-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-31[NETFILTER]: Missed and reordered checks in {arp,ip,ip6}_tablesDmitry Mishin2-17/+38
There is a number of issues in parsing user-provided table in translate_table(). Malicious user with CAP_NET_ADMIN may crash system by passing special-crafted table to the *_tables. The first issue is that mark_source_chains() function is called before entry content checks. In case of standard target, mark_source_chains() function uses t->verdict field in order to determine new position. But the check, that this field leads no further, than the table end, is in check_entry(), which is called later, than mark_source_chains(). The second issue, that there is no check that target_offset points inside entry. If so, *_ITERATE_MATCH macro will follow further, than the entry ends. As a result, we'll have oops or memory disclosure. And the third issue, that there is no check that the target is completely inside entry. Results are the same, as in previous issue. Signed-off-by: Dmitry Mishin <dim@openvz.org> Acked-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-31[NET]: fix uaccess handlingHeiko Carstens1-6/+11
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-26[TCP] H-TCP: fix integer overflowGavin McCullagh1-1/+1
When using H-TCP with a single flow on a 500Mbit connection (or less actually), alpha can exceed 65000, so alpha needs to be a u32. Signed-off-by: Gavin McCullagh <gavin.mccullagh@nuim.ie> Signed-off-by: Doug Leith <doug.leith@nuim.ie> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-26[TCP] cubic: scaling errorStephen Hemminger1-3/+3
Doug Leith observed a discrepancy between the version of CUBIC described in the papers and the version in 2.6.18. A math error related to scaling causes Cubic to grow too slowly. Patch is from "Sangtae Ha" <sha2@ncsu.edu>. I validated that it does fix the problems. See the following to show behavior over 500ms 100 Mbit link. Sender (2.6.19-rc3) --- Bridge (2.6.18-rt7) ------- Receiver (2.6.19-rc3) 1G [netem] 100M http://developer.osdl.org/shemminger/tcp/2.6.19-rc3/cubic-orig.png http://developer.osdl.org/shemminger/tcp/2.6.19-rc3/cubic-fix.png Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-25[IPV4] ipconfig: fix RARP ic_servaddr breakageAl Viro1-1/+1
memcpy 4 bytes to address of auto unsigned long variable followed by comparison with u32 is a bloody bad idea. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-20[TCP]: One NET_INC_STATS() could be NET_INC_STATS_BH in tcp_v4_err()Eric Dumazet1-1/+1
I believe this NET_INC_STATS() call can be replaced by NET_INC_STATS_BH(), a little bit cheaper. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-20[NETFILTER]: Missing check for CAP_NET_ADMIN in iptables compat layerBjörn Steinbrink1-0/+3
The 32bit compatibility layer has no CAP_NET_ADMIN check in compat_do_ipt_get_ctl, which for example allows to list the current iptables rules even without having that capability (the non-compat version requires it). Other capabilities might be required to exploit the bug (eg. CAP_NET_RAW to get the nfnetlink socket?), so a plain user can't exploit it, but a setup actually using the posix capability system might very well hit such a constellation of granted capabilities. Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-19[TCP]: Bound TSO defer timeJohn Heffner1-5/+15
This patch limits the amount of time you will defer sending a TSO segment to less than two clock ticks, or the time between two acks, whichever is longer. On slow links, deferring causes significant bursts. See attached plots, which show RTT through a 1 Mbps link with a 100 ms RTT and ~100 ms queue for (a) non-TSO, (b) currnet TSO, and (c) patched TSO. This burstiness causes significant jitter, tends to overflow queues early (bad for short queues), and makes delay-based congestion control more difficult. Deferring by a couple clock ticks I believe will have a relatively small impact on performance. Signed-off-by: John Heffner <jheffner@psc.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-19[IPv4] fib: Remove unused fib_config membersThomas Graf1-5/+0
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-16[NET]: reduce sizeof(struct inet_peer), cleanup, change in peer_check_expire()Eric Dumazet1-7/+22
1) shrink struct inet_peer on 64 bits platforms.
2006-10-16NetLabel: the CIPSOv4 passthrough mapping does not pass categories correctlyPaul Moore1-2/+2
The CIPSO passthrough mapping had a problem when sending categories which would cause no or incorrect categories to be sent on the wire with a packet. This patch fixes the problem which was a simple off-by-one bug. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>
2006-10-16NetLabel: only deref the CIPSOv4 standard map fields when using standard mappingPaul Moore1-6/+12
Fix several places in the CIPSO code where it was dereferencing fields which did not have valid pointers by moving those pointer dereferences into code blocks where the pointers are valid. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>
2006-10-16[NETFILTER]: ctnetlink: Remove debugging messagesPablo Neira Ayuso1-69/+3
Remove (compilation-breaking) debugging messages introduced at early development stage. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-16[NETFILTER]: ipt_ECN/ipt_TOS: fix incorrect checksum updatePatrick McHardy2-6/+6
Even though the tos field is only a single byte large, the values need to be converted to net-endian for the checkum update so they are in the corrent byte position. Also fix incorrect endian annotations. Reported by Stephane Chazelas <Stephane_Chazelas@yahoo.fr> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-16[NETFILTER]: arp_tables: missing unregistration on module unloadPatrick McHardy1-0/+2
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-12[NET]: Do not memcmp() over pad bytes of struct flowi.David S. Miller1-3/+9
They are not necessarily initialized to zero by the compiler, for example when using run-time initializers of automatic on-stack variables. Noticed by Eric Dumazet and Patrick McHardy. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-12[NET]: Use typesafe inet_twsk() inline function instead of cast.YOSHIFUJI Hideaki1-9/+7
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-12[NET]: Use hton{l,s}() for non-initializers.YOSHIFUJI Hideaki2-13/+22
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-12[TCP]: Use TCPOLEN_TSTAMP_ALIGNED macro instead of magic number.YOSHIFUJI Hideaki1-1/+1
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-12IPsec: correct semantics for SELinux policy matchingVenkat Yekkirala1-1/+1
Currently when an IPSec policy rule doesn't specify a security context, it is assumed to be "unlabeled" by SELinux, and so the IPSec policy rule fails to match to a flow that it would otherwise match to, unless one has explicitly added an SELinux policy rule allowing the flow to "polmatch" to the "unlabeled" IPSec policy rules. In the absence of such an explicitly added SELinux policy rule, the IPSec policy rule fails to match and so the packet(s) flow in clear text without the otherwise applicable xfrm(s) applied. The above SELinux behavior violates the SELinux security notion of "deny by default" which should actually translate to "encrypt by default" in the above case. This was first reported by Evgeniy Polyakov and the way James Morris was seeing the problem was when connecting via IPsec to a confined service on an SELinux box (vsftpd), which did not have the appropriate SELinux policy permissions to send packets via IPsec. With this patch applied, SELinux "polmatching" of flows Vs. IPSec policy rules will only come into play when there's a explicit context specified for the IPSec policy rule (which also means there's corresponding SELinux policy allowing appropriate domains/flows to polmatch to this context). Secondly, when a security module is loaded (in this case, SELinux), the security_xfrm_policy_lookup() hook can return errors other than access denied, such as -EINVAL. We were not handling that correctly, and in fact inverting the return logic and propagating a false "ok" back up to xfrm_lookup(), which then allowed packets to pass as if they were not associated with an xfrm policy. The solution for this is to first ensure that errno values are correctly propagated all the way back up through the various call chains from security_xfrm_policy_lookup(), and handled correctly. Then, flow_cache_lookup() is modified, so that if the policy resolver fails (typically a permission denied via the security module), the flow cache entry is killed rather than having a null policy assigned (which indicates that the packet can pass freely). This also forces any future lookups for the same flow to consult the security module (e.g. SELinux) for current security policy (rather than, say, caching the error on the flow cache entry). This patch: Fix the selinux side of things. This makes sure SELinux polmatching of flow contexts to IPSec policy rules comes into play only when an explicit context is associated with the IPSec policy rule. Also, this no longer defaults the context of a socket policy to the context of the socket since the "no explicit context" case is now handled properly. Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com> Signed-off-by: James Morris <jmorris@namei.org>
2006-10-12NetLabel: fix a cache race conditionpaul.moore@hp.com1-8/+10
Testing revealed a problem with the NetLabel cache where a cached entry could be freed while in use by the LSM layer causing an oops and other problems. This patch fixes that problem by introducing a reference counter to the cache entry so that it is only freed when it is no longer in use. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>
2006-10-11[PATCH] ptrdiff_t is %t, not %zAl Viro1-1/+1
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>