kernel/linux.git/include/net/sock.h, branch v4.11.6

net: Work around lockdep limitation in sockets that use sockets

2017-03-10T02:23:27+00:00

Lockdep issues a circular dependency warning when AFS issues an operation through AF_RXRPC from a context in which the VFS/VM holds the mmap_sem. The theory lockdep comes up with is as follows: (1) If the pagefault handler decides it needs to read pages from AFS, it calls AFS with mmap_sem held and AFS begins an AF_RXRPC call, but creating a call requires the socket lock: mmap_sem must be taken before sk_lock-AF_RXRPC (2) afs_open_socket() opens an AF_RXRPC socket and binds it. rxrpc_bind() binds the underlying UDP socket whilst holding its socket lock. inet_bind() takes its own socket lock: sk_lock-AF_RXRPC must be taken before sk_lock-AF_INET (3) Reading from a TCP socket into a userspace buffer might cause a fault and thus cause the kernel to take the mmap_sem, but the TCP socket is locked whilst doing this: sk_lock-AF_INET must be taken before mmap_sem However, lockdep's theory is wrong in this instance because it deals only with lock classes and not individual locks. The AF_INET lock in (2) isn't really equivalent to the AF_INET lock in (3) as the former deals with a socket entirely internal to the kernel that never sees userspace. This is a limitation in the design of lockdep. Fix the general case by: (1) Double up all the locking keys used in sockets so that one set are used if the socket is created by userspace and the other set is used if the socket is created by the kernel. (2) Store the kern parameter passed to sk_alloc() in a variable in the sock struct (sk_kern_sock). This informs sock_lock_init(), sock_init_data() and sk_clone_lock() as to the lock keys to be used. Note that the child created by sk_clone_lock() inherits the parent's kern setting. (3) Add a 'kern' parameter to ->accept() that is analogous to the one passed in to ->create() that distinguishes whether kernel_accept() or sys_accept4() was the caller and can be passed to sk_alloc(). Note that a lot of accept functions merely dequeue an already allocated socket. I haven't touched these as the new socket already exists before we get the parameter. Note also that there are a couple of places where I've made the accepted socket unconditionally kernel-based: irda_accept() rds_rcp_accept_one() tcp_accept_from_sock() because they follow a sock_create_kern() and accept off of that. Whilst creating this, I noticed that lustre and ocfs don't create sockets through sock_create_kern() and thus they aren't marked as for-kernel, though they appear to be internal. I wonder if these should do that so that they use the new set of lock keys. Signed-off-by: David Howells Signed-off-by: David S. Miller

net: Introduce sk_clone_lock() error path routine

2017-03-02T21:19:33+00:00

When handling problems in cloning a socket with the sk_clone_locked() function we need to perform several steps that were open coded in it and its callers, so introduce a routine to avoid this duplication: sk_free_unlock_clone(). Cc: Cong Wang Cc: Dmitry Vyukov Cc: Eric Dumazet Cc: Gerrit Renker Cc: Thomas Gleixner Link: http://lkml.kernel.org/n/net-ui6laqkotycunhtmqryl9bfx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

2017-02-07T21:29:30+00:00

The conflict was an interaction between a bug fix in the netvsc driver in 'net' and an optimization of the RX path in 'net-next'. Signed-off-by: David S. Miller

net: add dst_pending_confirm flag to skbuff

2017-02-07T18:07:46+00:00

Add new skbuff flag to allow protocols to confirm neighbour. When same struct dst_entry can be used for many different neighbours we can not use it for pending confirmations. Add sock_confirm_neigh() helper to confirm the neighbour and use it for IPv4, IPv6 and VRF before dst_neigh_output. Signed-off-by: Julian Anastasov Acked-by: Eric Dumazet Signed-off-by: David S. Miller

sock: add sk_dst_pending_confirm flag

2017-02-07T18:07:46+00:00

Add new sock flag to allow sockets to confirm neighbour. When same struct dst_entry can be used for many different neighbours we can not use it for pending confirmations. As not all call paths lock the socket use full word for the flag. Add sk_dst_confirm as replacement for dst_confirm when called for received packets. Signed-off-by: Julian Anastasov Acked-by: Eric Dumazet Signed-off-by: David S. Miller

udp: properly cope with csum errors

2017-02-07T16:19:00+00:00

Dmitry reported that UDP sockets being destroyed would trigger the WARN_ON(atomic_read(&sk->sk_rmem_alloc)); in inet_sock_destruct() It turns out we do not properly destroy skb(s) that have wrong UDP checksum. Thanks again to syzkaller team. Fixes : 7c13f97ffde6 ("udp: do fwd memory scheduling on dequeue") Reported-by: Dmitry Vyukov Signed-off-by: Eric Dumazet Cc: Paolo Abeni Cc: Hannes Frederic Sowa Acked-by: Paolo Abeni Signed-off-by: David S. Miller

net: adjust skb->truesize in pskb_expand_head()

2017-01-27T17:03:29+00:00

Slava Shwartsman reported a warning in skb_try_coalesce(), when we detect skb->truesize is completely wrong. In his case, issue came from IPv6 reassembly coping with malicious datagrams, that forced various pskb_may_pull() to reallocate a bigger skb->head than the one allocated by NIC driver before entering GRO layer. Current code does not change skb->truesize, leaving this burden to callers if they care enough. Blindly changing skb->truesize in pskb_expand_head() is not easy, as some producers might track skb->truesize, for example in xmit path for back pressure feedback (sk->sk_wmem_alloc) We can detect the cases where it should be safe to change skb->truesize : 1) skb is not attached to a socket. 2) If it is attached to a socket, destructor is sock_edemux() My audit gave only two callers doing their own skb->truesize manipulation. I had to remove skb parameter in sock_edemux macro when CONFIG_INET is not set to avoid a compile error. Signed-off-by: Eric Dumazet Reported-by: Slava Shwartsman Signed-off-by: David S. Miller

sock: use hlist_entry_safe

2017-01-20T16:38:45+00:00

Use hlist_entry_safe() instead of open-coding it. Signed-off-by: Geliang Tang Signed-off-by: David S. Miller

smc: netlink interface for SMC sockets

2017-01-09T21:07:41+00:00

Support for SMC socket monitoring via netlink sockets of protocol NETLINK_SOCK_DIAG. Signed-off-by: Ursula Braun Signed-off-by: David S. Miller

net: introduce keepalive function in struct proto

2017-01-09T21:07:37+00:00

Direct call of tcp_set_keepalive() function from protocol-agnostic sock_setsockopt() function in net/core/sock.c violates network layering. And newly introduced protocol (SMC-R) will need its own keepalive function. Therefore, add "keepalive" function pointer to "struct proto", and call it from sock_setsockopt() via this pointer. Signed-off-by: Ursula Braun Reviewed-by: Utz Bacher Signed-off-by: David S. Miller