Age | Commit message (Collapse) | Author | Files | Lines |
|
[ Upstream commit a904a0693c189691eeee64f6c6b188bd7dc244e9 ]
Historically linux tried to stick to RFC 791, 1122, 2003
for IPv4 ID field generation.
RFC 6864 made clear that no matter how hard we try,
we can not ensure unicity of IP ID within maximum
lifetime for all datagrams with a given source
address/destination address/protocol tuple.
Linux uses a per socket inet generator (inet_id), initialized
at connection startup with a XOR of 'jiffies' and other
fields that appear clear on the wire.
Thiemo Nagel pointed that this strategy is a privacy
concern as this provides 16 bits of entropy to fingerprint
devices.
Let's switch to a random starting point, this is just as
good as far as RFC 6864 is concerned and does not leak
anything critical.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Thiemo Nagel <tnagel@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9b6c08878e23adb7cc84bdca94d8a944b03f099e upstream.
Now when sctp_connect() is called with a wrong sa_family, it binds
to a port but doesn't set bp->port, then sctp_get_af_specific will
return NULL and sctp_connect() returns -EINVAL.
Then if sctp_bind() is called to bind to another port, the last
port it has bound will leak due to bp->port is NULL by then.
sctp_connect() doesn't need to bind ports, as later __sctp_connect
will do it if bp->port is NULL. So remove it from sctp_connect().
While at it, remove the unnecessary sockaddr.sa_family len check
as it's already done in sctp_inet_connect.
Fixes: 644fbdeacf1d ("sctp: fix the issue that flags are ignored when using kernel_connect")
Reported-by: syzbot+079bf326b38072f849d9@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 644fbdeacf1d3edd366e44b8ba214de9d1dd66a9 upstream.
Now sctp uses inet_dgram_connect as its proto_ops .connect, and the flags
param can't be passed into its proto .connect where this flags is really
needed.
sctp works around it by getting flags from socket file in __sctp_connect.
It works for connecting from userspace, as inherently the user sock has
socket file and it passes f_flags as the flags param into the proto_ops
.connect.
However, the sock created by sock_create_kern doesn't have a socket file,
and it passes the flags (like O_NONBLOCK) by using the flags param in
kernel_connect, which calls proto_ops .connect later.
So to fix it, this patch defines a new proto_ops .connect for sctp,
sctp_inet_connect, which calls __sctp_connect() directly with this
flags param. After this, the sctp's proto .connect can be removed.
Note that sctp_inet_connect doesn't need to do some checks that are not
needed for sctp, which makes thing better than with inet_dgram_connect.
Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 63dfb7938b13fa2c2fbcb45f34d065769eb09414 ]
syzbot reported a memory leak:
BUG: memory leak, unreferenced object 0xffff888120b3d380 (size 64):
backtrace:
[...] slab_alloc mm/slab.c:3319 [inline]
[...] kmem_cache_alloc+0x13f/0x2c0 mm/slab.c:3483
[...] sctp_bucket_create net/sctp/socket.c:8523 [inline]
[...] sctp_get_port_local+0x189/0x5a0 net/sctp/socket.c:8270
[...] sctp_do_bind+0xcc/0x200 net/sctp/socket.c:402
[...] sctp_bindx_add+0x4b/0xd0 net/sctp/socket.c:497
[...] sctp_setsockopt_bindx+0x156/0x1b0 net/sctp/socket.c:1022
[...] sctp_setsockopt net/sctp/socket.c:4641 [inline]
[...] sctp_setsockopt+0xaea/0x2dc0 net/sctp/socket.c:4611
[...] sock_common_setsockopt+0x38/0x50 net/core/sock.c:3147
[...] __sys_setsockopt+0x10f/0x220 net/socket.c:2084
[...] __do_sys_setsockopt net/socket.c:2100 [inline]
It was caused by when sending msgs without binding a port, in the path:
inet_sendmsg() -> inet_send_prepare() -> inet_autobind() ->
.get_port/sctp_get_port(), sp->bind_hash will be set while bp->port is
not. Later when binding another port by sctp_setsockopt_bindx(), a new
bucket will be created as bp->port is not set.
sctp's autobind is supposed to call sctp_autobind() where it does all
things including setting bp->port. Since sctp_autobind() is called in
sctp_sendmsg() if the sk is not yet bound, it should have skipped the
auto bind.
THis patch is to avoid calling inet_autobind() in inet_send_prepare()
by changing sctp_prot .no_autobind with true, also remove the unused
.get_port.
Reported-by: syzbot+d44f7bbebdea49dbc84a@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit cc3ccf26f0649089b3a34a2781977755ea36e72c ]
As rfc7496#section4.5 says about SCTP_PR_SUPPORTED:
This socket option allows the enabling or disabling of the
negotiation of PR-SCTP support for future associations. For existing
associations, it allows one to query whether or not PR-SCTP support
was negotiated on a particular association.
It means only sctp sock's prsctp_enable can be set.
Note that for the limitation of SCTP_{CURRENT|ALL}_ASSOC, we will
add it when introducing SCTP_{FUTURE|CURRENT|ALL}_ASSOC for linux
sctp in another patchset.
v1->v2:
- drop the params.assoc_id check as Neil suggested.
Fixes: 28aa4c26fce2 ("sctp: add SCTP_PR_SUPPORTED on sctp sockopt")
Reported-by: Ying Xu <yinxu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit b336decab22158937975293aea79396525f92bb3 ]
syzbot reported an use-after-free involving sctp_id2asoc. Dmitry Vyukov
helped to root cause it and it is because of reading the asoc after it
was freed:
CPU 1 CPU 2
(working on socket 1) (working on socket 2)
sctp_association_destroy
sctp_id2asoc
spin lock
grab the asoc from idr
spin unlock
spin lock
remove asoc from idr
spin unlock
free(asoc)
if asoc->base.sk != sk ... [*]
This can only be hit if trying to fetch asocs from different sockets. As
we have a single IDR for all asocs, in all SCTP sockets, their id is
unique on the system. An application can try to send stuff on an id
that matches on another socket, and the if in [*] will protect from such
usage. But it didn't consider that as that asoc may belong to another
socket, it may be freed in parallel (read: under another socket lock).
We fix it by moving the checks in [*] into the protected region. This
fixes it because the asoc cannot be freed while the lock is held.
Reported-by: syzbot+c7dd55d7aec49d48e49a@syzkaller.appspotmail.com
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit a8dd397903a6e57157f6265911f7d35681364427 ]
Commit d04adf1b3551 ("sctp: reset owner sk for data chunks on out queues
when migrating a sock") made a mistake that using 'list' as the param of
list_for_each_entry to traverse the retransmit, sacked and abandoned
queues, while chunks are using 'transmitted_list' to link into these
queues.
It could cause NULL dereference panic if there are chunks in any of these
queues when peeling off one asoc.
So use the chunk member 'transmitted_list' instead in this patch.
Fixes: d04adf1b3551 ("sctp: reset owner sk for data chunks on out queues when migrating a sock")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit bab1be79a5169ac748d8292b20c86d874022d7ba ]
As Marcelo noticed, in sctp_transport_get_next, it is iterating over
transports but then also accessing the association directly, without
checking any refcnts before that, which can cause an use-after-free
Read.
So fix it by holding transport before accessing the association. With
that, sctp_transport_hold calls can be removed in the later places.
Fixes: 626d16f50f39 ("sctp: export some apis or variables for sctp_diag and reuse some for proc")
Reported-by: syzbot+fe62a0c9aa6a85c6de16@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 81e98370293afcb58340ce8bd71af7b97f925c26 ]
Check must happen before call to ipv6_addr_v4mapped()
syzbot report was :
BUG: KMSAN: uninit-value in sctp_sockaddr_af net/sctp/socket.c:359 [inline]
BUG: KMSAN: uninit-value in sctp_do_bind+0x60f/0xdc0 net/sctp/socket.c:384
CPU: 0 PID: 3576 Comm: syzkaller968804 Not tainted 4.16.0+ #82
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x185/0x1d0 lib/dump_stack.c:53
kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
__msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676
sctp_sockaddr_af net/sctp/socket.c:359 [inline]
sctp_do_bind+0x60f/0xdc0 net/sctp/socket.c:384
sctp_bind+0x149/0x190 net/sctp/socket.c:332
inet6_bind+0x1fd/0x1820 net/ipv6/af_inet6.c:293
SYSC_bind+0x3f2/0x4b0 net/socket.c:1474
SyS_bind+0x54/0x80 net/socket.c:1460
do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x43fd49
RSP: 002b:00007ffe99df3d28 EFLAGS: 00000213 ORIG_RAX: 0000000000000031
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043fd49
RDX: 0000000000000010 RSI: 0000000020000000 RDI: 0000000000000003
RBP: 00000000006ca018 R08: 00000000004002c8 R09: 00000000004002c8
R10: 00000000004002c8 R11: 0000000000000213 R12: 0000000000401670
R13: 0000000000401700 R14: 0000000000000000 R15: 0000000000000000
Local variable description: ----address@SYSC_bind
Variable was created at:
SYSC_bind+0x6f/0x4b0 net/socket.c:1461
SyS_bind+0x54/0x80 net/socket.c:1460
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 6dfe4b97e08ec3d1a593fdaca099f0ef0a3a19e6 ]
Dmitry got the following recursive locking report while running syzkaller
fuzzer, the Call Trace:
__dump_stack lib/dump_stack.c:16 [inline]
dump_stack+0x2ee/0x3ef lib/dump_stack.c:52
print_deadlock_bug kernel/locking/lockdep.c:1729 [inline]
check_deadlock kernel/locking/lockdep.c:1773 [inline]
validate_chain kernel/locking/lockdep.c:2251 [inline]
__lock_acquire+0xef2/0x3430 kernel/locking/lockdep.c:3340
lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755
lock_sock_nested+0xcb/0x120 net/core/sock.c:2536
lock_sock include/net/sock.h:1460 [inline]
sctp_close+0xcd/0x9d0 net/sctp/socket.c:1497
inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425
inet6_release+0x50/0x70 net/ipv6/af_inet6.c:432
sock_release+0x8d/0x1e0 net/socket.c:597
__sock_create+0x38b/0x870 net/socket.c:1226
sock_create+0x7f/0xa0 net/socket.c:1237
sctp_do_peeloff+0x1a2/0x440 net/sctp/socket.c:4879
sctp_getsockopt_peeloff net/sctp/socket.c:4914 [inline]
sctp_getsockopt+0x111a/0x67e0 net/sctp/socket.c:6628
sock_common_getsockopt+0x95/0xd0 net/core/sock.c:2690
SYSC_getsockopt net/socket.c:1817 [inline]
SyS_getsockopt+0x240/0x380 net/socket.c:1799
entry_SYSCALL_64_fastpath+0x1f/0xc2
This warning is caused by the lock held by sctp_getsockopt() is on one
socket, while the other lock that sctp_close() is getting later is on
the newly created (which failed) socket during peeloff operation.
This patch is to avoid this warning by use lock_sock with subclass
SINGLE_DEPTH_NESTING as Wang Cong and Marcelo's suggestion.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit c76f97c99ae6d26d14c7f0e50e074382bfbc9f98 ]
Some sockopt handling functions were calculating the length of the
buffer to be written to userspace and then calculating it again when
actually writing the buffer, which could lead to some write not using
an up-to-date length.
This patch updates such places to just make use of the len variable.
Also, replace some sizeof(type) to sizeof(var).
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit ecca8f88da5c4260cc2bccfefd2a24976704c366 upstream.
Now in sctp_setsockopt_maxseg user_frag or frag_point can be set with
val >= 8 and val <= SCTP_MAX_CHUNK_LEN. But both checks are incorrect.
val >= 8 means frag_point can even be less than SCTP_DEFAULT_MINSEGMENT.
Then in sctp_datamsg_from_user(), when it's value is greater than cookie
echo len and trying to bundle with cookie echo chunk, the first_len will
overflow.
The worse case is when it's value is equal as cookie echo len, first_len
becomes 0, it will go into a dead loop for fragment later on. In Hangbin
syzkaller testing env, oom was even triggered due to consecutive memory
allocation in that loop.
Besides, SCTP_MAX_CHUNK_LEN is the max size of the whole chunk, it should
deduct the data header for frag_point or user_frag check.
This patch does a proper check with SCTP_DEFAULT_MINSEGMENT subtracting
the sctphdr and datahdr, SCTP_MAX_CHUNK_LEN subtracting datahdr when
setting frag_point via sockopt. It also improves sctp_setsockopt_maxseg
codes.
Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit a0ff660058b88d12625a783ce9e5c1371c87951f ]
After commit cea0cc80a677 ("sctp: use the right sk after waking up from
wait_buf sleep"), it may change to lock another sk if the asoc has been
peeled off in sctp_wait_for_sndbuf.
However, the asoc's new sk could be already closed elsewhere, as it's in
the sendmsg context of the old sk that can't avoid the new sk's closing.
If the sk's last one refcnt is held by this asoc, later on after putting
this asoc, the new sk will be freed, while under it's own lock.
This patch is to revert that commit, but fix the old issue by returning
error under the old sk's lock.
Fixes: cea0cc80a677 ("sctp: use the right sk after waking up from wait_buf sleep")
Reported-by: syzbot+ac6ea7baa4432811eb50@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit c5006b8aa74599ce19104b31d322d2ea9ff887cc ]
The check in sctp_sockaddr_af is not robust enough to forbid binding a
v4mapped v6 addr on a v4 socket.
The worse thing is that v4 socket's bind_verify would not convert this
v4mapped v6 addr to a v4 addr. syzbot even reported a crash as the v4
socket bound a v6 addr.
This patch is to fix it by doing the common sa.sa_family check first,
then AF_INET check for v4mapped v6 addrs.
Fixes: 7dab83de50c7 ("sctp: Support ipv6only AF_INET6 sockets.")
Reported-by: syzbot+7b7b518b1228d2743963@syzkaller.appspotmail.com
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 8cb38a602478e9f806571f6920b0a3298aabf042 ]
The patch(180d8cd942ce) replaces all uses of struct sock fields'
memory_pressure, memory_allocated, sockets_allocated, and sysctl_mem
to accessor macros. But the sockets_allocated field of sctp sock is
not replaced at all. Then replace it now for unifying the code.
Fixes: 180d8cd942ce ("foundations of per-cgroup memory pressure controlling.")
Cc: Glauber Costa <glommer@parallels.com>
Signed-off-by: Tonghao Zhang <zhangtonghao@didichuxing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit cea0cc80a6777beb6eb643d4ad53690e1ad1d4ff ]
Commit dfcb9f4f99f1 ("sctp: deny peeloff operation on asocs with threads
sleeping on it") fixed the race between peeloff and wait sndbuf by
checking waitqueue_active(&asoc->wait) in sctp_do_peeloff().
But it actually doesn't work, as even if waitqueue_active returns false
the waiting sndbuf thread may still not yet hold sk lock. After asoc is
peeled off, sk is not asoc->base.sk any more, then to hold the old sk
lock couldn't make assoc safe to access.
This patch is to fix this by changing to hold the new sk lock if sk is
not asoc->base.sk, meanwhile, also set the sk in sctp_sendmsg with the
new sk.
With this fix, there is no more race between peeloff and waitbuf, the
check 'waitqueue_active' in sctp_do_peeloff can be removed.
Thanks Marcelo and Neil for making this clear.
v1->v2:
fix it by changing to lock the new sock instead of adding a flag in asoc.
Suggested-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit ca3af4dd28cff4e7216e213ba3b671fbf9f84758 ]
Now in sctp_sendmsg sctp_wait_for_sndbuf could schedule out without
holding sock sk. It means the current asoc can be freed elsewhere,
like when receiving an abort packet.
If the asoc is just created in sctp_sendmsg and sctp_wait_for_sndbuf
returns err, the asoc will be freed again due to new_asoc is not nil.
An use-after-free issue would be triggered by this.
This patch is to fix it by setting new_asoc with nil if the asoc is
already dead when cpu schedules back, so that it will not be freed
again in sctp_sendmsg.
v1->v2:
set new_asoc as nil in sctp_sendmsg instead of sctp_wait_for_sndbuf.
Suggested-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit df80cd9b28b9ebaa284a41df611dbf3a2d05ca74 ]
Now when peeling off an association to the sock in another netns, all
transports in this assoc are not to be rehashed and keep use the old
key in hashtable.
As a transport uses sk->net as the hash key to insert into hashtable,
it would miss removing these transports from hashtable due to the new
netns when closing the sock and all transports are being freeed, then
later an use-after-free issue could be caused when looking up an asoc
and dereferencing those transports.
This is a very old issue since very beginning, ChunYu found it with
syzkaller fuzz testing with this series:
socket$inet6_sctp()
bind$inet6()
sendto$inet6()
unshare(0x40000000)
getsockopt$inet_sctp6_SCTP_GET_ASSOC_ID_LIST()
getsockopt$inet_sctp6_SCTP_SOCKOPT_PEELOFF()
This patch is to block this call when peeling one assoc off from one
netns to another one, so that the netns of all transport would not
go out-sync with the key in hashtable.
Note that this patch didn't fix it by rehashing transports, as it's
difficult to handle the situation when the tuple is already in use
in the new netns. Besides, no one would like to peel off one assoc
to another netns, considering ipaddrs, ifaces, etc. are usually
different.
Reported-by: ChunYu Wang <chunwang@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit d04adf1b355181e737b6b1e23d801b07f0b7c4c0 ]
Now when migrating sock to another one in sctp_sock_migrate(), it only
resets owner sk for the data in receive queues, not the chunks on out
queues.
It would cause that data chunks length on the sock is not consistent
with sk sk_wmem_alloc. When closing the sock or freeing these chunks,
the old sk would never be freed, and the new sock may crash due to
the overflow sk_wmem_alloc.
syzbot found this issue with this series:
r0 = socket$inet_sctp()
sendto$inet(r0)
listen(r0)
accept4(r0)
close(r0)
Although listen() should have returned error when one TCP-style socket
is in connecting (I may fix this one in another patch), it could also
be reproduced by peeling off an assoc.
This issue is there since very beginning.
This patch is to reset owner sk for the chunks on out queues so that
sk sk_wmem_alloc has correct value after accept one sock or peeloff
an assoc to one sock.
Note that when resetting owner sk for chunks on outqueue, it has to
sctp_clear_owner_w/skb_orphan chunks before changing assoc->base.sk
first and then sctp_set_owner_w them after changing assoc->base.sk,
due to that sctp_wfree and it's callees are using assoc->base.sk.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit ee6c88bb754e3d363e568da78086adfedb692447 ]
inet_diag_msg_sctp{,l}addr_fill() and sctp_get_sctp_info() copy
sizeof(sockaddr_storage) bytes to fill in sockaddr structs used
to export diagnostic information to userspace.
However, the memory allocated to store sockaddr information is
smaller than that and depends on the address family, so we leak
up to 100 uninitialized bytes to userspace. Just use the size of
the source structs instead, in all the three cases this is what
userspace expects. Zero out the remaining memory.
Unused bytes (i.e. when IPv4 addresses are used) in source
structs sctp_sockaddr_entry and sctp_transport are already
cleared by sctp_add_bind_addr() and sctp_transport_new(),
respectively.
Noticed while testing KASAN-enabled kernel with 'ss':
[ 2326.885243] BUG: KASAN: slab-out-of-bounds in inet_sctp_diag_fill+0x42c/0x6c0 [sctp_diag] at addr ffff881be8779800
[ 2326.896800] Read of size 128 by task ss/9527
[ 2326.901564] CPU: 0 PID: 9527 Comm: ss Not tainted 4.11.0-22.el7a.x86_64 #1
[ 2326.909236] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.4.3 01/17/2017
[ 2326.917585] Call Trace:
[ 2326.920312] dump_stack+0x63/0x8d
[ 2326.924014] kasan_object_err+0x21/0x70
[ 2326.928295] kasan_report+0x288/0x540
[ 2326.932380] ? inet_sctp_diag_fill+0x42c/0x6c0 [sctp_diag]
[ 2326.938500] ? skb_put+0x8b/0xd0
[ 2326.942098] ? memset+0x31/0x40
[ 2326.945599] check_memory_region+0x13c/0x1a0
[ 2326.950362] memcpy+0x23/0x50
[ 2326.953669] inet_sctp_diag_fill+0x42c/0x6c0 [sctp_diag]
[ 2326.959596] ? inet_diag_msg_sctpasoc_fill+0x460/0x460 [sctp_diag]
[ 2326.966495] ? __lock_sock+0x102/0x150
[ 2326.970671] ? sock_def_wakeup+0x60/0x60
[ 2326.975048] ? remove_wait_queue+0xc0/0xc0
[ 2326.979619] sctp_diag_dump+0x44a/0x760 [sctp_diag]
[ 2326.985063] ? sctp_ep_dump+0x280/0x280 [sctp_diag]
[ 2326.990504] ? memset+0x31/0x40
[ 2326.994007] ? mutex_lock+0x12/0x40
[ 2326.997900] __inet_diag_dump+0x57/0xb0 [inet_diag]
[ 2327.003340] ? __sys_sendmsg+0x150/0x150
[ 2327.007715] inet_diag_dump+0x4d/0x80 [inet_diag]
[ 2327.012979] netlink_dump+0x1e6/0x490
[ 2327.017064] __netlink_dump_start+0x28e/0x2c0
[ 2327.021924] inet_diag_handler_cmd+0x189/0x1a0 [inet_diag]
[ 2327.028045] ? inet_diag_rcv_msg_compat+0x1b0/0x1b0 [inet_diag]
[ 2327.034651] ? inet_diag_dump_compat+0x190/0x190 [inet_diag]
[ 2327.040965] ? __netlink_lookup+0x1b9/0x260
[ 2327.045631] sock_diag_rcv_msg+0x18b/0x1e0
[ 2327.050199] netlink_rcv_skb+0x14b/0x180
[ 2327.054574] ? sock_diag_bind+0x60/0x60
[ 2327.058850] sock_diag_rcv+0x28/0x40
[ 2327.062837] netlink_unicast+0x2e7/0x3b0
[ 2327.067212] ? netlink_attachskb+0x330/0x330
[ 2327.071975] ? kasan_check_write+0x14/0x20
[ 2327.076544] netlink_sendmsg+0x5be/0x730
[ 2327.080918] ? netlink_unicast+0x3b0/0x3b0
[ 2327.085486] ? kasan_check_write+0x14/0x20
[ 2327.090057] ? selinux_socket_sendmsg+0x24/0x30
[ 2327.095109] ? netlink_unicast+0x3b0/0x3b0
[ 2327.099678] sock_sendmsg+0x74/0x80
[ 2327.103567] ___sys_sendmsg+0x520/0x530
[ 2327.107844] ? __get_locked_pte+0x178/0x200
[ 2327.112510] ? copy_msghdr_from_user+0x270/0x270
[ 2327.117660] ? vm_insert_page+0x360/0x360
[ 2327.122133] ? vm_insert_pfn_prot+0xb4/0x150
[ 2327.126895] ? vm_insert_pfn+0x32/0x40
[ 2327.131077] ? vvar_fault+0x71/0xd0
[ 2327.134968] ? special_mapping_fault+0x69/0x110
[ 2327.140022] ? __do_fault+0x42/0x120
[ 2327.144008] ? __handle_mm_fault+0x1062/0x17a0
[ 2327.148965] ? __fget_light+0xa7/0xc0
[ 2327.153049] __sys_sendmsg+0xcb/0x150
[ 2327.157133] ? __sys_sendmsg+0xcb/0x150
[ 2327.161409] ? SyS_shutdown+0x140/0x140
[ 2327.165688] ? exit_to_usermode_loop+0xd0/0xd0
[ 2327.170646] ? __do_page_fault+0x55d/0x620
[ 2327.175216] ? __sys_sendmsg+0x150/0x150
[ 2327.179591] SyS_sendmsg+0x12/0x20
[ 2327.183384] do_syscall_64+0xe3/0x230
[ 2327.187471] entry_SYSCALL64_slow_path+0x25/0x25
[ 2327.192622] RIP: 0033:0x7f41d18fa3b0
[ 2327.196608] RSP: 002b:00007ffc3b731218 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 2327.205055] RAX: ffffffffffffffda RBX: 00007ffc3b731380 RCX: 00007f41d18fa3b0
[ 2327.213017] RDX: 0000000000000000 RSI: 00007ffc3b731340 RDI: 0000000000000003
[ 2327.220978] RBP: 0000000000000002 R08: 0000000000000004 R09: 0000000000000040
[ 2327.228939] R10: 00007ffc3b730f30 R11: 0000000000000246 R12: 0000000000000003
[ 2327.236901] R13: 00007ffc3b731340 R14: 00007ffc3b7313d0 R15: 0000000000000084
[ 2327.244865] Object at ffff881be87797e0, in cache kmalloc-64 size: 64
[ 2327.251953] Allocated:
[ 2327.254581] PID = 9484
[ 2327.257215] save_stack_trace+0x1b/0x20
[ 2327.261485] save_stack+0x46/0xd0
[ 2327.265179] kasan_kmalloc+0xad/0xe0
[ 2327.269165] kmem_cache_alloc_trace+0xe6/0x1d0
[ 2327.274138] sctp_add_bind_addr+0x58/0x180 [sctp]
[ 2327.279400] sctp_do_bind+0x208/0x310 [sctp]
[ 2327.284176] sctp_bind+0x61/0xa0 [sctp]
[ 2327.288455] inet_bind+0x5f/0x3a0
[ 2327.292151] SYSC_bind+0x1a4/0x1e0
[ 2327.295944] SyS_bind+0xe/0x10
[ 2327.299349] do_syscall_64+0xe3/0x230
[ 2327.303433] return_from_SYSCALL_64+0x0/0x6a
[ 2327.308194] Freed:
[ 2327.310434] PID = 4131
[ 2327.313065] save_stack_trace+0x1b/0x20
[ 2327.317344] save_stack+0x46/0xd0
[ 2327.321040] kasan_slab_free+0x73/0xc0
[ 2327.325220] kfree+0x96/0x1a0
[ 2327.328530] dynamic_kobj_release+0x15/0x40
[ 2327.333195] kobject_release+0x99/0x1e0
[ 2327.337472] kobject_put+0x38/0x70
[ 2327.341266] free_notes_attrs+0x66/0x80
[ 2327.345545] mod_sysfs_teardown+0x1a5/0x270
[ 2327.350211] free_module+0x20/0x2a0
[ 2327.354099] SyS_delete_module+0x2cb/0x2f0
[ 2327.358667] do_syscall_64+0xe3/0x230
[ 2327.362750] return_from_SYSCALL_64+0x0/0x6a
[ 2327.367510] Memory state around the buggy address:
[ 2327.372855] ffff881be8779700: fc fc fc fc 00 00 00 00 00 00 00 00 fc fc fc fc
[ 2327.380914] ffff881be8779780: fb fb fb fb fb fb fb fb fc fc fc fc 00 00 00 00
[ 2327.388972] >ffff881be8779800: 00 00 00 00 fc fc fc fc fb fb fb fb fb fb fb fb
[ 2327.397031] ^
[ 2327.401792] ffff881be8779880: fc fc fc fc fb fb fb fb fb fb fb fb fc fc fc fc
[ 2327.409850] ffff881be8779900: 00 00 00 00 00 04 fc fc fc fc fc fc 00 00 00 00
[ 2327.417907] ==================================================================
This fixes CVE-2017-7558.
References: https://bugzilla.redhat.com/show_bug.cgi?id=1480266
Fixes: 8f840e47f190 ("sctp: add the sctp_diag.c file")
Cc: Xin Long <lucien.xin@gmail.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 912964eacb111551db73429719eb5fadcab0ff8a ]
Commit 6f29a1306131 ("sctp: sctp_addr_id2transport should verify the
addr before looking up assoc") invoked sctp_verify_addr to verify the
addr.
But it didn't check af variable beforehand, once users pass an address
with family = 0 through sockopt, sctp_get_af_specific will return NULL
and NULL pointer dereference will be caused by af->sockaddr_len.
This patch is to fix it by returning NULL if af variable is NULL.
Fixes: 6f29a1306131 ("sctp: sctp_addr_id2transport should verify the addr before looking up assoc")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 988c7322116970696211e902b468aefec95b6ec4 ]
In sctp_for_each_transport, pos is used to save how many objs it has
dumped. Now it gets the last obj by sctp_transport_get_idx, then gets
the next obj by sctp_transport_get_next.
The issue is that in the meanwhile if some objs in transport hashtable
are removed and the objs nums are less than pos, sctp_transport_get_idx
would return NULL and hti.walker.tbl is NULL as well. At this moment
it should stop hti, instead of continue getting the next obj. Or it
would cause a NULL pointer dereference in sctp_transport_get_next.
This patch is to pass pos + 1 into sctp_transport_get_idx to get the
next obj directly, even if pos > objs nums, it would return NULL and
stop hti.
Fixes: 626d16f50f39 ("sctp: export some apis or variables for sctp_diag and reuse some for proc")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 581409dacc9176b0de1f6c4ca8d66e13aa8e1b29 ]
Now sctp holds read_lock when foreach sctp_ep_hashtable without disabling
BH. If CPU schedules to another thread A at this moment, the thread A may
be trying to hold the write_lock with disabling BH.
As BH is disabled and CPU cannot schedule back to the thread holding the
read_lock, while the thread A keeps waiting for the read_lock. A dead
lock would be triggered by this.
This patch is to fix this dead lock by calling read_lock_bh instead to
disable BH when holding the read_lock in sctp_for_each_endpoint.
Fixes: 626d16f50f39 ("sctp: export some apis or variables for sctp_diag and reuse some for proc")
Reported-by: Xiumei Mu <xmu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 6f29a130613191d3c6335169febe002cba00edf5 ]
sctp_addr_id2transport is a function for sockopt to look up assoc by
address. As the address is from userspace, it can be a v4-mapped v6
address. But in sctp protocol stack, it always handles a v4-mapped
v6 address as a v4 address. So it's necessary to convert it to a v4
address before looking up assoc by address.
This patch is to fix it by calling sctp_verify_addr in which it can do
this conversion before calling sctp_endpoint_lookup_assoc, just like
what sctp_sendmsg and __sctp_connect do for the address from users.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 34b2789f1d9bf8dcca9b5cb553d076ca2cd898ee ]
Now sctp doesn't check sock's state before listening on it. It could
even cause changing a sock with any state to become a listening sock
when doing sctp_listen.
This patch is to fix it by checking sock's state in sctp_listen, so
that it will listen on the sock with right state.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit dfcb9f4f99f1e9a49e43398a7bfbf56927544af1 upstream.
commit 2dcab5984841 ("sctp: avoid BUG_ON on sctp_wait_for_sndbuf")
attempted to avoid a BUG_ON call when the association being used for a
sendmsg() is blocked waiting for more sndbuf and another thread did a
peeloff operation on such asoc, moving it to another socket.
As Ben Hutchings noticed, then in such case it would return without
locking back the socket and would cause two unlocks in a row.
Further analysis also revealed that it could allow a double free if the
application managed to peeloff the asoc that is created during the
sendmsg call, because then sctp_sendmsg() would try to free the asoc
that was created only for that call.
This patch takes another approach. It will deny the peeloff operation
if there is a thread sleeping on the asoc, so this situation doesn't
exist anymore. This avoids the issues described above and also honors
the syscalls that are already being handled (it can be multiple sendmsg
calls).
Joint work with Xin Long.
Fixes: 2dcab5984841 ("sctp: avoid BUG_ON on sctp_wait_for_sndbuf")
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 2dcab598484185dea7ec22219c76dcdd59e3cb90 ]
Alexander Popov reported that an application may trigger a BUG_ON in
sctp_wait_for_sndbuf if the socket tx buffer is full, a thread is
waiting on it to queue more data and meanwhile another thread peels off
the association being used by the first thread.
This patch replaces the BUG_ON call with a proper error handling. It
will return -EPIPE to the original sendmsg call, similarly to what would
have been done if the association wasn't found in the first place.
Acked-by: Alexander Popov <alex.popov@linux.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
null
[ Upstream commit 08abb79542c9e8c367d1d8e44fe1026868d3f0a7 ]
Prior to this patch, sctp_transport_lookup_process didn't rcu_read_unlock
when it failed to find a transport by sctp_addrs_lookup_transport.
This patch is to fix it by moving up rcu_read_unlock right before checking
transport and also to remove the out path.
Fixes: 1cceda784980 ("sctp: fix the issue sctp_diag uses lock_sock in rcu_read_lock")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Now when users shutdown a sock with SEND_SHUTDOWN in sctp, even if
this sock has no connection (assoc), sk state would be changed to
SCTP_SS_CLOSING, which is not as we expect.
Besides, after that if users try to listen on this sock, kernel
could even panic when it dereference sctp_sk(sk)->bind_hash in
sctp_inet_listen, as bind_hash is null when sock has no assoc.
This patch is to move sk state change after checking sk assocs
is not empty, and also merge these two if() conditions and reduce
indent level.
Fixes: d46e416c11c8 ("sctp: sctp should change socket state when shutdown is received")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
sctp_wait_for_connect() currently already holds the asoc to keep it
alive during the sleep, in case another thread release it. But Andrey
Konovalov and Dmitry Vyukov reported an use-after-free in such
situation.
Problem is that __sctp_connect() doesn't get a ref on the asoc and will
do a read on the asoc after calling sctp_wait_for_connect(), but by then
another thread may have closed it and the _put on sctp_wait_for_connect
will actually release it, causing the use-after-free.
Fix is, instead of doing the read after waiting for the connect, do it
before so, and avoid this issue as the socket is still locked by then.
There should be no issue on returning the asoc id in case of failure as
the application shouldn't trust on that number in such situations
anyway.
This issue doesn't exist in sctp_sendmsg() path.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In sctp_transport_lookup_process(), Commit 1cceda784980 ("sctp: fix
the issue sctp_diag uses lock_sock in rcu_read_lock") moved cb() out
of rcu lock, but it put transport and hold assoc instead, and ignore
that cb() still uses transport. It may cause a use-after-free issue.
This patch is to hold transport instead of assoc there.
Fixes: 1cceda784980 ("sctp: fix the issue sctp_diag uses lock_sock in rcu_read_lock")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Most of getsockopt handlers in net/sctp/socket.c check len against
sizeof some structure like:
if (len < sizeof(int))
return -EINVAL;
On the first look, the check seems to be correct. But since len is int
and sizeof returns size_t, int gets promoted to unsigned size_t too. So
the test returns false for negative lengths. Yes, (-1 < sizeof(long)) is
false.
Fix this in sctp by explicitly checking len < 0 before any getsockopt
handler is called.
Note that sctp_getsockopt_events already handled the negative case.
Since we added the < 0 check elsewhere, this one can be removed.
If not checked, this is the result:
UBSAN: Undefined behaviour in ../mm/page_alloc.c:2722:19
shift exponent 52 is too large for 32-bit type 'int'
CPU: 1 PID: 24535 Comm: syz-executor Not tainted 4.8.1-0-syzkaller #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
0000000000000000 ffff88006d99f2a8 ffffffffb2f7bdea 0000000041b58ab3
ffffffffb4363c14 ffffffffb2f7bcde ffff88006d99f2d0 ffff88006d99f270
0000000000000000 0000000000000000 0000000000000034 ffffffffb5096422
Call Trace:
[<ffffffffb3051498>] ? __ubsan_handle_shift_out_of_bounds+0x29c/0x300
...
[<ffffffffb273f0e4>] ? kmalloc_order+0x24/0x90
[<ffffffffb27416a4>] ? kmalloc_order_trace+0x24/0x220
[<ffffffffb2819a30>] ? __kmalloc+0x330/0x540
[<ffffffffc18c25f4>] ? sctp_getsockopt_local_addrs+0x174/0xca0 [sctp]
[<ffffffffc18d2bcd>] ? sctp_getsockopt+0x10d/0x1b0 [sctp]
[<ffffffffb37c1219>] ? sock_common_getsockopt+0xb9/0x150
[<ffffffffb37be2f5>] ? SyS_getsockopt+0x1a5/0x270
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-sctp@vger.kernel.org
Cc: netdev@vger.kernel.org
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Three sets of overlapping changes. Nothing serious.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When sctp dumps all the ep->assocs, it needs to lock_sock first,
but now it locks sock in rcu_read_lock, and lock_sock may sleep,
which would break rcu_read_lock.
This patch is to get and hold one sock when traversing the list.
After that and get out of rcu_read_lock, lock and dump it. Then
it will traverse the list again to get the next one until all
sctp socks are dumped.
For sctp_diag_dump_one, it fixes this issue by holding asoc and
moving cb() out of rcu_read_lock in sctp_transport_lookup_process.
Fixes: 8f840e47f190 ("sctp: add the sctp_diag.c file")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Last patch "sctp: do not return the transmit err back to sctp_sendmsg"
made sctp_primitive_SEND return err only when asoc state is unavailable.
In this case, chunks are not enqueued, they have no chance to be freed if
we don't take care of them later.
This Patch is actually to revert commit 1cd4d5c4326a ("sctp: remove the
unused sctp_datamsg_free()"), commit 69b5777f2e57 ("sctp: hold the chunks
only after the chunk is enqueued in outq") and commit 8b570dc9f7b6 ("sctp:
only drop the reference on the datamsg after sending a msg"), to use
sctp_datamsg_free to free the chunks of current msg.
Fixes: 8b570dc9f7b6 ("sctp: only drop the reference on the datamsg after sending a msg")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 141ddefce7c8 ("sctp: change sk state to CLOSED instead of
CLOSING in sctp_sock_migrate") changed sk state to CLOSED if the
assoc is closed when sctp_accept clones a new sk.
If there is still data in sk receive queue, users will not be able
to read it any more, as sctp_recvmsg returns directly if sk state
is CLOSED.
This patch is to add CLOSED state check in sctp_recvmsg to allow
reading data from TCP-style sk with CLOSED state as what TCP does.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
I was seeing a lot of these:
BUG: sleeping function called from invalid context at mm/slab.h:388
in_atomic(): 0, irqs_disabled(): 0, pid: 14971, name: trinity-c2
Preemption disabled at:[<ffffffff819bcd46>] rhashtable_walk_start+0x46/0x150
[<ffffffff81149abb>] preempt_count_add+0x1fb/0x280
[<ffffffff83295722>] _raw_spin_lock+0x12/0x40
[<ffffffff811aac87>] console_unlock+0x2f7/0x930
[<ffffffff811ab5bb>] vprintk_emit+0x2fb/0x520
[<ffffffff811aba6a>] vprintk_default+0x1a/0x20
[<ffffffff812c171a>] printk+0x94/0xb0
[<ffffffff811d6ed0>] print_stack_trace+0xe0/0x170
[<ffffffff8115835e>] ___might_sleep+0x3be/0x460
[<ffffffff81158490>] __might_sleep+0x90/0x1a0
[<ffffffff8139b823>] kmem_cache_alloc+0x153/0x1e0
[<ffffffff819bca1e>] rhashtable_walk_init+0xfe/0x2d0
[<ffffffff82ec64de>] sctp_transport_walk_start+0x1e/0x60
[<ffffffff82edd8ad>] sctp_transport_seq_start+0x4d/0x150
[<ffffffff8143a82b>] seq_read+0x27b/0x1180
[<ffffffff814f97fc>] proc_reg_read+0xbc/0x180
[<ffffffff813d471b>] __vfs_read+0xdb/0x610
[<ffffffff813d4d3a>] vfs_read+0xea/0x2d0
[<ffffffff813d615b>] SyS_pread64+0x11b/0x150
[<ffffffff8100334c>] do_syscall_64+0x19c/0x410
[<ffffffff832960a5>] return_from_SYSCALL_64+0x0/0x6a
[<ffffffffffffffff>] 0xffffffffffffffff
Apparently we always need to call rhashtable_walk_stop(), even when
rhashtable_walk_start() fails:
* rhashtable_walk_start - Start a hash table walk
* @iter: Hash table iterator
*
* Start a hash table walk. Note that we take the RCU lock in all
* cases including when we return an error. So you must always call
* rhashtable_walk_stop to clean up.
otherwise we never call rcu_read_unlock() and we get the splat above.
Fixes: 53fa1036 ("sctp: fix some rhashtable functions using in sctp proc/diag")
See-also: 53fa1036 ("sctp: fix some rhashtable functions using in sctp proc/diag")
See-also: f2dba9c6 ("rhashtable: Introduce rhashtable_walk_*")
Cc: Xin Long <lucien.xin@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: stable@vger.kernel.org
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit d46e416c11c8 missed to update some other places which checked for
the socket being TCP-style AND Established state, as Closing state has
some overlapping with the previous understanding of Established.
Without this fix, one of the effects is that some already queued rx
messages may not be readable anymore depending on how the association
teared down, and sending may also not be possible if peer initiated the
shutdown.
Also merge two if() blocks into one condition on sctp_sendmsg().
Cc: Xin Long <lucien.xin@gmail.com>
Fixes: d46e416c11c8 ("sctp: sctp should change socket state when shutdown is received")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
SCTP will try to access original IP headers on sctp_recvmsg in order to
copy the addresses used. There are also other places that do similar access
to IP or even SCTP headers. But after 90017accff61 ("sctp: Add GSO
support") they aren't always there because they are only present in the
header skb.
SCTP handles the queueing of incoming data by cloning the incoming skb
and limiting to only the relevant payload. This clone has its cb updated
to something different and it's then queued on socket rx queue. Thus we
need to fix this in two moments.
For rx path, not related to socket queue yet, this patch uses a
partially copied sctp_input_cb to such GSO frags. This restores the
ability to access the headers for this part of the code.
Regarding the socket rx queue, it removes iif member from sctp_event and
also add a chunk pointer on it.
With these changes we're always able to reach the headers again.
The biggest change here is that now the sctp_chunk struct and the
original skb are only freed after the application consumed the buffer.
Note however that the original payload was already like this due to the
skb cloning.
For iif, SCTP's IPv4 code doesn't use it, so no change is necessary.
IPv6 now can fetch it directly from original's IPv6 CB as the original
skb is still accessible.
In the future we probably can simplify sctp_v*_skb_iif() stuff, as
sctp_v4_skb_iif() was called but it's return value not used, and now
it's not even called, but such cleanup is out of scope for this change.
Fixes: 90017accff61 ("sctp: Add GSO support")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
prsctp PRIO policy is a policy to abandon lower priority chunks when
asoc doesn't have enough snd buffer, so that the current chunk with
higher priority can be queued successfully.
Similar to TTL/RTX policy, we will set the priority of the chunk to
prsctp_param with sinfo->sinfo_timetolive in sctp_set_prsctp_policy().
So if PRIO policy is enabled, msg->expire_at won't work.
asoc->sent_cnt_removable will record how many chunks can be checked to
remove. If priority policy is enabled, when the chunk is queued into
the out_queue, we will increase sent_cnt_removable. When the chunk is
moved to abandon_queue or dequeue and free, we will decrease
sent_cnt_removable.
In sctp_sendmsg, we will check if there is enough snd buffer for current
msg and if sent_cnt_removable is not 0. Then try to abandon chunks in
sctp_prune_prsctp when sendmsg from the retransmit/transmited queue, and
free chunks from out_queue in right order until the abandon+free size >
msg_len - sctp_wfree. For the abandon size, we have to wait until it
sends FORWARD TSN, receives the sack and the chunks are really freed.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
prsctp TTL policy is a policy to abandon chunks when they expire
at the specific time in local stack. It's similar with expires_at
in struct sctp_datamsg.
This patch uses sinfo->sinfo_timetolive to set the specific time for
TTL policy. sinfo->sinfo_timetolive is also used for msg->expires_at.
So if prsctp_enable or TTL policy is not enabled, msg->expires_at
still works as before.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds SCTP_PR_ASSOC_STATUS to sctp sockopt, which is used
to dump the prsctp statistics info from the asoc. The prsctp statistics
includes abandoned_sent/unsent from the asoc. abandoned_sent is the
count of the packets we drop packets from retransmit/transmited queue,
and abandoned_unsent is the count of the packets we drop from out_queue
according to the policy.
Note: another option for prsctp statistics dump described in rfc is
SCTP_PR_STREAM_STATUS, which is used to dump the prsctp statistics
info from each stream. But by now, linux doesn't yet have per stream
statistics info, it needs rfc6525 to be implemented. As the prsctp
statistics for each stream has to be based on per stream statistics,
we will delay it until rfc6525 is done in linux.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds SCTP_DEFAULT_PRINFO to sctp sockopt. It is used
to set/get sctp Partially Reliable Policies' default params,
which includes 3 policies (ttl, rtx, prio) and their values.
Still, if we set policy params in sndinfo, we will use the params
of sndinfo against chunks, instead of the default params.
In this patch, we will use 5-8bit of sp/asoc->default_flags
to store prsctp policies, and reuse asoc->default_timetolive
to store their values. It means if we enable and set prsctp
policy, prior ttl timeout in sctp will not work any more.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
According to section 4.5 of rfc7496, prsctp_enable should be per asoc.
We will add prsctp_enable to both asoc and ep, and replace the places
where it used net.sctp->prsctp_enable with asoc->prsctp_enable.
ep->prsctp_enable will be initialized with net.sctp->prsctp_enable, and
asoc->prsctp_enable will be initialized with ep->prsctp_enable. We can
also modify it's value through sockopt SCTP_PR_SUPPORTED.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit d46e416c11c8 ("sctp: sctp should change socket state when
shutdown is received") may set sk_state CLOSING in sctp_sock_migrate,
but inet_accept doesn't allow the sk_state other than ESTABLISHED/
CLOSED for sctp. So we will change sk_state to CLOSED, instead of
CLOSING, as actually sk is closed already there.
Fixes: d46e416c11c8 ("sctp: sctp should change socket state when shutdown is received")
Reported-by: Ye Xiaolong <xiaolong.ye@intel.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now sctp doesn't change socket state upon shutdown reception. It changes
just the assoc state, even though it's a TCP-style socket.
For some cases, if we really need to check sk->sk_state, it's necessary to
fix this issue, at least when we use ss or netstat to dump, we can get a
more exact information.
As an improvement, we will change sk->sk_state when we change asoc->state
to SHUTDOWN_RECEIVED, and also do it in sctp_shutdown to keep consistent
with sctp_close.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo R. Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
SCTP has this pecualiarity that its packets cannot be just segmented to
(P)MTU. Its chunks must be contained in IP segments, padding respected.
So we can't just generate a big skb, set gso_size to the fragmentation
point and deliver it to IP layer.
This patch takes a different approach. SCTP will now build a skb as it
would be if it was received using GRO. That is, there will be a cover
skb with protocol headers and children ones containing the actual
segments, already segmented to a way that respects SCTP RFCs.
With that, we can tell skb_segment() to just split based on frag_list,
trusting its sizes are already in accordance.
This way SCTP can benefit from GSO and instead of passing several
packets through the stack, it can pass a single large packet.
v2:
- Added support for receiving GSO frames, as requested by Dave Miller.
- Clear skb->cb if packet is GSO (otherwise it's not used by SCTP)
- Added heuristics similar to what we have in TCP for not generating
single GSO packets that fills cwnd.
v3:
- consider sctphdr size in skb_gso_transport_seglen()
- rebased due to 5c7cdf339af5 ("gso: Remove arbitrary checks for
unsupported GSO")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Now we cannot distinguish that one sk is a udp or sctp style when
we use ss to dump sctp_info. it's necessary to dump it as well.
For sctp_diag, ss support is not officially available, thus there
are no official users of this yet, so we can add this field in the
middle of sctp_info without breaking user API.
v1->v2:
- move 'sctpi_s_type' field to the end of struct sctp_info, so
that it won't cause incompatibility with applications already
built.
- add __reserved3 in sctp_info to make sure sctp_info is 8-byte
alignment.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When rhashtable_walk_init return err, no release function should be
called, and when rhashtable_walk_start return err, we should only invoke
rhashtable_walk_exit to release the source.
But now when sctp_transport_walk_start return err, we just call
rhashtable_walk_stop/exit, and never care about if rhashtable_walk_init
or start return err, which is so bad.
We will fix it by calling rhashtable_walk_exit if rhashtable_walk_start
return err in sctp_transport_walk_start, and if sctp_transport_walk_start
return err, we do not need to call sctp_transport_walk_stop any more.
For sctp proc, we will use 'iter->start_fail' to decide if we will call
rhashtable_walk_stop/exit.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For some main variables in sctp.ko, we couldn't export it to other modules,
so we have to define some api to access them.
It will include sctp transport and endpoint's traversal.
There are some transport traversal functions for sctp_diag, we can also
use it for sctp_proc. cause they have the similar situation to traversal
transport.
v2->v3:
- rhashtable_walk_init need the parameter gfp, because of recent upstrem
update
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|