| Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
commit 54d83fc74aa9ec72794373cb47432c5f7fb1a309 upstream.
Ben Hawkes says:
In the mark_source_chains function (net/ipv4/netfilter/ip_tables.c) it
is possible for a user-supplied ipt_entry structure to have a large
next_offset field. This field is not bounds checked prior to writing a
counter value at the supplied offset.
Problem is that mark_source_chains should not have been called --
the rule doesn't have a next entry, so its supposed to return
an absolute verdict of either ACCEPT or DROP.
However, the function conditional() doesn't work as the name implies.
It only checks that the rule is using wildcard address matching.
However, an unconditional rule must also not be using any matches
(no -m args).
The underflow validator only checked the addresses, therefore
passing the 'unconditional absolute verdict' test, while
mark_source_chains also tested for presence of matches, and thus
proceeeded to the next (not-existent) rule.
Unify this so that all the callers have same idea of 'unconditional rule'.
Reported-by: Ben Hawkes <hawkes@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 81422e672f8181d7ad1ee6c60c723aac649f538f upstream.
According to Documentation/power/devices.txt
The driver should not use device_set_wakeup_enable() which is the policy
for user to decide.
Using device_init_wakeup() to initialize dev->power.should_wakeup and
dev->power.can_wakeup on driver initialization.
And use device_may_wakeup() on suspend to decide if WoL function should
be enabled on NIC.
Reported-by: Diego Viola <diego.viola@gmail.com>
Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 0772a99b818079e628a1da122ac7ee023faed83e upstream.
Otherwise it might be back on resume right after going to suspend in
some hardware.
Reported-by: Diego Viola <diego.viola@gmail.com>
Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 3ba3458fb9c050718b95275a3310b74415e767e2 ]
When sending a UDPv6 message longer than MTU, account for the length
of fragmentable IPv6 extension headers in skb->network_header offset.
Same as we do in alloc_new_skb path in __ip6_append_data().
This ensures that later on __ip6_make_skb() will make space in
headroom for fragmentable extension headers:
/* move skb->data to ip header from ext header */
if (skb->data < skb_network_header(skb))
__skb_pull(skb, skb_network_offset(skb));
Prevents a splat due to skb_under_panic:
skbuff: skb_under_panic: text:ffffffff8143397b len:2126 put:14 \
head:ffff880005bacf50 data:ffff880005bacf4a tail:0x48 end:0xc0 dev:lo
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:104!
invalid opcode: 0000 [#1] KASAN
CPU: 0 PID: 160 Comm: reproducer Not tainted 4.6.0-rc2 #65
[...]
Call Trace:
[<ffffffff813eb7b9>] skb_push+0x79/0x80
[<ffffffff8143397b>] eth_header+0x2b/0x100
[<ffffffff8141e0d0>] neigh_resolve_output+0x210/0x310
[<ffffffff814eab77>] ip6_finish_output2+0x4a7/0x7c0
[<ffffffff814efe3a>] ip6_output+0x16a/0x280
[<ffffffff815440c1>] ip6_local_out+0xb1/0xf0
[<ffffffff814f1115>] ip6_send_skb+0x45/0xd0
[<ffffffff81518836>] udp_v6_send_skb+0x246/0x5d0
[<ffffffff8151985e>] udpv6_sendmsg+0xa6e/0x1090
[...]
Reported-by: Ji Jianwen <jiji@redhat.com>
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 5745b8232e942abd5e16e85fa9b27cc21324acf0 ]
pskb_may_pull() can change skb->data, so we have to load ptr/optr at the
right place.
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 071d36bf21bcc837be00cea55bcef8d129e7f609 ]
A crash is observed when a decrypted packet is processed in receive
path. get_rps_cpus() tries to dereference the skb->dev fields but it
appears that the device is freed from the poison pattern.
[<ffffffc000af58ec>] get_rps_cpu+0x94/0x2f0
[<ffffffc000af5f94>] netif_rx_internal+0x140/0x1cc
[<ffffffc000af6094>] netif_rx+0x74/0x94
[<ffffffc000bc0b6c>] xfrm_input+0x754/0x7d0
[<ffffffc000bc0bf8>] xfrm_input_resume+0x10/0x1c
[<ffffffc000ba6eb8>] esp_input_done+0x20/0x30
[<ffffffc0000b64c8>] process_one_work+0x244/0x3fc
[<ffffffc0000b7324>] worker_thread+0x2f8/0x418
[<ffffffc0000bb40c>] kthread+0xe0/0xec
-013|get_rps_cpu(
| dev = 0xFFFFFFC08B688000,
| skb = 0xFFFFFFC0C76AAC00 -> (
| dev = 0xFFFFFFC08B688000 -> (
| name =
"......................................................
| name_hlist = (next = 0xAAAAAAAAAAAAAAAA, pprev =
0xAAAAAAAAAAA
Following are the sequence of events observed -
- Encrypted packet in receive path from netdevice is queued
- Encrypted packet queued for decryption (asynchronous)
- Netdevice brought down and freed
- Packet is decrypted and returned through callback in esp_input_done
- Packet is queued again for process in network stack using netif_rx
Since the device appears to have been freed, the dereference of
skb->dev in get_rps_cpus() leads to an unhandled page fault
exception.
Fix this by holding on to device reference when queueing packets
asynchronously and releasing the reference on call back return.
v2: Make the change generic to xfrm as mentioned by Steffen and
update the title to xfrm
Suggested-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Jerome Stanislaus <jeromes@codeaurora.org>
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 2c9a266afefe137bff06bbe0fc48b4d3b3cb348c ]
When running small packets [length < 256 bytes] traffic, packets were
being dropped due to invalid data in those packets which were
delivered by the driver upto the stack. Using pci_dma_sync_single_for_cpu
ensures copying latest and updated data into skb from the receive buffer.
Signed-off-by: Sony Chacko <sony.chacko@qlogic.com>
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit e725a66c0202b5f36c2f9d59d26a65c53bbf21f7 ]
gcc-6 finds an out of bounds access in the fst_add_one function
when calculating the end of the mmio area:
drivers/net/wan/farsync.c: In function 'fst_add_one':
drivers/net/wan/farsync.c:418:53: error: index 2 denotes an offset greater than size of 'u8[2][8192] {aka unsigned char[2][8192]}' [-Werror=array-bounds]
#define BUF_OFFSET(X) (BFM_BASE + offsetof(struct buf_window, X))
^
include/linux/compiler-gcc.h:158:21: note: in definition of macro '__compiler_offsetof'
__builtin_offsetof(a, b)
^
drivers/net/wan/farsync.c:418:37: note: in expansion of macro 'offsetof'
#define BUF_OFFSET(X) (BFM_BASE + offsetof(struct buf_window, X))
^~~~~~~~
drivers/net/wan/farsync.c:2519:36: note: in expansion of macro 'BUF_OFFSET'
+ BUF_OFFSET ( txBuffer[i][NUM_TX_BUFFER][0]);
^~~~~~~~~~
The warning is correct, but not critical because this appears
to be a write-only variable that is set by each WAN driver but
never accessed afterwards.
I'm taking the minimal fix here, using the correct pointer by
pointing 'mem_end' to the last byte inside of the register area
as all other WAN drivers do, rather than the first byte outside of
it. An alternative would be to just remove the mem_end member
entirely.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 8e2ad4113ce4671686740f808ff2795395c39eef ]
The stack expects link layer headers in the skb linear section.
Macvtap can create skbs with llheader in frags in edge cases:
when (IFF_VNET_HDR is off or vnet_hdr.hdr_len < ETH_HLEN) and
prepad + len > PAGE_SIZE and vnet_hdr.flags has no or bad csum.
Add checks to ensure linear is always at least ETH_HLEN.
At this point, len is already ensured to be >= ETH_HLEN.
For backwards compatiblity, rounds up short vnet_hdr.hdr_len.
This differs from tap and packet, which return an error.
Fixes b9fb9ee07e67 ("macvtap: add GSO/csum offload support")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: don't use macvtap16_to_cpu()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit c1b7fca65070bfadca94dd53a4e6b71cd4f69715 ]
In a low memory situation, if netdev_alloc_skb() fails on a first RX ring
loop iteration in sh_eth_ring_format(), 'rxdesc' is still NULL. Avoid
kernel oops by adding the 'rxdesc' check after the loop.
Reported-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit ea47781c26510e5d97f80f9aceafe9065bd5e3aa ]
As variable length protocol, AX25 fails link layer header validation
tests based on a minimum length. header_ops.validate allows protocols
to validate headers that are shorter than hard_header_len. Implement
this callback for AX25.
See also http://comments.gmane.org/gmane.linux.network/401064
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 2793a23aacbd754dbbb5cb75093deb7e4103bace ]
Netdevice parameter hard_header_len is variously interpreted both as
an upper and lower bound on link layer header length. The field is
used as upper bound when reserving room at allocation, as lower bound
when validating user input in PF_PACKET.
Clarify the definition to be maximum header length. For validation
of untrusted headers, add an optional validate member to header_ops.
Allow bypassing of validation by passing CAP_SYS_RAWIO, for instance
for deliberate testing of corrupt input. In this case, pad trailing
bytes, as some device drivers expect completely initialized headers.
See also http://comments.gmane.org/gmane.linux.network/401064
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: net_device has inline comments instead of kernel-doc]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 48906f62c96cc2cd35753e59310cb70eb08cc6a5 ]
Some devices will silently fail setup unless they are reset first.
This is necessary even if the data interface is already in
altsetting 0, which it will be when the device is probed for the
first time. Briefly toggling the altsetting forces a function
reset regardless of the initial state.
This fixes a setup problem observed on a number of Huawei devices,
appearing to operate in NTB-32 mode even if we explicitly set them
to NTB-16 mode.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: hard-code 1 for data_altsetting]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 40b4f0fd74e46c017814618d67ec9127ff20f157 ]
As the member .cmp_addr of sctp_af_inet6, sctp_v6_cmp_addr should also check
the port of addresses, just like sctp_v4_cmp_addr, cause it's invoked by
sctp_cmp_addr_exact().
Now sctp_v6_cmp_addr just check the port when two addresses have different
family, and lack the port check for two ipv6 addresses. that will make
sctp_hash_cmp() cannot work well.
so fix it by adding ports comparison in sctp_v6_cmp_addr().
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit ee50c130c82175eaa0820c96b6d3763928af2241 ]
The JMC260 network card fails to suspend/resume because the call to
jme_start_irq() was too early, moving the call to jme_start_irq() after
the call to jme_reset_link() makes it work.
Prior this change suspend/resume would fail unless /sys/power/pm_async=0
was explicitly specified.
Relevant bug report: https://bugzilla.kernel.org/show_bug.cgi?id=112351
Signed-off-by: Diego Viola <diego.viola@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit ff1cab374ad98f4b9f408525ca9c08992b4ed784 upstream.
The BSP team noticed that there is spin/mutex lock issue on sh-sci when
CPUFREQ is used. The issue is that the notifier function may call
mutex_lock() while the spinlock is held, which can lead to a BUG().
This may happen if CPUFREQ is changed while another CPU calls
clk_get_rate().
Taking the spinlock was added to the notifier function in commit
e552de2413edad1a ("sh-sci: add platform device private data"), to
protect the list of serial ports against modification during traversal.
At that time the Common Clock Framework didn't exist yet, and
clk_get_rate() just returned clk->rate without taking a mutex.
Note that since commit d535a2305facf9b4 ("serial: sh-sci: Require a
device per port mapping."), there's no longer a list of serial ports to
traverse, and taking the spinlock became superfluous.
To fix the issue, just remove the cpufreq notifier:
1. The notifier doesn't work correctly: all it does is update the
stored clock rate; it does not update the divider in the hardware.
The divider will only be updated when calling sci_set_termios().
I believe this was broken back in 2004, when the old
drivers/char/sh-sci.c driver (where the notifier did update the
divider) was replaced by drivers/serial/sh-sci.c (where the
notifier just updated port->uartclk).
Cfr. full-history-linux commits 6f8deaef2e9675d9 ("[PATCH] sh: port
sh-sci driver to the new API") and 3f73fe878dc9210a ("[PATCH]
Remove old sh-sci driver").
2. On modern SoCs, the sh-sci parent clock rate is no longer related
to the CPU clock rate anyway, so using a cpufreq notifier is
futile.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 2d99b55d378c996b9692a0c93dd25f4ed5d58934 upstream.
Commit 35dc248383bbab0a7203fca4d722875bc81ef091 introduced a check for
current->mm to see if we have a user space context and only copies data
if we do. Now if an IO gets interrupted by a signal data isn't copied
into user space any more (as we don't have a user space context) but
user space isn't notified about it.
This patch modifies the behaviour to return -EINTR from bio_uncopy_user()
to notify userland that a signal has interrupted the syscall, otherwise
it could lead to a situation where the caller may get a buffer with
no data returned.
This can be reproduced by issuing SG_IO ioctl()s in one thread while
constantly sending signals to it.
Fixes: 35dc248 [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
[bwh: Backported to 3.2:
- Adjust filename
- Put the assignment in the existing 'else' block]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit d9749fb5942f51555dc9ce1ac0dbb1806960a975 ]
Dmitry Vyukov noted recently that the sctp_port_hashtable had an error in
its size computation, observing that the current method never guaranteed
that the hashsize (measured in number of entries) would be a power of two,
which the input hash function for that table requires. The root cause of
the problem is that two values need to be computed (one, the allocation
order of the storage requries, as passed to __get_free_pages, and two the
number of entries for the hash table). Both need to be ^2, but for
different reasons, and the existing code is simply computing one order
value, and using it as the basis for both, which is wrong (i.e. it assumes
that ((1<<order)*PAGE_SIZE)/sizeof(bucket) is still ^2 when its not).
To fix this, we change the logic slightly. We start by computing a goal
allocation order (which is limited by the maximum size hash table we want
to support. Then we attempt to allocate that size table, decreasing the
order until a successful allocation is made. Then, with the resultant
successful order we compute the number of buckets that hash table supports,
which we then round down to the nearest power of two, giving us the number
of entries the table actually supports.
I've tested this locally here, using non-debug and spinlock-debug kernels,
and the number of entries in the hashtable consistently work out to be
powers of two in all cases.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
CC: Dmitry Vyukov <dvyukov@google.com>
CC: Vladislav Yasevich <vyasevich@gmail.com>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 29e73269aa4d36f92b35610c25f8b01c789b0dc8 ]
Drop reference on the relay_po socket when __pppoe_xmit() succeeds.
This is already handled correctly in the error path.
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 919483096bfe75dda338e98d56da91a263746a0a ]
Dmitry reported memory leaks of IP options allocated in
ip_cmsg_send() when/if this function returns an error.
Callers are responsible for the freeing.
Many thanks to Dmitry for the report and diagnostic.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 8013d1d7eafb0589ca766db6b74026f76b7f5cb4 ]
Commit 6fd99094de2b ("ipv6: Don't reduce hop limit for an interface")
disabled accept hop limit from RA if it is smaller than the current hop
limit for security stuff. But this behavior kind of break the RFC definition.
RFC 4861, 6.3.4. Processing Received Router Advertisements
A Router Advertisement field (e.g., Cur Hop Limit, Reachable Time,
and Retrans Timer) may contain a value denoting that it is
unspecified. In such cases, the parameter should be ignored and the
host should continue using whatever value it is already using.
If the received Cur Hop Limit value is non-zero, the host SHOULD set
its CurHopLimit variable to the received value.
So add sysctl option accept_ra_min_hop_limit to let user choose the minimum
hop limit value they can accept from RA. And set default to 1 to meet RFC
standards.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2:
- Adjust filename, context
- Number DEVCONF enumerators explicitly to match upstream]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 1cdda91871470f15e79375991bd2eddc6e86ddb1 ]
Currently, the egress interface index specified via IPV6_PKTINFO
is ignored by __ip6_datagram_connect(), so that RFC 3542 section 6.7
can be subverted when the user space application calls connect()
before sendmsg().
Fix it by initializing properly flowi6_oif in connect() before
performing the route lookup.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 81e8f2e930fe76b9814c71b9d87c30760b5eb705 ]
PHY status frames are not reliable, the PHY may not be able to send them
during heavy receive traffic. This overflow condition is signaled by the
PHY in the next status frame, but the driver did not make use of it.
Instead it always reported wrong tx timestamps to user space after an
overflow happened because it assigned newly received tx timestamps to old
packets in the queue.
This commit fixes this issue by clearing the tx timestamp queue every time
an overflow happens, so that no timestamps are delivered for overflow
packets. This way time stamping will continue correctly after an overflow.
Signed-off-by: Manfred Rudigier <manfred.rudigier@omicron.at>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 52a82e23b9f2a9e1d429c5207f8575784290d008 ]
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Evgeny Cherkashin <Eugene.Crosser@ru.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 34ae6a1aa0540f0f781dd265366036355fdc8930 ]
When a tunnel decapsulates the outer header, it has to comply
with RFC 6080 and eventually propagate CE mark into inner header.
It turns out IP6_ECN_set_ce() does not correctly update skb->csum
for CHECKSUM_COMPLETE packets, triggering infamous "hw csum failure"
messages and stack traces.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2:
- Adjust context
- Add skb argument to other callers of IP6_ECN_set_ce()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 7aaed57c5c2890634cfadf725173c7c68ea4cb4f ]
Ivaylo Dimitrov reported a regression caused by commit 7866a621043f
("dev: add per net_device packet type chains").
skb->dev becomes NULL and we crash in __netif_receive_skb_core().
Before above commit, different kind of bugs or corruptions could happen
without major crash.
But the root cause is that phonet_rcv() can queue skb without checking
if skb is shared or not.
Many thanks to Ivaylo Dimitrov for his help, diagnosis and tests.
Reported-by: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com>
Tested-by: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Remi Denis-Courmont <courmisch@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 83d15e70c4d8909d722c0d64747d8fb42e38a48f ]
For tcp_yeah, use an ssthresh floor of 2, the same floor used by Reno
and CUBIC, per RFC 5681 (equation 4).
tcp_yeah_ssthresh() was sometimes returning a 0 or negative ssthresh
value if the intended reduction is as big or bigger than the current
cwnd. Congestion control modules should never return a zero or
negative ssthresh. A zero ssthresh generally results in a zero cwnd,
causing the connection to stall. A negative ssthresh value will be
interpreted as a u32 and will set a target cwnd for PRR near 4
billion.
Oleksandr Natalenko reported that a system using tcp_yeah with ECN
could see a warning about a prior_cwnd of 0 in
tcp_cwnd_reduction(). Testing verified that this was due to
tcp_yeah_ssthresh() misbehaving in this way.
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit ff62198553e43cdffa9d539f6165d3e83f8a42bc ]
[I stole this patch from Eric Biederman. He wrote:]
> There is no defined mechanism to pass network namespace information
> into /sbin/bridge-stp therefore don't even try to invoke it except
> for bridge devices in the initial network namespace.
>
> It is possible for unprivileged users to cause /sbin/bridge-stp to be
> invoked for any network device name which if /sbin/bridge-stp does not
> guard against unreasonable arguments or being invoked twice on the
> same network device could cause problems.
[Hannes: changed patch using netns_eq]
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
[ Upstream commit 55285bf09427c5abf43ee1d54e892f352092b1f1 ]
Dmitry reports memleak with syskaller program.
Problem is that connector bumps skb usecount but might not invoke callback.
So move skb_get to where we invoke the callback.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
sctp_close
[ Upstream commit 068d8bd338e855286aea54e70d1c101569284b21 ]
In sctp_close, sctp_make_abort_user may return NULL because of memory
allocation failure. If this happens, it will bypass any state change
and never free the assoc. The assoc has no chance to be freed and it
will be kept in memory with the state it had even after the socket is
closed by sctp_close().
So if sctp_make_abort_user fails to allocate memory, we should abort
the asoc via sctp_primitive_ABORT as well. Just like the annotation in
sctp_sf_cookie_wait_prm_abort and sctp_sf_do_9_1_prm_abort said,
"Even if we can't send the ABORT due to low memory delete the TCB.
This is a departure from our typical NOMEM handling".
But then the chunk is NULL (low memory) and the SCTP_CMD_REPLY cmd would
dereference the chunk pointer, and system crash. So we should add
SCTP_CMD_REPLY cmd only when the chunk is not NULL, just like other
places where it adds SCTP_CMD_REPLY cmd.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 5e1021f2b6dff1a86a468a1424d59faae2bc63c1 upstream.
ext4_reserve_inode_write() in ext4_mark_inode_dirty() could fail on
error (e.g. EIO) and iloc.bh can be NULL in this case. But the error is
ignored in the following "if" condition and ext4_expand_extra_isize()
might be called with NULL iloc.bh set, which triggers NULL pointer
dereference.
This is uncovered by commit 8b4953e13f4c ("ext4: reserve code points for
the project quota feature"), which enlarges the ext4_inode size, and
run the following script on new kernel but with old mke2fs:
#/bin/bash
mnt=/mnt/ext4
devname=ext4-error
dev=/dev/mapper/$devname
fsimg=/home/fs.img
trap cleanup 0 1 2 3 9 15
cleanup()
{
umount $mnt >/dev/null 2>&1
dmsetup remove $devname
losetup -d $backend_dev
rm -f $fsimg
exit 0
}
rm -f $fsimg
fallocate -l 1g $fsimg
backend_dev=`losetup -f --show $fsimg`
devsize=`blockdev --getsz $backend_dev`
good_tab="0 $devsize linear $backend_dev 0"
error_tab="0 $devsize error $backend_dev 0"
dmsetup create $devname --table "$good_tab"
mkfs -t ext4 $dev
mount -t ext4 -o errors=continue,strictatime $dev $mnt
dmsetup load $devname --table "$error_tab" && dmsetup resume $devname
echo 3 > /proc/sys/vm/drop_caches
ls -l $mnt
exit 0
[ Patch changed to simplify the function a tiny bit. -- Ted ]
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit fbd40ea0180a2d328c5adc61414dc8bab9335ce2 upstream.
When an inetdev is destroyed, every address assigned to the interface
is removed. And in this scenerio we do two pointless things which can
be very expensive if the number of assigned interfaces is large:
1) Address promotion. We are deleting all addresses, so there is no
point in doing this.
2) A full nf conntrack table purge for every address. We only need to
do this once, as is already caught by the existing
masq_dev_notifier so masq_inet_event() can skip this.
Reported-by: Solar Designer <solar@openwall.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tested-by: Cyrill Gorcunov <gorcunov@openvz.org>
[bwh: Backported to 3.2: adjust filename, context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit b348d7dddb6c4fbfc810b7a0626e8ec9e29f7cbb upstream.
Fix potential out-of-bounds write to urb->transfer_buffer
usbip handles network communication directly in the kernel. When receiving a
packet from its peer, usbip code parses headers according to protocol. As
part of this parsing urb->actual_length is filled. Since the input for
urb->actual_length comes from the network, it should be treated as untrusted.
Any entity controlling the network may put any value in the input and the
preallocated urb->transfer_buffer may not be large enough to hold the data.
Thus, the malicious entity is able to write arbitrary data to kernel memory.
Signed-off-by: Ignat Korchagin <ignat.korchagin@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2: adjust filename, context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 1666984c8625b3db19a9abc298931d35ab7bc64b upstream.
In case bind() works, but a later error forces bailing
in probe() in error cases work and a timer may be scheduled.
They must be killed. This fixes an error case related to
the double free reported in
http://www.spinics.net/lists/netdev/msg367669.html
and needs to go on top of Linus' fix to cdc-ncm.
Signed-off-by: Oliver Neukum <ONeukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 8b8addf891de8a00e4d39fc32f93f7c5eb8feceb upstream.
Currently on i386 and on X86_64 when emulating X86_32 in legacy mode, only
the stack and the executable are randomized but not other mmapped files
(libraries, vDSO, etc.). This patch enables randomization for the
libraries, vDSO and mmap requests on i386 and in X86_32 in legacy mode.
By default on i386 there are 8 bits for the randomization of the libraries,
vDSO and mmaps which only uses 1MB of VA.
This patch preserves the original randomness, using 1MB of VA out of 3GB or
4GB. We think that 1MB out of 3GB is not a big cost for having the ASLR.
The first obvious security benefit is that all objects are randomized (not
only the stack and the executable) in legacy mode which highly increases
the ASLR effectiveness, otherwise the attackers may use these
non-randomized areas. But also sensitive setuid/setgid applications are
more secure because currently, attackers can disable the randomization of
these applications by setting the ulimit stack to "unlimited". This is a
very old and widely known trick to disable the ASLR in i386 which has been
allowed for too long.
Another trick used to disable the ASLR was to set the ADDR_NO_RANDOMIZE
personality flag, but fortunately this doesn't work on setuid/setgid
applications because there is security checks which clear Security-relevant
flags.
This patch always randomizes the mmap_legacy_base address, removing the
possibility to disable the ASLR by setting the stack to "unlimited".
Signed-off-by: Hector Marco-Gisbert <hecmargi@upv.es>
Acked-by: Ismael Ripoll Ripoll <iripoll@upv.es>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akpm@linux-foundation.org
Cc: kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/1457639460-5242-1-git-send-email-hecmargi@upv.es
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 82168140bc4cec7ec9bad39705518541149ff8b7 upstream.
In preparation for splitting out ET_DYN ASLR, this refactors the use of
mmap_rnd() to be used similarly to arm, and extracts the checking of
PF_RANDOMIZE.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 6e94e0cfb0887e4013b3b930fa6ab1fe6bb6ba91 upstream.
Otherwise this function may read data beyond the ruleset blob.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit bdf533de6968e9686df777dc178486f600c6e617 upstream.
We should check that e->target_offset is sane before
mark_source_chains gets called since it will fetch the target entry
for loop detection.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 2ef4dfd9d9f288943e249b78365a69e3ea3ec072 upstream.
Handling exceptions from modules never worked on parisc.
It was just masked by the fact that exceptions from modules
don't happen during normal use.
When a module triggers an exception in get_user() we need to load the
main kernel dp value before accessing the exception_data structure, and
afterwards restore the original dp value of the module on exit.
Noticed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Helge Deller <deller@gmx.de>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit ef72f3110d8b19f4c098a0bff7ed7d11945e70c6 upstream.
The kernel module testcase (lib/test_user_copy.c) exhibited a kernel
crash on parisc if the parameters for copy_from_user were reversed
("illegal reversed copy_to_user" testcase).
Fix this potential crash by checking the fault handler if the faulting
address is in the exception table.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit e3893027a300927049efc1572f852201eb785142 upstream.
We want to avoid the kernel module loader to create function pointers
for the kernel fixup routines of get_user() and put_user(). Changing
the external reference from function type to int type fixes this.
This unbreaks exception handling for get_user() and put_user() when
called from a kernel module.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit cddc9434e3dcc37a85c4412fb8e277d3a582e456 upstream.
The CP2105 is used in the GE Healthcare Remote Alarm Box, with the
Manufacturer ID of 0x1901 and Product ID of 0x0194.
Signed-off-by: Martyn Welch <martyn.welch@collabora.co.uk>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit ea6db90e750328068837bed34cb1302b7a177339 upstream.
A Fedora user reports that the ftdi_sio driver works properly for the
ICP DAS I-7561U device. Further, the user manual for these devices
instructs users to load the driver and add the ids using the sysfs
interface.
Add support for these in the driver directly so that the devices work
out of the box instead of needing manual configuration.
Reported-by: <thesource@mail.ru>
Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit ff1e22e7a638a0782f54f81a6c9cb139aca2da35 upstream.
Moving an unmasked irq may result in irq handler being invoked on both
source and target CPUs.
With 2-level this can happen as follows:
On source CPU:
evtchn_2l_handle_events() ->
generic_handle_irq() ->
handle_edge_irq() ->
eoi_pirq():
irq_move_irq(data);
/***** WE ARE HERE *****/
if (VALID_EVTCHN(evtchn))
clear_evtchn(evtchn);
If at this moment target processor is handling an unrelated event in
evtchn_2l_handle_events()'s loop it may pick up our event since target's
cpu_evtchn_mask claims that this event belongs to it *and* the event is
unmasked and still pending. At the same time, source CPU will continue
executing its own handle_edge_irq().
With FIFO interrupt the scenario is similar: irq_move_irq() may result
in a EVTCHNOP_unmask hypercall which, in turn, may make the event
pending on the target CPU.
We can avoid this situation by moving and clearing the event while
keeping event masked.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
[bwh: Backported to 3.2:
- Adjust filename
- Add a suitable definition of test_and_set_mask()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 4a07083ed613644c96c34a7dd2853dc5d7c70902 upstream.
ALSA system timer backend stops the timer via del_timer() without sync
and leaves del_timer_sync() at the close instead. This is because of
the restriction by the design of ALSA timer: namely, the stop callback
may be called from the timer handler, and calling the sync shall lead
to a hangup. However, this also triggers a kernel BUG() when the
timer is rearmed immediately after stopping without sync:
kernel BUG at kernel/time/timer.c:966!
Call Trace:
<IRQ>
[<ffffffff8239c94e>] snd_timer_s_start+0x13e/0x1a0
[<ffffffff8239e1f4>] snd_timer_interrupt+0x504/0xec0
[<ffffffff8122fca0>] ? debug_check_no_locks_freed+0x290/0x290
[<ffffffff8239ec64>] snd_timer_s_function+0xb4/0x120
[<ffffffff81296b72>] call_timer_fn+0x162/0x520
[<ffffffff81296add>] ? call_timer_fn+0xcd/0x520
[<ffffffff8239ebb0>] ? snd_timer_interrupt+0xec0/0xec0
....
It's the place where add_timer() checks the pending timer. It's clear
that this may happen after the immediate restart without sync in our
cases.
So, the workaround here is just to use mod_timer() instead of
add_timer(). This looks like a band-aid fix, but it's a right move,
as snd_timer_interrupt() takes care of the continuous rearm of timer.
Reported-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 321c5658c5e9192dea0d58ab67cf1791e45b2b26 upstream.
Non maskable interrupts (NMI) are preferred to interrupts in current
implementation. If a NMI is pending and NMI is blocked by the result
of nmi_allowed(), pending interrupt is not injected and
enable_irq_window() is not executed, even if interrupts injection is
allowed.
In old kernel (e.g. 2.6.32), schedule() is often called in NMI context.
In this case, interrupts are needed to execute iret that intends end
of NMI. The flag of blocking new NMI is not cleared until the guest
execute the iret, and interrupts are blocked by pending NMI. Due to
this, iret can't be invoked in the guest, and the guest is starved
until block is cleared by some events (e.g. canceling injection).
This patch injects pending interrupts, when it's allowed, even if NMI
is blocked. And, If an interrupts is pending after executing
inject_pending_event(), enable_irq_window() is executed regardless of
NMI pending counter.
Signed-off-by: Yuki Shibuya <shibuya.yk@ncos.nec.co.jp>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[bwh: Backported to 3.2:
- vcpu_enter_guest() is simpler because inject_pending_event() can't fail
- Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit f08bb1e0dbdd0297258d0b8cd4dbfcc057e57b2a upstream.
During revalidate we check whether device capacity has changed before we
decide whether to output disk information or not.
The check for old capacity failed to take into account that we scaled
sdkp->capacity based on the reported logical block size. And therefore
the capacity test would always fail for devices with sectors bigger than
512 bytes and we would print several copies of the same discovery
information.
Avoid scaling sdkp->capacity and instead adjust the value on the fly
when setting the block device capacity and generating fake C/H/S
geometry.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reported-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[bwh: Backported to 3.2:
- logical_to_sectors() is a new function
- Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 5a07975ad0a36708c6b0a5b9fea1ff811d0b0c1f upstream.
The driver can be crashed with devices that expose crafted descriptors
with too few endpoints.
See: http://seclists.org/bugtraq/2016/Mar/61
Signed-off-by: Oliver Neukum <ONeukum@suse.com>
[johan: fix OOB endpoint check and add error messages ]
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit c55aee1bf0e6b6feec8b2927b43f7a09a6d5f754 upstream.
An attack using missing endpoints exists.
CVE-2016-3137
Signed-off-by: Oliver Neukum <ONeukum@suse.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|