Age | Commit message (Collapse) | Author | Files | Lines |
|
commit 993f88f1cc7f0879047ff353e824e5cc8f10adfc upstream.
Fix typo in edac_inc_ue_error() to increment ue_noinfo_count instead of
ce_noinfo_count.
Signed-off-by: Emmanouil Maroudas <emmanouil.maroudas@gmail.com>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Fixes: 4275be635597 ("edac: Change internal representation to work with layers")
Link: http://lkml.kernel.org/r/1461425580-5898-1-git-send-email-emmanouil.maroudas@gmail.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 0ae3aeefabbeef26294e7a349b51f1c761d46c9f upstream.
As pm_runtime_set_active() may fail because the device's parent isn't
active, we can end up executing the ->runtime_resume() callback for the
device when it isn't allowed.
Fix this by invoking pm_runtime_set_active() before running the callback
and let's also deal with the error code.
Fixes: 37f204164dfb (PM: Add pm_runtime_suspend|resume_force functions)
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 8ed8ab40047a570fdd8043a40c104a57248dd3fd upstream.
Some of the interrupt vectors on 64-bit POWER server processors are only
32 bytes long (8 instructions), which is not enough for the full
first-level interrupt handler. For these we need to branch to an
out-of-line (OOL) handler. But when we are running a relocatable kernel,
interrupt vectors till __end_interrupts marker are copied down to real
address 0x100. So, branching to labels (ie. OOL handlers) outside this
section must be handled differently (see LOAD_HANDLER()), considering
relocatable kernel, which would need at least 4 instructions.
However, branching from interrupt vector means that we corrupt the
CFAR (come-from address register) on POWER7 and later processors as
mentioned in commit 1707dd16. So, EXCEPTION_PROLOG_0 (6 instructions)
that contains the part up to the point where the CFAR is saved in the
PACA should be part of the short interrupt vectors before we branch out
to OOL handlers.
But as mentioned already, there are interrupt vectors on 64-bit POWER
server processors that are only 32 bytes long (like vectors 0x4f00,
0x4f20, etc.), which cannot accomodate the above two cases at the same
time owing to space constraint. Currently, in these interrupt vectors,
we simply branch out to OOL handlers, without using LOAD_HANDLER(),
which leaves us vulnerable when running a relocatable kernel (eg. kdump
case). While this has been the case for sometime now and kdump is used
widely, we were fortunate not to see any problems so far, for three
reasons:
1. In almost all cases, production kernel (relocatable) is used for
kdump as well, which would mean that crashed kernel's OOL handler
would be at the same place where we end up branching to, from short
interrupt vector of kdump kernel.
2. Also, OOL handler was unlikely the reason for crash in almost all
the kdump scenarios, which meant we had a sane OOL handler from
crashed kernel that we branched to.
3. On most 64-bit POWER server processors, page size is large enough
that marking interrupt vector code as executable (see commit
429d2e83) leads to marking OOL handler code from crashed kernel,
that sits right below interrupt vector code from kdump kernel, as
executable as well.
Let us fix this by moving the __end_interrupts marker down past OOL
handlers to make sure that we also copy OOL handlers to real address
0x100 when running a relocatable kernel.
This fix has been tested successfully in kdump scenario, on an LPAR with
4K page size by using different default/production kernel and kdump
kernel.
Also tested by manually corrupting the OOL handlers in the first kernel
and then kdump'ing, and then causing the OOL handlers to fire - mpe.
Fixes: c1fb6816fb1b ("powerpc: Add relocation on exception vector handlers")
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit c7c999cb18da88a881e10e07f0724ad0bfaff770 upstream.
hci_vhci driver creates a hci device object dynamically upon each
HCI_VENDOR_PKT write. Although it checks the already created object
and returns an error, it's still racy and may build multiple hci_dev
objects concurrently when parallel writes are performed, as the device
tracks only a single hci_dev object.
This patch introduces a mutex to protect against the concurrent device
creations.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit d124fd3bb36ceb40438f10c897ce642386b74b72 upstream.
Commit 834392a7d92677ff ("serial: doc: Un-document non-existing
uart_write_console()") removed a paragraph about a helper function that
seemed to never exist.
Peter Hurley pointed out that the function does exist, but is called
differently. Re-add the paragraph, with the function name corrected.
Fixes: 834392a7d92677ff ("serial: doc: Un-document non-existing uart_write_console()")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 305c2e71b3d733ec065cb716c76af7d554bd5571 upstream.
Now that we've done a more comprehensive fix with the intermediate
target state we can remove the previous hack introduced with commit
90a88d6ef88e ("scsi: fix soft lockup in scsi_remove_target() on module
removal").
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit f05795d3d771f30a7bdc3a138bf714b06d42aa95 upstream.
Add intermediate STARGET_REMOVE state to scsi_target_state to avoid
running into the BUG_ON() in scsi_target_reap(). The STARGET_REMOVE
state is only valid in the path from scsi_remove_target() to
scsi_target_destroy() indicating this target is going to be removed.
This re-fixes the problem introduced in commits bc3f02a795d3 ("[SCSI]
scsi_remove_target: fix softlockup regression on hot remove") and
40998193560d ("scsi: restart list search after unlock in
scsi_remove_target") in a more comprehensive way.
[mkp: Included James' fix for scsi_target_destroy()]
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Fixes: 40998193560dab6c3ce8d25f4fa58a23e252ef38
Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: James Bottomley <jejb@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit e9135c4f08d9acb0f3da3ad2643b669dee3217c2 upstream.
Two concurrent writes into the same register cacheline has the chance of
killing the machine on Ivybridge and other gen7. This includes LRI
emitted from the command parser. The MI_SET_CONTEXT itself serves as
serialising barrier and prevents the pair of register writes in the first
packet from triggering the fault. However, if a second switch-context
immediately occurs then we may have two adjacent blocks of LRI to the
same registers which may then trigger the hang. To counteract this we
need to insert a delay after the second register write using SRM.
This is easiest to reproduce with something like
igt/gem_ctx_switch/interruptible that triggers back-to-back context
switches (with no operations in between them in the command stream,
which requires the execbuf operation to be interrupted after the
MI_SET_CONTEXT) but can be observed sporadically elsewhere when running
interruptible igt. No reports from the wild though, so it must be of low
enough frequency that no one has correlated the random machine freezes
with i915.ko
The issue was introduced with
commit 2c550183476dfa25641309ae9a28d30feed14379 [v3.19]
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 16 10:02:27 2014 +0000
drm/i915: Disable PSMI sleep messages on all rings around context switches
Testcase: igt/gem_ctx_switch/render-interruptible #ivb
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-11-git-send-email-chris@chris-wilson.co.uk
[bwh: Backported to 3.16:
- Pass ring, not engine, to intel_ring_emit()
- Register type is u32 not i915_reg_t
- MI_STORE_REGISTER_MEM is a function-macro]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 47e27d5e92c46a3a62d4dfd8895b1ddb8613f531 upstream.
The original tokenized iid support implemented via f53adae4eae5 ("net: ipv6:
add tokenized interface identifier support") didn't allow for clearing a
device token as it was intended that this addressing mode was the only one
active for globally scoped IPv6 addresses. Later we relaxed that restriction
via 617fe29d45bd ("net: ipv6: only invalidate previously tokenized addresses"),
and we should also allow for clearing tokens as there's no good reason why
it shouldn't be allowed.
Fixes: 617fe29d45bd ("net: ipv6: only invalidate previously tokenized addresses")
Reported-by: Robin H. Johnson <robbat2@gentoo.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 60587bd0680507f48ae3a7360983228fd207de8a upstream.
The "handled" variable could be uninitialized if the
interrupt_service_routine() call back hasn't been implimented or if it
has been implemented but doesn't initialize "handled" to zero at the
start. For example, adv76xx_isr() only sets "handled" to true.
Fixes: 44b153ca639f ('[media] m5mols: Add ISO sensitivity controls')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 22aab38e7b59fd79ce1045006be69a9abab58e5a upstream.
Instead to being true/false, the "handled" is true/uninitialized.
Presumably this doesn't cause that many problems in real life because
normally we handle the IRQ.
Fixes: eea6b7cc53aa ('mfd: Add lp8788 mfd driver')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Milo Kim <milo.kim@ti.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit c20c8f750d9f8f8617f07ee2352d3ff560e66bc2 upstream.
The omap_hwmod _enable() function can return success without setting
the hwmod state to _HWMOD_STATE_ENABLED for IPs with reset lines when
all of the reset lines are asserted. The omap_hwmod _idle() function
also performs a similar check, but after checking for the hwmod state
first. This triggers the WARN when pm_runtime_get and pm_runtime_put
are invoked on IPs with all reset lines asserted. Reverse the checks
for hwmod state and reset lines status to fix this.
Issue found during a unbind operation on a device with reset lines
still asserted, example backtrace below
------------[ cut here ]------------
WARNING: CPU: 1 PID: 879 at arch/arm/mach-omap2/omap_hwmod.c:2207 _idle+0x1e4/0x240()
omap_hwmod: mmu_dsp: idle state can only be entered from enabled state
Modules linked in:
CPU: 1 PID: 879 Comm: sh Not tainted 4.4.0-00008-ga989d951331a #3
Hardware name: Generic OMAP5 (Flattened Device Tree)
[<c0018e60>] (unwind_backtrace) from [<c0014dc4>] (show_stack+0x10/0x14)
[<c0014dc4>] (show_stack) from [<c037ac28>] (dump_stack+0x90/0xc0)
[<c037ac28>] (dump_stack) from [<c003f420>] (warn_slowpath_common+0x78/0xb4)
[<c003f420>] (warn_slowpath_common) from [<c003f48c>] (warn_slowpath_fmt+0x30/0x40)
[<c003f48c>] (warn_slowpath_fmt) from [<c0028c20>] (_idle+0x1e4/0x240)
[<c0028c20>] (_idle) from [<c0029080>] (omap_hwmod_idle+0x28/0x48)
[<c0029080>] (omap_hwmod_idle) from [<c002a5a4>] (omap_device_idle+0x3c/0x90)
[<c002a5a4>] (omap_device_idle) from [<c0427a90>] (__rpm_callback+0x2c/0x60)
[<c0427a90>] (__rpm_callback) from [<c0427ae4>] (rpm_callback+0x20/0x80)
[<c0427ae4>] (rpm_callback) from [<c0427f84>] (rpm_suspend+0x138/0x74c)
[<c0427f84>] (rpm_suspend) from [<c0428b78>] (__pm_runtime_idle+0x78/0xa8)
[<c0428b78>] (__pm_runtime_idle) from [<c041f514>] (__device_release_driver+0x64/0x100)
[<c041f514>] (__device_release_driver) from [<c041f5d0>] (device_release_driver+0x20/0x2c)
[<c041f5d0>] (device_release_driver) from [<c041d85c>] (unbind_store+0x78/0xf8)
[<c041d85c>] (unbind_store) from [<c0206df8>] (kernfs_fop_write+0xc0/0x1c4)
[<c0206df8>] (kernfs_fop_write) from [<c018a120>] (__vfs_write+0x20/0xdc)
[<c018a120>] (__vfs_write) from [<c018a9cc>] (vfs_write+0x90/0x164)
[<c018a9cc>] (vfs_write) from [<c018b1f0>] (SyS_write+0x44/0x9c)
[<c018b1f0>] (SyS_write) from [<c0010420>] (ret_fast_syscall+0x0/0x1c)
---[ end trace a4182013c75a9f50 ]---
While at this, fix the sequence in _shutdown() as well, though there
is no easy reproducible scenario.
Fixes: 747834ab8347 ("ARM: OMAP2+: hwmod: revise hardreset behavior")
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit c998c07836f985b24361629dc98506ec7893e7a0 upstream.
Currently the 'registered' member of the cpuidle_device struct is set
to 1 during cpuidle_register_device. In this same function there are
checks to see if the device is already registered to prevent duplicate
calls to register the device, but this value is never set to 0 even on
unregister of the device. Because of this, any attempt to call
cpuidle_register_device after a call to cpuidle_unregister_device will
fail which shouldn't be the case.
To prevent this, set registered to 0 when the device is unregistered.
Fixes: c878a52d3c7c (cpuidle: Check if device is already registered)
Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 13407376b255325fa817798800117a839f3aa055 upstream.
The write handler allocates skbs and queues them into data->readq.
Read side should read them, if there is any. If there is none, skbs
should be dropped by hdev->flush. But this happens only if the device
is HCI_UP, i.e. hdev->power_on work was triggered already. When it was
not, skbs stay allocated in the queue when /dev/vhci is closed. So
purge the queue in ->release.
Program to reproduce:
#include <err.h>
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/uio.h>
int main()
{
char buf[] = { 0xff, 0 };
struct iovec iov = {
.iov_base = buf,
.iov_len = sizeof(buf),
};
int fd;
while (1) {
fd = open("/dev/vhci", O_RDWR);
if (fd < 0)
err(1, "open");
usleep(50);
if (writev(fd, &iov, 1) < 0)
err(1, "writev");
usleep(50);
close(fd);
}
return 0;
}
Result:
kmemleak: 4609 new suspected memory leaks
unreferenced object 0xffff88059f4d5440 (size 232):
comm "vhci", pid 1084, jiffies 4294912542 (age 37569.296s)
hex dump (first 32 bytes):
20 f0 23 87 05 88 ff ff 20 f0 23 87 05 88 ff ff .#..... .#.....
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
...
[<ffffffff81ece010>] __alloc_skb+0x0/0x5a0
[<ffffffffa021886c>] vhci_create_device+0x5c/0x580 [hci_vhci]
[<ffffffffa0219436>] vhci_write+0x306/0x4c8 [hci_vhci]
Fixes: 23424c0d31 (Bluetooth: Add support creating virtual AMP controllers)
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 373a32c848ae3a1c03618517cce85f9211a6facf upstream.
Both vhci_get_user and vhci_release race with open_timeout work. They
both contain cancel_delayed_work_sync, but do not test whether the
work actually created hdev or not. Since the work can be in progress
and _sync will wait for finishing it, we can have data->hdev allocated
when cancel_delayed_work_sync returns. But the call sites do 'if
(data->hdev)' *before* cancel_delayed_work_sync.
As a result:
* vhci_get_user allocates a second hdev and puts it into
data->hdev. The former is leaked.
* vhci_release does not release data->hdev properly as it thinks there
is none.
Fix both cases by moving the actual test *after* the call to
cancel_delayed_work_sync.
This can be hit by this program:
#include <err.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>
#include <sys/stat.h>
#include <sys/types.h>
int main(int argc, char **argv)
{
int fd;
srand(time(NULL));
while (1) {
const int delta = (rand() % 200 - 100) * 100;
fd = open("/dev/vhci", O_RDWR);
if (fd < 0)
err(1, "open");
usleep(1000000 + delta);
close(fd);
}
return 0;
}
And the result is:
BUG: KASAN: use-after-free in skb_queue_tail+0x13e/0x150 at addr ffff88006b0c1228
Read of size 8 by task kworker/u13:1/32068
=============================================================================
BUG kmalloc-192 (Tainted: G E ): kasan: bad access detected
-----------------------------------------------------------------------------
Disabling lock debugging due to kernel taint
INFO: Allocated in vhci_open+0x50/0x330 [hci_vhci] age=260 cpu=3 pid=32040
...
kmem_cache_alloc_trace+0x150/0x190
vhci_open+0x50/0x330 [hci_vhci]
misc_open+0x35b/0x4e0
chrdev_open+0x23b/0x510
...
INFO: Freed in vhci_release+0xa4/0xd0 [hci_vhci] age=9 cpu=2 pid=32040
...
__slab_free+0x204/0x310
vhci_release+0xa4/0xd0 [hci_vhci]
...
INFO: Slab 0xffffea0001ac3000 objects=16 used=13 fp=0xffff88006b0c1e00 flags=0x5fffff80004080
INFO: Object 0xffff88006b0c1200 @offset=4608 fp=0xffff88006b0c0600
Bytes b4 ffff88006b0c11f0: 09 df 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff88006b0c1200: 00 06 0c 6b 00 88 ff ff 00 00 00 00 00 00 00 00 ...k............
Object ffff88006b0c1210: 10 12 0c 6b 00 88 ff ff 10 12 0c 6b 00 88 ff ff ...k.......k....
Object ffff88006b0c1220: c0 46 c2 6b 00 88 ff ff c0 46 c2 6b 00 88 ff ff .F.k.....F.k....
Object ffff88006b0c1230: 01 00 00 00 01 00 00 00 e0 ff ff ff 0f 00 00 00 ................
Object ffff88006b0c1240: 40 12 0c 6b 00 88 ff ff 40 12 0c 6b 00 88 ff ff @..k....@..k....
Object ffff88006b0c1250: 50 0d 6e a0 ff ff ff ff 00 02 00 00 00 00 ad de P.n.............
Object ffff88006b0c1260: 00 00 00 00 00 00 00 00 ab 62 02 00 01 00 00 00 .........b......
Object ffff88006b0c1270: 90 b9 19 81 ff ff ff ff 38 12 0c 6b 00 88 ff ff ........8..k....
Object ffff88006b0c1280: 03 00 20 00 ff ff ff ff ff ff ff ff 00 00 00 00 .. .............
Object ffff88006b0c1290: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff88006b0c12a0: 00 00 00 00 00 00 00 00 00 80 cd 3d 00 88 ff ff ...........=....
Object ffff88006b0c12b0: 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 . ..............
Redzone ffff88006b0c12c0: bb bb bb bb bb bb bb bb ........
Padding ffff88006b0c13f8: 00 00 00 00 00 00 00 00 ........
CPU: 3 PID: 32068 Comm: kworker/u13:1 Tainted: G B E 4.4.6-0-default #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20151112_172657-sheep25 04/01/2014
Workqueue: hci0 hci_cmd_work [bluetooth]
00000000ffffffff ffffffff81926cfa ffff88006be37c68 ffff88006bc27180
ffff88006b0c1200 ffff88006b0c1234 ffffffff81577993 ffffffff82489320
ffff88006bc24240 0000000000000046 ffff88006a100000 000000026e51eb80
Call Trace:
...
[<ffffffff81ec8ebe>] ? skb_queue_tail+0x13e/0x150
[<ffffffffa06e027c>] ? vhci_send_frame+0xac/0x100 [hci_vhci]
[<ffffffffa0c61268>] ? hci_send_frame+0x188/0x320 [bluetooth]
[<ffffffffa0c61515>] ? hci_cmd_work+0x115/0x310 [bluetooth]
[<ffffffff811a1375>] ? process_one_work+0x815/0x1340
[<ffffffff811a1f85>] ? worker_thread+0xe5/0x11f0
[<ffffffff811a1ea0>] ? process_one_work+0x1340/0x1340
[<ffffffff811b3c68>] ? kthread+0x1c8/0x230
...
Memory state around the buggy address:
ffff88006b0c1100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88006b0c1180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88006b0c1200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88006b0c1280: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
ffff88006b0c1300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
Fixes: 23424c0d31 (Bluetooth: Add support creating virtual AMP controllers)
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 7ccca1d5bf69fdd1d3c5fcf84faf1659a6e0ad11 upstream.
Fix possible out of bounds read, by adding missing comma.
The code may read pass the end of the dsi_errors array
when the most significant bit (bit #31) in the intr_stat register
is set.
This bug has been detected using CppCheck (static analysis tool).
Signed-off-by: Itai Handler <itai_handler@hotmail.com>
Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit d0a58e833931234c44e515b5b8bede32bd4e6eed upstream.
Today, a kernel which refuses to mount a filesystem read-write
due to unknown ro-compat features can still transition to read-write
via the remount path. The old kernel is most likely none the wiser,
because it's unaware of the new feature, and isn't using it. However,
writing to the filesystem may well corrupt metadata related to that
new feature, and moving to a newer kernel which understand the feature
will have problems.
Right now the only ro-compat feature we have is the free inode btree,
which showed up in v3.16. It would be good to push this back to
all the active stable kernels, I think, so that if anyone is using
newer mkfs (which enables the finobt feature) with older kernel
releases, they'll be protected.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Bill O'Donnell <billodo@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit a0fe14d7dcf5db2f101b4fe8689ecabb255ab7d3 upstream.
Remove new line in error logs, avoid duplicate and explicit pr_fmt.
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Fixes: 0ac2491f57af ('x86, dmar: move page fault handling code to dmar.c')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit c43fce4eebae257ca413733690e2076757282093 upstream.
Fault rates can easily overwhelm the console and make the system
unresponsive. Ratelimit to allow an opportunity for maintenance.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Fixes: 0ac2491f57af ('x86, dmar: move page fault handling code to dmar.c')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 7b9bc799a445aea95f64f15e0083cb19b5789abe upstream.
BugLink: http://bugs.launchpad.net/bugs/972604
Commit 09c9bae26b0d3c9472cb6ae45010460a2cee8b8d ("ath5k: add led pin
configuration for compaq c700 laptop") added a pin configuration for the Compaq
c700 laptop. However, the polarity of the led pin is reversed. It should be
red for wifi off and blue for wifi on, but it is the opposite. This bug was
reported in the following bug report:
http://pad.lv/972604
Fixes: 09c9bae26b0d3c9472cb6ae45010460a2cee8b8d ("ath5k: add led pin configuration for compaq c700 laptop")
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 834392a7d92677ff2bdc1c709b1171ee585b55c9 upstream.
uart_write_console() never existed, not even when the "new
uart_write_console function" was documented.
Fixes: 67ab7f596b6adbae ("[SERIAL] Update serial driver documentation")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 9ec423ed62b8278412400fae6c064edb6ce1bb51 upstream.
Commit be3d7d023b87 ("ARM: kirkwood: Add DTS file for NSA320")
created the new file kirkwood-nsa320.dts but did not
add it to the Makefile.
Fixes: be3d7d023b87 ("ARM: kirkwood: Add DTS file for NSA320")
Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit fc5c796e12511a7c027b5a4438719dde2f796208 upstream.
Commit 2d0a7addbd10 ("ARM: Kirkwood: Add support for many Synology
NAS devices") created the new file kirkwood-ds112.dts but did not
add it to the Makefile.
Fixes: 2d0a7addbd10 ("ARM: Kirkwood: Add support for many Synology NAS devices")
Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 267c85860308d36bc163c5573308cd024f659d7c upstream.
Setting the flag 'cache_bypass' will bypass the cache not the hardware.
Fix this comment here.
Fixes: 0eef6b0415f5 ("regmap: Fix doc comment")
Signed-off-by: Andrew F. Davis <afd@ti.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
|
|
callbacks
commit 9916e214998a4a363b152b637245e5c958067350 upstream.
Remove the direct {push,pull} balancing operations from
switched_{from,to}_dl() / prio_changed_dl() and use the balance
callback queue.
Again, err on the side of too many reschedules; since too few is a
hard bug while too many is just annoying.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: ktkhai@parallels.com
Cc: rostedt@goodmis.org
Cc: juri.lelli@gmail.com
Cc: pang.xunlei@linaro.org
Cc: oleg@redhat.com
Cc: wanpeng.li@linux.intel.com
Cc: umgwanakikbuti@gmail.com
Link: http://lkml.kernel.org/r/20150611124742.968262663@infradead.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[Conflicts: kernel/sched/deadline.c]
Signed-off-by: Byungchul Park <byungchul.park@lge.com>
[bwh: Backported to 3.16:
- The check_resched / !CONFIG_SMP case in switched_to_dl() is different
- Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 0ea60c2054fc3b0c3eb68ac4f6884f3ee78d9925 upstream.
In order to be able to use pull_dl_task() from a callback, we need to
do away with the return value.
Since the return value indicates if we should reschedule, do this
inside the function. Since not all callers currently do this, this can
increase the number of reschedules due rt balancing.
Too many reschedules is not a correctness issues, too few are.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: ktkhai@parallels.com
Cc: rostedt@goodmis.org
Cc: juri.lelli@gmail.com
Cc: pang.xunlei@linaro.org
Cc: oleg@redhat.com
Cc: wanpeng.li@linux.intel.com
Cc: umgwanakikbuti@gmail.com
Link: http://lkml.kernel.org/r/20150611124742.859398977@infradead.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[Conflicts: kernel/sched/deadline.c]
Signed-off-by: Byungchul Park <byungchul.park@lge.com>
[bwh: Backported to 3.16: use resched_task() instead of resched_curr()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
callbacks
commit 9916e214998a4a363b152b637245e5c958067350 upstream.
Remove the direct {push,pull} balancing operations from
switched_{from,to}_rt() / prio_changed_rt() and use the balance
callback queue.
Again, err on the side of too many reschedules; since too few is a
hard bug while too many is just annoying.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: ktkhai@parallels.com
Cc: rostedt@goodmis.org
Cc: juri.lelli@gmail.com
Cc: pang.xunlei@linaro.org
Cc: oleg@redhat.com
Cc: wanpeng.li@linux.intel.com
Cc: umgwanakikbuti@gmail.com
Link: http://lkml.kernel.org/r/20150611124742.766832367@infradead.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[Conflicts: kernel/sched/rt.c]
Signed-off-by: Byungchul Park <byungchul.park@lge.com>
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 8046d6806247088de5725eaf8a2580b29e50ac5a upstream.
In order to be able to use pull_rt_task() from a callback, we need to
do away with the return value.
Since the return value indicates if we should reschedule, do this
inside the function. Since not all callers currently do this, this can
increase the number of reschedules due rt balancing.
Too many reschedules is not a correctness issues, too few are.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: ktkhai@parallels.com
Cc: rostedt@goodmis.org
Cc: juri.lelli@gmail.com
Cc: pang.xunlei@linaro.org
Cc: oleg@redhat.com
Cc: wanpeng.li@linux.intel.com
Cc: umgwanakikbuti@gmail.com
Link: http://lkml.kernel.org/r/20150611124742.679002000@infradead.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[Conflicts: kernel/sched/rt.c]
Signed-off-by: Byungchul Park <byungchul.park@lge.com>
[bwh: Backported to 3.16: use resched_task() instead of resched_curr()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 4c9a4bc89a9cca8128bce67d6bc8870d6b7ee0b2 upstream.
In order to remove dropping rq->lock from the
switched_{to,from}()/prio_changed() sched_class methods, run the
balance callbacks after it.
We need to remove dropping rq->lock because its buggy,
suppose using sched_setattr()/sched_setscheduler() to change a running
task from FIFO to OTHER.
By the time we get to switched_from_rt() the task is already enqueued
on the cfs runqueues. If switched_from_rt() does pull_rt_task() and
drops rq->lock, load-balancing can come in and move our task @p to
another rq.
The subsequent switched_to_fair() still assumes @p is on @rq and bad
things will happen.
By using balance callbacks we delay the load-balancing operations
{rt,dl}x{push,pull} until we've done all the important work and the
task is fully set up.
Furthermore, the balance callbacks do not know about @p, therefore
they cannot get confused like this.
Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: ktkhai@parallels.com
Cc: rostedt@goodmis.org
Cc: juri.lelli@gmail.com
Cc: pang.xunlei@linaro.org
Cc: oleg@redhat.com
Cc: wanpeng.li@linux.intel.com
Link: http://lkml.kernel.org/r/20150611124742.615343911@infradead.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[Conflicts: kernel/sched/core.c]
Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit e3fca9e7cbfb72694a21c886fcdf9f059cfded9c upstream.
Generalize the post_schedule() stuff into a balance callback list.
This allows us to more easily use it outside of schedule() and cross
sched_class.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: ktkhai@parallels.com
Cc: rostedt@goodmis.org
Cc: juri.lelli@gmail.com
Cc: pang.xunlei@linaro.org
Cc: oleg@redhat.com
Cc: wanpeng.li@linux.intel.com
Cc: umgwanakikbuti@gmail.com
Link: http://lkml.kernel.org/r/20150611124742.424032725@infradead.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[Conflicts: kernel/sched/core.c]
Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit daaf40e53b5dbdf75255d58a45ce8ac65ca511a8 upstream.
Fixes: f7d11e93ee97a locking,arch,arc: Fold atomic_ops
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Guenter Roeck <linux@roeck-us.net>
|
|
commit f5e0a12ca2d939e47995f73428d9bf1ad372b289 upstream.
An arm64 allmodconfig fails to build with GCC 5 due to __asmeq
assertions in the PSCI firmware calling code firing due to mcount
preambles breaking our assumptions about register allocation of function
arguments:
/tmp/ccDqJsJ6.s: Assembler messages:
/tmp/ccDqJsJ6.s:60: Error: .err encountered
/tmp/ccDqJsJ6.s:61: Error: .err encountered
/tmp/ccDqJsJ6.s:62: Error: .err encountered
/tmp/ccDqJsJ6.s:99: Error: .err encountered
/tmp/ccDqJsJ6.s:100: Error: .err encountered
/tmp/ccDqJsJ6.s:101: Error: .err encountered
This patch fixes the issue by moving the PSCI calls out-of-line into
their own assembly files, which are safe from the compiler's meddling
fingers.
Reported-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Guenter Roeck <linux@roeck-us.net>
|
|
commit 4d88e6f7d5ffc84e6094a47925870f4a130555c2 upstream.
If CONFIG_BALLOON_COMPACTION=n balloon_page_insert() does not link pages
with balloon and doesn't set PagePrivate flag, as a result
balloon_page_dequeue() cannot get any pages because it thinks that all
of them are isolated. Without balloon compaction nobody can isolate
ballooned pages. It's safe to remove this check.
Fixes: d6d86c0a7f8d ("mm/balloon_compaction: redesign ballooned pages management").
Signed-off-by: Konstantin Khlebnikov <k.khlebnikov@samsung.com>
Reported-by: Matt Mullins <mmullins@mmlx.us>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: jian wang <wangjian@bytedance.com>
|
|
commit d6d86c0a7f8ddc5b38cf089222cb1d9540762dc2 upstream.
Sasha Levin reported KASAN splash inside isolate_migratepages_range().
Problem is in the function __is_movable_balloon_page() which tests
AS_BALLOON_MAP in page->mapping->flags. This function has no protection
against anonymous pages. As result it tried to check address space flags
inside struct anon_vma.
Further investigation shows more problems in current implementation:
* Special branch in __unmap_and_move() never works:
balloon_page_movable() checks page flags and page_count. In
__unmap_and_move() page is locked, reference counter is elevated, thus
balloon_page_movable() always fails. As a result execution goes to the
normal migration path. virtballoon_migratepage() returns
MIGRATEPAGE_BALLOON_SUCCESS instead of MIGRATEPAGE_SUCCESS,
move_to_new_page() thinks this is an error code and assigns
newpage->mapping to NULL. Newly migrated page lose connectivity with
balloon an all ability for further migration.
* lru_lock erroneously required in isolate_migratepages_range() for
isolation ballooned page. This function releases lru_lock periodically,
this makes migration mostly impossible for some pages.
* balloon_page_dequeue have a tight race with balloon_page_isolate:
balloon_page_isolate could be executed in parallel with dequeue between
picking page from list and locking page_lock. Race is rare because they
use trylock_page() for locking.
This patch fixes all of them.
Instead of fake mapping with special flag this patch uses special state of
page->_mapcount: PAGE_BALLOON_MAPCOUNT_VALUE = -256. Buddy allocator uses
PAGE_BUDDY_MAPCOUNT_VALUE = -128 for similar purpose. Storing mark
directly in struct page makes everything safer and easier.
PagePrivate is used to mark pages present in page list (i.e. not
isolated, like PageLRU for normal pages). It replaces special rules for
reference counter and makes balloon migration similar to migration of
normal pages. This flag is protected by page_lock together with link to
the balloon device.
Signed-off-by: Konstantin Khlebnikov <k.khlebnikov@samsung.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Link: http://lkml.kernel.org/p/53E6CEAA.9020105@oracle.com
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.16:
- Remove an additional check for MIGRATEPAGE_BALLOON_SUCCESS in
__unmap_and_move()
- Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: jian wang <wangjian@bytedance.com>
|
|
commit f436b2ac90a095746beb6729b8ee8ed87c9eaede upstream.
The Performance Monitors extension is an optional feature of the
AArch64 architecture, therefore, in order to access Performance
Monitors registers safely, the kernel should detect the architected
PMU unit presence through the ID_AA64DFR0_EL1 register PMUVer field
before accessing them.
This patch implements a guard by reading the ID_AA64DFR0_EL1 register
PMUVer field to detect the architected PMU presence and prevent accessing
PMU system registers if the Performance Monitors extension is not
implemented in the core.
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Fixes: 60792ad349f3 ("arm64: kernel: enforce pmuserenr_el0 initialization and restore")
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 79e48650320e6fba48369fccf13fd045315b19b8 upstream.
Stack object "dte_facilities" is allocated in x25_rx_call_request(),
which is supposed to be initialized in x25_negotiate_facilities.
However, 5 fields (8 bytes in total) are not initialized. This
object is then copied to userland via copy_to_user, thus infoleak
occurs.
Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 5f8e44741f9f216e33736ea4ec65ca9ac03036e6 upstream.
The stack object “map” has a total size of 32 bytes. Its last 4
bytes are padding generated by compiler. These padding bytes are
not initialized and sent out via “nla_put”.
Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context, indentation]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit b8670c09f37bdf2847cc44f36511a53afc6161fd upstream.
The stack object “info” has a total size of 12 bytes. Its last byte
is padding which is not initialized and leaked via “put_cmsg”.
Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 42cb14b110a5698ccf26ce59c4441722605a3743 upstream.
clear_page_dirty_for_io() has accumulated writeback and memcg subtleties
since v2.6.16 first introduced page migration; and the set_page_dirty()
which completed its migration of PageDirty, later had to be moderated to
__set_page_dirty_nobuffers(); then PageSwapBacked had to skip that too.
No actual problems seen with this procedure recently, but if you look into
what the clear_page_dirty_for_io(page)+set_page_dirty(newpage) is actually
achieving, it turns out to be nothing more than moving the PageDirty flag,
and its NR_FILE_DIRTY stat from one zone to another.
It would be good to avoid a pile of irrelevant decrementations and
incrementations, and improper event counting, and unnecessary descent of
the radix_tree under tree_lock (to set the PAGECACHE_TAG_DIRTY which
radix_tree_replace_slot() left in place anyway).
Do the NR_FILE_DIRTY movement, like the other stats movements, while
interrupts still disabled in migrate_page_move_mapping(); and don't even
bother if the zone is the same. Do the PageDirty movement there under
tree_lock too, where old page is frozen and newpage not yet visible:
bearing in mind that as soon as newpage becomes visible in radix_tree, an
un-page-locked set_page_dirty() might interfere (or perhaps that's just
not possible: anything doing so should already hold an additional
reference to the old page, preventing its migration; but play safe).
But we do still need to transfer PageDirty in migrate_page_copy(), for
those who don't go the mapping route through migrate_page_move_mapping().
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.16: adjust context. This is not just an optimisation,
but turned out to fix a possible oops (CVE-2016-3070).]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 23c8a812dc3c621009e4f0e5342aa4e2ede1ceaa upstream.
This fixes CVE-2016-0758.
In the ASN.1 decoder, when the length field of an ASN.1 value is extracted,
it isn't validated against the remaining amount of data before being added
to the cursor. With a sufficiently large size indicated, the check:
datalen - dp < 2
may then fail due to integer overflow.
Fix this by checking the length indicated against the amount of remaining
data in both places a definite length is determined.
Whilst we're at it, make the following changes:
(1) Check the maximum size of extended length does not exceed the capacity
of the variable it's being stored in (len) rather than the type that
variable is assumed to be (size_t).
(2) Compare the EOC tag to the symbolic constant ASN1_EOC rather than the
integer 0.
(3) To reduce confusion, move the initialisation of len outside of:
for (len = 0; n > 0; n--) {
since it doesn't have anything to do with the loop counter n.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Acked-by: Peter Jones <pjones@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit cb984d101b30eb7478d32df56a0023e4603cba7f upstream.
As gcc major version numbers are going to advance rather rapidly in the
future, there's no real value in separate files for each compiler
version.
Deduplicate some of the macros #defined in each file too.
Neaten comments using normal kernel commenting style.
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Cc: Sasha Levin <levinsasha928@gmail.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Alan Modra <amodra@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[ philm: backport to 3.16-stable ]
Signed-off-by: Philip Müller <philm@manjaro.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit ec56b1f1fdc69599963574ce94cc5693d535dd64 upstream.
Lock ordering for the new mmap lock needs to be:
mmap_sem
sb_start_pagefault
i_mmap_lock
page lock
<fault processsing>
Right now xfs_vm_page_mkwrite gets this the wrong way around,
While technically it cannot deadlock due to the current freeze
ordering, it's still a landmine that might explode if we change
anything in future. Hence we need to nest the locks correctly.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Jan Kara <jack@suse.cz>
Cc: xfs@oss.sgi.com
|
|
commit 723cac48473358939759885a18e8df113ea96138 upstream.
Extent swap operations are another extent manipulation operation
that we need to ensure does not race against mmap page faults. The
current code returns if the file is mapped prior to the swap being
done, but it could potentially race against new page faults while
the swap is in progress. Hence we should use the XFS_MMAPLOCK_EXCL
for this operation, too.
While there, fix the error path handling that can result in double
unlocks of the inodes when cancelling the swapext transaction.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
[bwh: Backported to 3.16:
- The obsoleted check for mmap'd files was directly in xfs_swap_extents()
and used VN_MAPPED
- Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 0f9160b444e4de33b65dfcd3b901358a3129461a upstream.
Now that truncate locks out new page faults, we no longer need to do
special writeback hacks in truncate to work around potential races
between page faults, page cache truncation and file size updates to
ensure we get write page faults for extending truncates on sub-page
block size filesystems. Hence we can remove the code in
xfs_setattr_size() that handles this and update the comments around
the code tha thandles page cache truncate and size updates to
reflect the new reality.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
[bwh: Backported to 3.16: we never had the previous hack, so just update the
comment]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit e8e9ad42c1f1e1bfbe0e8c32c8cac02e9ebfb7ef upstream.
Now we have the i_mmap_lock being held across the page fault IO
path, we now add extent manipulation operation exclusion by adding
the lock to the paths that directly modify extent maps. This
includes truncate, hole punching and other fallocate based
operations. The operations will now take both the i_iolock and the
i_mmaplock in exclusive mode, thereby ensuring that all IO and page
faults block without holding any page locks while the extent
manipulation is in progress.
This gives us the lock order during truncate of i_iolock ->
i_mmaplock -> page_lock -> i_lock, hence providing the same
lock order as the iolock provides the normal IO path without
involving the mmap_sem.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
[bwh: Backported to 3.16:
- We never need to break layouts, so take both i_iolock and i_mmaplock at the
same time
- Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Jan Kara <jack@suse.cz>
Cc: xfs@oss.sgi.com
|
|
commit 075a924d45cc69c75a35f20b4912b85aa98b180a upstream.
Take the i_mmaplock over write page faults. These come through the
->page_mkwrite callout, so we need to wrap that calls with the
i_mmaplock.
This gives us a lock order of mmap_sem -> i_mmaplock -> page_lock
-> i_lock.
Also, move the page_mkwrite wrapper to the same region of xfs_file.c
as the read fault wrappers and add a tracepoint.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Jan Kara <jack@suse.cz>
Cc: xfs@oss.sgi.com
|
|
commit de0e8c20ba3a65b0f15040aabbefdc1999876e6b upstream.
Take the i_mmaplock over read page faults. These come through the
->fault callout, so we need to wrap the generic implementation
with the i_mmaplock. While there, add tracepoints for the read
fault as it passes through XFS.
This gives us a lock order of mmap_sem -> i_mmaplock -> page_lock
-> i_lock.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Jan Kara <jack@suse.cz>
Cc: xfs@oss.sgi.com
|
|
commit 653c60b633a9019a54a80d64b5ed33ecb214823c upstream.
Right now we cannot serialise mmap against truncate or hole punch
sanely. ->page_mkwrite is not able to take locks that the read IO
path normally takes (i.e. the inode iolock) because that could
result in lock inversions (read - iolock - page fault - page_mkwrite
- iolock) and so we cannot use an IO path lock to serialise page
write faults against truncate operations.
Instead, introduce a new lock that is used *only* in the
->page_mkwrite path that is the equivalent of the iolock. The lock
ordering in a page fault is i_mmaplock -> page lock -> i_ilock,
and so in truncate we can i_iolock -> i_mmaplock and so lock out
new write faults during the process of truncation.
Because i_mmap_lock is outside the page lock, we can hold it across
all the same operations we hold the i_iolock for. The only
difference is that we never hold the i_mmaplock in the normal IO
path and so do not ever have the possibility that we can page fault
inside it. Hence there are no recursion issues on the i_mmap_lock
and so we can use it to serialise page fault IO against inode
modification operations that affect the IO path.
This patch introduces the i_mmaplock infrastructure, lockdep
annotations and initialisation/destruction code. Use of the new lock
will be in subsequent patches.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Jan Kara <jack@suse.cz>
Cc: xfs@oss.sgi.com
|
|
commit 812176832169c77b4bacddd01edc3e55340263fd upstream.
xfs_swap_extents() holds the ilock over a call to
filemap_write_and_wait(), which can then try to write data and take
the ilock. That causes a self-deadlock.
Fix the deadlock and clean up the code by separating the locking
appropriately. Add a lockflags variable to track what locks we are
holding as we gain and drop them and cleanup the error handling to
always use "out_unlock" with the lockflags variable.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|