summaryrefslogtreecommitdiff
path: root/drivers/xen/xenbus
AgeCommit message (Collapse)AuthorFilesLines
2021-02-23arm/xen: Don't probe xenbus as part of an early initcallJulien Grall2-2/+1
commit c4295ab0b485b8bc50d2264bcae2acd06f25caaf upstream. After Commit 3499ba8198cad ("xen: Fix event channel callback via INTX/GSI"), xenbus_probe() will be called too early on Arm. This will recent to a guest hang during boot. If the hang wasn't there, we would have ended up to call xenbus_probe() twice (the second time is in xenbus_probe_initcall()). We don't need to initialize xenbus_probe() early for Arm guest. Therefore, the call in xen_guest_init() is now removed. After this change, there is no more external caller for xenbus_probe(). So the function is turned to a static one. Interestingly there were two prototypes for it. Cc: stable@vger.kernel.org Fixes: 3499ba8198cad ("xen: Fix event channel callback via INTX/GSI") Reported-by: Ian Jackson <iwj@xenproject.org> Signed-off-by: Julien Grall <jgrall@amazon.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Link: https://lore.kernel.org/r/20210210170654.5377-1-julien@xen.org Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-04xen: Fix XenStore initialisation for XS_LOCALDavid Woodhouse1-0/+31
commit 5f46400f7a6a4fad635d5a79e2aa5a04a30ffea1 upstream. In commit 3499ba8198ca ("xen: Fix event channel callback via INTX/GSI") I reworked the triggering of xenbus_probe(). I tried to simplify things by taking out the workqueue based startup triggered from wake_waiting(); the somewhat poorly named xenbus IRQ handler. I missed the fact that in the XS_LOCAL case (Dom0 starting its own xenstored or xenstore-stubdom, which happens after the kernel is booted completely), that IRQ-based trigger is still actually needed. So... put it back, except more cleanly. By just spawning a xenbus_probe thread which waits on xb_waitq and runs the probe the first time it gets woken, just as the workqueue-based hack did. This is actually a nicer approach for *all* the back ends with different interrupt methods, and we can switch them all over to that without the complex conditions for when to trigger it. But not in -rc6. This is the minimal fix for the regression, although it's a step in the right direction instead of doing a partial revert and actually putting the workqueue back. It's also simpler than the workqueue. Fixes: 3499ba8198ca ("xen: Fix event channel callback via INTX/GSI") Reported-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/4c9af052a6e0f6485d1de43f2c38b1461996db99.camel@infradead.org Signed-off-by: Juergen Gross <jgross@suse.com> Cc: Salvatore Bonaccorso <carnil@debian.org> Cc: Jason Andryuk <jandryuk@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-30xen: Fix event channel callback via INTX/GSIDavid Woodhouse3-22/+68
[ Upstream commit 3499ba8198cad47b731792e5e56b9ec2a78a83a2 ] For a while, event channel notification via the PCI platform device has been broken, because we attempt to communicate with xenstore before we even have notifications working, with the xs_reset_watches() call in xs_init(). We tend to get away with this on Xen versions below 4.0 because we avoid calling xs_reset_watches() anyway, because xenstore might not cope with reading a non-existent key. And newer Xen *does* have the vector callback support, so we rarely fall back to INTX/GSI delivery. To fix it, clean up a bit of the mess of xs_init() and xenbus_probe() startup. Call xs_init() directly from xenbus_init() only in the !XS_HVM case, deferring it to be called from xenbus_probe() in the XS_HVM case instead. Then fix up the invocation of xenbus_probe() to happen either from its device_initcall if the callback is available early enough, or when the callback is finally set up. This means that the hack of calling xenbus_probe() from a workqueue after the first interrupt, or directly from the PCI platform device setup, is no longer needed. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20210113132606.422794-2-dwmw2@infradead.org Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-29xenbus/xenbus_backend: Disallow pending watch messagesSeongJae Park1-0/+7
commit 9996bd494794a2fe393e97e7a982388c6249aa76 upstream. 'xenbus_backend' watches 'state' of devices, which is writable by guests. Hence, if guests intensively updates it, dom0 will have lots of pending events that exhausting memory of dom0. In other words, guests can trigger dom0 memory pressure. This is known as XSA-349. However, the watch callback of it, 'frontend_changed()', reads only 'state', so doesn't need to have the pending events. To avoid the problem, this commit disallows pending watch messages for 'xenbus_backend' using the 'will_handle()' watch callback. This is part of XSA-349 Cc: stable@vger.kernel.org Signed-off-by: SeongJae Park <sjpark@amazon.de> Reported-by: Michael Kurth <mku@amazon.de> Reported-by: Pawel Wieczorkiewicz <wipawel@amazon.de> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-29xen/xenbus: Count pending messages for each watchSeongJae Park1-11/+18
commit 3dc86ca6b4c8cfcba9da7996189d1b5a358a94fc upstream. This commit adds a counter of pending messages for each watch in the struct. It is used to skip unnecessary pending messages lookup in 'unregister_xenbus_watch()'. It could also be used in 'will_handle' callback. This is part of XSA-349 Cc: stable@vger.kernel.org Signed-off-by: SeongJae Park <sjpark@amazon.de> Reported-by: Michael Kurth <mku@amazon.de> Reported-by: Pawel Wieczorkiewicz <wipawel@amazon.de> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-29xen/xenbus/xen_bus_type: Support will_handle watch callbackSeongJae Park2-1/+4
commit be987200fbaceaef340872841d4f7af2c5ee8dc3 upstream. This commit adds support of the 'will_handle' watch callback for 'xen_bus_type' users. This is part of XSA-349 Cc: stable@vger.kernel.org Signed-off-by: SeongJae Park <sjpark@amazon.de> Reported-by: Michael Kurth <mku@amazon.de> Reported-by: Pawel Wieczorkiewicz <wipawel@amazon.de> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-29xen/xenbus: Add 'will_handle' callback support in xenbus_watch_path()SeongJae Park2-3/+8
commit 2e85d32b1c865bec703ce0c962221a5e955c52c2 upstream. Some code does not directly make 'xenbus_watch' object and call 'register_xenbus_watch()' but use 'xenbus_watch_path()' instead. This commit adds support of 'will_handle' callback in the 'xenbus_watch_path()' and it's wrapper, 'xenbus_watch_pathfmt()'. This is part of XSA-349 Cc: stable@vger.kernel.org Signed-off-by: SeongJae Park <sjpark@amazon.de> Reported-by: Michael Kurth <mku@amazon.de> Reported-by: Pawel Wieczorkiewicz <wipawel@amazon.de> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-29xen/xenbus: Allow watches discard events before queueingSeongJae Park2-1/+5
commit fed1755b118147721f2c87b37b9d66e62c39b668 upstream. If handling logics of watch events are slower than the events enqueue logic and the events can be created from the guests, the guests could trigger memory pressure by intensively inducing the events, because it will create a huge number of pending events that exhausting the memory. Fortunately, some watch events could be ignored, depending on its handler callback. For example, if the callback has interest in only one single path, the watch wouldn't want multiple pending events. Or, some watches could ignore events to same path. To let such watches to volutarily help avoiding the memory pressure situation, this commit introduces new watch callback, 'will_handle'. If it is not NULL, it will be called for each new event just before enqueuing it. Then, if the callback returns false, the event will be discarded. No watch is using the callback for now, though. This is part of XSA-349 Cc: stable@vger.kernel.org Signed-off-by: SeongJae Park <sjpark@amazon.de> Reported-by: Michael Kurth <mku@amazon.de> Reported-by: Pawel Wieczorkiewicz <wipawel@amazon.de> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-09xen/xenbus: Fix granting of vmalloc'd memorySimon Leiner1-2/+8
[ Upstream commit d742db70033c745e410523e00522ee0cfe2aa416 ] On some architectures (like ARM), virt_to_gfn cannot be used for vmalloc'd memory because of its reliance on virt_to_phys. This patch introduces a check for vmalloc'd addresses and obtains the PFN using vmalloc_to_pfn in that case. Signed-off-by: Simon Leiner <simon@leiner.me> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Link: https://lore.kernel.org/r/20200825093153.35500-1-simon@leiner.me Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-02xen/xenbus: ensure xenbus_map_ring_valloc() returns proper grant statusJuergen Gross1-1/+8
[ Upstream commit 6b51fd3f65a22e3d1471b18a1d56247e246edd46 ] xenbus_map_ring_valloc() maps a ring page and returns the status of the used grant (0 meaning success). There are Xen hypervisors which might return the value 1 for the status of a failed grant mapping due to a bug. Some callers of xenbus_map_ring_valloc() test for errors by testing the returned status to be less than zero, resulting in no error detected and crashing later due to a not available ring page. Set the return value of xenbus_map_ring_valloc() to GNTST_general_error in case the grant status reported by Xen is greater than zero. This is part of XSA-316. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Wei Liu <wl@xen.org> Link: https://lore.kernel.org/r/20200326080358.1018-1-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-02xenbus: req->err should be updated before req->stateDongli Zhang1-0/+2
[ Upstream commit 8130b9d5b5abf26f9927b487c15319a187775f34 ] This patch adds the barrier to guarantee that req->err is always updated before req->state. Otherwise, read_reply() would not return ERR_PTR(req->err) but req->body, when process_writes()->xb_write() is failed. Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Link: https://lore.kernel.org/r/20200303221423.21962-2-dongli.zhang@oracle.com Reviewed-by: Julien Grall <jgrall@amazon.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-02xenbus: req->body should be updated before req->stateDongli Zhang2-3/+8
[ Upstream commit 1b6a51e86cce38cf4d48ce9c242120283ae2f603 ] The req->body should be updated before req->state is updated and the order should be guaranteed by a barrier. Otherwise, read_reply() might return req->body = NULL. Below is sample callstack when the issue is reproduced on purpose by reordering the updates of req->body and req->state and adding delay in code between updates of req->state and req->body. [ 22.356105] general protection fault: 0000 [#1] SMP PTI [ 22.361185] CPU: 2 PID: 52 Comm: xenwatch Not tainted 5.5.0xen+ #6 [ 22.366727] Hardware name: Xen HVM domU, BIOS ... [ 22.372245] RIP: 0010:_parse_integer_fixup_radix+0x6/0x60 ... ... [ 22.392163] RSP: 0018:ffffb2d64023fdf0 EFLAGS: 00010246 [ 22.395933] RAX: 0000000000000000 RBX: 75746e7562755f6d RCX: 0000000000000000 [ 22.400871] RDX: 0000000000000000 RSI: ffffb2d64023fdfc RDI: 75746e7562755f6d [ 22.405874] RBP: 0000000000000000 R08: 00000000000001e8 R09: 0000000000cdcdcd [ 22.410945] R10: ffffb2d6402ffe00 R11: ffff9d95395eaeb0 R12: ffff9d9535935000 [ 22.417613] R13: ffff9d9526d4a000 R14: ffff9d9526f4f340 R15: ffff9d9537654000 [ 22.423726] FS: 0000000000000000(0000) GS:ffff9d953bc80000(0000) knlGS:0000000000000000 [ 22.429898] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 22.434342] CR2: 000000c4206a9000 CR3: 00000001ea3fc002 CR4: 00000000001606e0 [ 22.439645] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 22.444941] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 22.450342] Call Trace: [ 22.452509] simple_strtoull+0x27/0x70 [ 22.455572] xenbus_transaction_start+0x31/0x50 [ 22.459104] netback_changed+0x76c/0xcc1 [xen_netfront] [ 22.463279] ? find_watch+0x40/0x40 [ 22.466156] xenwatch_thread+0xb4/0x150 [ 22.469309] ? wait_woken+0x80/0x80 [ 22.472198] kthread+0x10e/0x130 [ 22.474925] ? kthread_park+0x80/0x80 [ 22.477946] ret_from_fork+0x35/0x40 [ 22.480968] Modules linked in: xen_kbdfront xen_fbfront(+) xen_netfront xen_blkfront [ 22.486783] ---[ end trace a9222030a747c3f7 ]--- [ 22.490424] RIP: 0010:_parse_integer_fixup_radix+0x6/0x60 The virt_rmb() is added in the 'true' path of test_reply(). The "while" is changed to "do while" so that test_reply() is used as a read memory barrier. Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Link: https://lore.kernel.org/r/20200303221423.21962-1-dongli.zhang@oracle.com Reviewed-by: Julien Grall <jgrall@amazon.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-11xen/xenbus: fix self-deadlock after killing user processJuergen Gross1-2/+18
commit a8fabb38525c51a094607768bac3ba46b3f4a9d5 upstream. In case a user process using xenbus has open transactions and is killed e.g. via ctrl-C the following cleanup of the allocated resources might result in a deadlock due to trying to end a transaction in the xenbus worker thread: [ 2551.474706] INFO: task xenbus:37 blocked for more than 120 seconds. [ 2551.492215] Tainted: P OE 5.0.0-29-generic #5 [ 2551.510263] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2551.528585] xenbus D 0 37 2 0x80000080 [ 2551.528590] Call Trace: [ 2551.528603] __schedule+0x2c0/0x870 [ 2551.528606] ? _cond_resched+0x19/0x40 [ 2551.528632] schedule+0x2c/0x70 [ 2551.528637] xs_talkv+0x1ec/0x2b0 [ 2551.528642] ? wait_woken+0x80/0x80 [ 2551.528645] xs_single+0x53/0x80 [ 2551.528648] xenbus_transaction_end+0x3b/0x70 [ 2551.528651] xenbus_file_free+0x5a/0x160 [ 2551.528654] xenbus_dev_queue_reply+0xc4/0x220 [ 2551.528657] xenbus_thread+0x7de/0x880 [ 2551.528660] ? wait_woken+0x80/0x80 [ 2551.528665] kthread+0x121/0x140 [ 2551.528667] ? xb_read+0x1d0/0x1d0 [ 2551.528670] ? kthread_park+0x90/0x90 [ 2551.528673] ret_from_fork+0x35/0x40 Fix this by doing the cleanup via a workqueue instead. Reported-by: James Dingwall <james@dingwall.me.uk> Fixes: fd8aa9095a95c ("xen: optimize xenbus driver for multiple concurrent xenstore accesses") Cc: <stable@vger.kernel.org> # 4.11 Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-11fs: stream_open - opener for stream-like files so that read and write can ↵Kirill Smelkov1-3/+1
run simultaneously without deadlock commit 10dce8af34226d90fa56746a934f8da5dcdba3df upstream. Commit 9c225f2655e3 ("vfs: atomic f_pos accesses as per POSIX") added locking for file.f_pos access and in particular made concurrent read and write not possible - now both those functions take f_pos lock for the whole run, and so if e.g. a read is blocked waiting for data, write will deadlock waiting for that read to complete. This caused regression for stream-like files where previously read and write could run simultaneously, but after that patch could not do so anymore. See e.g. commit 581d21a2d02a ("xenbus: fix deadlock on writes to /proc/xen/xenbus") which fixes such regression for particular case of /proc/xen/xenbus. The patch that added f_pos lock in 2014 did so to guarantee POSIX thread safety for read/write/lseek and added the locking to file descriptors of all regular files. In 2014 that thread-safety problem was not new as it was already discussed earlier in 2006. However even though 2006'th version of Linus's patch was adding f_pos locking "only for files that are marked seekable with FMODE_LSEEK (thus avoiding the stream-like objects like pipes and sockets)", the 2014 version - the one that actually made it into the tree as 9c225f2655e3 - is doing so irregardless of whether a file is seekable or not. See https://lore.kernel.org/lkml/53022DB1.4070805@gmail.com/ https://lwn.net/Articles/180387 https://lwn.net/Articles/180396 for historic context. The reason that it did so is, probably, that there are many files that are marked non-seekable, but e.g. their read implementation actually depends on knowing current position to correctly handle the read. Some examples: kernel/power/user.c snapshot_read fs/debugfs/file.c u32_array_read fs/fuse/control.c fuse_conn_waiting_read + ... drivers/hwmon/asus_atk0110.c atk_debugfs_ggrp_read arch/s390/hypfs/inode.c hypfs_read_iter ... Despite that, many nonseekable_open users implement read and write with pure stream semantics - they don't depend on passed ppos at all. And for those cases where read could wait for something inside, it creates a situation similar to xenbus - the write could be never made to go until read is done, and read is waiting for some, potentially external, event, for potentially unbounded time -> deadlock. Besides xenbus, there are 14 such places in the kernel that I've found with semantic patch (see below): drivers/xen/evtchn.c:667:8-24: ERROR: evtchn_fops: .read() can deadlock .write() drivers/isdn/capi/capi.c:963:8-24: ERROR: capi_fops: .read() can deadlock .write() drivers/input/evdev.c:527:1-17: ERROR: evdev_fops: .read() can deadlock .write() drivers/char/pcmcia/cm4000_cs.c:1685:7-23: ERROR: cm4000_fops: .read() can deadlock .write() net/rfkill/core.c:1146:8-24: ERROR: rfkill_fops: .read() can deadlock .write() drivers/s390/char/fs3270.c:488:1-17: ERROR: fs3270_fops: .read() can deadlock .write() drivers/usb/misc/ldusb.c:310:1-17: ERROR: ld_usb_fops: .read() can deadlock .write() drivers/hid/uhid.c:635:1-17: ERROR: uhid_fops: .read() can deadlock .write() net/batman-adv/icmp_socket.c:80:1-17: ERROR: batadv_fops: .read() can deadlock .write() drivers/media/rc/lirc_dev.c:198:1-17: ERROR: lirc_fops: .read() can deadlock .write() drivers/leds/uleds.c:77:1-17: ERROR: uleds_fops: .read() can deadlock .write() drivers/input/misc/uinput.c:400:1-17: ERROR: uinput_fops: .read() can deadlock .write() drivers/infiniband/core/user_mad.c:985:7-23: ERROR: umad_fops: .read() can deadlock .write() drivers/gnss/core.c:45:1-17: ERROR: gnss_fops: .read() can deadlock .write() In addition to the cases above another regression caused by f_pos locking is that now FUSE filesystems that implement open with FOPEN_NONSEEKABLE flag, can no longer implement bidirectional stream-like files - for the same reason as above e.g. read can deadlock write locking on file.f_pos in the kernel. FUSE's FOPEN_NONSEEKABLE was added in 2008 in a7c1b990f715 ("fuse: implement nonseekable open") to support OSSPD. OSSPD implements /dev/dsp in userspace with FOPEN_NONSEEKABLE flag, with corresponding read and write routines not depending on current position at all, and with both read and write being potentially blocking operations: See https://github.com/libfuse/osspd https://lwn.net/Articles/308445 https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1406 https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1438-L1477 https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1479-L1510 Corresponding libfuse example/test also describes FOPEN_NONSEEKABLE as "somewhat pipe-like files ..." with read handler not using offset. However that test implements only read without write and cannot exercise the deadlock scenario: https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L124-L131 https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L146-L163 https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L209-L216 I've actually hit the read vs write deadlock for real while implementing my FUSE filesystem where there is /head/watch file, for which open creates separate bidirectional socket-like stream in between filesystem and its user with both read and write being later performed simultaneously. And there it is semantically not easy to split the stream into two separate read-only and write-only channels: https://lab.nexedi.com/kirr/wendelin.core/blob/f13aa600/wcfs/wcfs.go#L88-169 Let's fix this regression. The plan is: 1. We can't change nonseekable_open to include &~FMODE_ATOMIC_POS - doing so would break many in-kernel nonseekable_open users which actually use ppos in read/write handlers. 2. Add stream_open() to kernel to open stream-like non-seekable file descriptors. Read and write on such file descriptors would never use nor change ppos. And with that property on stream-like files read and write will be running without taking f_pos lock - i.e. read and write could be running simultaneously. 3. With semantic patch search and convert to stream_open all in-kernel nonseekable_open users for which read and write actually do not depend on ppos and where there is no other methods in file_operations which assume @offset access. 4. Add FOPEN_STREAM to fs/fuse/ and open in-kernel file-descriptors via steam_open if that bit is present in filesystem open reply. It was tempting to change fs/fuse/ open handler to use stream_open instead of nonseekable_open on just FOPEN_NONSEEKABLE flags, but grepping through Debian codesearch shows users of FOPEN_NONSEEKABLE, and in particular GVFS which actually uses offset in its read and write handlers https://codesearch.debian.net/search?q=-%3Enonseekable+%3D https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1080 https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1247-1346 https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1399-1481 so if we would do such a change it will break a real user. 5. Add stream_open and FOPEN_STREAM handling to stable kernels starting from v3.14+ (the kernel where 9c225f2655 first appeared). This will allow to patch OSSPD and other FUSE filesystems that provide stream-like files to return FOPEN_STREAM | FOPEN_NONSEEKABLE in their open handler and this way avoid the deadlock on all kernel versions. This should work because fs/fuse/ ignores unknown open flags returned from a filesystem and so passing FOPEN_STREAM to a kernel that is not aware of this flag cannot hurt. In turn the kernel that is not aware of FOPEN_STREAM will be < v3.14 where just FOPEN_NONSEEKABLE is sufficient to implement streams without read vs write deadlock. This patch adds stream_open, converts /proc/xen/xenbus to it and adds semantic patch to automatically locate in-kernel places that are either required to be converted due to read vs write deadlock, or that are just safe to be converted because read and write do not use ppos and there are no other funky methods in file_operations. Regarding semantic patch I've verified each generated change manually - that it is correct to convert - and each other nonseekable_open instance left - that it is either not correct to convert there, or that it is not converted due to current stream_open.cocci limitations. The script also does not convert files that should be valid to convert, but that currently have .llseek = noop_llseek or generic_file_llseek for unknown reason despite file being opened with nonseekable_open (e.g. drivers/input/mousedev.c) Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Yongzhi Pan <panyongzhi@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Juergen Gross <jgross@suse.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Tejun Heo <tj@kernel.org> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Nikolaus Rath <Nikolaus@rath.org> Cc: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Kirill Smelkov <kirr@nexedi.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-20xen: xenbus_dev_frontend: Really return response stringSimon Gaiser1-1/+2
[ Upstream commit ebf04f331fa15a966262341a7dc6b1a0efd633e4 ] xenbus_command_reply() did not actually copy the response string and leaked stack content instead. Fixes: 9a6161fe73bd ("xen: return xenstore command failures via response instead of rc") Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30xen: xenbus: use put_device() instead of kfree()Arvind Yadav1-1/+4
[ Upstream commit 351b2bccede1cb673ec7957b35ea997ea24c8884 ] Never directly free @dev after calling device_register(), even if it returned an error! Always use put_device() to give up the reference initialized. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-19xen: xenbus_dev_frontend: Fix XS_TRANSACTION_END handlingSimon Gaiser1-1/+1
commit 2a22ee6c3ab1d761bc9c04f1e4117edd55b82f09 upstream. Commit fd8aa9095a95 ("xen: optimize xenbus driver for multiple concurrent xenstore accesses") made a subtle change to the semantic of xenbus_dev_request_and_reply() and xenbus_transaction_end(). Before on an error response to XS_TRANSACTION_END xenbus_dev_request_and_reply() would not decrement the active transaction counter. But xenbus_transaction_end() has always counted the transaction as finished regardless of the response. The new behavior is that xenbus_dev_request_and_reply() and xenbus_transaction_end() will always count the transaction as finished regardless the response code (handled in xs_request_exit()). But xenbus_dev_frontend tries to end a transaction on closing of the device if the XS_TRANSACTION_END failed before. Trying to close the transaction twice corrupts the reference count. So fix this by also considering a transaction closed if we have sent XS_TRANSACTION_END once regardless of the return code. Cc: <stable@vger.kernel.org> # 4.11 Fixes: fd8aa9095a95 ("xen: optimize xenbus driver for multiple concurrent xenstore accesses") Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-22xenbus: track caller request idJoao Martins3-0/+5
commit 29fee6eed2811ff1089b30fc579a2d19d78016ab upstream. Commit fd8aa9095a95 ("xen: optimize xenbus driver for multiple concurrent xenstore accesses") optimized xenbus concurrent accesses but in doing so broke UABI of /dev/xen/xenbus. Through /dev/xen/xenbus applications are in charge of xenbus message exchange with the correct header and body. Now, after the mentioned commit the replies received by application will no longer have the header req_id echoed back as it was on request (see specification below for reference), because that particular field is being overwritten by kernel. struct xsd_sockmsg { uint32_t type; /* XS_??? */ uint32_t req_id;/* Request identifier, echoed in daemon's response. */ uint32_t tx_id; /* Transaction id (0 if not related to a transaction). */ uint32_t len; /* Length of data following this. */ /* Generally followed by nul-terminated string(s). */ }; Before there was only one request at a time so req_id could simply be forwarded back and forth. To allow simultaneous requests we need a different req_id for each message thus kernel keeps a monotonic increasing counter for this field and is written on every request irrespective of userspace value. Forwarding again the req_id on userspace requests is not a solution because we would open the possibility of userspace-generated req_id colliding with kernel ones. So this patch instead takes another route which is to artificially keep user req_id while keeping the xenbus logic as is. We do that by saving the original req_id before xs_send(), use the private kernel counter as req_id and then once reply comes and was validated, we restore back the original req_id. Cc: <stable@vger.kernel.org> # 4.11 Fixes: fd8aa9095a ("xen: optimize xenbus driver for multiple concurrent xenstore accesses") Reported-by: Bhavesh Davda <bhavesh.davda@oracle.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman2-0/+2
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-18xen: don't compile pv-specific parts if XEN_PV isn't configuredJuergen Gross1-63/+67
xenbus_client.c contains some functions specific for pv guests. Enclose them with #ifdef CONFIG_XEN_PV to avoid compiling them when they are not needed (e.g. on ARM). Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-11xen: avoid deadlock in xenbusJuergen Gross1-1/+2
When starting the xenwatch thread a theoretical deadlock situation is possible: xs_init() contains: task = kthread_run(xenwatch_thread, NULL, "xenwatch"); if (IS_ERR(task)) return PTR_ERR(task); xenwatch_pid = task->pid; And xenwatch_thread() does: mutex_lock(&xenwatch_mutex); ... event->handle->callback(); ... mutex_unlock(&xenwatch_mutex); The callback could call unregister_xenbus_watch() which does: ... if (current->pid != xenwatch_pid) mutex_lock(&xenwatch_mutex); ... In case a watch is firing before xenwatch_pid could be set and the callback of that watch unregisters a watch, then a self-deadlock would occur. Avoid this by setting xenwatch_pid in xenwatch_thread(). Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2017-06-25xen: avoid deadlock in xenbus driverJuergen Gross1-11/+10
There has been a report about a deadlock in the xenbus driver: [ 247.979498] ====================================================== [ 247.985688] WARNING: possible circular locking dependency detected [ 247.991882] 4.12.0-rc4-00022-gc4b25c0 #575 Not tainted [ 247.997040] ------------------------------------------------------ [ 248.003232] xenbus/91 is trying to acquire lock: [ 248.007875] (&u->msgbuffer_mutex){+.+.+.}, at: [<ffff00000863e904>] xenbus_dev_queue_reply+0x3c/0x230 [ 248.017163] [ 248.017163] but task is already holding lock: [ 248.023096] (xb_write_mutex){+.+...}, at: [<ffff00000863a940>] xenbus_thread+0x5f0/0x798 [ 248.031267] [ 248.031267] which lock already depends on the new lock. [ 248.031267] [ 248.039615] [ 248.039615] the existing dependency chain (in reverse order) is: [ 248.047176] [ 248.047176] -> #1 (xb_write_mutex){+.+...}: [ 248.052943] __lock_acquire+0x1728/0x1778 [ 248.057498] lock_acquire+0xc4/0x288 [ 248.061630] __mutex_lock+0x84/0x868 [ 248.065755] mutex_lock_nested+0x3c/0x50 [ 248.070227] xs_send+0x164/0x1f8 [ 248.074015] xenbus_dev_request_and_reply+0x6c/0x88 [ 248.079427] xenbus_file_write+0x260/0x420 [ 248.084073] __vfs_write+0x48/0x138 [ 248.088113] vfs_write+0xa8/0x1b8 [ 248.091983] SyS_write+0x54/0xb0 [ 248.095768] el0_svc_naked+0x24/0x28 [ 248.099897] [ 248.099897] -> #0 (&u->msgbuffer_mutex){+.+.+.}: [ 248.106088] print_circular_bug+0x80/0x2e0 [ 248.110730] __lock_acquire+0x1768/0x1778 [ 248.115288] lock_acquire+0xc4/0x288 [ 248.119417] __mutex_lock+0x84/0x868 [ 248.123545] mutex_lock_nested+0x3c/0x50 [ 248.128016] xenbus_dev_queue_reply+0x3c/0x230 [ 248.133005] xenbus_thread+0x788/0x798 [ 248.137306] kthread+0x110/0x140 [ 248.141087] ret_from_fork+0x10/0x40 It is rather easy to avoid by dropping xb_write_mutex before calling xenbus_dev_queue_reply(). Fixes: fd8aa9095a95c02dcc35540a263267c29b8fda9d ("xen: optimize xenbus driver for multiple concurrent xenstore accesses"). Cc: <stable@vger.kernel.org> # 4.11 Reported-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Juergen Gross <jgross@suse.com> Tested-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2017-04-04xenbus: remove transaction holder from list before freeingJan Beulich1-1/+3
After allocation the item is being placed on the list right away. Consequently it needs to be taken off the list before freeing in the case xenbus_dev_request_and_reply() failed, as in that case the callback (xenbus_dev_queue_reply()) is not being called (and if it was called, it should do both). Fixes: 5584ea250ae44f929feb4c7bd3877d1c5edbf813 Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-02-27xenbus: Remove duplicate inclusion of linux/init.hMasanari Iida1-1/+0
This patch remove duplicate inclusion of linux/init.h in xenbus_dev_frontend.c. Confirm successfully compile after remove the line. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Reviewed-by: Juergen Gross <jgross@suse.com>
2017-02-09xen: optimize xenbus driver for multiple concurrent xenstore accessesJuergen Gross4-391/+672
Handling of multiple concurrent Xenstore accesses through xenbus driver either from the kernel or user land is rather lame today: xenbus is capable to have one access active only at one point of time. Rewrite xenbus to handle multiple requests concurrently by making use of the request id of the Xenstore protocol. This requires to: - Instead of blocking inside xb_read() when trying to read data from the xenstore ring buffer do so only in the main loop of xenbus_thread(). - Instead of doing writes to the xenstore ring buffer in the context of the caller just queue the request and do the write in the dedicated xenbus thread. - Instead of just forwarding the request id specified by the caller of xenbus to xenstore use a xenbus internal unique request id. This will allow multiple outstanding requests. - Modify the locking scheme in order to allow multiple requests being active in parallel. - Instead of waiting for the reply of a user's xenstore request after writing the request to the xenstore ring buffer return directly to the caller and do the waiting in the read path. Additionally signal handling was optimized by avoiding waking up the xenbus thread or sending an event to Xenstore in case the addressed entity is known to be running already. As a result communication with Xenstore is sped up by a factor of up to 5: depending on the request type (read or write) and the amount of data transferred the gain was at least 20% (small reads) and went up to a factor of 5 for large writes. In the end some more rough edges of xenbus have been smoothed: - Handling of memory shortage when reading from xenstore ring buffer in the xenbus driver was not optimal: it was busy looping and issuing a warning in each loop. - In case of xenstore not running in dom0 but in a stubdom we end up with two xenbus threads running as the initialization of xenbus in dom0 expecting a local xenstored will be redone later when connecting to the xenstore domain. Up to now this was no problem as locking would prevent the two xenbus threads interfering with each other, but this was just a waste of kernel resources. - An out of memory situation while writing to or reading from the xenstore ring buffer no longer will lead to a possible loss of synchronization with xenstore. - The user read and write part are now interruptible by signals. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-02-09xen: modify xenstore watch event interfaceJuergen Gross7-51/+42
Today a Xenstore watch event is delivered via a callback function declared as: void (*callback)(struct xenbus_watch *, const char **vec, unsigned int len); As all watch events only ever come with two parameters (path and token) changing the prototype to: void (*callback)(struct xenbus_watch *, const char *path, const char *token); is the natural thing to do. Apply this change and adapt all users. Cc: konrad.wilk@oracle.com Cc: roger.pau@citrix.com Cc: wei.liu2@citrix.com Cc: paul.durrant@citrix.com Cc: netdev@vger.kernel.org Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-02-09xen: clean up xenbus internal headersJuergen Gross10-91/+45
The xenbus driver has an awful mixture of internally and globally visible headers: some of the internally used only stuff is defined in the global header include/xen/xenbus.h while some stuff defined in internal headers is used by other drivers, too. Clean this up by moving the externally used symbols to include/xen/xenbus.h and the symbols used internally only to a new header drivers/xen/xenbus/xenbus.h replacing xenbus_comms.h and xenbus_probe.h Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-02-08xenbus: Neaten xenbus_va_dev_errorJoe Perches1-29/+10
This function error patch can be simplified, so do so. Remove fail: label and somewhat obfuscating, used once "error_path" function. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2016-12-23xen: remove stale xs_input_avail() from headerJuergen Gross1-1/+0
In drivers/xen/xenbus/xenbus_comms.h there is a stale declaration of xs_input_avail(). Remove it. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2016-12-23xen: return xenstore command failures via response instead of rcJuergen Gross1-20/+27
When the xenbus driver does some special handling for a Xenstore command any error condition related to the command should be returned via an error response instead of letting the related write operation fail. Otherwise the user land handler might take wrong decisions assuming the connection to Xenstore is broken. While at it try to return the same error values xenstored would return for those cases. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2016-12-23xen: xenbus driver must not accept invalid transaction idsJuergen Gross1-1/+1
When accessing Xenstore in a transaction the user is specifying a transaction id which he normally obtained from Xenstore when starting the transaction. Xenstore is validating a transaction id against all known transaction ids of the connection the request came in. As all requests of a domain not being the one where Xenstore lives share one connection, validation of transaction ids of different users of Xenstore in that domain should be done by the kernel of that domain being the multiplexer between the Xenstore users in that domain and Xenstore. In order to prohibit one Xenstore user "hijacking" a transaction from another user the xenbus driver has to verify a given transaction id against all known transaction ids of the user before forwarding it to Xenstore. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2016-12-12xenbus: fix deadlock on writes to /proc/xen/xenbusDavid Vrabel1-0/+2
/proc/xen/xenbus does not work correctly. A read blocked waiting for a xenstore message holds the mutex needed for atomic file position updates. This blocks any writes on the same file handle, which can deadlock if the write is needed to unblock the read. Clear FMODE_ATOMIC_POS when opening this device to always get character device like sematics. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2016-12-08xen: xenbus: set error code on failurePan Bian1-1/+1
Variable err is initialized with 0. As a result, the return value may be 0 even if get_zeroed_page() fails to allocate memory. This patch fixes the bug, initializing err with "-ENOMEM". Signed-off-by: Pan Bian <bianpan2016@163.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2016-11-17xenfs: Use proc_create_mount_point() to create /proc/xenSeth Forshee1-1/+1
Mounting proc in user namespace containers fails if the xenbus filesystem is mounted on /proc/xen because this directory fails the "permanently empty" test. proc_create_mount_point() exists specifically to create such mountpoints in proc but is currently proc-internal. Export this interface to modules, then use it in xenbus when creating /proc/xen. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2016-11-07xen: make use of xenbus_read_unsigned() in xenbusJuergen Gross2-11/+4
Use xenbus_read_unsigned() instead of xenbus_scanf() when possible. This requires to change the type of the reads from int to unsigned, but these cases have been wrong before: negative values are not allowed for the modified cases. Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: David Vrabel <david.vrabel@citrix.com>
2016-11-07xen: introduce xenbus_read_unsigned()Juergen Gross1-0/+15
There are multiple instances of code reading an optional unsigned parameter from Xenstore via xenbus_scanf(). Instead of repeating the same code over and over add a service function doing the job. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com>
2016-10-25Merge tag 'for-linus-4.9-rc2-tag' of ↵Linus Torvalds2-3/+5
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from David Vrabel: - advertise control feature flags in xenstore - fix x86 build when XEN_PVHVM is disabled * tag 'for-linus-4.9-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xenbus: check return value of xenbus_scanf() xenbus: prefer list_for_each() x86: xen: move cpu_up functions out of ifdef xenbus: advertise control feature flags
2016-10-24xenbus: check return value of xenbus_scanf()Jan Beulich1-1/+3
Don't ignore errors here: Set backend state to unknown when unsuccessful. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-10-24xenbus: prefer list_for_each()Jan Beulich1-2/+2
This is more efficient than list_for_each_safe() when list modification is accompanied by breaking out of the loop. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-08-24xenbus: don't look up transaction IDs for ordinary writesJan Beulich1-1/+1
This should really only be done for XS_TRANSACTION_END messages, or else at least some of the xenstore-* tools don't work anymore. Fixes: 0beef634b8 ("xenbus: don't BUG() on user mode induced condition") Reported-by: Richard Schütz <rschuetz@uni-koblenz.de> Cc: <stable@vger.kernel.org> Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Richard Schütz <rschuetz@uni-koblenz.de> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-27Merge tag 'for-linus-4.8-rc0-tag' of ↵Linus Torvalds1-14/+1
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from David Vrabel: "Features and fixes for 4.8-rc0: - ACPI support for guests on ARM platforms. - Generic steal time support for arm and x86. - Support cases where kernel cpu is not Xen VCPU number (e.g., if in-guest kexec is used). - Use the system workqueue instead of a custom workqueue in various places" * tag 'for-linus-4.8-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (47 commits) xen: add static initialization of steal_clock op to xen_time_ops xen/pvhvm: run xen_vcpu_setup() for the boot CPU xen/evtchn: use xen_vcpu_id mapping xen/events: fifo: use xen_vcpu_id mapping xen/events: use xen_vcpu_id mapping in events_base x86/xen: use xen_vcpu_id mapping when pointing vcpu_info to shared_info x86/xen: use xen_vcpu_id mapping for HYPERVISOR_vcpu_op xen: introduce xen_vcpu_id mapping x86/acpi: store ACPI ids from MADT for future usage x86/xen: update cpuid.h from Xen-4.7 xen/evtchn: add IOCTL_EVTCHN_RESTRICT xen-blkback: really don't leak mode property xen-blkback: constify instance of "struct attribute_group" xen-blkfront: prefer xenbus_scanf() over xenbus_gather() xen-blkback: prefer xenbus_scanf() over xenbus_gather() xen: support runqueue steal time on xen arm/xen: add support for vm_assist hypercall xen: update xen headers xen-pciback: drop superfluous variables xen-pciback: short-circuit read path used for merging write values ...
2016-07-08xenbus: simplify xenbus_dev_request_and_reply()Jan Beulich1-4/+3
No need to retain a local copy of the full request message, only the type is really needed. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-08xenbus: don't bail early from xenbus_dev_request_and_reply()Jan Beulich1-3/+0
xenbus_dev_request_and_reply() needs to track whether a transaction is open. For XS_TRANSACTION_START messages it calls transaction_start() and for XS_TRANSACTION_END messages it calls transaction_end(). If sending an XS_TRANSACTION_START message fails or responds with an an error, the transaction is not open and transaction_end() must be called. If sending an XS_TRANSACTION_END message fails, the transaction is still open, but if an error response is returned the transaction is closed. Commit 027bd7e89906 ("xen/xenbus: Avoid synchronous wait on XenBus stalling shutdown/restart") introduced a regression where failed XS_TRANSACTION_START messages were leaving the transaction open. This can cause problems with suspend (and migration) as all transactions must be closed before suspending. It appears that the problematic change was added accidentally, so just remove it. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-07xenbus: don't BUG() on user mode induced conditionJan Beulich1-6/+8
Inability to locate a user mode specified transaction ID should not lead to a kernel crash. For other than XS_TRANSACTION_START also don't issue anything to xenbus if the specified ID doesn't match that of any active transaction. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-06xen: xenbus: Remove create_workqueueBhaktipriya Shridhar1-14/+1
System workqueues have been able to handle high level of concurrency for a long time now and there's no reason to use dedicated workqueues just to gain concurrency. Replace dedicated xenbus_frontend_wq with the use of system_wq. Unlike a dedicated per-cpu workqueue created with create_workqueue(), system_wq allows multiple work items to overlap executions even on the same CPU; however, a per-cpu workqueue doesn't have any CPU locality or global ordering guarantees unless the target CPU is explicitly specified and the increase of local concurrency shouldn't make any difference. In this case, there is only a single work item, increase of concurrency level by switching to system_wq should not make any difference. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-03-22Merge tag 'for-linus-4.6-rc0-tag' of ↵Linus Torvalds3-23/+4
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from David Vrabel: "Features and fixes for 4.6: - Make earlyprintk=xen work for HVM guests - Remove module support for things never built as modules" * tag 'for-linus-4.6-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: drivers/xen: make platform-pci.c explicitly non-modular drivers/xen: make sys-hypervisor.c explicitly non-modular drivers/xen: make xenbus_dev_[front/back]end explicitly non-modular drivers/xen: make [xen-]ballon explicitly non-modular xen: audit usages of module.h ; remove unnecessary instances xen/x86: Drop mode-selecting ifdefs in startup_xen() xen/x86: Zero out .bss for PV guests hvc_xen: make early_printk work with HVM guests hvc_xen: fix xenboot for DomUs hvc_xen: add earlycon support
2016-03-21drivers/xen: make xenbus_dev_[front/back]end explicitly non-modularPaul Gortmaker2-22/+4
The Makefile / Kconfig currently controlling compilation here is: obj-y += xenbus_dev_frontend.o [...] obj-$(CONFIG_XEN_BACKEND) += xenbus_dev_backend.o ...with: drivers/xen/Kconfig:config XEN_BACKEND drivers/xen/Kconfig: bool "Backend driver support" ...meaning that they currently are not being built as modules by anyone. Lets remove the modular code that is essentially orphaned, so that when reading the driver there is no doubt it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. We also delete the MODULE_LICENSE tag since all that information is already contained at the top of the file in the comments. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-03-21xen: audit usages of module.h ; remove unnecessary instancesPaul Gortmaker1-1/+0
Code that uses no modular facilities whatsoever should not be sourcing module.h at all, since that header drags in a bunch of other headers with it. Similarly, code that is not explicitly using modular facilities like module_init() but only is declaring module_param setup variables should be using moduleparam.h and not the larger module.h file for that. In making this change, we also uncover an implicit use of BUG() in inline fcns within arch/arm/include/asm/xen/hypercall.h so we explicitly source <linux/bug.h> for that file now. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-02-23Merge tag 'for-linus-4.5-rc5-tag' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen bug fixes from David Vrabel: - Two scsiback fixes (resource leak and spurious warning). - Fix DMA mapping of compound pages on arm/arm64. - Fix some pciback regressions in MSI-X handling. - Fix a pcifront crash due to some uninitialize state. * tag 'for-linus-4.5-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/pcifront: Fix mysterious crashes when NUMA locality information was extracted. xen/pcifront: Report the errors better. xen/pciback: Save the number of MSI-X entries to be copied later. xen/pciback: Check PF instead of VF for PCI_COMMAND_MEMORY xen: fix potential integer overflow in queue_reply xen/arm: correctly handle DMA mapping of compound pages xen/scsiback: avoid warnings when adding multiple LUNs to a domain xen/scsiback: correct frontend counting
2016-02-15xen: fix potential integer overflow in queue_replyInsu Yun1-0/+2
When len is greater than UINT_MAX - sizeof(*rb), in next allocation, it can overflow integer range and allocates small size of heap. After that, memcpy will overflow the allocated heap. Therefore, it needs to check the size of given length. Signed-off-by: Insu Yun <wuninsu@gmail.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>