summaryrefslogtreecommitdiff
path: root/arch/x86/xen/suspend.c
AgeCommit message (Collapse)AuthorFilesLines
2024-07-11x86/xen: eliminate some private header filesJuergen Gross1-2/+0
Under arch/x86/xen there is one large private header file xen-ops.h containing most of the Xen-private x86 related declarations, and then there are several small headers with a handful of declarations each. Merge the small headers into xen-ops.h. While doing that, move the declaration of xen_fifo_events from xen-ops.h into include/xen/events.h where it should have been from the beginning. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Message-ID: <20240710093718.14552-3-jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2018-02-28x86/xen: Zero MSR_IA32_SPEC_CTRL before suspendJuergen Gross1-0/+16
Older Xen versions (4.5 and before) might have problems migrating pv guests with MSR_IA32_SPEC_CTRL having a non-zero value. So before suspending zero that MSR and restore it after being resumed. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jan Beulich <jbeulich@suse.com> Cc: stable@vger.kernel.org Cc: xen-devel@lists.xenproject.org Cc: boris.ostrovsky@oracle.com Link: https://lkml.kernel.org/r/20180226140818.4849-1-jgross@suse.com
2017-11-17Merge tag 'for-linus-4.15-rc1-tag' of ↵Linus Torvalds1-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: "Xen features and fixes for v4.15-rc1 Apart from several small fixes it contains the following features: - a series by Joao Martins to add vdso support of the pv clock interface - a series by Juergen Gross to add support for Xen pv guests to be able to run on 5 level paging hosts - a series by Stefano Stabellini adding the Xen pvcalls frontend driver using a paravirtualized socket interface" * tag 'for-linus-4.15-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (34 commits) xen/pvcalls: fix potential endless loop in pvcalls-front.c xen/pvcalls: Add MODULE_LICENSE() MAINTAINERS: xen, kvm: track pvclock-abi.h changes x86/xen/time: setup vcpu 0 time info page x86/xen/time: set pvclock flags on xen_time_init() x86/pvclock: add setter for pvclock_pvti_cpu0_va ptp_kvm: probe for kvm guest availability xen/privcmd: remove unused variable pageidx xen: select grant interface version xen: update arch/x86/include/asm/xen/cpuid.h xen: add grant interface version dependent constants to gnttab_ops xen: limit grant v2 interface to the v1 functionality xen: re-introduce support for grant v2 interface xen: support priv-mapping in an HVM tools domain xen/pvcalls: remove redundant check for irq >= 0 xen/pvcalls: fix unsigned less than zero error check xen/time: Return -ENODEV from xen_get_wallclock() xen/pvcalls-front: mark expected switch fall-through xen: xenbus_probe_frontend: mark expected switch fall-throughs xen/time: do not decrease steal time after live migration on xen ...
2017-11-09x86/xen/time: setup vcpu 0 time info pageJoao Martins1-0/+4
In order to support pvclock vdso on xen we need to setup the time info page for vcpu 0 and register the page with Xen using the VCPUOP_register_vcpu_time_memory_area hypercall. This hypercall will also forcefully update the pvti which will set some of the necessary flags for vdso. Afterwards we check if it supports the PVCLOCK_TSC_STABLE_BIT flag which is mandatory for having vdso/vsyscall support. And if so, it will set the cpu 0 pvti that will be later on used when mapping the vdso image. The xen headers are also updated to include the new hypercall for registering the secondary vcpu_time_info struct. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman1-0/+1
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-02x86/xen: split suspend.c for PV and PVHVM guestsVitaly Kuznetsov1-54/+0
Slit the code in suspend.c into suspend_pv.c and suspend_hvm.c. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2016-01-13Merge tag 'for-linus-4.5-rc0-tag' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from David Vrabel: "Xen features and fixes for 4.5-rc0: - Stolen ticks and PV wallclock support for arm/arm64 - Add grant copy ioctl to gntdev device" * tag 'for-linus-4.5-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/gntdev: add ioctl for grant copy x86/xen: don't reset vcpu_info on a cancelled suspend xen/gntdev: constify mmu_notifier_ops structures xen/grant-table: constify gnttab_ops structure xen/time: use READ_ONCE xen/x86: convert remaining timespec to timespec64 in xen_pvclock_gtod_notify xen/x86: support XENPF_settime64 xen/arm: set the system time in Xen via the XENPF_settime64 hypercall xen/arm: introduce xen_read_wallclock arm: extend pvclock_wall_clock with sec_hi xen: introduce XENPF_settime64 xen/arm: introduce HYPERVISOR_platform_op on arm and arm64 xen: rename dom0_op to platform_op xen/arm: account for stolen ticks arm64: introduce CONFIG_PARAVIRT, PARAVIRT_TIME_ACCOUNTING and pv_time_ops arm: introduce CONFIG_PARAVIRT, PARAVIRT_TIME_ACCOUNTING and pv_time_ops missing include asm/paravirt.h in cputime.c xen: move xen_setup_runstate_info and get_runstate_snapshot to drivers/xen/time.c
2016-01-04x86/xen: don't reset vcpu_info on a cancelled suspendOuyang Zhaowei (Charles)1-1/+2
On a cancelled suspend the vcpu_info location does not change (it's still in the per-cpu area registered by xen_vcpu_setup()). So do not call xen_hvm_init_shared_info() which would make the kernel think its back in the shared info. With the wrong vcpu_info, events cannot be received and the domain will hang after a cancelled suspend. Signed-off-by: Charles Ouyang <ouyangzhaowei@huawei.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-12-30arch/x86/xen/suspend.c: include xen/xen.hAndrew Morton1-0/+1
Fix the build warning: arch/x86/xen/suspend.c: In function 'xen_arch_pre_suspend': arch/x86/xen/suspend.c:70:9: error: implicit declaration of function 'xen_pv_domain' [-Werror=implicit-function-declaration] if (xen_pv_domain()) ^ Reported-by: kbuild test robot <fengguang.wu@intel.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-02xen: Resume PMU from non-atomic contextBoris Ostrovsky1-10/+10
Resuming PMU currently triggers a warning from ___might_sleep() (assuming CONFIG_DEBUG_ATOMIC_SLEEP is set) when xen_pmu_init() allocates GFP_KERNEL page because we are in state resembling atomic context. Move resuming PMU to xen_arch_resume() which is called in regular context. For symmetry move suspending PMU to xen_arch_suspend() as well. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: <stable@vger.kernel.org> # 4.3 Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-20xen/PMU: Initialization code for Xen PMUBoris Ostrovsky1-6/+17
Map shared data structure that will hold CPU registers, VPMU context, V/PCPU IDs of the CPU interrupted by PMU interrupt. Hypervisor fills this information in its handler and passes it to the guest for further processing. Set up PMU VIRQ. Now that perf infrastructure will assume that PMU is available on a PV guest we need to be careful and make sure that accesses via RDPMC instruction don't cause fatal traps by the hypervisor. Provide a nop RDPMC handler. For the same reason avoid issuing a warning on a write to APIC's LVTPC. Both of these will be made functional in later patches. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-04-29xen: Suspend ticks on all CPUs during suspendBoris Ostrovsky1-0/+10
Commit 77e32c89a711 ("clockevents: Manage device's state separately for the core") decouples clockevent device's modes from states. With this change when a Xen guest tries to resume, it won't be calling its set_mode op which needs to be done on each VCPU in order to make the hypervisor aware that we are in oneshot mode. This happens because clockevents_tick_resume() (which is an intermediate step of resuming ticks on a processor) doesn't call clockevents_set_state() anymore and because during suspend clockevent devices on all VCPUs (except for the one doing the suspend) are left in ONESHOT state. As result, during resume the clockevents state machine will assume that device is already where it should be and doesn't need to be updated. To avoid this problem we should suspend ticks on all VCPUs during suspend. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-04-01tick/xen: Provide and use tick_suspend_local() and tick_resume_local()Thomas Gleixner1-1/+1
Xen calls on every cpu into tick_resume() which is just wrong. tick_resume() is for the syscore global suspend/resume invocation. What XEN really wants is a per cpu local resume function. Provide a tick_resume_local() function and use it in XEN. Also provide a complementary tick_suspend_local() and modify tick_unfreeze() and tick_freeze(), respectively, to use the new local tick resume/suspend functions. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [ Combined two patches, rebased, modified subject/changelog. ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1698741.eezk9tnXtG@vostro.rjw.lan [ Merged to latest timers/core. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-01clockevents: Make suspend/resume calls explicitThomas Gleixner1-7/+4
clockevents_notify() is a leftover from the early design of the clockevents facility. It's really not a notification mechanism, it's a multiplex call. We are way better off to have explicit calls instead of this monstrosity. Split out the suspend/resume() calls and invoke them directly from the call sites. No locking required at this point because these calls happen with interrupts disabled and a single cpu online. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [ Rebased on top of 4.0-rc5. ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/713674030.jVm1qaHuPf@vostro.rjw.lan [ Rebased on top of latest timers/core. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-05-12xen: refactor suspend pre/post hooksDavid Vrabel1-3/+20
New architectures currently have to provide implementations of 5 different functions: xen_arch_pre_suspend(), xen_arch_post_suspend(), xen_arch_hvm_post_suspend(), xen_mm_pin_all(), and xen_mm_unpin_all(). Refactor the suspend code to only require xen_arch_pre_suspend() and xen_arch_post_suspend(). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2013-02-15Revert "xen PVonHVM: use E820_Reserved area for shared_info"Konrad Rzeszutek Wilk1-1/+1
This reverts commit 9d02b43dee0d7fb18dfb13a00915550b1a3daa9f. We are doing this b/c on 32-bit PVonHVM with older hypervisors (Xen 4.1) it ends up bothing up the start_info. This is bad b/c we use it for the time keeping, and the timekeeping code loops forever - as the version field never changes. Olaf says to revert it, so lets do that. Acked-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-11-02xen PVonHVM: use E820_Reserved area for shared_infoOlaf Hering1-1/+1
This is a respin of 00e37bdb0113a98408de42db85be002f21dbffd3 ("xen PVonHVM: move shared_info to MMIO before kexec"). Currently kexec in a PVonHVM guest fails with a triple fault because the new kernel overwrites the shared info page. The exact failure depends on the size of the kernel image. This patch moves the pfn from RAM into an E820 reserved memory area. The pfn containing the shared_info is located somewhere in RAM. This will cause trouble if the current kernel is doing a kexec boot into a new kernel. The new kernel (and its startup code) can not know where the pfn is, so it can not reserve the page. The hypervisor will continue to update the pfn, and as a result memory corruption occours in the new kernel. The toolstack marks the memory area FC000000-FFFFFFFF as reserved in the E820 map. Within that range newer toolstacks (4.3+) will keep 1MB starting from FE700000 as reserved for guest use. Older Xen4 toolstacks will usually not allocate areas up to FE700000, so FE700000 is expected to work also with older toolstacks. In Xen3 there is no reserved area at a fixed location. If the guest is started on such old hosts the shared_info page will be placed in RAM. As a result kexec can not be used. Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-08-16Revert "xen PVonHVM: move shared_info to MMIO before kexec"Konrad Rzeszutek Wilk1-1/+1
This reverts commit 00e37bdb0113a98408de42db85be002f21dbffd3. During shutdown of PVHVM guests with more than 2VCPUs on certain machines we can hit the race where the replaced shared_info is not replaced fast enough and the PV time clock retries reading the same area over and over without any any success and is stuck in an infinite loop. Acked-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-07-19xen PVonHVM: move shared_info to MMIO before kexecOlaf Hering1-1/+1
Currently kexec in a PVonHVM guest fails with a triple fault because the new kernel overwrites the shared info page. The exact failure depends on the size of the kernel image. This patch moves the pfn from RAM into MMIO space before the kexec boot. The pfn containing the shared_info is located somewhere in RAM. This will cause trouble if the current kernel is doing a kexec boot into a new kernel. The new kernel (and its startup code) can not know where the pfn is, so it can not reserve the page. The hypervisor will continue to update the pfn, and as a result memory corruption occours in the new kernel. One way to work around this issue is to allocate a page in the xen-platform pci device's BAR memory range. But pci init is done very late and the shared_info page is already in use very early to read the pvclock. So moving the pfn from RAM to MMIO is racy because some code paths on other vcpus could access the pfn during the small window when the old pfn is moved to the new pfn. There is even a small window were the old pfn is not backed by a mfn, and during that time all reads return -1. Because it is not known upfront where the MMIO region is located it can not be used right from the start in xen_hvm_init_shared_info. To minimise trouble the move of the pfn is done shortly before kexec. This does not eliminate the race because all vcpus are still online when the syscore_ops will be called. But hopefully there is no work pending at this point in time. Also the syscore_op is run last which reduces the risk further. Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-02-25xen: suspend: add "arch" to pre/post suspend hooksIan Campbell1-3/+3
xen_pre_device_suspend is unused on ia64. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-02-25xen: fix compile issue if XEN is enabled but XEN_PVHVM is disabledStefano Stabellini1-0/+2
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2010-12-02xen: unplug the emulated devices at resume timeStefano Stabellini1-0/+1
Early after being resumed we need to unplug again the emulated devices. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2010-07-27x86: Use xen_vcpuop_clockevent, xen_clocksource and xen wallclock.Stefano Stabellini1-0/+6
Use xen_vcpuop_clockevent instead of hpet and APIC timers as main clockevent device on all vcpus, use the xen wallclock time as wallclock instead of rtc and use xen_clocksource as clocksource. The pv clock algorithm needs to work correctly for the xen_clocksource and xen wallclock to be usable, only modern Xen versions offer a reliable pv clock in HVM guests (XENFEAT_hvm_safe_pvclock). Using the hpet as clocksource means a VMEXIT every time we read/write to the hpet mmio addresses, pvclock give us a better rating without VMEXITs. Same goes for the xen wallclock and xen_vcpuop_clockevent Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Don Dutile <ddutile@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-07-23xen: Add suspend/resume support for PV on HVM guests.Stefano Stabellini1-0/+6
Suspend/resume requires few different things on HVM: the suspend hypercall is different; we don't need to save/restore memory related settings; except the shared info page and the callback mechanism. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-06-03xen: ensure timer tick is resumed even on CPU driving the resumeIan Campbell1-2/+2
The core suspend/resume code is run from stop_machine on CPU0 but parts of the suspend/resume machinery (including xen_arch_resume) are run on whichever CPU happened to schedule the xenwatch kernel thread. As part of the non-core resume code xen_arch_resume is called in order to restart the timer tick on non-boot processors. The boot processor itself is taken care of by core timekeeping code. xen_arch_resume uses smp_call_function which does not call the given function on the current processor. This means that we can end up with one CPU not receiving timer ticks if the xenwatch thread happened to be scheduled on CPU > 0. Use on_each_cpu instead of smp_call_function to ensure the timer tick is resumed everywhere. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Stable Kernel <stable@kernel.org> # .32.x
2009-12-03xen: call clock resume notifier on all CPUsIan Campbell1-1/+14
tick_resume() is never called on secondary processors. Presumably this is because they are offlined for suspend on native and so this is normally taken care of in the CPU onlining path. Under Xen we keep all CPUs online over a suspend. This patch papers over the issue for me but I will investigate a more generic, less hacky, way of doing to the same. tick_suspend is also only called on the boot CPU which I presume should be fixed too. Signed-off-by: Ian Campbell <Ian.Campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stable Kernel <stable@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de>
2009-12-03xen: correctly restore pfn_to_mfn_list_list after resumeIan Campbell1-0/+2
pvops kernels >= 2.6.30 can currently only be saved and restored once. The second attempt to save results in: ERROR Internal error: Frame# in pfn-to-mfn frame list is not in pseudophys ERROR Internal error: entry 0: p2m_frame_list[0] is 0xf2c2c2c2, max 0x120000 ERROR Internal error: Failed to map/save the p2m frame list I finally narrowed it down to: commit cdaead6b4e657f960d6d6f9f380e7dfeedc6a09b Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Date: Fri Feb 27 15:34:59 2009 -0800 xen: split construction of p2m mfn tables from registration Build the p2m_mfn_list_list early with the rest of the p2m table, but register it later when the real shared_info structure is in place. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> The unforeseen side-effect of this change was to cause the mfn list list to not be rebuilt on resume. Prior to this change it would have been rebuilt via xen_post_suspend() -> xen_setup_shared_info() -> xen_setup_mfn_list_list(). Fix by explicitly calling xen_build_mfn_list_list() from xen_post_suspend(). Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stable Kernel <stable@kernel.org>
2009-01-23x86, xen: fix hardirq.h merge falloutIngo Molnar1-0/+1
Impact: build fix This build error: arch/x86/xen/suspend.c:22: error: implicit declaration of function 'fix_to_virt' arch/x86/xen/suspend.c:22: error: 'FIX_PARAVIRT_BOOTMAP' undeclared (first use in this function) arch/x86/xen/suspend.c:22: error: (Each undeclared identifier is reported only once arch/x86/xen/suspend.c:22: error: for each function it appears in.) triggers because the hardirq.h unification removed an implicit fixmap.h include - on which arch/x86/xen/suspend.c depended. Add the fixmap.h include explicitly. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-17xen: convert to cpumask_var_t and new cpumask primitives.Mike Travis1-1/+2
Simple change, and eventual space saving when NR_CPUS >> nr_cpu_ids. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
2008-07-16xen: add xen_arch_resume()/xen_timer_resume hook for ia64 supportIsaku Yamahata1-1/+4
add xen_timer_resume() hook. Timer resume should be done after event channel is resumed. add xen_arch_resume() hook when ipi becomes usable after resume. After resume, some cpu specific resource must be reinitialized on ia64 that can't be set by another cpu. However available hooks is run once on only one cpu so that ipi has to be used. During stop_machine_run() ipi can't be used because interrupt is masked. So add another hook after stop_machine_run(). Another approach might be use resume hook which is run by device_resume(). However device_resume() may be executed on suspend error recovery path. So it is necessary to determine whether it is executed on real resume path or error recovery path. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Cc: Stephen Tweedie <sct@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-02xen: resume timers on all vcpusJeremy Fitzhardinge1-0/+1
On resume, the vcpu timer modes will not be restored. The timer infrastructure doesn't do this for us, since it assumes the cpus are offline. We can just poke the other vcpus into the right mode directly though. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-02xen: restore vcpu_info mappingJeremy Fitzhardinge1-1/+3
If we're using vcpu_info mapping, then make sure its restored on all processors before relasing them from stop_machine. The only complication is that if this fails, we can't continue because we've already made assumptions that the mapping is available (baked in calls to the _direct versions of the functions, for example). Fortunately this can only happen with a 32-bit hypervisor, which may possibly run out of mapping space. On a 64-bit hypervisor, this is a non-issue. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-27xen: implement save/restoreJeremy Fitzhardinge1-0/+42
This patch implements Xen save/restore and migration. Saving is triggered via xenbus, which is polled in drivers/xen/manage.c. When a suspend request comes in, the kernel prepares itself for saving by: 1 - Freeze all processes. This is primarily to prevent any partially-completed pagetable updates from confusing the suspend process. If CONFIG_PREEMPT isn't defined, then this isn't necessary. 2 - Suspend xenbus and other devices 3 - Stop_machine, to make sure all the other vcpus are quiescent. The Xen tools require the domain to run its save off vcpu0. 4 - Within the stop_machine state, it pins any unpinned pgds (under construction or destruction), performs canonicalizes various other pieces of state (mostly converting mfns to pfns), and finally 5 - Suspend the domain Restore reverses the steps used to save the domain, ending when all the frozen processes are thawed. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>