Age | Commit message (Collapse) | Author | Files | Lines |
|
Commit 31bc3858ea3e ("add automatic onlining policy for the newly added
memory") provides the capability to have added memory automatically
onlined during add, but this appears to be slightly broken.
The current implementation uses walk_memory_range() to call
online_memory_block, which uses memory_block_change_state() to online
the memory. Instead, we should be calling device_online() for the
memory block in online_memory_block(). This would online the memory
(the memory bus online routine memory_subsys_online() called from
device_online calls memory_block_change_state()) and properly update the
device struct offline flag.
As a result of the current implementation, attempting to remove a memory
block after adding it using auto online fails. This is because doing a
remove, for instance
echo offline > /sys/devices/system/memory/memoryXXX/state
uses device_offline() which checks the dev->offline flag.
Link: http://lkml.kernel.org/r/20170222220744.8119.19687.stgit@ltcalpine2-lp14.aus.stglabs.ibm.com
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The callers of the DMA alloc functions already provide the proper
context GFP flags. Make sure to pass them through to the CMA allocator,
to make the CMA compaction context aware.
Link: http://lkml.kernel.org/r/20170127172328.18574-3-l.stach@pengutronix.de
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Alexander Graf <agraf@suse.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Most users of this interface just want to use it with the default
GFP_KERNEL flags, but for cases where DMA memory is allocated it may be
called from a different context.
No functional change yet, just passing through the flag to the
underlying alloc_contig_range function.
Link: http://lkml.kernel.org/r/20170127172328.18574-2-l.stach@pengutronix.de
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Alexander Graf <agraf@suse.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
mem_hotplug_begin() assumes that it can set mem_hotplug.active_writer
and run the hotplug process without racing another thread. Validate
this assumption with a lockdep assertion.
Link: http://lkml.kernel.org/r/148693886229.16345.1770484669403334689.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the "small" driver core patches for 4.11-rc1.
Not much here, some firmware documentation and self-test updates, a
debugfs code formatting issue, and a new feature for call_usermodehelper
to make it more robust on systems that want to lock it down in a more
secure way.
All of these have been linux-next for a while now with no reported
issues"
* tag 'driver-core-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
kernfs: handle null pointers while printing node name and path
Introduce STATIC_USERMODEHELPER to mediate call_usermodehelper()
Make static usermode helper binaries constant
kmod: make usermodehelper path a const string
firmware: revamp firmware documentation
selftests: firmware: send expected errors to /dev/null
selftests: firmware: only modprobe if driver is missing
platform: Print the resource range if device failed to claim
kref: prefer atomic_inc_not_zero to atomic_add_unless
debugfs: improve formatting of debugfs_real_fops()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull device property updates from Rafael J. Wysocki:
"Generic device properties framework updates for v4.11-rc1
Allow built-in (static) device properties to be declared as constant,
make it possible to save memory by discarding alternative (but unused)
built-in (static) property sets and add support for automatic handling
of built-in properties to the I2C code"
* tag 'device-properties-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
i2c: allow specify device properties in i2c_board_info
device property: export code duplicating array of property entries
device property: constify property arrays values
device property: allow to constify properties
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"The majority of changes go into the Operating Performance Points (OPP)
framework and cpufreq this time, followed by devfreq and some
scattered updates all over.
The OPP changes are mostly related to switching over from RCU-based
synchronization, that turned out to be overly complicated and
problematic, to reference counting using krefs.
In the cpufreq land there are core cleanups, documentation updates, a
new driver for Broadcom BMIPS SoCs, a new cpufreq-dt sub-driver for TI
SoCs that require special handling, ARM64 SoCs support for the qoriq
driver, intel_pstate updates, powernv driver update and assorted
fixes.
The devfreq changes are mostly fixes related to the sysfs interface
and some Exynos drivers updates.
Apart from that, the cpuidle menu governor will support per-CPU PM QoS
constraints for the wakeup latency now, some bugs in the wakeup IRQs
framework are fixed, the generic power domains framework should handle
asynchronous invocations of *noirq suspend/resume callbacks from now
on, the analyze_suspend.py script is updated and there is a new tool
for intel_pstate diagnostics.
Specifics:
- Operating Performance Points (OPP) framework fixes, cleanups and
switch over from RCU-based synchronization to reference counting
using krefs (Viresh Kumar, Wei Yongjun, Dave Gerlach)
- cpufreq core cleanups and documentation updates (Viresh Kumar,
Rafael Wysocki)
- New cpufreq driver for Broadcom BMIPS SoCs (Markus Mayer)
- New cpufreq-dt sub-driver for TI SoCs requiring special handling,
like in the AM335x, AM437x, DRA7x, and AM57x families, along with
new DT bindings for it (Dave Gerlach, Paul Gortmaker)
- ARM64 SoCs support for the qoriq cpufreq driver (Tang Yuantian)
- intel_pstate driver updates including a new sysfs knob to control
the driver's operation mode and fixes related to the no_turbo sysfs
knob and the hardware-managed P-states feature support (Rafael
Wysocki, Srinivas Pandruvada)
- New interface to export ultra-turbo frequencies for the powernv
cpufreq driver (Shilpasri Bhat)
- Assorted fixes for cpufreq drivers (Arnd Bergmann, Dan Carpenter,
Wei Yongjun)
- devfreq core fixes, mostly related to the sysfs interface exported
by it (Chanwoo Choi, Chris Diamand)
- Updates of the exynos-bus and exynos-ppmu devfreq drivers (Chanwoo
Choi)
- Device PM QoS extension to support CPUs and support for per-CPU
wakeup (device resume) latency constraints in the cpuidle menu
governor (Alex Shi)
- Wakeup IRQs framework fixes (Grygorii Strashko)
- Generic power domains framework update including a fix to make it
handle asynchronous invocations of *noirq suspend/resume callbacks
correctly (Ulf Hansson, Geert Uytterhoeven)
- Assorted fixes and cleanups in the core suspend/hibernate code, PM
QoS framework and x86 ACPI idle support code (Corentin Labbe, Geert
Uytterhoeven, Geliang Tang, John Keeping, Nick Desaulniers)
- Update of the analyze_suspend.py script is updated to version 4.5
offering multiple improvements (Todd Brandt)
- New tool for intel_pstate diagnostics using the pstate_sample
tracepoint (Doug Smythies)"
* tag 'pm-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (85 commits)
MAINTAINERS: cpufreq: add bmips-cpufreq.c
PM / QoS: Fix memory leak on resume_latency.notifiers
PM / Documentation: Spelling s/wrtie/write/
PM / sleep: Fix test_suspend after sleep state rework
cpufreq: CPPC: add ACPI_PROCESSOR dependency
cpufreq: make ti-cpufreq explicitly non-modular
cpufreq: Do not clear real_cpus mask on policy init
tools/power/x86: Debug utility for intel_pstate driver
AnalyzeSuspend: fix drag and zoom bug in javascript
PM / wakeirq: report a wakeup_event on dedicated wekup irq
PM / wakeirq: Fix spurious wake-up events for dedicated wakeirqs
PM / wakeirq: Enable dedicated wakeirq for suspend
cpufreq: dt: Don't use generic platdev driver for ti-cpufreq platforms
cpufreq: ti: Add cpufreq driver to determine available OPPs at runtime
Documentation: dt: add bindings for ti-cpufreq
PM / OPP: Expose _of_get_opp_desc_node as dev_pm_opp API
cpufreq: qoriq: Don't look at clock implementation details
cpufreq: qoriq: add ARM64 SoCs support
PM / Domains: Provide dummy governors if CONFIG_PM_GENERIC_DOMAINS=n
cpufreq: brcmstb-avs-cpufreq: remove unnecessary platform_set_drvdata()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"For v4.11 activity on the regmap API has literally doubled, there are
two patches this release:
- fixes from Charles Keepax to make the kerneldoc generate correctly
- a cleanup from Geliang Tang using rb_entry() rather than open
coding it with container_of()"
* tag 'regmap-v4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: Fixup the kernel-doc comments on functions/structures
regmap: use rb_entry()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"This update provides:
- Yet another two irq controller chip drivers
- A few updates and fixes for GICV3
- A resource managed function for interrupt allocation
- Fixes, updates and enhancements all over the place"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/qcom: Fix error handling
genirq: Clarify logic calculating bogus irqreturn_t values
genirq/msi: Add stubs for get_cached_msi_msg/pci_write_msi_msg
genirq/devres: Use dev_name(dev) as default for devname
genirq: Fix /proc/interrupts output alignment
irqdesc: Add a resource managed version of irq_alloc_descs()
irqchip/gic-v3-its: Zero command on allocation
irqchip/gic-v3-its: Fix command buffer allocation
irqchip/mips-gic: Fix local interrupts
irqchip: Add a driver for Cortina Gemini
irqchip: DT bindings for Cortina Gemini irqchip
irqchip/gic-v3: Remove duplicate definition of GICD_TYPER_LPIS
irqchip/gic-v3-its: Rename MAPVI to MAPTI
irqchip/gic-v3-its: Drop deprecated GITS_BASER_TYPE_CPU
irqchip/gic-v3-its: Refactor command encoding
irqchip/gic-v3-its: Enable cacheable attribute Read-allocate hints
irqchip/qcom: Add IRQ combiner driver
ACPI: Add support for ResourceSource/IRQ domain mapping
ACPI: Generic GSI: Do not attempt to map non-GSI IRQs during bus scan
irq/platform-msi: Fix comment about maximal MSIs
|
|
* pm-core:
PM / wakeirq: report a wakeup_event on dedicated wekup irq
PM / wakeirq: Fix spurious wake-up events for dedicated wakeirqs
PM / wakeirq: Enable dedicated wakeirq for suspend
* pm-qos:
PM / QoS: Fix memory leak on resume_latency.notifiers
PM / QoS: Remove unneeded linux/miscdevice.h include
* pm-domains:
PM / Domains: Provide dummy governors if CONFIG_PM_GENERIC_DOMAINS=n
PM / Domains: Fix asynchronous execution of *noirq() callbacks
PM / Domains: Correct comment in irq_safe_dev_in_no_sleep_domain()
PM / Domains: Rename functions in genpd for power on/off
|
|
* pm-cpuidle:
CPU / PM: expose pm_qos_resume_latency for CPUs
cpuidle/menu: add per CPU PM QoS resume latency consideration
cpuidle/menu: stop seeking deeper idle if current state is deep enough
ACPI / idle: small formatting fixes
|
|
* pm-opp: (24 commits)
PM / OPP: Expose _of_get_opp_desc_node as dev_pm_opp API
PM / OPP: Make _find_opp_table_unlocked() static
PM / OPP: Update Documentation to remove RCU specific bits
PM / OPP: Simplify dev_pm_opp_get_max_volt_latency()
PM / OPP: Simplify _opp_set_availability()
PM / OPP: Move away from RCU locking
PM / OPP: Take kref from _find_opp_table()
PM / OPP: Update OPP users to put reference
PM / OPP: Add 'struct kref' to struct dev_pm_opp
PM / OPP: Use dev_pm_opp_get_opp_table() instead of _add_opp_table()
PM / OPP: Take reference of the OPP table while adding/removing OPPs
PM / OPP: Return opp_table from dev_pm_opp_set_*() routines
PM / OPP: Add 'struct kref' to OPP table
PM / OPP: Add per OPP table mutex
PM / OPP: Split out part of _add_opp_table() and _remove_opp_table()
PM / OPP: Don't expose srcu_head to register notifiers
PM / OPP: Rename dev_pm_opp_get_suspend_opp() and return OPP rate
PM / OPP: Don't allocate OPP table from _opp_allocate()
PM / OPP: Rename and split _dev_pm_opp_remove_table()
PM / OPP: Add light weight _opp_free() routine
...
|
|
Since commit 2d984ad132a8 (PM / QoS: Introcuce latency tolerance device
PM QoS type) we reassign "c" to point at qos->latency_tolerance before
freeing c->notifiers, but the notifiers field of latency_tolerance is
never used.
Restore the original behaviour of freeing the notifiers pointer on
qos->resume_latency, which is used, and fix the following kmemleak
warning.
unreferenced object 0xed9dba00 (size 64):
comm "kworker/0:1", pid 36, jiffies 4294670128 (age 15202.983s)
hex dump (first 32 bytes):
00 00 00 00 04 ba 9d ed 04 ba 9d ed 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<c06f6084>] kmemleak_alloc+0x74/0xb8
[<c011c964>] kmem_cache_alloc_trace+0x170/0x25c
[<c035f448>] dev_pm_qos_constraints_allocate+0x3c/0xe4
[<c035f574>] __dev_pm_qos_add_request+0x84/0x1a0
[<c035f6cc>] dev_pm_qos_add_request+0x3c/0x54
[<c03c3fc4>] usb_hub_create_port_device+0x110/0x2b8
[<c03b2a60>] hub_probe+0xadc/0xc80
[<c03bb050>] usb_probe_interface+0x1b4/0x260
[<c035773c>] driver_probe_device+0x198/0x40c
[<c0357b14>] __device_attach_driver+0x8c/0x98
[<c0355bbc>] bus_for_each_drv+0x8c/0x9c
[<c0357494>] __device_attach+0x98/0x138
[<c0357c64>] device_initial_probe+0x14/0x18
[<c03569dc>] bus_probe_device+0x30/0x88
[<c0354c54>] device_add+0x430/0x554
[<c03b92d8>] usb_set_configuration+0x660/0x6fc
Fixes: 2d984ad132a8 (PM / QoS: Introcuce latency tolerance device PM QoS type)
Signed-off-by: John Keeping <john@metanate.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
There are two reasons for reporting wakeup event when dedicated wakeup
IRQ is triggered:
- wakeup events accounting, so proper statistical data will be
displayed in sysfs and debugfs;
- there are small window when System is entering suspend during which
dedicated wakeup IRQ can be lost:
dpm_suspend_noirq()
|- device_wakeup_arm_wake_irqs()
|- dev_pm_arm_wake_irq(X)
|- IRQ is enabled and marked as wakeup source
[1]...
|- suspend_device_irqs()
|- suspend_device_irq(X)
|- irqd_set(X, IRQD_WAKEUP_ARMED);
|- wakup IRQ armed
The wakeup IRQ can be lost if it's triggered at point [1]
and not armed yet.
Hence, fix above cases by adding simple pm_wakeup_event() call in
handle_threaded_wake_irq().
Fixes: 4990d4fe327b (PM / Wakeirq: Add automated device wake IRQ handling)
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
[ tony@atomide.com: added missing return to avoid warnings ]
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Dedicated wakeirq is a one time event to wake-up the system from
low-power state and then call pm_runtime_resume() on the device wired
with the dedicated wakeirq.
Sometimes dedicated wakeirqs can get deferred if they trigger after we
call disable_irq_nosync() in dev_pm_disable_wake_irq(). This can happen
if pm_runtime_get() is called around the same time a wakeirq fires.
If an interrupt fires after disable_irq_nosync(), by default it will get
tagged with IRQS_PENDING and will run later on when the interrupt is
enabled again.
Deferred wakeirqs usually just produce pointless wake-up events. But they
can also cause suspend to fail if the deferred wakeirq fires during
dpm_suspend_noirq() for example. So we really don't want to see the
deferred wakeirqs triggering after the device has resumed.
Let's fix the issue by setting IRQ_DISABLE_UNLAZY flag for the dedicated
wakeirqs. The other option would be to implement irq_disable() in the
dedicated wakeirq controller, but that's not a generic solution.
For reference below is what happens with a IRQ_TYPE_EDGE_BOTH IRQ
type wakeirq:
- resume by dedicated IRQ (EDGE_FALLING)
- suspend_enter()
....
- arch_suspend_enable_irqs()
|- dedicated IRQ armed and fired
|- irq_pm_check_wakeup()
|- disarm, disable IRQ and mark as IRQS_PENDING
....
- dpm_resume_noirq()
|- resume_device_irqs()
|- __enable_irq()
|- check_irq_resend()
|- handle_threaded_wake_irq()
|- dedicated IRQ processed
|- device_wakeup_disarm_wake_irqs()
|- disable_irq_wake()
....
!-> dedicated IRQ (EDGE_RISING)
-| handle_edge_irq()
|- IRQ disabled: mask_ack_irq and mark as IRQS_PENDING
....
- subsequent suspend
....
|- dpm_suspend_noirq()
|- device_wakeup_arm_wake_irqs()
|- __enable_irq()
|- check_irq_resend()
(a) |- handle_threaded_wake_irq()
|- pm_wakeup_event() --> abort suspend
....
|- suspend_device_irqs()
|- suspend_device_irq()
|- dedicated IRQ armed
....
(b) |- resend_irqs
|- irq_pm_check_wakeup()
|- IRQ armed -> abort suspend
because of pending IRQ System suspend can be aborted at points
(a)-not armed or (b)-armed.
Fixes: 4990d4fe327b (PM / Wakeirq: Add automated device wake IRQ handling)
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
[ tony@atomide.com: added a comment, updated the description ]
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
We currently rely on runtime PM to enable dedicated wakeirq for suspend.
This assumption fails in the following two cases:
1. If the consumer driver does not have runtime PM implemented, the
dedicated wakeirq never gets enabled for suspend
2. If the consumer driver has runtime PM implemented, but does not idle
in suspend
Let's fix the issue by always enabling the dedicated wakeirq during
suspend.
Depends-on: bed570307ed7 (PM / wakeirq: Fix dedicated wakeirq for drivers not using autosuspend)
Fixes: 4990d4fe327b (PM / Wakeirq: Add automated device wake IRQ handling)
Reported-by: Keerthy <j-keerthy@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
[ tony@atomide.com: updated based on bed570307ed7, added description ]
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Rename _of_get_opp_desc_node to dev_pm_opp_of_get_opp_desc_node and add it
to include/linux/pm_opp.h to allow other drivers, such as platform OPP
and cpufreq drivers, to make use of it.
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core
Pull irqchip updates for 4.11 from Marc Zyngier
- A number of gic-v3-its cleanups and fixes
- A fix for the MIPS GIC
- One new interrupt controller for the Cortina Gemini platform
- Support for the Qualcomm interrupt combiner, together with
its ACPI goodness
|
|
As the PM core may invoke the *noirq() callbacks asynchronously, the
current lock-less approach in genpd doesn't work. The consequence is that
we may find concurrent operations racing to power on/off the PM domain.
As of now, no immediate errors has been reported, but it's probably only a
matter time. Therefor let's fix the problem now before this becomes a real
issue, by deploying the locking scheme to the relevant functions.
Reported-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The earlier comment stated that the dev_warn_once() was going to be printed
once per device. Let's fix that, as dev_warn_once() is printed only once,
no matter of the device.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Lina Iyer <lina.iyer@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Fixes the following sparse warning:
drivers/base/power/opp/core.c:49:18: warning:
symbol '_find_opp_table_unlocked' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
When augmenting ACPI-enumerated devices with additional property data based
on DMI info, a module has often several potential property sets, with only
one being active on a given box. In order to save memory it should be
possible to mark everything and __initdata or __initconst, execute DMI
match early, and duplicate relevant properties. Then kernel will discard
the rest of them.
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Data that is fed into property arrays should not be modified, so let's mark
relevant pointers as const. This will allow us making source arrays as
const/__initconst.
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
There is no reason why statically defined properties should be modifiable,
so let's make device_add_properties() and the rest of pset_*() functions to
take const pointers to properties.
This will allow us to mark properties as const/__initconst at definition
sites.
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
* pm-core-fixes:
PM / runtime: Avoid false-positive warnings from might_sleep_if()
* pm-cpufreq-fixes:
cpufreq: intel_pstate: Disable energy efficiency optimization
cpufreq: brcmstb-avs-cpufreq: properly retrieve P-state upon suspend
cpufreq: brcmstb-avs-cpufreq: extend sysfs entry brcm_avs_pmap
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are two bugfixes that resolve some reported issues. One in the
firmware loader, that should fix the much-reported problem of crashes
with it. The other is a hyperv fix for a reported regression.
Both have been in linux-next for a week or so with no reported issues"
* tag 'char-misc-4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
Drivers: hv: vmbus: finally fix hv_need_to_signal_on_read()
firmware: fix NULL pointer dereference in __fw_load_abort()
|
|
The might_sleep_if() assertions in __pm_runtime_idle(),
__pm_runtime_suspend() and __pm_runtime_resume() may generate
false-positive warnings in some situations. For example, that
happens if a nested pm_runtime_get_sync()/pm_runtime_put() pair
is executed with disabled interrupts within an outer
pm_runtime_get_sync()/pm_runtime_put() section for the same device.
[Generally, pm_runtime_get_sync() may sleep, so it should not be
called with disabled interrupts, but in this particular case the
previous pm_runtime_get_sync() guarantees that the device will not
be suspended, so the inner pm_runtime_get_sync() will return
immediately after incrementing the device's usage counter.]
That started to happen in the i915 driver in 4.10-rc, leading to
the following splat:
BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:1032
in_atomic(): 1, irqs_disabled(): 0, pid: 1500, name: Xorg
1 lock held by Xorg/1500:
#0: (&dev->struct_mutex){+.+.+.}, at:
[<ffffffffa0680c13>] i915_mutex_lock_interruptible+0x43/0x140 [i915]
CPU: 0 PID: 1500 Comm: Xorg Not tainted
Call Trace:
dump_stack+0x85/0xc2
___might_sleep+0x196/0x260
__might_sleep+0x53/0xb0
__pm_runtime_resume+0x7a/0x90
intel_runtime_pm_get+0x25/0x90 [i915]
aliasing_gtt_bind_vma+0xaa/0xf0 [i915]
i915_vma_bind+0xaf/0x1e0 [i915]
i915_gem_execbuffer_relocate_entry+0x513/0x6f0 [i915]
i915_gem_execbuffer_relocate_vma.isra.34+0x188/0x250 [i915]
? trace_hardirqs_on+0xd/0x10
? i915_gem_execbuffer_reserve_vma.isra.31+0x152/0x1f0 [i915]
? i915_gem_execbuffer_reserve.isra.32+0x372/0x3a0 [i915]
i915_gem_do_execbuffer.isra.38+0xa70/0x1a40 [i915]
? __might_fault+0x4e/0xb0
i915_gem_execbuffer2+0xc5/0x260 [i915]
? __might_fault+0x4e/0xb0
drm_ioctl+0x206/0x450 [drm]
? i915_gem_execbuffer+0x340/0x340 [i915]
? __fget+0x5/0x200
do_vfs_ioctl+0x91/0x6f0
? __fget+0x111/0x200
? __fget+0x5/0x200
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x23/0xc6
even though the code triggering it is correct.
Unfortunately, the might_sleep_if() assertions in question are
too coarse-grained to cover such cases correctly, so make them
a bit less sensitive in order to avoid the false-positives.
Reported-and-tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Reading a sysfs "memoryN/valid_zones" file leads to the following oops
when the first page of a range is not backed by struct page.
show_valid_zones() assumes that 'start_pfn' is always valid for
page_zone().
BUG: unable to handle kernel paging request at ffffea017a000000
IP: show_valid_zones+0x6f/0x160
This issue may happen on x86-64 systems with 64GiB or more memory since
their memory block size is bumped up to 2GiB. [1] An example of such
systems is desribed below. 0x3240000000 is only aligned by 1GiB and
this memory block starts from 0x3200000000, which is not backed by
struct page.
BIOS-e820: [mem 0x0000003240000000-0x000000603fffffff] usable
Since test_pages_in_a_zone() already checks holes, fix this issue by
extending this function to return 'valid_start' and 'valid_end' for a
given range. show_valid_zones() then proceeds with the valid range.
[1] 'Commit bdee237c0343 ("x86: mm: Use 2GB memory block size on
large-memory x86-64 systems")'
Link: http://lkml.kernel.org/r/20170127222149.30893-3-toshi.kani@hpe.com
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Zhang Zhen <zhenzhang.zhang@huawei.com>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@vger.kernel.org> [4.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
ACPI extended IRQ resources may contain a ResourceSource to specify
an alternate interrupt controller. Introduce acpi_irq_get and use it
to implement ResourceSource/IRQ domain mapping.
The new API is similar to of_irq_get and allows re-initialization
of a platform resource from the ACPI extended IRQ resource, and
provides proper behavior for probe deferral when the domain is not
yet present when called.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
into regmap-next
|
|
regmap: Fix for v4.10
The only change for regmap this merge window is a single fix for an
unused variable.
# gpg: Signature made Mon 12 Dec 2016 17:03:19 CET
# gpg: using RSA key ADE668AA675718B59FE29FEA24D68B725D5487D0
# gpg: issuer "broonie@kernel.org"
# gpg: key 0D9EACE2CD7BEEBC: no public key for trusted key - skipped
# gpg: key 0D9EACE2CD7BEEBC marked as ultimately trusted
# gpg: key CCB0A420AF88CD16: no public key for trusted key - skipped
# gpg: key CCB0A420AF88CD16 marked as ultimately trusted
# gpg: key 162614E316005C11: no public key for trusted key - skipped
# gpg: key 162614E316005C11 marked as ultimately trusted
# gpg: key A730C53A5621E907: no public key for trusted key - skipped
# gpg: key A730C53A5621E907 marked as ultimately trusted
# gpg: key 276568D75C6153AD: no public key for trusted key - skipped
# gpg: key 276568D75C6153AD marked as ultimately trusted
# gpg: Good signature from "Mark Brown <broonie@sirena.org.uk>" [ultimate]
# gpg: aka "Mark Brown <broonie@debian.org>" [ultimate]
# gpg: aka "Mark Brown <broonie@kernel.org>" [ultimate]
# gpg: aka "Mark Brown <broonie@tardis.ed.ac.uk>" [ultimate]
# gpg: aka "Mark Brown <broonie@linaro.org>" [ultimate]
# gpg: aka "Mark Brown <Mark.Brown@linaro.org>" [ultimate]
|
|
The cpu-dma PM QoS constraint impacts all the cpus in the system. There
is no way to let the user to choose a PM QoS constraint per cpu.
The following patch exposes to the userspace a per cpu based sysfs file
in order to let the userspace to change the value of the PM QoS latency
constraint.
This change is inoperative in its form and the cpuidle governors have to
take into account the per cpu latency constraint in addition to the
global cpu-dma latency constraint in order to operate properly.
BTW
The pm_qos_resume_latency usage defined in
Documentation/ABI/testing/sysfs-devices-power
The /sys/devices/.../power/pm_qos_resume_latency_us attribute
contains the PM QoS resume latency limit for the given device,
which is the maximum allowed time it can take to resume the
device, after it has been suspended at run time, from a resume
request to the moment the device will be ready to process I/O,
in microseconds. If it is equal to 0, however, this means that
the PM QoS resume latency may be arbitrary.
Signed-off-by: Alex Shi <alex.shi@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
dev_pm_opp_get_max_volt_latency() calls _find_opp_table() two times
effectively.
Merge _get_regulator_count() into dev_pm_opp_get_max_volt_latency() to
avoid that.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
As we don't use RCU locking anymore, there is no need to replace an
earlier OPP node with a new one. Just update the existing one.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The RCU locking isn't well suited for the OPP core. The RCU locking fits
better for reader heavy stuff, while the OPP core have at max one or two
readers only at a time.
Over that, it was getting very confusing the way RCU locking was used
with the OPP core. The individual OPPs are mostly well handled, i.e. for
an update a new structure was created and then that replaced the older
one. But the OPP tables were updated directly all the time from various
parts of the core. Though they were mostly used from within RCU locked
region, they didn't had much to do with RCU and were governed by the
mutex instead.
And that mixed with the 'opp_table_lock' has made the core even more
confusing.
Now that we are already managing the OPPs and the OPP tables with kernel
reference infrastructure, we can get rid of RCU locking completely and
simplify the code a lot.
Remove all RCU references from code and comments.
Acquire opp_table->lock while parsing the list of OPPs though.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Take reference of the OPP table from within _find_opp_table(). Also
update the callers of _find_opp_table() to call
dev_pm_opp_put_opp_table() after they have used the OPP table.
Note that _find_opp_table() increments the reference under the
opp_table_lock.
Now that the OPP table wouldn't get freed until the callers of
_find_opp_table() call dev_pm_opp_put_opp_table(), there is no need to
take the opp_table_lock or rcu_read_lock() around it. Drop them.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
This patch updates dev_pm_opp_find_freq_*() routines to get a reference
to the OPPs returned by them.
Also updates the users of dev_pm_opp_find_freq_*() routines to call
dev_pm_opp_put() after they are done using the OPPs.
As it is guaranteed the that OPPs wouldn't get freed while being used,
the RCU read side locking present with the users isn't required anymore.
Drop it as well.
This patch also updates all users of devfreq_recommended_opp() which was
returning an OPP received from the OPP core.
Note that some of the OPP core routines have gained
rcu_read_{lock|unlock}() calls, as those still use RCU specific APIs
within them.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com> [Devfreq]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Add kref to struct dev_pm_opp for easier accounting of the OPPs.
Note that the OPPs are freed under the opp_table->lock mutex only.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Migrate all users of _add_opp_table() to use dev_pm_opp_get_opp_table()
to guarantee that the OPP table doesn't get freed while being used.
Also update _managed_opp() to get the reference to the OPP table.
Now that the OPP table wouldn't get freed while these routines are
executing after dev_pm_opp_get_opp_table() is called, there is no need
to take opp_table_lock. Drop them as well.
Now that _add_opp_table(), _remove_opp_table() and the unlocked release
routines aren't used anymore, remove them.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Take reference of the OPP table while adding and removing OPPs, that
helps us remove special checks in _remove_opp_table().
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Now that we have proper kernel reference infrastructure in place for OPP
tables, use it to guarantee that the OPP table isn't freed while being
used by the callers of dev_pm_opp_set_*() APIs.
Make them all return the pointer to the OPP table after taking its
reference and put the reference back with dev_pm_opp_put_*() APIs.
Now that the OPP table wouldn't get freed while these routines are
executing after dev_pm_opp_get_opp_table() is called, there is no need
to take opp_table_lock. Drop them as well.
Remove the rcu specific comments from these routines as they aren't
relevant anymore.
Note that prototypes of dev_pm_opp_{set|put}_regulators() were already
updated by another patch.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Add kref to struct opp_table for easier accounting of the OPP table.
Note that the new routine dev_pm_opp_get_opp_table() takes the reference
from under the opp_table_lock, which guarantees that the OPP table
doesn't get freed unless dev_pm_opp_put_opp_table() is called for the
OPP table.
Two separate release mechanisms are added: locked and unlocked. In
unlocked version the routines aren't required to take/drop
opp_table_lock as the callers have already done that. This is required
to avoid breaking git bisect, otherwise we may get lockdeps between
commits. Once all the users of OPP table are updated the unlocked
version shall be removed.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Add per OPP table lock to protect opp_table->opp_list.
Note that at few places opp_list is used under the rcu_read_lock() and
so a mutex can't be added there for now. This will be fixed by a later
patch.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Split out parts of _add_opp_table() and _remove_opp_table() into
separate routines. This improves readability as well.
Should result in no functional changes.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Let the OPP core provide helpers to register notifiers for any device,
instead of exposing srcu_head outside of the core.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
There is only one user of dev_pm_opp_get_suspend_opp() and that uses it
to get the OPP rate for the suspend_opp.
Rename dev_pm_opp_get_suspend_opp() as dev_pm_opp_get_suspend_opp_freq()
and return the rate directly from it.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
There is no point in trying to find/allocate the table for every OPP
that is added for a device. It would be far more efficient to allocate
the table only once and pass its pointer to the routines that add the
OPP entry.
Locking is removed from _opp_add_static_v2() and _opp_add_v1() now as
the callers call them with that lock already held.
Call to _remove_opp_table() routine is also removed from _opp_free()
now, as opp_table isn't allocated from within _opp_allocate(). This is
handled by the routines which created the OPP table in the first place.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Later patches would want to remove OPP table (and its OPPs) using the
opp_table pointer instead of 'dev'.
In order to prepare for that, rename _dev_pm_opp_remove_table() as
_dev_pm_opp_find_and_remove_table() split out part of it.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The OPPs which are never successfully added using _opp_add() are not
required to be freed with the _opp_remove() routine, as a simple kfree()
is enough for them.
Introduce a new light weight routine _opp_free(), which will do that.
That also helps us removing the 'notify' parameter to _opp_remove(),
which isn't required anymore.
Note that _opp_free() contains a call to _remove_opp_table() as the OPP
table might have been added for this very OPP only. The
_remove_opp_table() routine returns quickly if there are more OPPs in
the table. This will be simplified in later patches though.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The code adding static OPPs for V2 bindings already does so. Make the V1
bindings specific code behave the same.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|