Age | Commit message (Collapse) | Author | Files | Lines |
|
Most of the drivers do following in their ->target_index() routines:
struct cpufreq_freqs freqs;
freqs.old = old freq...
freqs.new = new freq...
cpufreq_notify_transition(policy, &freqs, CPUFREQ_PRECHANGE);
/* Change rate here */
cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE);
This is replicated over all cpufreq drivers today and there doesn't exists a
good enough reason why this shouldn't be moved to cpufreq core instead.
There are few special cases though, like exynos5440, which doesn't do everything
on the call to ->target_index() routine and call some kind of bottom halves for
doing this work, work/tasklet/etc..
They may continue doing notification from their own code as flag:
CPUFREQ_ASYNC_NOTIFICATION is already set for them.
All drivers are also modified in this patch to avoid breaking 'git bisect', as
double notification would happen otherwise.
Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Russell King <linux@arm.linux.org.uk>
Acked-by: Stephen Warren <swarren@nvidia.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Reviewed-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
There are few special cases like exynos5440 which doesn't send POSTCHANGE
notification from their ->target() routine and call some kind of bottom halves
for doing this work, work/tasklet/etc.. From which they finally send POSTCHANGE
notification.
Its better if we distinguish them from other cpufreq drivers in some way so that
core can handle them specially. So this patch introduces another flag:
CPUFREQ_ASYNC_NOTIFICATION, which will be set by such drivers.
This also changes exynos5440-cpufreq.c and powernow-k8 in order to set this
flag.
Acked-by: Amit Daniel Kachhap <amit.daniel@samsung.com>
Acked-by: Kukjin Kim <kgene.kim@samsung.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Fixes warnings reported by kbuild test robot
sparse warnings: (new ones prefixed by >>)
drivers/cpufreq/intel_pstate.c:729:6: sparse: symbol 'copy_pid_params' was not declared. Should it be static?
drivers/cpufreq/intel_pstate.c:739:6: sparse: symbol 'copy_cpu_funcs' was not declared. Should it be static?
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The b.L switcher can be turned on/off at run time. It is therefore
necessary to change the cpufreq driver behavior accordingly.
The driver must be unregistered/registered with the cpufreq core
to reconfigure freq tables for the virtual or actual CPUs. This is
accomplished via the b.L switcher notifier callback.
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
This patch adds IKS (In Kernel Switcher) support to cpufreq driver.
This creates a combined freq table for A7-A15 CPU pairs. A7 frequencies
are virtualized and scaled down to half the actual frequencies to
approximate a linear scale across the combined A7+A15 range. When the
requested frequency change crosses the A7-A15 boundary a cluster switch
is invoked.
Based on earlier work from Sudeep KarkadaNagesha.
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
This patch adds vexpress-spc platform device to enables the vexpress
SPC cpufreq interface driver.
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Acked-by: Pawel Moll <Pawel.Moll@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The TC2(i.e. CA15_A7) Versatile Express has external Cortex M3 based
power controller which is responsible for CPU DVFS and SPC provides
the interface for the same.
This patch adds a tiny interface driver to check if OPPs are
initialised by SPC platform code and register the arm_big_little
cpufreq driver.
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
On TC2, the cpu clocks are controlled by the external M3 microcontroller
and SPC provides the interface between the CPU and the power controller.
The generic cpufreq drivers use the clock APIs to get the cpu clocks.
This patch add virtual spc clocks for all the cpus to control the cpu
operating frequency via the clock framework.
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Pawel Moll <Pawel.Moll@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
SPC(Serial Power Controller) on TC2 also controls the CPU performance
operating points which is essential to provide CPU DVFS. The M3
microcontroller provides two sets of eight performance values, one set
for each cluster (CA15 or CA7). Each of this value contains the
frequency(kHz) and voltage(mV) at that performance level. It expects
these performance level to be passed through the SPC PERF_LVL registers.
This patch adds support to populate these performance levels from M3,
build the mapping to CPU OPPs at the boot and then use it to get and
set the CPU performance level runtime.
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Pawel Moll <Pawel.Moll@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
We have per-CPU cpu_policy_rwsem for cpufreq core, but we never use
all of them. We always use rwsem of policy->cpu and so we can
actually make this rwsem per policy instead.
This patch does this change. With this change other tricky situations
are also avoided now, like which lock to take while we are changing
policy->cpu, etc.
Suggested-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Add support for the Baytrail processor.
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Non-core processors have a different MSR layout to commumicate P state
information. Refactor the driver to use CPU dependent accessors to
P state information.
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Currently, the prototype of cpufreq_drivers target routines is:
int target(struct cpufreq_policy *policy, unsigned int target_freq,
unsigned int relation);
And most of the drivers call cpufreq_frequency_table_target() to get a valid
index of their frequency table which is closest to the target_freq. And they
don't use target_freq and relation after that.
So, it makes sense to just do this work in cpufreq core before calling
cpufreq_frequency_table_target() and simply pass index instead. But this can be
done only with drivers which expose their frequency table with cpufreq core. For
others we need to stick with the old prototype of target() until those drivers
are converted to expose frequency tables.
This patch implements the new light weight prototype for target_index() routine.
It looks like this:
int target_index(struct cpufreq_policy *policy, unsigned int index);
CPUFreq core will call cpufreq_frequency_table_target() before calling this
routine and pass index to it. Because CPUFreq core now requires to call routines
present in freq_table.c CONFIG_CPU_FREQ_TABLE must be enabled all the time.
This also marks target() interface as deprecated. So, that new drivers avoid
using it. And Documentation is updated accordingly.
It also converts existing .target() to newly defined light weight
.target_index() routine for many driver.
Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Russell King <linux@arm.linux.org.uk>
Acked-by: David S. Miller <davem@davemloft.net>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rjw@rjwysocki.net>
|
|
Conflicts:
drivers/cpufreq/omap-cpufreq.c
|
|
Since Operating Performance Points (OPP) functions are specific
to device specific power management, be specific and rename opp.h
to pm_opp.h
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Since Operating Performance Points (OPP) data structures are specific
to device specific power management, be specific and rename opp_* data
structures in OPP library with dev_pm_opp_* equivalent.
Affected structures are:
struct opp
enum opp_event
Minor checkpatch warning resulting of this change was fixed as well.
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Since Operating Performance Points (OPP) functions are specific to
device specific power management, be specific and rename opp_*
accessors in OPP library with dev_pm_opp_* equivalent.
Affected functions are:
opp_get_voltage
opp_get_freq
opp_get_opp_count
opp_find_freq_exact
opp_find_freq_floor
opp_find_freq_ceil
opp_add
opp_enable
opp_disable
opp_get_notifier
opp_init_cpufreq_table
opp_free_cpufreq_table
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Nishanth Menon <nm@ti.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fix from Chris Mason:
"Sage hit a deadlock with ceph on btrfs, and Josef tracked it down to a
regression in our initial rc1 pull. When doing nocow writes we were
sometimes starting a transaction with locks held"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: release path before starting transaction in can_nocow_extent
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
- intel_pstate fix for misbehavior after system resume if sysfs
attributes are set in a specific way before the corresponding suspend
from Dirk Brandewie.
- A recent intel_pstate fix has no effect if unsigned long is 32-bit,
so fix it up to cover that case as well.
- The s3c64xx cpufreq driver was not updated when the index field of
struct cpufreq_frequency_table was replaced with driver_data, so
update it now. From Charles Keepax.
- The Kconfig help text for ACPI_BUTTON still refers to
/proc/acpi/event that has been dropped recently, so modify it to
remove that reference. From Krzysztof Mazur.
- A Lan Tianyu's change adds a missing mutex unlock to an error code
path in acpi_resume_power_resources().
- Some code related to ACPI power resources, whose very purpose is
questionable to put it lightly, turns out to cause problems to happen
during testing on real systems, so remove it completely (we may
revisit that in the future if there's a compelling enough reason).
From Rafael J Wysocki and Aaron Lu.
* tag 'pm+acpi-3.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / PM: Drop two functions that are not used any more
ATA / ACPI: remove power dependent device handling
cpufreq: s3c64xx: Rename index to driver_data
ACPI / power: Drop automaitc resume of power resource dependent devices
intel_pstate: Fix type mismatch warning
cpufreq / intel_pstate: Fix max_perf_pct on resume
ACPI: remove /proc/acpi/event from ACPI_BUTTON help
ACPI / power: Release resource_lock after acpi_power_get_state() return error
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Two fixlets:
- fix a (rare-config) build bug
- fix a next-gen SGI/UV hw/firmware enumeration bug"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Update UV3 hub revision ID
x86/microcode: Correct Kconfig dependencies
|
|
We can't be holding tree locks while we try to start a transaction, we will
deadlock. Thanks,
Reported-by: Sage Weil <sage@inktank.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
* acpi-fixes:
ACPI / PM: Drop two functions that are not used any more
ATA / ACPI: remove power dependent device handling
ACPI / power: Drop automaitc resume of power resource dependent devices
ACPI: remove /proc/acpi/event from ACPI_BUTTON help
ACPI / power: Release resource_lock after acpi_power_get_state() return error
|
|
* pm-fixes:
cpufreq: s3c64xx: Rename index to driver_data
intel_pstate: Fix type mismatch warning
cpufreq / intel_pstate: Fix max_perf_pct on resume
|
|
Pull CIFS fixes from Steve French:
"Five small cifs fixes (includes fixes for: unmount hang, 2 security
related, symlink, large file writes)"
* 'for-linus' of git://git.samba.org/sfrench/cifs-2.6:
cifs: ntstatus_to_dos_map[] is not terminated
cifs: Allow LANMAN auth method for servers supporting unencapsulated authentication methods
cifs: Fix inability to write files >2GB to SMB2/3 shares
cifs: Avoid umount hangs with smb2 when server is unresponsive
do not treat non-symlink reparse points as valid symlinks
|
|
cpufreq_set_policy() has been changed to origin __cpufreq_set_policy()
and policy->lock has been converted to rewrite lock by commit 5a01f2.
So remove the comment.
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fix from Greg KH:
"Here is one fix for the hotplug memory path that resolves a regression
when removing memory that showed up in 3.12-rc1"
* tag 'driver-core-3.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
driver core: Release device_hotplug_lock when store_mem_state returns EINVAL
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some USB fixes and new device ids for 3.12-rc6
The largest change here is a bunch of new device ids for the option
USB serial driver for new Huawei devices. Other than that, just some
small bug fixes for issues that people have reported (run-time and
build-time), nothing major"
* tag 'usb-3.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: usb_phy_gen: refine conditional declaration of usb_nop_xceiv_register
usb: misc: usb3503: Fix compile error due to incorrect regmap depedency
usb/chipidea: fix oops on memory allocation failure
usb-storage: add quirk for mandatory READ_CAPACITY_16
usb: serial: option: blacklist Olivetti Olicard200
USB: quirks: add touchscreen that is dazzeled by remote wakeup
Revert "usb: musb: gadget: fix otg active status flag"
USB: quirks.c: add one device that cannot deal with suspension
USB: serial: option: add support for Inovia SEW858 device
USB: serial: ti_usb_3410_5052: add Abbott strip port ID to combined table as well.
USB: support new huawei devices in option.c
usb: musb: start musb on the udc side, too
xhci: Fix spurious wakeups after S5 on Haswell
xhci: fix write to USB3_PSSEN and XUSB2PRM pci config registers
xhci: quirk for extra long delay for S4
xhci: Don't enable/disable RWE on bus suspend/resume.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull serial driver fixes from Greg KH:
"Here are two serial driver fixes for your tree. One is a revert of a
patch that causes a build error, the other is a fix to provide the
correct brace placement which resolves a bug where the driver was not
working properly"
* tag 'tty-3.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: vt8500: add missing braces
Revert "serial: i.MX: evaluate linux,stdout-path property"
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are some small iio and w1 driver fixes for 3.12-rc6.
There is also a hyper-v fix in here, which turned out to be incorrect,
so it was reverted. That will probably have to wait unto 3.13-rc1 to
get accepted as it's still being discussed"
* tag 'char-misc-3.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
Revert "Drivers: hv: vmbus: Fix a bug in channel rescind code"
Drivers: hv: vmbus: Fix a bug in channel rescind code
iio:buffer: Free active scan mask in iio_disable_all_buffers()
iio: frequency: adf4350: add missing clk_disable_unprepare() on error in adf4350_probe()
w1 - call request_module with w1 master mutex unlocked
w1 - fix fops in w1_bus_notify
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"All reasonably small fixes as rc6: a HD-audio mic fix, a us122l mmap
regression fix, and kernel memory leak fix in hdsp driver. Hopefully
this will be the last pull request for 3.12..."
* tag 'sound-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hdsp - info leak in snd_hdsp_hwdep_ioctl()
ALSA: us122l: Fix pcm_usb_stream mmapping regression
ALSA: hda - Fix inverted internal mic not indicated on some machines
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull apparmor fixes from James Morris:
"A couple more regressions fixed"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
apparmor: fix bad lock balance when introspecting policy
apparmor: fix memleak of the profile hash
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-linus
Jonathan writes:
Third set of IIO fixes for the 3.12 cycle.
Two little ones this time:
1) A missing clk_unprepare in adf4350.
2) A missing free of the active_scan_mask when iio_disable_all_buffers is
called during an unexpected device removal. This leak was introduced by
the fix
a87c82e454f184a9473f8cdfd4d304205f585f65 iio: Stop sampling when the device is removed
and hence is a regression fix.
|
|
Commit 3fa4d734 (usb: phy: rename nop_usb_xceiv => usb_phy_gen_xceiv)
changed the conditional around the declaration of usb_nop_xceiv_register
from
#if defined(CONFIG_NOP_USB_XCEIV) ||
(defined(CONFIG_NOP_USB_XCEIV_MODULE) && defined(MODULE))
to
#if IS_ENABLED(CONFIG_NOP_USB_XCEIV)
While that looks the same, it is semantically different. The first expression
is true if CONFIG_NOP_USB_XCEIV is built as module and if the including
code is built as module. The second expression is true if code depending on
CONFIG_NOP_USB_XCEIV if built as module or into the kernel.
As a result, the arm:allmodconfig build fails with
arch/arm/mach-omap2/built-in.o: In function `omap3_evm_init':
arch/arm/mach-omap2/board-omap3evm.c:703: undefined reference to
`usb_nop_xceiv_register'
Fix the problem by reverting to the old conditional.
Cc: Josh Boyer <jwboyer@redhat.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This reverts commit 90d33f3ec519db19d785216299a4ee85ef58ec97 as it's not
the correct fix for this issue, and it causes a build warning to be
added to the kernel tree.
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Two functions defined in device_pm.c, acpi_dev_pm_add_dependent()
and acpi_dev_pm_remove_dependent(), have no callers and may be
dropped, so drop them.
Moreover, they are the only functions adding entries to and removing
entries from the power_dependent list in struct acpi_device, so drop
that list too.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Previously, we wanted SCSI devices corrsponding to ATA devices to
be runtime resumed when the power resource for those ATA device was
turned on by some other device, so we added the SCSI device to the
dependent device list of the ATA device's ACPI node. However, this
code has no effect after commit 41863fc (ACPI / power: Drop automaitc
resume of power resource dependent devices) and the mechanism it was
supposed to implement is regarded as a bad idea now, so drop it.
[rjw: Changelog]
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
In the exynos4210_set_apll() function, the APLL frequency is set with
direct register manipulation.
Such approach is not allowed in the common clock framework. The frequency
is changed, but the corresponding clock value is not updated. This causes
wrong frequency read from cpufreq's cpuinfo_cur_freq sysfs attribute.
Also direct manipulation with PLL's S parameter has been removed. It is
already done at PLL35xx code.
Tested at:
- Exynos4210 - Trats board (linux 3.12-rc4)
Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Reviewed-by: Yadwinder Singh Brar <yadi.brar@samsung.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
In the exynos4x12_set_apll() function, the APLL frequency is set with
direct register manipulation.
Such approach is not allowed in the common clock framework. The frequency
is changed, but the corresponding clock value is not updated. This causes
wrong frequency read from cpufreq's cpuinfo_cur_freq sysfs attribute.
Also direct manipulation with PLL's S parameter has been removed. It is
already done at PLL35xx code.
Tested at:
- Exynos4412 - Trats2 board (linux 3.12-rc4)
Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Reviewed-by: Tomasz Figa <t.figa@samsung.com>
Reviewed-by: Yadwinder Singh Brar <yadi.brar@samsung.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Merge misc fixes from Andrew Morton.
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (21 commits)
mm: revert mremap pud_free anti-fix
mm: fix BUG in __split_huge_page_pmd
swap: fix set_blocksize race during swapon/swapoff
procfs: call default get_unmapped_area on MMU-present architectures
procfs: fix unintended truncation of returned mapped address
writeback: fix negative bdi max pause
percpu_refcount: export symbols
fs: buffer: move allocation failure loop into the allocator
mm: memcg: handle non-error OOM situations more gracefully
tools/testing/selftests: fix uninitialized variable
block/partitions/efi.c: treat size mismatch as a warning, not an error
mm: hugetlb: initialize PG_reserved for tail pages of gigantic compound pages
mm/zswap: bugfix: memory leak when re-swapon
mm: /proc/pid/pagemap: inspect _PAGE_SOFT_DIRTY only on present pages
mm: migration: do not lose soft dirty bit if page is in migration state
gcov: MAINTAINERS: Add an entry for gcov
mm/hugetlb.c: correct missing private flag clearing
mm/vmscan.c: don't forget to free shrinker->nr_deferred
ipc/sem.c: synchronize semop and semctl with IPC_RMID
ipc: update locking scheme comments
...
|
|
Revert commit 1ecfd533f4c5 ("mm/mremap.c: call pud_free() after fail
calling pmd_alloc()").
The original code was correct: pud_alloc(), pmd_alloc(), pte_alloc_map()
ensure that the pud, pmd, pt is already allocated, and seldom do they
need to allocate; on failure, upper levels are freed if appropriate by
the subsequent do_munmap(). Whereas commit 1ecfd533f4c5 did an
unconditional pud_free() of a most-likely still-in-use pud: saved only
by the near-impossiblity of pmd_alloc() failing.
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Occasionally we hit the BUG_ON(pmd_trans_huge(*pmd)) at the end of
__split_huge_page_pmd(): seen when doing madvise(,,MADV_DONTNEED).
It's invalid: we don't always have down_write of mmap_sem there: a racing
do_huge_pmd_wp_page() might have copied-on-write to another huge page
before our split_huge_page() got the anon_vma lock.
Forget the BUG_ON, just go back and try again if this happens.
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: David Rientjes <rientjes@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Fix race between swapoff and swapon. Swapoff used old_block_size from
swap_info outside of swapon_mutex so it could be overwritten by
concurrent swapon.
The race has visible effect only if more than one swap block device
exists with different block sizes (e.g. /dev/sda1 with block size 4096
and /dev/sdb1 with 512). In such case it leads to setting the blocksize
of swapped off device with wrong blocksize.
The bug can be triggered with multiple concurrent swapoff and swapon:
0. Swap for some device is on.
1. swapoff:
First the swapoff is called on this device and "struct swap_info_struct
*p" is assigned. This is done under swap_lock however this lock is
released for the call try_to_unuse().
2. swapon:
After the assignment above (and before acquiring swapon_mutex &
swap_lock by swapoff) the swapon is called on the same device.
The p->old_block_size is assigned to the value of block_size the device.
This block size should be the same as previous but sometimes it is not.
The swapon ends successfully.
3. swapoff:
Swapoff resumes, grabs the locks and mutex and continues to disable this
swap device. Now it sets the block size to value taken from swap_info
which was overwritten by swapon in 2.
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Reported-by: Weijie Yang <weijie.yang.kh@gmail.com>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Shaohua Li <shli@fusionio.com>
Cc: Minchan Kim <minchan@kernel.org>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit c4fe24485729 ("sparc: fix PCI device proc file mmap(2)") added
proc_reg_get_unmapped_area in proc_reg_file_ops and
proc_reg_file_ops_no_compat, by which now mmap always returns EIO if
get_unmapped_area method is not defined for the target procfs file,
which causes regression of mmap on /proc/vmcore.
To address this issue, like get_unmapped_area(), call default
current->mm->get_unmapped_area on MMU-present architectures if
pde->proc_fops->get_unmapped_area, i.e. the one in actual file
operation in the procfs file, is not defined.
Reported-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Tested-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Currently, proc_reg_get_unmapped_area truncates upper 32-bit of the
mapped virtual address returned from get_unmapped_area method in
pde->proc_fops due to the variable rv of signed integer on x86_64. This
is too small to have vitual address of unsigned long on x86_64 since on
x86_64, signed integer is of 4 bytes while unsigned long is of 8 bytes.
To fix this issue, use unsigned long instead.
Fixes a regression added in commit c4fe24485729 ("sparc: fix PCI device
proc file mmap(2)").
Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Tested-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Toralf runs trinity on UML/i386. After some time it hangs and the last
message line is
BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521]
It's found that pages_dirtied becomes very large. More than 1000000000
pages in this case:
period = HZ * pages_dirtied / task_ratelimit;
BUG_ON(pages_dirtied > 2000000000);
BUG_ON(pages_dirtied > 1000000000); <---------
UML debug printf shows that we got negative pause here:
ick: pause : -984
ick: pages_dirtied : 0
ick: task_ratelimit: 0
pause:
+ if (pause < 0) {
+ extern int printf(char *, ...);
+ printf("ick : pause : %li\n", pause);
+ printf("ick: pages_dirtied : %lu\n", pages_dirtied);
+ printf("ick: task_ratelimit: %lu\n", task_ratelimit);
+ BUG_ON(1);
+ }
trace_balance_dirty_pages(bdi,
Since pause is bounded by [min_pause, max_pause] where min_pause is also
bounded by max_pause. It's suspected and demonstrated that the
max_pause calculation goes wrong:
ick: pause : -717
ick: min_pause : -177
ick: max_pause : -717
ick: pages_dirtied : 14
ick: task_ratelimit: 0
The problem lies in the two "long = unsigned long" assignments in
bdi_max_pause() which might go negative if the highest bit is 1, and the
min_t(long, ...) check failed to protect it falling under 0. Fix all of
them by using "unsigned long" throughout the function.
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Tested-by: Toralf Förster <toralf.foerster@gmx.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Richard Weinberger <richard@nod.at>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Export the interface to be used within modules.
Signed-off-by: Matias Bjorling <m@bjorling.me>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Buffer allocation has a very crude indefinite loop around waking the
flusher threads and performing global NOFS direct reclaim because it can
not handle allocation failures.
The most immediate problem with this is that the allocation may fail due
to a memory cgroup limit, where flushers + direct reclaim might not make
any progress towards resolving the situation at all. Because unlike the
global case, a memory cgroup may not have any cache at all, only
anonymous pages but no swap. This situation will lead to a reclaim
livelock with insane IO from waking the flushers and thrashing unrelated
filesystem cache in a tight loop.
Use __GFP_NOFAIL allocations for buffers for now. This makes sure that
any looping happens in the page allocator, which knows how to
orchestrate kswapd, direct reclaim, and the flushers sensibly. It also
allows memory cgroups to detect allocations that can't handle failure
and will allow them to ultimately bypass the limit if reclaim can not
make progress.
Reported-by: azurIt <azurit@pobox.sk>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit 3812c8c8f395 ("mm: memcg: do not trap chargers with full
callstack on OOM") assumed that only a few places that can trigger a
memcg OOM situation do not return VM_FAULT_OOM, like optional page cache
readahead. But there are many more and it's impractical to annotate
them all.
First of all, we don't want to invoke the OOM killer when the failed
allocation is gracefully handled, so defer the actual kill to the end of
the fault handling as well. This simplifies the code quite a bit for
added bonus.
Second, since a failed allocation might not be the abrupt end of the
fault, the memcg OOM handler needs to be re-entrant until the fault
finishes for subsequent allocation attempts. If an allocation is
attempted after the task already OOMed, allow it to bypass the limit so
that it can quickly finish the fault and invoke the OOM killer.
Reported-by: azurIt <azurit@pobox.sk>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The err variable is intended to receive the timer_create() return before
checking it
Signed-off-by: Felipe Pena <felipensp@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|