Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch replaces list_for_each_continue_rcu() with
list_for_each_entry_continue_rcu() to save a few lines
of code and allow removing list_for_each_continue_rcu().
Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
|
The can_stop_idle_tick() function complains if a softirq vector is
raised too late in the idle-entry process, presumably in order to
prevent dangling softirq invocations from being delayed across the
full idle period, which might be indefinitely long -- and if softirq
was asserted any later than the call to this function, such a delay
might well happen.
However, RCU needs to be able to use softirq to stop idle entry in
order to be able to drain RCU callbacks from the current CPU, which in
turn enables faster entry into dyntick-idle mode, which in turn reduces
power consumption. Because RCU takes this action at a well-defined
point in the idle-entry path, it is safe for RCU to take this approach.
This commit therefore silences the error message that is sometimes
produced when the going-idle CPU suddenly finds that it has an RCU_SOFTIRQ
to process. The error message will continue to be issued for other
softirq vectors.
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
|
The use of raw_local_irq_save() is unnecessary, given that local_irq_save()
really does disable interrupts. Also, it appears to interfere with lockdep.
Therefore, this commit moves to local_irq_save().
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
|
|
The first memory barrier in __call_rcu() is supposed to order any
updates done beforehand by the caller against the actual queuing
of the callback. However, the second memory barrier (which is intended
to order incrementing the queue lengths before queuing the callback)
is also between the caller's updates and the queuing of the callback.
The second memory barrier can therefore serve both purposes.
This commit therefore removes the first memory barrier.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
|
If a given CPU avoids the idle loop but also avoids starting a new
RCU grace period for a full minute, RCU can issue spurious RCU CPU
stall warnings. This commit fixes this issue by adding a check for
ongoing grace period to avoid these spurious stall warnings.
Reported-by: Becky Bruce <bgillbruce@gmail.com>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
|
The print_other_cpu_stall() function accesses a number of rcu_node
fields without protection from the ->lock. In theory, this is not
a problem because the fields accessed are all integers, but in
practice the compiler can get nasty. Therefore, the commit extends
the existing critical section to cover the entire loop body.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
The rcu_print_detail_task_stall_rnp() function invokes
rcu_preempt_blocked_readers_cgp() to verify that there are some preempted
RCU readers blocking the current grace period outside of the protection
of the rcu_node structure's ->lock. This means that the last blocked
reader might exit its RCU read-side critical section and remove itself
from the ->blkd_tasks list before the ->lock is acquired, resulting in
a segmentation fault when the subsequent code attempts to dereference
the now-NULL gp_tasks pointer.
This commit therefore moves the test under the lock. This will not
have measurable effect on lock contention because this code is invoked
only when printing RCU CPU stall warnings, in other words, in the common
case, never.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
The increment_cpu_stall_ticks() function listed each RCU flavor
explicitly, with an ifdef to handle preemptible RCU. This commit
therefore applies for_each_rcu_flavor() to save a line of code.
Because this commit switches from a code-based enumeration of the
flavors of RCU to an rcu_state-list-based enumeration, it is no longer
possible to apply __get_cpu_var() to the per-CPU rcu_data structures.
We instead use __this_cpu_var() on the rcu_state structure's ->rda field
that references the corresponding rcu_data structures.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
Commit 1217ed1b (rcu: permit rcu_read_unlock() to be called while holding
runqueue locks) made rcu_initiate_boost() restore irq state when releasing
the rcu_node structure's ->lock, but failed to update the header comment
accordingly. This commit therefore brings the header comment up to date.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
|
The rcu_implicit_offline_qs() function implicitly assumed that execution
would progress predictably when interrupts are disabled, which is of course
not guaranteed when running on a hypervisor. Furthermore, this function
is short, and is called from one place only in a short function.
This commit therefore ensures that the timing is checked before
checking the condition, which guarantees correct behavior even given
indefinite delays. It also inlines rcu_implicit_offline_qs() into
rcu_implicit_dynticks_qs().
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
|
The rcu_preempt_offline_tasks() moves all tasks queued on a given leaf
rcu_node structure to the root rcu_node, which is done when the last CPU
corresponding the the leaf rcu_node structure goes offline. Now that
RCU-preempt's synchronize_rcu_expedited() implementation blocks CPU-hotplug
operations during the initialization of each rcu_node structure's
->boost_tasks pointer, rcu_preempt_offline_tasks() can do a better job
of setting the root rcu_node's ->boost_tasks pointer.
The key point is that rcu_preempt_offline_tasks() runs as part of the
CPU-hotplug process, so that a concurrent synchronize_rcu_expedited()
is guaranteed to either have not started on the one hand (in which case
there is no boosting on behalf of the expedited grace period) or to be
completely initialized on the other (in which case, in the absence of
other priority boosting, all ->boost_tasks pointers will be initialized).
Therefore, if rcu_preempt_offline_tasks() finds that the ->boost_tasks
pointer is equal to the ->exp_tasks pointer, it can be sure that it is
correctly placed.
In the case where there was boosting ongoing at the time that the
synchronize_rcu_expedited() function started, different nodes might start
boosting the tasks blocking the expedited grace period at different times.
In this mixed case, the root node will either be boosting tasks for
the expedited grace period already, or it will start as soon as it gets
done boosting for the normal grace period -- but in this latter case,
the root node's tasks needed to be boosted in any case.
This commit therefore adds a check of the ->boost_tasks pointer against
the ->exp_tasks pointer to the list that prevents updating ->boost_tasks.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
|
There is a need to use RCU from interrupt context, but either before
rcu_irq_enter() is called or after rcu_irq_exit() is called. If the
interrupt occurs from idle, then lockdep-RCU will complain about such
uses, as they appear to be illegal uses of RCU from the idle loop.
In other environments, RCU_NONIDLE() could be used to properly protect
the use of RCU, but RCU_NONIDLE() currently cannot be invoked except
from process context.
This commit therefore modifies RCU_NONIDLE() to permit its use more
globally.
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
When rcu_preempt_offline_tasks() clears tasks from a leaf rcu_node
structure, it does not NULL out the structure's ->boost_tasks field.
This commit therefore fixes this issue.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
|
Because TINY_RCU's idle detection keys directly off of the nesting
level, rather than from a separate variable as in TREE_RCU, the
TINY_RCU dyntick-idle tracing on transition to idle must happen
before the change to the nesting level. This commit therefore makes
this change by passing the desired new value (rather than the old value)
of the nesting level in to rcu_idle_enter_common().
[ paulmck: Add fix for wrong-variable bug spotted by
Michael Wang <wangyun@linux.vnet.ibm.com>. ]
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
|
There have been some recent bugs that were triggered only when
preemptible RCU's __rcu_read_unlock() was preempted just after setting
->rcu_read_lock_nesting to INT_MIN, which is a low-probability event.
Therefore, reproducing those bugs (to say nothing of gaining confidence
in alleged fixes) was quite difficult. This commit therefore creates
a new debug-only RCU kernel config option that forces a short delay
in __rcu_read_unlock() to increase the probability of those sorts of
bugs occurring.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
|
Each grace period is supposed to have at least one callback waiting
for that grace period to complete. However, if CONFIG_NO_HZ=n, an
extra callback-free grace period is no big problem -- it will chew up
a tiny bit of CPU time, but it will complete normally. In contrast,
CONFIG_NO_HZ=y kernels have the potential for all the CPUs to go to
sleep indefinitely, in turn indefinitely delaying completion of the
callback-free grace period. Given that nothing is waiting on this grace
period, this is also not a problem.
That is, unless RCU CPU stall warnings are also enabled, as they are
in recent kernels. In this case, if a CPU wakes up after at least one
minute of inactivity, an RCU CPU stall warning will result. The reason
that no one noticed until quite recently is that most systems have enough
OS noise that they will never remain absolutely idle for a full minute.
But there are some embedded systems with cut-down userspace configurations
that consistently get into this situation.
All this begs the question of exactly how a callback-free grace period
gets started in the first place. This can happen due to the fact that
CPUs do not necessarily agree on which grace period is in progress.
If a CPU still believes that the grace period that just completed is
still ongoing, it will believe that it has callbacks that need to wait for
another grace period, never mind the fact that the grace period that they
were waiting for just completed. This CPU can therefore erroneously
decide to start a new grace period. Note that this can happen in
TREE_RCU and TREE_PREEMPT_RCU even on a single-CPU system: Deadlock
considerations mean that the CPU that detected the end of the grace
period is not necessarily officially informed of this fact for some time.
Once this CPU notices that the earlier grace period completed, it will
invoke its callbacks. It then won't have any callbacks left. If no
other CPU has any callbacks, we now have a callback-free grace period.
This commit therefore makes CPUs check more carefully before starting a
new grace period. This new check relies on an array of tail pointers
into each CPU's list of callbacks. If the CPU is up to date on which
grace periods have completed, it checks to see if any callbacks follow
the RCU_DONE_TAIL segment, otherwise it checks to see if any callbacks
follow the RCU_WAIT_TAIL segment. The reason that this works is that
the RCU_WAIT_TAIL segment will be promoted to the RCU_DONE_TAIL segment
as soon as the CPU is officially notified that the old grace period
has ended.
This change is to cpu_needs_another_gp(), which is called in a number
of places. The only one that really matters is in rcu_start_gp(), where
the root rcu_node structure's ->lock is held, which prevents any
other CPU from starting or completing a grace period, so that the
comparison that determines whether the CPU is missing the completion
of a grace period is stable.
Reported-by: Becky Bruce <bgillbruce@gmail.com>
Reported-by: Subodh Nijsure <snijsure@grid-net.com>
Reported-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Paul Walmsley <paul@pwsan.com> # OMAP3730, OMAP4430
Cc: stable@vger.kernel.org
|
|
Tracepoints declare a static inline trace_*_rcuidle variant of the trace
function, to support safely generating trace events from the idle loop.
Module code never actually uses that variant of trace functions, because
modules don't run code that needs tracing with RCU idled. However, the
declaration of those otherwise unused functions causes the module to
reference rcu_idle_exit and rcu_idle_enter, which RCU does not export to
modules.
To avoid this, don't generate trace_*_rcuidle functions for tracepoints
declared in module code.
Link: http://lkml.kernel.org/r/20120905062306.GA14756@leaf
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
|
|
git://git.linaro.org/people/mszyprowski/linux-dma-mapping
Pull DMA-mapping fixes from Marek Szyprowski:
"Another set of fixes for ARM dma-mapping subsystem.
Commit e9da6e9905e6 replaced custom consistent buffer remapping code
with generic vmalloc areas. It however introduced some regressions
caused by limited support for allocations in atomic context. This
series contains fixes for those regressions.
For some subplatforms the default, pre-allocated pool for atomic
allocations turned out to be too small, so a function for setting its
size has been added.
Another set of patches adds support for atomic allocations to
IOMMU-aware DMA-mapping implementation.
The last part of this pull request contains two fixes for Contiguous
Memory Allocator, which relax too strict requirements."
* 'fixes-for-3.6' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
ARM: dma-mapping: IOMMU allocates pages from atomic_pool with GFP_ATOMIC
ARM: dma-mapping: Introduce __atomic_get_pages() for __iommu_get_pages()
ARM: dma-mapping: Refactor out to introduce __in_atomic_pool
ARM: dma-mapping: atomic_pool with struct page **pages
ARM: Kirkwood: increase atomic coherent pool size
ARM: DMA-Mapping: print warning when atomic coherent allocation fails
ARM: DMA-Mapping: add function for setting coherent pool size from platform code
ARM: relax conditions required for enabling Contiguous Memory Allocator
mm: cma: fix alignment requirements for contiguous regions
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input subsystem updates from Dmitry Torokhov.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: wacom - add support for EMR on Cintiq 24HD touch
Input: i8042 - add Gigabyte T1005 series netbooks to noloop table
Input: imx_keypad - reset the hardware before enabling
Input: edt-ft5x06 - fix build error when compiling wthout CONFIG_DEBUG_FS
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Pull HID updates from Jiri Kosina:
"It contains a fix for Eaton Ellipse MAX UPS from Alan Stern,
performance improvement (not processing debug data if noone is
interested), by Henrik Rydberg, and allowing tpkbd-driven devices to
work even with generic driver in a crippled mode, by Andres Freund."
* 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: tpkbd: work even if the new Lenovo Keyboard driver is not configured
HID: Only dump input if someone is listening
HID: add NOGET quirk for Eaton Ellipse MAX UPS
|
|
c1dcad2d32d0252e8a3023d20311b52a187ecda3 added a new driver configured by
HID_LENOVO_TPKBD but made the hid_have_special_driver entry non-optional which
lead to a recognized but non-working device if the new driver wasn't
configured (which is the correct default).
Signed-off-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull Xen bug-fixes from Konrad Rzeszutek Wilk:
* Fix for TLB flushing introduced in v3.6
* Fix Xen-SWIOTLB not using proper DMA mask - device had 64bit but
in a 32-bit kernel we need to allocate for coherent pages from a
32-bit pool.
* When trying to re-use P2M nodes we had a one-off error and triggered
a BUG_ON check with specific CONFIG_ option.
* When doing FLR in Xen-PCI-backend we would first do FLR then save the
PCI configuration space. We needed to do it the other way around.
* tag 'stable/for-linus-3.6-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/pciback: Fix proper FLR steps.
xen: Use correct masking in xen_swiotlb_alloc_coherent.
xen: fix logical error in tlb flushing
xen/p2m: Fix one-off error in checking the P2M tree directory.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
"Power management
- PCI/PM: Enable D3/D3cold by default for most devices
- PCI/PM: Keep parent bridge active when probing device
- PCI/PM: Fix config reg access for D3cold and bridge suspending
- PCI/PM: Add ABI document for sysfs file d3cold_allowed
Core
- PCI: Don't print anything while decoding is disabled"
* tag '3.6-pci-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: Don't print anything while decoding is disabled
PCI/PM: Add ABI document for sysfs file d3cold_allowed
PCI/PM: Fix config reg access for D3cold and bridge suspending
PCI/PM: Keep parent bridge active when probing device
PCI/PM: Enable D3/D3cold by default for most devices
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC bug fixes from Olof Johansson:
"Mostly Renesas and Atmel bugfixes this time, targeting boot and build
problems. A couple of patches for gemini and kirkwood as well. On a
whole nothing very controversial."
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: gemini: fix the gemini build
ARM: shmobile: armadillo800eva: enable rw rootfs mount
ARM: Kirkwood: Fix 'SZ_1M' undeclared here for db88f6281-bp-setup.c
ARM: shmobile: mackerel: fixup usb module order
ARM: shmobile: armadillo800eva: fixup: sound card detection order
ARM: shmobile: marzen: fixup smsc911x id for regulator
ARM: at91/feature-removal-schedule: delay at91_mci removal
ARM: mach-shmobile: armadillo800eva: Enable power button as wakeup source
ARM: mach-shmobile: armadillo800eva: Fix GPIO buttons descriptions
ARM: at91/dts: remove partial parameter in at91sam9g25ek.dts
ARM: at91/clock: fix PLLA overclock warning
ARM: at91: fix rtc-at91sam9 irq issue due to sparse irq support
ARM: at91: fix system timer irq issue due to sparse irq support
ARM: shmobile: sh73a0: fixup RELOC_BASE of intca_irq_pins_desc
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull a hwmon fix from Guenter Roeck:
"One patch, fixing DIV_ROUND_CLOSEST to support negative dividends.
While the changes are not in the drivers/hwmon directory, the problem
primarily affects hwmon drivers, and it makes sense to push the patch
through the hwmon tree."
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
linux/kernel.h: Fix DIV_ROUND_CLOSEST to support negative dividends
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kbuild fixes from Michal Marek:
"These are two fixes that should go into 3.6. The link-vmlinux.sh one
is obvious.
The other one fixes make firmware_install with certain configurations,
where a file in the toplevel firmware tree gets installed first, and
$(INSTALL_FW_PATH)/$$(dir <file>) results in /lib/firmware/./, which
confuses make 3.82 for some reason."
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
firmware: fix directory creation rule matching with make 3.82
link-vmlinux.sh: Fix stray "echo" in error message
|
|
Trivially triggerable, found by trinity:
kernel BUG at mm/mempolicy.c:2546!
Process trinity-child2 (pid: 23988, threadinfo ffff88010197e000, task ffff88007821a670)
Call Trace:
show_numa_map+0xd5/0x450
show_pid_numa_map+0x13/0x20
traverse+0xf2/0x230
seq_read+0x34b/0x3e0
vfs_read+0xac/0x180
sys_pread64+0xa2/0xc0
system_call_fastpath+0x1a/0x1f
RIP: mpol_to_str+0x156/0x360
Cc: stable@vger.kernel.org
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When we do FLR and save PCI config we did it in the wrong order.
The end result was that if a PCI device was unbind from
its driver, then binded to xen-pciback, and then back to its
driver we would get:
> lspci -s 04:00.0
04:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
13:42:12 # 4 :~/
> echo "0000:04:00.0" > /sys/bus/pci/drivers/pciback/unbind
> modprobe e1000e
e1000e: Intel(R) PRO/1000 Network Driver - 2.0.0-k
e1000e: Copyright(c) 1999 - 2012 Intel Corporation.
e1000e 0000:04:00.0: Disabling ASPM L0s L1
e1000e 0000:04:00.0: enabling device (0000 -> 0002)
xen: registering gsi 48 triggering 0 polarity 1
Already setup the GSI :48
e1000e 0000:04:00.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
e1000e: probe of 0000:04:00.0 failed with error -2
This fixes it by first saving the PCI configuration space, then
doing the FLR.
Reported-by: Ren, Yongjie <yongjie.ren@intel.com>
Reported-and-Tested-by: Tobias Geiger <tobias.geiger@vido.info>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: stable@vger.kernel.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
Pull MMC fixes from Chris Ball:
- a firmware bug on several Samsung MoviNAND eMMC models causes
permanent corruption on the device when secure erase and secure trim
requests are made, so we disable those requests on these eMMC devices.
- atmel-mci: fix a hang with some SD cards by waiting for not-busy flag.
- dw_mmc: low-power mode breaks SDIO interrupts; fix PIO error handling;
fix handling of error interrupts.
- mxs-mmc: fix deadlocks; fix compile error due to dma.h arch change.
- omap: fix broken PIO mode causing memory corruption.
- sdhci-esdhc: fix card detection.
* tag 'mmc-fixes-for-3.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
mmc: omap: fix broken PIO mode
mmc: card: Skip secure erase on MoviNAND; causes unrecoverable corruption.
mmc: dw_mmc: Disable low power mode if SDIO interrupts are used
mmc: dw_mmc: fix error handling in PIO mode
mmc: dw_mmc: correct mishandling error interrupt
mmc: dw_mmc: amend using error interrupt status
mmc: atmel-mci: not busy flag has also to be used for read operations
mmc: sdhci-esdhc: break out early if clock is 0
mmc: mxs-mmc: fix deadlock caused by recursion loop
mmc: mxs-mmc: fix deadlock in SDIO IRQ case
mmc: bfin_sdh: fix dma_desc_array build error
|
|
Fix the following compile error on UML.
arch/um/os-Linux/time.c: In function 'deliver_alarm':
arch/um/os-Linux/time.c:117:3: error: too few arguments to function 'alarm_handler'
arch/um/os-Linux/internal.h:1:6: note: declared here
The error was introduced by commit d3c1cfcd ("um: pass siginfo to guest
process") in 3.6-rc1.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Martin Pärtel <martin.partel@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Allocate a structure not a pointer to it !
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc fixes from Benjamin Herrenschmidt:
"Here are a few fixes for 3.6 that were piling up while I was away or
busy (I was mostly MIA a week or two before San Diego).
Some fixes from Anton fixing up issues with our relatively new DSCR
control feature, and a few other fixes that are either regressions or
bugs nasty enough to warrant not waiting."
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: Don't use __put_user() in patch_instruction
powerpc: Make sure IPI handlers see data written by IPI senders
powerpc: Restore correct DSCR in context switch
powerpc: Fix DSCR inheritance in copy_thread()
powerpc: Keep thread.dscr and thread.dscr_inherit in sync
powerpc: Update DSCR on all CPUs when writing sysfs dscr_default
powerpc/powernv: Always go into nap mode when CPU is offline
powerpc: Give hypervisor decrementer interrupts their own handler
powerpc/vphn: Fix arch_update_cpu_topology() return value
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
"These are some GPIO regression fixes for v3.6:
- Erroneous debug message from of_get_named_gpio_flags()
- Make sure the MC9S08DZ60 GPIO driver depend on I2C being compiled
in (not module) or allmodconfig breaks.
- Check return value from irq_alloc_descs() in the Emma Mobile GPIO
driver.
- Assign the owner field for the rdc321x driver so the module won't
be removed if it has active GPIOs."
* tag 'gpio-fixes-for-v3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: rdc321x: Prevent removal of modules exporting active GPIOs
gpio: em: Fix checking return value of irq_alloc_descs
gpio: mc9s08dz60: Fix build error if I2C=m
gpio: Fix debug message in of_get_named_gpio_flags()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"There are nothing scaring, contains only small fixes for HD-audio and
USB-audio:
- EPSS regression fix and GPIO fix for HD-audio IDT codecs
- A series of USB-audio regression fixes that are found since 3.5
kernel"
* tag 'sound-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: snd-usb: fix cross-interface streaming devices
ALSA: snd-usb: fix calls to next_packet_size
ALSA: snd-usb: restore delay information
ALSA: snd-usb: use list_for_each_safe for endpoint resources
ALSA: snd-usb: Fix URB cancellation at stream start
ALSA: hda - Don't trust codec EPSS bit for IDT 92HD83xx & co
ALSA: hda - Avoid unnecessary parameter read for EPSS
ALSA: hda - Do not set GPIOs for speakers on IDT if there are no speakers
|
|
Pull fbdev fixes from Florian Tobias Schandinat:
- a fix by Paul Cercueil to prevent a possible buffer overflow
- a fix by Bruno Prémont to prevent a rare sleep in invalid context
- a fix by Julia Lawall for a double free in auo_k190x
- a fix by Dan Carpenter to prevent a division by zero in mb862xxfb
- a regression fix by Tomi Valkeinen for the SDI output in OMAP
- a fix by Grazvydas Ignotas to fix the console colors in OMAP
* tag 'fbdev-fixes-for-3.6-1' of git://github.com/schandinat/linux-2.6:
OMAPFB: fix framebuffer console colors
OMAPDSS: Fix SDI PLL locking
video: mb862xxfb: prevent divide by zero bug
drivers/video/auo_k190x.c: drop kfree of devm_kzalloc's data
fbcon: Fix bit_putcs() call to kmalloc(s, GFP_KERNEL)
fbcon: prevent possible buffer overflow.
|
|
Pull ubi fix from Artem Bityutskiy:
"A single small fix for memory deallocation: we allocated memory using
'kmem_cache_alloc()' but were freeing it using 'kfree()' in some
cases. Now we fix this by using 'kmem_cache_free()' instead."
* tag 'upstream-3.6-rc5' of git://git.infradead.org/linux-ubi:
UBI: fix a horrible memory deallocation bug
|
|
Commit 644595f89620 ("compat: Handle COMPAT_USE_64BIT_TIME in
net/socket.c") introduced a bug where the helper functions to take
either a 64-bit or compat time[spec|val] got the arguments in the wrong
order, passing the kernel stack pointer off as a user pointer (and vice
versa).
Because of the user address range check, that in turn then causes an
EFAULT due to the user pointer range checking failing for the kernel
address. Incorrectly resuling in a failed system call for 32-bit
processes with a 64-bit kernel.
On odder architectures like HP-PA (with separate user/kernel address
spaces), it can be used read kernel memory.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When running 32-bit pvops-dom0 and a driver tries to allocate a coherent
DMA-memory the xen swiotlb-implementation returned memory beyond 4GB.
The underlaying reason is that if the supplied driver passes in a
DMA_BIT_MASK(64) ( hwdev->coherent_dma_mask is set to 0xffffffffffffffff)
our dma_mask will be u64 set to 0xffffffffffffffff even if we set it to
DMA_BIT_MASK(32) previously. Meaning we do not reset the upper bits.
By using the dma_alloc_coherent_mask function - it does the proper casting
and we get 0xfffffffff.
This caused not working sound on a system with 4 GB and a 64-bit
compatible sound-card with sets the DMA-mask to 64bit.
On bare-metal and the forward-ported xen-dom0 patches from OpenSuse a coherent
DMA-memory is always allocated inside the 32-bit address-range by calling
dma_alloc_coherent_mask.
This patch adds the same functionality to xen swiotlb and is a rebase of the
original patch from Ronny Hegewald which never got upstream b/c the
underlaying reason was not understood until now.
The original email with the original patch is in:
http://old-list-archives.xen.org/archives/html/xen-devel/2010-02/msg00038.html
the original thread from where the discussion started is in:
http://old-list-archives.xen.org/archives/html/xen-devel/2010-01/msg00928.html
Signed-off-by: Ronny Hegewald <ronny.hegewald@online.de>
Signed-off-by: Stefano Panella <stefano.panella@citrix.com>
Acked-By: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: stable@vger.kernel.org
|
|
While TLB_FLUSH_ALL gets passed as 'end' argument to
flush_tlb_others(), the Xen code was made to check its 'start'
parameter. That may give a incorrect op.cmd to MMUEXT_INVLPG_MULTI
instead of MMUEXT_TLB_FLUSH_MULTI. Then it causes some page can not
be flushed from TLB.
This patch fixed this issue.
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Alex Shi <alex.shi@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Yongjie Ren <yongjie.ren@intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
stable/for-linus-3.6
* commit '4cb38750d49010ae72e718d46605ac9ba5a851b4': (6849 commits)
bcma: fix invalid PMU chip control masks
[libata] pata_cmd64x: whitespace cleanup
libata-acpi: fix up for acpi_pm_device_sleep_state API
sata_dwc_460ex: device tree may specify dma_channel
ahci, trivial: fixed coding style issues related to braces
ahci_platform: add hibernation callbacks
libata-eh.c: local functions should not be exposed globally
libata-transport.c: local functions should not be exposed globally
sata_dwc_460ex: support hardreset
ata: use module_pci_driver
drivers/ata/pata_pcmcia.c: adjust suspicious bit operation
pata_imx: Convert to clk_prepare_enable/clk_disable_unprepare
ahci: Enable SB600 64bit DMA on MSI K9AGM2 (MS-7327) v2
[libata] Prevent interface errors with Seagate FreeAgent GoFlex
drivers/acpi/glue: revert accidental license-related 6b66d95895c bits
libata-acpi: add missing inlines in libata.h
i2c-omap: Add support for I2C_M_STOP message flag
i2c: Fall back to emulated SMBus if the operation isn't supported natively
i2c: Add SCCB support
i2c-tiny-usb: Add support for the Robofuzz OSIF USB/I2C converter
...
|
|
We would traverse the full P2M top directory (from 0->MAX_DOMAIN_PAGES
inclusive) when trying to figure out whether we can re-use some of the
P2M middle leafs.
Which meant that if the kernel was compiled with MAX_DOMAIN_PAGES=512
we would try to use the 512th entry. Fortunately for us the p2m_top_index
has a check for this:
BUG_ON(pfn >= MAX_P2M_PFN);
which we hit and saw this:
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-4.1.2-OVM x86_64 debug=n Tainted: C ]----
(XEN) CPU: 0
(XEN) RIP: e033:[<ffffffff819cadeb>]
(XEN) RFLAGS: 0000000000000212 EM: 1 CONTEXT: pv guest
(XEN) rax: ffffffff81db5000 rbx: ffffffff81db4000 rcx: 0000000000000000
(XEN) rdx: 0000000000480211 rsi: 0000000000000000 rdi: ffffffff81db4000
(XEN) rbp: ffffffff81793db8 rsp: ffffffff81793d38 r8: 0000000008000000
(XEN) r9: 4000000000000000 r10: 0000000000000000 r11: ffffffff81db7000
(XEN) r12: 0000000000000ff8 r13: ffffffff81df1ff8 r14: ffffffff81db6000
(XEN) r15: 0000000000000ff8 cr0: 000000008005003b cr4: 00000000000026f0
(XEN) cr3: 0000000661795000 cr2: 0000000000000000
Fixes-Oracle-Bug: 14570662
CC: stable@vger.kernel.org # only for v3.5
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
patch_instruction() can be called very early on ppc32, when the kernel
isn't yet running at it's linked address. That can cause the !
is_kernel_addr() test in __put_user() to trip and call might_sleep()
which is very bad at that point during boot.
Use a lower level function instead for now, at least until we get to
rework ppc32 boot process to do the code patching later, like ppc64
does.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
We have been observing hangs, both of KVM guest vcpu tasks and more
generally, where a process that is woken doesn't properly wake up and
continue to run, but instead sticks in TASK_WAKING state. This
happens because the update of rq->wake_list in ttwu_queue_remote()
is not ordered with the update of ipi_message in
smp_muxed_ipi_message_pass(), and the reading of rq->wake_list in
scheduler_ipi() is not ordered with the reading of ipi_message in
smp_ipi_demux(). Thus it is possible for the IPI receiver not to see
the updated rq->wake_list and therefore conclude that there is nothing
for it to do.
In order to make sure that anything done before smp_send_reschedule()
is ordered before anything done in the resulting call to scheduler_ipi(),
this adds barriers in smp_muxed_message_pass() and smp_ipi_demux().
The barrier in smp_muxed_message_pass() is a full barrier to ensure that
there is a full ordering between the smp_send_reschedule() caller and
scheduler_ipi(). In smp_ipi_demux(), we use xchg() rather than
xchg_local() because xchg() includes release and acquire barriers.
Using xchg() rather than xchg_local() makes sense given that
ipi_message is not just accessed locally.
This moves the barrier between setting the message and calling the
cause_ipi() function into the individual cause_ipi implementations.
Most of them -- those that used outb, out_8 or similar -- already had
a full barrier because out_8 etc. include a sync before the MMIO
store. This adds an explicit barrier in the two remaining cases.
These changes made no measurable difference to the speed of IPIs as
measured using a simple ping-pong latency test across two CPUs on
different cores of a POWER7 machine.
The analysis of the reason why processes were not waking up properly
is due to Milton Miller.
Cc: stable@vger.kernel.org # v3.0+
Reported-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
During a context switch we always restore the per thread DSCR value.
If we aren't doing explicit DSCR management
(ie thread.dscr_inherit == 0) and the default DSCR changed while
the process has been sleeping we end up with the wrong value.
Check thread.dscr_inherit and select the default DSCR or per thread
DSCR as required.
This was found with the following test case, when running with
more threads than CPUs (ie forcing context switching):
http://ozlabs.org/~anton/junkcode/dscr_default_test.c
With the four patches applied I can run a combination of all
test cases successfully at the same time:
http://ozlabs.org/~anton/junkcode/dscr_default_test.c
http://ozlabs.org/~anton/junkcode/dscr_explicit_test.c
http://ozlabs.org/~anton/junkcode/dscr_inherit_test.c
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
If the default DSCR is non zero we set thread.dscr_inherit in
copy_thread() meaning the new thread and all its children will ignore
future updates to the default DSCR. This is not intended and is
a change in behaviour that a number of our users have hit.
We just need to inherit thread.dscr and thread.dscr_inherit from
the parent which ends up being much simpler.
This was found with the following test case:
http://ozlabs.org/~anton/junkcode/dscr_default_test.c
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
When we update the DSCR either via emulation of mtspr(DSCR) or via
a change to dscr_default in sysfs we don't update thread.dscr.
We will eventually update it at context switch time but there is
a period where thread.dscr is incorrect.
If we fork at this point we will copy the old value of thread.dscr
into the child. To avoid this, always keep thread.dscr in sync with
reality.
This issue was found with the following testcase:
http://ozlabs.org/~anton/junkcode/dscr_inherit_test.c
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
Writing to dscr_default in sysfs doesn't actually change the DSCR -
we rely on a context switch on each CPU to do the work. There is no
guarantee we will get a context switch in a reasonable amount of time
so fire off an IPI to force an immediate change.
This issue was found with the following test case:
http://ozlabs.org/~anton/junkcode/dscr_explicit_test.c
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> # 3.0+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The CPU hotplug code for the powernv platform currently only puts
offline CPUs into nap mode if the powersave_nap variable is set.
However, HV-style KVM on this platform requires secondary CPU threads
to be offline and in nap mode. Since we know nap mode works just
fine on all POWER7 machines, and the only machines that support the
powernv platform are POWER7 machines, this changes the code to
always put offline CPUs into nap mode, regardless of powersave_nap.
Powersave_nap still controls whether or not CPUs go into nap mode
when idle, as before.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
At the moment the handler for hypervisor decrementer interrupts is
the same as for decrementer interrupts, i.e. timer_interrupt().
This is bogus; if we ever do get a hypervisor decrementer interrupt
it won't have anything to do with the next timer event. In fact
the only time we get hypervisor decrementer interrupts is when one
is left pending on exit from a KVM guest.
When we get a hypervisor decrementer interrupt we don't need to do
anything special to clear it, since they are edge-triggered on the
transition of HDEC from 0 to -1. Thus this adds an empty handler
function for them. We don't need to have them masked when interrupts
are soft-disabled, so we use STD_EXCEPTION_HV instead of
MASKABLE_EXCEPTION_HV.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|