Age | Commit message (Collapse) | Author | Files | Lines |
|
The proc interface is not aware of sem_lock(), it instead calls
ipc_lock_object() directly. This means that simple semop() operations
can run in parallel with the proc interface. Right now, this is
uncritical, because the implementation doesn't do anything that requires
a proper synchronization.
But it is dangerous and therefore should be fixed.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Operations that need access to the whole array must guarantee that there
are no simple operations ongoing. Right now this is achieved by
spin_unlock_wait(sem->lock) on all semaphores.
If complex_count is nonzero, then this spin_unlock_wait() is not
necessary, because it was already performed in the past by the thread
that increased complex_count and even though sem_perm.lock was dropped
inbetween, no simple operation could have started, because simple
operations cannot start when complex_count is non-zero.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Rik van Riel <riel@redhat.com>
Reviewed-by: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The exclusion of complex operations in sem_lock() is insufficient: after
acquiring the per-semaphore lock, a simple op must first check that
sem_perm.lock is not locked and only after that test check
complex_count. The current code does it the other way around - and that
creates a race. Details are below.
The patch is a complete rewrite of sem_lock(), based in part on the code
from Mike Galbraith. It removes all gotos and all loops and thus the
risk of livelocks.
I have tested the patch (together with the next one) on my i3 laptop and
it didn't cause any problems.
The bug is probably also present in 3.10 and 3.11, but for these kernels
it might be simpler just to move the test of sma->complex_count after
the spin_is_locked() test.
Details of the bug:
Assume:
- sma->complex_count = 0.
- Thread 1: semtimedop(complex op that must sleep)
- Thread 2: semtimedop(simple op).
Pseudo-Trace:
Thread 1: sem_lock(): acquire sem_perm.lock
Thread 1: sem_lock(): check for ongoing simple ops
Nothing ongoing, thread 2 is still before sem_lock().
Thread 1: try_atomic_semop()
<<< preempted.
Thread 2: sem_lock():
static inline int sem_lock(struct sem_array *sma, struct sembuf *sops,
int nsops)
{
int locknum;
again:
if (nsops == 1 && !sma->complex_count) {
struct sem *sem = sma->sem_base + sops->sem_num;
/* Lock just the semaphore we are interested in. */
spin_lock(&sem->lock);
/*
* If sma->complex_count was set while we were spinning,
* we may need to look at things we did not lock here.
*/
if (unlikely(sma->complex_count)) {
spin_unlock(&sem->lock);
goto lock_array;
}
<<<<<<<<<
<<< complex_count is still 0.
<<<
<<< Here it is preempted
<<<<<<<<<
Thread 1: try_atomic_semop() returns, notices that it must sleep.
Thread 1: increases sma->complex_count.
Thread 1: drops sem_perm.lock
Thread 2:
/*
* Another process is holding the global lock on the
* sem_array; we cannot enter our critical section,
* but have to wait for the global lock to be released.
*/
if (unlikely(spin_is_locked(&sma->sem_perm.lock))) {
spin_unlock(&sem->lock);
spin_unlock_wait(&sma->sem_perm.lock);
goto again;
}
<<< sem_perm.lock already dropped, thus no "goto again;"
locknum = sops->sem_num;
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: <stable@vger.kernel.org> [3.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
We've been getting warnings about an excessive amount of time spent
allocating pages for migration during memory compaction without
scheduling. isolate_freepages_block() already periodically checks for
contended locks or the need to schedule, but isolate_freepages() never
does.
When a zone is massively long and no suitable targets can be found, this
iteration can be quite expensive without ever doing cond_resched().
Check periodically for the need to reschedule while the compaction free
scanner iterates.
Signed-off-by: David Rientjes <rientjes@google.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
A high setting of max_map_count, and a process core-dumping with a large
enough vm_map_count could result in an NT_FILE note not being written,
and the kernel crashing immediately later because it has assumed
otherwise.
Reproduction of the oops-causing bug described here:
https://lkml.org/lkml/2013/8/30/50
Rge ussue originated in commit 2aa362c49c31 ("coredump: extend core dump
note section to contain file names of mapped file") from Oct 4, 2012.
This patch make that section optional in that case. fill_files_note()
should signify the error, and also let the info struct in
elf_core_dump() be zero-initialized so that we can check for the
optionally written note.
[akpm@linux-foundation.org: avoid abusing E2BIG, remove a couple of not-really-needed local variables]
[akpm@linux-foundation.org: fix sparse warning]
Signed-off-by: Dan Aloni <alonid@stratoscale.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Reported-by: Martin MOKREJS <mmokrejs@gmail.com>
Tested-by: Martin MOKREJS <mmokrejs@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This reverts commit cea27eb2a202 ("mm/memory-hotplug: fix lowmem count
overflow when offline pages").
The fixed bug by commit cea27eb was fixed to another way by commit
3dcc0571cd64 ("mm: correctly update zone->managed_pages"). That commit
enhances memory_hotplug.c to adjust totalhigh_pages when hot-removing
memory, for details please refer to:
http://marc.info/?l=linux-mm&m=136957578620221&w=2
As a result, commit cea27eb2a202 currently causes duplicated decreasing
of totalhigh_pages, thus the revert.
Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Signed-off-by: Denis Ciocca <denis.ciocca@st.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
|
|
Remove the the debugfs entries in iio_device_unregister(). Otherwise the debugfs
entries might still be accessible even though the device used in the debugfs
callback has already been freed.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
|
|
When a packet is passed from mac80211 to the driver with the
IEEE80211_TX_CTL_PS_RESPONSE flag set, it bypasses the normal driver
internal queueing and goes directly to the UAPSD queue.
When that happens, packets that are part of a BlockAck session still
need to be tracked as such inside the driver, otherwise it will create
discrepancies in the receiver BA reorder window, causing traffic stalls.
This only happens in AP mode with powersave-enabled clients.
This patch fixes the regression introduced in the commit
"ath9k: use software queues for un-aggregated data packets"
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
601216e "mwifiex: process RX packets in SDIO IRQ thread directly"
introduced a command timeout issue which can be reproduced easily on
an AM33xx platform using a test application written by Daniel Mack:
https://gist.github.com/zonque/6579314
mwifiex_main_process() is called from both the SDIO handler and
the workqueue. In case an interrupt occurs right after the
int_status check, but before updating the mwifiex_processing flag,
this interrupt gets lost, resulting in a command timeout and
consequently a card reset.
Let main_proc_lock protect both int_status and mwifiex_processing
flag. This fixes the interrupt lost issue.
Cc: <stable@vger.kernel.org> # 3.7+
Reported-by: Sven Neumann <s.neumann@raumfeld.com>
Reported-by: Andreas Fenkart <andreas.fenkart@streamunlimited.com>
Tested-by: Daniel Mack <zonque@gmail.com>
Reviewed-by: Dylan Reid <dgreid@chromium.org>
Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Bing Zhao <bzhao@marvell.com>
Signed-off-by: Paul Stewart <pstew@chromium.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
This reverts commit 9483f40d8d01918b399b4e24d0c1111db0afffeb.
Some devices stop to connect with above commit, see:
https://bugzilla.kernel.org/show_bug.cgi?id=61621
Since there is no clear benefit of having MSI enabled, just revert
change to fix the problem.
Cc: stable@vger.kernel.org # 3.11+
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Jakub Kicinski <kubakici@wp.pl>
Acked-by: Gertjan van Wingerde <gwingerde@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
|
|
fq_reset() should drops all packets in queue, including
throttled flows.
This patch moves code from fq_destroy() to fq_reset()
to do the cleaning.
fq_change() must stop calling fq_dequeue() if all remaining
packets are from throttled flows.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In commit 8ed781668dd49 ("flow_keys: include thoff into flow_keys for
later usage"), we missed that existing code was using nhoff as a
temporary variable that could not always contain transport header
offset.
This is not a problem for TCP/UDP because port offset (@poff)
is 0 for these protocols.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When the frontend state changes netback now specifies its desired state to
a new function, set_backend_state(), which transitions through any
necessary intermediate states.
This fixes an issue observed with some old Windows frontend drivers where
they failed to transition through the Closing state and netback would not
behave correctly.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Consider the scenario where an IPv6 router is advertising a fixed
preferred_lft of 1800 seconds, while the valid_lft begins at 3600
seconds and counts down in realtime.
A client should reset its preferred_lft to 1800 every time the RA is
received, but a bug is causing Linux to ignore the update.
The core problem is here:
if (prefered_lft != ifp->prefered_lft) {
Note that ifp->prefered_lft is an offset, so it doesn't decrease over
time. Thus, the comparison is always (1800 != 1800), which fails to
trigger an update.
The most direct solution would be to compute a "stored_prefered_lft",
and use that value in the comparison. But I think that trying to filter
out unnecessary updates here is a premature optimization. In order for
the filter to apply, both of these would need to hold:
- The advertised valid_lft and preferred_lft are both declining in
real time.
- No clock skew exists between the router & client.
So in this patch, I've set "update_lft = 1" unconditionally, which
allows the surrounding code to be greatly simplified.
Signed-off-by: Paul Marks <pmarks@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
While sending packet skb_cow_head() can change skb header which
invalidates inner_iph pointer to skb header. Following patch
avoid using it. Found by code inspection.
This bug was introduced by commit 0e6fbc5b6c6218 (ip_tunnels: extend
iptunnel_xmit()).
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Cinterion PLXX LTE devices have a 0x0060 product ID, not 0x12d1.
The blacklisting in the serial/option driver does actually use the correct PID,
as per commit 8ff10bdb14a52e3f25d4ce09e0582a8684c1a6db ('USB: Blacklisted
Cinterion's PLxx WWAN Interface').
CC: Hans-Christoph Schemmel <hans-christoph.schemmel@gemalto.com>
CC: Christian Schmiedl <christian.schmiedl@gemalto.com>
CC: Nicolaus Colberg <nicolaus.colberg@gemalto.com>
Signed-off-by: Aleksander Morgado <aleksander@lanedo.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Acked-by: Christian Schmiedl <christian.schmiedl@gemalto.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently IEEE 1588 timer reference clock source is determined through
hard-coded value in gianfar_ptp driver. This patch allows to select ptp
clock source by means of device tree file node.
For instance:
fsl,cksel = <0>;
for using external (TSEC_TMR_CLK input) high precision timer
reference clock.
Other acceptable values:
<1> : eTSEC system clock
<2> : eTSEC1 transmit clock
<3> : RTC clock input
When this attribute isn't used, eTSEC system clock will serve as
IEEE 1588 timer reference clock.
Signed-off-by: Aida Mynzhasova <aida.mynzhasova@skitlab.ru>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
__initdata tag should not be placed between "struct" and "resource"
because it prevents the variable from being placed in the intended
.init.data section. Fix it.
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Convert printks to pr_* format. Additionally re-use PREFIX constant instead of
hardcoded strings.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Use of RCU api makes vxlan code easier to understand. It also
fixes bug due to missing ACCESS_ONCE() on sk_user_data dereference.
In rare case without ACCESS_ONCE() compiler might omit vs on
sk_user_data dereference.
Compiler can use vs as alias for sk->sk_user_data, resulting in
multiple sk_user_data dereference in rcu read context which
could change.
CC: Jesse Gross <jesse@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"Quite a few fixes here, mostly small driver specific ones.
The stand out thing is a fix for errors generating the documentation
from Randy Dunlap, otherwise unless you're using the driver in
question there should be no impact"
* tag 'regulator-v3.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: ti-abb: Fix bias voltage glitch in transition to bypass mode
regulator: wm831x-ldo: Fix max_uV for gp_ldo and aldo linear range settings
regulator: wm8350: correct the max_uV of LDO
regulator: fix fatal kernel-doc error
regulator: palmas: Remove wrong comment for the equation calculating num_voltages
regulator: da9063: Fix PTR_ERR/ERR_PTR mismatch
regulator: palmas: configure enable time for LDOs
regulator: palmas: fix the n_voltages for smps to 122
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull apparmor fixes from James Morris:
"Bugfixes for the Apparmor code for regressions introduced in the 3.12
pull request"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
apparmor: fix suspicious RCU usage warning in policy.c/policy.h
apparmor: Use shash crypto API interface for profile hashes
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull assorted vfs fixes from Al Viro:
"A couple of bug fixes + removal of dead code in afs ->d_revalidate()"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
afs: dget_parent() can't return a negative dentry
ocfs2: needs ->d_lock to poke in ->d_parent->d_inode from ->d_revalidate()
sysv: Add forgotten superblock lock init for v7 fs
|
|
Since the patch "cpufreq: cpufreq-cpu0: NULL is a valid regulator", cpu_reg
contains an error value if the regulator is not set, instead of NULL.
Accordingly, fix the remaining check for non-NULL cpu_reg.
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
'clk_round_rate' returns a negative error code upon failure. This
will never get detected by unsigned 'newfreq'. Make it signed.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Use kobject_init_and_add() since we have nothing special to do between
kobject_init() and kobject_add().
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
kobject_add() sets the parent pointer, so we don't need to do it
explicitly.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Set the kobject name via kobject_add() instead of using kobject_set_name(),
which is deprecated per Documentation/kobject.txt.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
cpu_hotplug_driver_lock() serializes CPU online/offline operations
when ARCH_CPU_PROBE_RELEASE is set. This lock interface is no longer
necessary with the following reason:
- lock_device_hotplug() now protects CPU online/offline operations,
including the probe & release interfaces enabled by
ARCH_CPU_PROBE_RELEASE. The use of cpu_hotplug_driver_lock() is
redundant.
- cpu_hotplug_driver_lock() is only valid when ARCH_CPU_PROBE_RELEASE
is defined, which is misleading and is only enabled on powerpc.
This patch removes the cpu_hotplug_driver_lock() interface. As
a result, ARCH_CPU_PROBE_RELEASE only enables / disables the cpu
probe & release interface as intended. There is no functional change
in this patch.
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
This patch only introduces indentation cleanups. No functional changes.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
This (trivial) patch:
1. Deletes duplicate Kconfig dependency as there is "if IPMI_HANDLER"
around "IPMI_SI".
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
This (trivial) patch:
1. Deletes several useless header inclusions.
2. Kernel codes should always include <linux/acpi.h> instead of
<acpi/acpi_bus.h> or <acpi/acpi_drivers.h> where many conditional
declarations are handled.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
This (trivial) patch.
1. Changes dynamic mutex initialization to static initialization.
2. Removes one acpi_ipmi_init() variable initialization as it is not
needed.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
This (trivial) patch:
1. Deletes a member of the acpi_ipmi_device, smi_data, which is not
actually used.
2. Updates a member of the acpi_ipmi_device, pnp_dev, which is only used
by dev_warn() invocations, so changes it to a struct device.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
This patch adds reference counting for ACPI IPMI transfers to tune the
locking granularity of tx_msg_lock.
This patch also makes the whole acpi_ipmi module's coding style consistent
by using reference counting for all its objects (i.e., acpi_ipmi_device and
acpi_ipmi_msg).
The acpi_ipmi_msg handling is re-designed using referece counting.
1. tx_msg is always unlinked before complete(), so that it is safe to put
complete() out side of tx_msg_lock.
2. tx_msg reference counters are incremented before calling
ipmi_request_settime() and tx_msg_lock protection is added to
ipmi_cancel_tx_msg() so that a complete() can be safely called in
parellel with tx_msg unlinking in failure cases.
3. tx_msg holds a reference to acpi_ipmi_device so that it can be flushed
and freed in the contexts other than acpi_ipmi_space_handler().
The lockdep_chains shows all acpi_ipmi locks are leaf locks after the
tuning:
1. ipmi_lock is always leaf:
irq_context: 0
[ffffffff81a943f8] smi_watchers_mutex
[ffffffffa06eca60] driver_data.ipmi_lock
irq_context: 0
[ffffffff82767b40] &buffer->mutex
[ffffffffa00a6678] s_active#103
[ffffffffa06eca60] driver_data.ipmi_lock
2. without this patch applied, lock used by complete() is held after
holding tx_msg_lock:
irq_context: 0
[ffffffff82767b40] &buffer->mutex
[ffffffffa00a6678] s_active#103
[ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock
irq_context: 1
[ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock
irq_context: 1
[ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock
[ffffffffa06eccf0] &x->wait#25
irq_context: 1
[ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock
[ffffffffa06eccf0] &x->wait#25
[ffffffff81e36620] &p->pi_lock
irq_context: 1
[ffffffffa06ecce8] &(&ipmi_device->tx_msg_lock)->rlock
[ffffffffa06eccf0] &x->wait#25
[ffffffff81e36620] &p->pi_lock
[ffffffff81e5d0a8] &rq->lock
3. with this patch applied, tx_msg_lock is always leaf:
irq_context: 0
[ffffffff82767b40] &buffer->mutex
[ffffffffa00a66d8] s_active#107
[ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock
irq_context: 1
[ffffffffa07ecdc8] &(&ipmi_device->tx_msg_lock)->rlock
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
It is found on a real machine, in its ACPI namespace, the IPMI
OperationRegions (in the ACPI000D - ACPI power meter) are not defined under
the IPMI system interface device (the IPI0001 with KCS type returned from
_IFT control method):
Device (PMI0)
{
Name (_HID, "ACPI000D") // _HID: Hardware ID
OperationRegion (SYSI, IPMI, 0x0600, 0x0100)
Field (SYSI, BufferAcc, Lock, Preserve)
{
AccessAs (BufferAcc, 0x01),
Offset (0x58),
SCMD, 8,
GCMD, 8
}
OperationRegion (POWR, IPMI, 0x3000, 0x0100)
Field (POWR, BufferAcc, Lock, Preserve)
{
AccessAs (BufferAcc, 0x01),
Offset (0xB3),
GPMM, 8
}
}
Device (PCI0)
{
Device (ISA)
{
Device (NIPM)
{
Name (_HID, EisaId ("IPI0001")) // _HID: Hardware ID
Method (_IFT, 0, NotSerialized) // _IFT: IPMI Interface Type
{
Return (0x01)
}
}
}
}
Current ACPI_IPMI code registers IPMI operation region handler on a
per-device basis, so for the above namespace the IPMI operation region
handler is registered only under the scope of \_SB.PCI0.ISA.NIPM. Thus
when an IPMI operation region field of \PMI0 is accessed, there are errors
reported on such platform:
ACPI Error: No handlers for Region [IPMI]
ACPI Error: Region IPMI(7) has no handler
The solution is to install an IPMI operation region handler from root node
so that every object that defines IPMI OperationRegion can get an address
space handler registered.
When an IPMI operation region field is accessed, the Network Function
(0x06 for SYSI and 0x30 for POWR) and the Command (SCMD, GCMD, GPMM) are
passed to the operation region handler, there is no system interface
specified by the BIOS. The patch tries to select one system interface by
monitoring the system interface notification. IPMI messages passed from
the ACPI codes are sent to this selected global IPMI system interface.
The ACPI_IPMI will always select the first registered IPMI interface
with an ACPI handle (i.e., defined in the ACPI namespace). It's hard to
determine the selection when there are multiple IPMI system interfaces
defined in the ACPI namespace. According to the IPMI specification:
A BMC device may make available multiple system interfaces, but only one
management controller is allowed to be 'active' BMC that provides BMC
functionality for the system (in case of a 'partitioned' system, there
can be only one active BMC per partition). Only the system interface(s)
for the active BMC allowed to respond to the 'Get Device Id' command.
According to the ipmi_si desigin:
The ipmi_si registeration notifications can only happen after a
successful "Get Device ID" command.
Thus it should be OK for non-partitioned systems to do such selection.
However, we do not have much knowledge on 'partitioned' systems.
References: https://bugzilla.kernel.org/show_bug.cgi?id=46741
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
This patch uses reference counting to fix the race caused by the
unprotected ACPI IPMI user.
There are two rules for using the ipmi_si APIs:
1. In ipmi_si, ipmi_destroy_user() can ensure that no ipmi_recv_msg will
be passed to ipmi_msg_handler(), but ipmi_request_settime() can not
use an invalid ipmi_user_t. This means the ipmi_si users must ensure
that there won't be any local references on ipmi_user_t before invoking
ipmi_destroy_user().
2. In ipmi_si, the smi_gone()/new_smi() callbacks are protected by
smi_watchers_mutex, so their execution is serialized. But as a
new smi can re-use a freed intf_num, it requires that the callback
implementation must not use intf_num as an identification mean or it
must ensure all references to the previous smi are all dropped before
exiting smi_gone() callback.
As the acpi_ipmi_device->user_interface check in acpi_ipmi_space_handler()
can happen before setting user_interface to NULL and codes after the check
in acpi_ipmi_space_handler() can happen after user_interface becomes NULL,
the on-going acpi_ipmi_space_handler() still can pass an invalid
acpi_ipmi_device->user_interface to ipmi_request_settime(). Such race
conditions are not allowed by the IPMI layer's API design as a crash will
happen in ipmi_request_settime() if something like that happens.
This patch follows the ipmi_devintf.c design:
1. Invoke ipmi_destroy_user() after the reference count of
acpi_ipmi_device drops to 0. References of acpi_ipmi_device dropping
to 0 also means tx_msg related to this acpi_ipmi_device are all freed.
This matches the IPMI layer's API calling rule on ipmi_destroy_user()
and ipmi_request_settime().
2. ipmi_flush_tx_msg() is performed so that no on-going tx_msg can still be
running in acpi_ipmi_space_handler(). And it is invoked after invoking
__ipmi_dev_kill() where acpi_ipmi_device is deleted from the list with a
"dead" flag set, and the "dead" flag check is also introduced to the
point where a tx_msg is going to be added to the tx_msg_list so that no
new tx_msg can be created after returning from the __ipmi_dev_kill().
3. The waiting codes in ipmi_flush_tx_msg() is deleted because it is not
required since this patch ensures no acpi_ipmi reference is still held
for ipmi_user_t before calling ipmi_destroy_user() and
ipmi_destroy_user() can ensure no more ipmi_msg_handler() can happen
after returning from ipmi_destroy_user().
4. The flushing of tx_msg is also moved out of ipmi_lock in this patch.
The forthcoming IPMI operation region handler installation changes also
requires acpi_ipmi_device be handled in this style.
The header comment of the file is also updated due to this design change.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
This patch fixes races caused by timed out ACPI IPMI transfers.
This patch uses timeout mechanism provided by ipmi_si to avoid the race
that the msg_done flag is set but without any protection, its content can
be invalid. Thanks for the suggestion of Corey Minyard.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
This patch fixes races caused by unprotected ACPI IPMI transfers.
We can see that the following crashes may occur:
1. There is no tx_msg_lock held for iterating tx_msg_list in
ipmi_flush_tx_msg() while it may be unlinked on failure in
parallel in acpi_ipmi_space_handler() under tx_msg_lock.
2. There is no lock held for freeing tx_msg in acpi_ipmi_space_handler()
while it may be accessed in parallel in ipmi_flush_tx_msg() and
ipmi_msg_handler().
This patch enhances tx_msg_lock to protect all tx_msg accesses to solve
this issue. Then tx_msg_lock is always held around complete() and tx_msg
accesses.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
This patch enhances sanity checks on message size to avoid potential buffer
overflow.
The kernel IPMI message size is IPMI_MAX_MSG_LENGTH(272 bytes) while the
ACPI specification defined IPMI message size is 64 bytes. The difference
is not handled by the original codes. This may cause crash in the response
handling codes.
This patch closes this gap and also combines rx_data/tx_data to use single
data/len pair since they need not be seperate.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Recent commit 8fd37a4 (PM / hibernate: Create memory bitmaps after
freezing user space) broke the resume part of the user space driven
hibernation (s2disk), because I forgot that the resume utility
loaded the image into memory without freezing user space (it still
freezes tasks after loading the image). This means that during user
space driven resume we need to create the memory bitmaps at the
"device open" time rather than at the "freeze tasks" time, so make
that happen (that's a special case anyway, so it needs to be treated
in a special way).
Reported-and-tested-by: Ronald <ronald645@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32
Pull AVR32 fixes from Hans-Christian Egtvedt.
Fix build warnings and use the Kbuild infrastructure for generic headers
rather than doing it by hand.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32:
avr32: cast syscall_return to silence compiler warning
avr32: fix clockevents kernel warning
avr32: use Kbuild infrastructure to handle the asm-generic headers
|
|
Pull S+core fixes from Lennox Wu:
"These updates include updating information of maintainers, fix some
trivial errors, and add a necessary function for supporting ipv6"
* tag 'for-linus-20130929' of git://github.com/sctscore/official-linux:
Score: Update the information of Score maintaners
Score: Modify the Makefile of Score, remove -mlong-calls for compiling
Score: Implement the function csum_ipv6_magic
Score: The commit is for compiling successfully
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC Fixes from Vineet Gupta:
- Handle unaligned access in zero delay loops
- spinlock livelock fix for SMP systemC model
- fix 32bit overflow in access_ok
- better setup of clockevents
* tag 'arc-fixes-for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: Use clockevents_config_and_register over clockevents_register_device
ARC: Workaround spinlock livelock in SMP SystemC simulation
ARC: Fix 32-bit wrap around in access_ok()
ARC: Handle zero-overhead-loop in unaligned access handler
|
|
Like we are doing on DDR0 we need to cleanly shutdown DDR1 if it is
used before rebooting.
If DDR1 is not initialized, we check it and avoid dereferencing its address.
Even by adding two more instructions, we are able to complete the procedure
within a single cache line.
Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
Change my email to kernel.org which is easier for me to catch.
Signed-off-by: Barry Song <Baohua.Song@csr.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
Take more drivers into maintain list of CSR SiRF SoC machines.
Signed-off-by: Barry Song <Baohua.Song@csr.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
|