Age | Commit message (Collapse) | Author | Files | Lines |
|
'torture.2015.10.06a' into HEAD
doc.2015.10.06a: Documentation updates.
percpu-rwsem.2015.10.06a: Optimization of per-CPU reader-writer semaphores.
torture.2015.10.06a: Torture-test updates.
|
|
exp.2015.10.07a: Reduce OS jitter of RCU-sched expedited grace periods.
fixes.2015.10.06a: Miscellaneous fixes.
|
|
Now that synchronize_sched_expedited() no longer uses it, there are
no users of try_get_online_cpus() in mainline. This commit therefore
removes it.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
|
|
1. Rename __rcu_sync_is_idle() to rcu_sync_lockdep_assert() and
change it to use rcu_lockdep_assert().
2. Change rcu_sync_is_idle() to return rsp->gp_state == GP_IDLE
unconditonally, this way we can remove the same check from
rcu_sync_lockdep_assert() and clearly isolate the debugging
code.
Note: rcu_sync_enter()->wait_event(gp_state == GP_PASSED) needs
another CONFIG_PROVE_RCU check, the same as is done in ->sync(); but
this needs some simple preparations in the core RCU code to avoid the
code duplication.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
|
Currently down_write/up_write calls synchronize_sched_expedited()
twice, which is evil. Change this code to rely on rcu-sync primitives.
This avoids the _expedited "big hammer", and this can be faster in
the contended case or even in the case when a single thread does
down_write/up_write in a loop.
Of course, a single down_write() will take more time, but otoh it
will be much more friendly to the whole system.
To simplify the review this patch doesn't update the comments, fixed
by the next change.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
|
This commit allows rcu_sync structures to be safely deallocated,
The trick is to add a new ->wait field to the gp_ops array.
This field is a pointer to the rcu_barrier() function corresponding
to the flavor of RCU in question. This allows a new rcu_sync_dtor()
to wait for any outstanding callbacks before freeing the rcu_sync
structure.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
|
This commit validates that the caller of rcu_sync_is_idle() holds the
corresponding type of RCU read-side lock, but only in kernels built
with CONFIG_PROVE_RCU=y. This validation is carried out via a new
rcu_sync_ops->held() method that is checked within rcu_sync_is_idle().
Note that although this does add code to the fast path, it only does so
in kernels built with CONFIG_PROVE_RCU=y.
Suggested-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
|
This commit adds the new struct rcu_sync_ops which holds sync/call
methods, and turns the function pointers in rcu_sync_struct into an array
of struct rcu_sync_ops. This simplifies the "init" helpers by collapsing
a switch statement and explicit multiple definitions into a simple
assignment and a helper macro, respectively.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
|
The rcu_sync infrastructure can be thought of as infrastructure to be
used to implement reader-writer primitives having extremely lightweight
readers during times when there are no writers. The first use is in
the percpu_rwsem used by the VFS subsystem.
This infrastructure is functionally equivalent to
struct rcu_sync_struct {
atomic_t counter;
};
/* Check possibility of fast-path read-side operations. */
static inline bool rcu_sync_is_idle(struct rcu_sync_struct *rss)
{
return atomic_read(&rss->counter) == 0;
}
/* Tell readers to use slowpaths. */
static inline void rcu_sync_enter(struct rcu_sync_struct *rss)
{
atomic_inc(&rss->counter);
synchronize_sched();
}
/* Allow readers to once again use fastpaths. */
static inline void rcu_sync_exit(struct rcu_sync_struct *rss)
{
synchronize_sched();
atomic_dec(&rss->counter);
}
The main difference is that it records the state and only calls
synchronize_sched() if required. At least some of the calls to
synchronize_sched() will be optimized away when rcu_sync_enter() and
rcu_sync_exit() are invoked repeatedly in quick succession.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
|
The various RCU list-deletion macros (list_del_rcu(),
hlist_del_init_rcu(), hlist_del_rcu(), hlist_bl_del_init_rcu(),
hlist_bl_del_rcu(), hlist_nulls_del_init_rcu(), and hlist_nulls_del_rcu())
do plain stores into the ->next pointer of the preceding list elemment.
Unfortunately, the compiler is within its rights to (for example) use
byte-at-a-time writes to update the pointer, which would fatally confuse
concurrent readers. This patch therefore adds the needed WRITE_ONCE()
macros.
KernelThreadSanitizer (KTSAN) reported the __hlist_del() issue, which
is a problem when __hlist_del() is invoked by hlist_del_rcu().
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
|
The old rcu_lockdep_assert() was retained to ease handling of incoming
patches, but any use will result in deprecated warnings. However, its
replacement, RCU_LOCKDEP_WARN(), is now upstream. It is therefore
time to remove rcu_lockdep_assert(), which this commit does.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
|
The current list_entry_rcu() implementation copies the pointer to a stack
variable, then invokes rcu_dereference_raw() on it. This results in an
additional store-load pair. Now, most compilers will emit normal store
and load instructions, which might seem to be of negligible overhead,
but this results in a load-hit-store situation that can cause surprisingly
long pipeline stalls, even on modern microprocessors. The problem is
that it takes time for the store to get the store buffer updated, which
can delay the subsequent load, which immediately follows.
This commit therefore switches to the lockless_dereference() primitive,
which does not expect the __rcu annotations (that are anyway not present
in the list_head structure) and which, like rcu_dereference_raw(),
does not check for an enclosing RCU read-side critical section.
Most importantly, it does not copy the pointer, thus avoiding the
load-hit-store overhead.
Signed-off-by: Patrick Marlier <patrick.marlier@gmail.com>
[ paulmck: Switched to lockless_dereference() to suppress sparse warnings. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
|
This commit adds an rcu_pointer_handoff() that is intended to mark
situations where a structure's protection transitions from RCU to some
other mechanism (locking, reference counting, whatever). These markings
should allow external tools to more easily spot bugs involving leaking
pointers out of RCU read-side critical sections.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
Currently, __srcu_read_lock() cannot be invoked from restricted
environments because it contains calls to preempt_disable() and
preempt_enable(), both of which can invoke lockdep, which is a bad
idea in some restricted execution modes. This commit therefore moves
the preempt_disable() and preempt_enable() from __srcu_read_lock()
to srcu_read_lock(). It also inserts the preempt_disable() and
preempt_enable() around the call to __srcu_read_lock() in do_exit().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
|
Because preempt_disable() maps to barrier() for non-debug builds,
it forces the compiler to spill and reload registers. Because Tree
RCU and Tiny RCU now only appear in CONFIG_PREEMPT=n builds, these
barrier() instances generate needless extra code for each instance of
rcu_read_lock() and rcu_read_unlock(). This extra code slows down Tree
RCU and bloats Tiny RCU.
This commit therefore removes the preempt_disable() and preempt_enable()
from the non-preemptible implementations of __rcu_read_lock() and
__rcu_read_unlock(), respectively. However, for debug purposes,
preempt_disable() and preempt_enable() are still invoked if
CONFIG_PREEMPT_COUNT=y, because this allows detection of sleeping inside
atomic sections in non-preemptible kernels.
However, Tiny and Tree RCU operates by coalescing all RCU read-side
critical sections on a given CPU that lie between successive quiescent
states. It is therefore necessary to compensate for removing barriers
from __rcu_read_lock() and __rcu_read_unlock() by adding them to a
couple of the RCU functions invoked during quiescent states, namely to
rcu_all_qs() and rcu_note_context_switch(). However, note that the latter
is more paranoia than necessity, at least until link-time optimizations
become more aggressive.
This is based on an earlier patch by Paul E. McKenney, fixing
a bug encountered in kernels built with CONFIG_PREEMPT=n and
CONFIG_PREEMPT_COUNT=y.
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
As we now have rcu_callback_t typedefs as the type of rcu callbacks, we
should use it in call_rcu*() and friends as the type of parameters. This
could save us a few lines of code and make it clear which function
requires an rcu callbacks rather than other callbacks as its argument.
Besides, this can also help cscope to generate a better database for
code reading.
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
|
The current preemptible-RCU expedited grace-period algorithm invokes
synchronize_sched_expedited() to enqueue all tasks currently running
in a preemptible-RCU read-side critical section, then waits for all the
->blkd_tasks lists to drain. This works, but results in both an IPI and
a double context switch even on CPUs that do not happen to be running
in a preemptible RCU read-side critical section.
This commit implements a new algorithm that causes less OS jitter.
This new algorithm IPIs all online CPUs that are not idle (from an
RCU perspective), but refrains from self-IPIs. If a CPU receiving
this IPI is not in a preemptible RCU read-side critical section (or
is just now exiting one), it pushes quiescence up the rcu_node tree,
otherwise, it sets a flag that will be handled by the upcoming outermost
rcu_read_unlock(), which will then push quiescence up the tree.
The expedited grace period must of course wait on any pre-existing blocked
readers, and newly blocked readers must be queued carefully based on
the state of both the normal and the expedited grace periods. This
new queueing approach also avoids the need to update boost state,
courtesy of the fact that blocked tasks are no longer ever migrated to
the root rcu_node structure.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
Code like this in inline functions confuses some recent versions of gcc:
const int n = const-expr;
whatever_t array[n];
For more details, see:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67055#c13
This compiler bug results in the following failure after 114b7fd4b (rcu:
Create rcu_sync infrastructure):
In file included from include/linux/rcupdate.h:429:0,
from include/linux/rcu_sync.h:5,
from kernel/rcu/sync.c:1:
include/linux/rcutiny.h: In function 'rcu_barrier_sched':
include/linux/rcutiny.h:55:20: internal compiler error: Segmentation fault
static inline void rcu_barrier_sched(void)
This commit therefore eliminates the constant local variable in favor of
direct use of the expression.
Reported-and-tested-by: Mark Salter <msalter@redhat.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI updates from Rafael Wysocki:
"Included are: a somewhat late devfreq update which however is mostly
fixes and cleanups with one new thing only (the PPMUv2 support on
Exynos5433), an ACPI cpufreq driver fixup and two ACPI core cleanups
related to preprocessor directives.
Specifics:
- Fix a memory allocation size in the devfreq core (Xiaolong Ye).
- Fix a mistake in the exynos-ppmu DT binding (Javier Martinez
Canillas).
- Add support for PPMUv2 ((Platform Performance Monitoring Unit
version 2.0) on the Exynos5433 SoCs (Chanwoo Choi).
- Fix a type casting bug in the Exynos PPMU code (MyungJoo Ham).
- Assorted devfreq code cleanups and optimizations (Javi Merino,
MyungJoo Ham, Viresh Kumar).
- Fix up the ACPI cpufreq driver to use a more lightweight way to get
to its private data in the ->get() callback (Rafael J Wysocki).
- Fix a CONFIG_ prefix bug in one of the ACPI drivers and make the
ACPI subsystem use IS_ENABLED() instead of #ifdefs in function
bodies (Sudeep Holla)"
* tag 'pm+acpi-4.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: acpi-cpufreq: Use cpufreq_cpu_get_raw() in ->get()
ACPI: Eliminate CONFIG_.*{, _MODULE} #ifdef in favor of IS_ENABLED()
ACPI: int340x_thermal: add missing CONFIG_ prefix
PM / devfreq: Fix incorrect type issue.
PM / devfreq: tegra: Update governor to use devfreq_update_stats()
PM / devfreq: comments for get_dev_status usage updated
PM / devfreq: drop comment about thermal setting max_freq
PM / devfreq: cache the last call to get_dev_status()
PM / devfreq: Drop unlikely before IS_ERR(_OR_NULL)
PM / devfreq: exynos-ppmu: bit-wise operation bugfix.
PM / devfreq: exynos-ppmu: Update documentation to support PPMUv2
PM / devfreq: exynos-ppmu: Add the support of PPMUv2 for Exynos5433
PM / devfreq: event: Remove incorrect property in exynos-ppmu DT binding
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
"The new hfi1 driver in staging/rdma has had a number of fixup patches
since being added to the tree. This is the first batch of those fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
IB/hfi: Properly set permissions for user device files
IB/hfi1: mask vs shift confusion
IB/hfi1: clean up some defines
IB/hfi1: info leak in get_ctxt_info()
IB/hfi1: fix a locking bug
IB/hfi1: checking for NULL instead of IS_ERR
IB/hfi1: fix sdma_descq_cnt parameter parsing
IB/hfi1: fix copy_to/from_user() error handling
IB/hfi1: fix pstateinfo from returning improperly byteswapped value
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:
- a boot regression (since v4.2) fix for some ARM configurations from
Tyler
- regression (since v4.1) fixes for mkfs.xfs on a DAX enabled device
from Jeff. These are tagged for -stable.
- a pair of locking fixes from Axel that are hidden from lockdep since
they involve device_lock(). The "btt" one is tagged for -stable, the
other only applies to the new "pfn" mechanism in v4.3.
- a fix for the pmem ->rw_page() path to use wmb_pmem() from Ross.
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
mm: fix type cast in __pfn_to_phys()
pmem: add proper fencing to pmem_rw_page()
libnvdimm: pfn_devs: Fix locking in namespace_store
libnvdimm: btt_devs: Fix locking in namespace_store
blockdev: don't set S_DAX for misaligned partitions
dax: fix O_DIRECT I/O to the last block of a blockdev
|
|
Pull block updates from Jens Axboe:
"This is a bit bigger than it should be, but I could (did) not want to
send it off last week due to both wanting extra testing, and expecting
a fix for the bounce regression as well. In any case, this contains:
- Fix for the blk-merge.c compilation warning on gcc 5.x from me.
- A set of back/front SG gap merge fixes, from me and from Sagi.
This ensures that we honor SG gapping for integrity payloads as
well.
- Two small fixes for null_blk from Matias, fixing a leak and a
capacity propagation issue.
- A blkcg fix from Tejun, fixing a NULL dereference.
- A fast clone optimization from Ming, fixing a performance
regression since the arbitrarily sized bio's were introduced.
- Also from Ming, a regression fix for bouncing IOs"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: fix bounce_end_io
block: blk-merge: fast-clone bio when splitting rw bios
block: blkg_destroy_all() should clear q->root_blkg and ->root_rl.blkg
block: Copy a user iovec if it includes gaps
block: Refuse adding appending a gapped integrity page to a bio
block: Refuse request/bio merges with gaps in the integrity payload
block: Check for gaps on front and back merges
null_blk: fix wrong capacity when bs is not 512 bytes
null_blk: fix memory leak on cleanup
block: fix bogus compiler warnings in blk-merge.c
|
|
The various definitions of __pfn_to_phys() have been consolidated to
use a generic macro in include/asm-generic/memory_model.h. This hit
mainline in the form of 012dcef3f058 "mm: move __phys_to_pfn and
__pfn_to_phys to asm/generic/memory_model.h". When the generic macro
was implemented the type cast to phys_addr_t was dropped which caused
boot regressions on ARM platforms with more than 4GB of memory and
LPAE enabled.
It was suggested to use PFN_PHYS() defined in include/linux/pfn.h
as provides the correct logic and avoids further duplication.
Reported-by: kernelci.org bot <bot@kernelci.org>
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Tyler Baker <tyler.baker@linaro.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
* acpi-bus:
ACPI: Eliminate CONFIG_.*{, _MODULE} #ifdef in favor of IS_ENABLED()
ACPI: int340x_thermal: add missing CONFIG_ prefix
|
|
* pm-cpufreq:
cpufreq: acpi-cpufreq: Use cpufreq_cpu_get_raw() in ->get()
* pm-devfreq:
PM / devfreq: Fix incorrect type issue.
PM / devfreq: tegra: Update governor to use devfreq_update_stats()
PM / devfreq: comments for get_dev_status usage updated
PM / devfreq: drop comment about thermal setting max_freq
PM / devfreq: cache the last call to get_dev_status()
PM / devfreq: Drop unlikely before IS_ERR(_OR_NULL)
PM / devfreq: exynos-ppmu: bit-wise operation bugfix.
PM / devfreq: exynos-ppmu: Update documentation to support PPMUv2
PM / devfreq: exynos-ppmu: Add the support of PPMUv2 for Exynos5433
PM / devfreq: event: Remove incorrect property in exynos-ppmu DT binding
|
|
Pull KVM fixes from Paolo Bonzini:
"Mostly stable material, a lot of ARM fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (22 commits)
sched: access local runqueue directly in single_task_running
arm/arm64: KVM: Remove 'config KVM_ARM_MAX_VCPUS'
arm64: KVM: Remove all traces of the ThumbEE registers
arm: KVM: Disable virtual timer even if the guest is not using it
arm64: KVM: Disable virtual timer even if the guest is not using it
arm/arm64: KVM: vgic: Check for !irqchip_in_kernel() when mapping resources
KVM: s390: Replace incorrect atomic_or with atomic_andnot
arm: KVM: Fix incorrect device to IPA mapping
arm64: KVM: Fix user access for debug registers
KVM: vmx: fix VPID is 0000H in non-root operation
KVM: add halt_attempted_poll to VCPU stats
kvm: fix zero length mmio searching
kvm: fix double free for fast mmio eventfd
kvm: factor out core eventfd assign/deassign logic
kvm: don't try to register to KVM_FAST_MMIO_BUS for non mmio eventfd
KVM: make the declaration of functions within 80 characters
KVM: arm64: add workaround for Cortex-A57 erratum #852523
KVM: fix polling for guest halt continued even if disable it
arm/arm64: KVM: Fix PSCI affinity info return value for non valid cores
arm64: KVM: set {v,}TCR_EL2 RES1 bits
...
|
|
Byteswap link_width_downgrade_*_active values before sending on the wire. In
addition properly define the Port State Info structure.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Christian Gomez <christian.gomez@intel.com>
Signed-off-by: Rimmer, Todd <todd.rimmer@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"This is a rather large update post rc1 due to the final steps of
cleanups and API changes which had to wait for the preparatory patches
to hit your tree.
- Regression fixes for ARM GIC irqchips
- Regression fixes and lockdep anotations for renesas irq chips
- The leftovers of the cleanup and preparatory patches which have
been ignored by maintainers
- Final conversions of the newly merged users of obsolete APIs
- Final removal of obsolete APIs
- Final removal of ARM artifacts which had been introduced during the
conversion of ARM to the generic interrupt code.
- Final split of the irq_data into chip specific and common data to
reflect the needs of hierarchical irq domains.
- Treewide removal of the first argument of interrupt flow handlers,
i.e. the irq number, which is not used by the majority of handlers
and simple to retrieve from the other argument the irq descriptor.
- A few comment updates and build warning fixes"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
arm64: Remove ununsed set_irq_flags
ARM: Remove ununsed set_irq_flags
sh: Kill off set_irq_flags usage
irqchip: Kill off set_irq_flags usage
gpu/drm: Kill off set_irq_flags usage
genirq: Remove irq argument from irq flow handlers
genirq: Move field 'msi_desc' from irq_data into irq_common_data
genirq: Move field 'affinity' from irq_data into irq_common_data
genirq: Move field 'handler_data' from irq_data into irq_common_data
genirq: Move field 'node' from irq_data into irq_common_data
irqchip/gic-v3: Use IRQD_FORWARDED_TO_VCPU flag
irqchip/gic: Use IRQD_FORWARDED_TO_VCPU flag
genirq: Provide IRQD_FORWARDED_TO_VCPU status flag
genirq: Simplify irq_data_to_desc()
genirq: Remove __irq_set_handler_locked()
pinctrl/pistachio: Use irq_set_handler_locked
gpio: vf610: Use irq_set_handler_locked
powerpc/mpc8xx: Use irq_set_handler_locked()
powerpc/ipic: Use irq_set_handler_locked()
powerpc/cpm2: Use irq_set_handler_locked()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull Ceph fixes from Sage Weil:
"These are both fixes to the new and improved keepalive2 behavior"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
libceph: advertise support for keepalive2
libceph: don't access invalid memory in keepalive2 path
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Ingo Molnar:
"A fix for an abs()/abs64() bug that caused too slow NTP convergence on
32-bit kernels, plus a removal of an obsolete clockevents driver
facility after all users got converted during the merge window"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clockevents: Remove unused set_mode() callback
time: Fix timekeeping_freqadjust()'s incorrect use of abs() instead of abs64()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"A migrate_tasks() locking fix, and a late-coming nohz change plus a
nohz debug check"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: 'Annotate' migrate_tasks()
nohz: Assert existing housekeepers when nohz full enabled
nohz: Affine unpinned timers to housekeepers
|
|
We are the client, but advertise keepalive2 anyway - for consistency,
if nothing else. In the future the server might want to know whether
its clients support keepalive2.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Yan, Zheng <zyan@redhat.com>
|
|
This
struct ceph_timespec ceph_ts;
...
con_out_kvec_add(con, sizeof(ceph_ts), &ceph_ts);
wraps ceph_ts into a kvec and adds it to con->out_kvec array, yet
ceph_ts becomes invalid on return from prepare_write_keepalive(). As
a result, we send out bogus keepalive2 stamps. Fix this by encoding
into a ceph_timespec member, similar to how acks are read and written.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Yan, Zheng <zyan@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
"Spinlock performance regression fix, plus documentation fixes"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/static_keys: Fix up the static keys documentation
locking/qspinlock/x86: Only emit the test-and-set fallback when building guest support
locking/qspinlock/x86: Fix performance regression under unaccelerated VMs
locking/static_keys: Fix a silly typo
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master
Second set of KVM/ARM changes for 4.3-rc2
- Workaround for a Cortex-A57 erratum
- Bug fix for the debugging infrastructure
- Fix for 32bit guests with more than 4GB of address space
on a 32bit host
- A number of fixes for the (unusual) case when we don't use
the in-kernel GIC emulation
- Removal of ThumbEE handling on arm64, since these have been
dropped from the architecture before anyone actually ever
built a CPU
- Remove the KVM_ARM_MAX_VCPUS limitation which has become
fairly pointless
|
|
This patch removes config option of KVM_ARM_MAX_VCPUS,
and like other ARCHs, just choose the maximum allowed
value from hardware, and follows the reasons:
1) from distribution view, the option has to be
defined as the max allowed value because it need to
meet all kinds of virtulization applications and
need to support most of SoCs;
2) using a bigger value doesn't introduce extra memory
consumption, and the help text in Kconfig isn't accurate
because kvm_vpu structure isn't allocated until request
of creating VCPU is sent from QEMU;
3) the main effect is that the field of vcpus[] in 'struct kvm'
becomes a bit bigger(sizeof(void *) per vcpu) and need more cache
lines to hold the structure, but 'struct kvm' is one generic struct,
and it has worked well on other ARCHs already in this way. Also,
the world switch frequecy is often low, for example, it is ~2000
when running kernel building load in VM from APM xgene KVM host,
so the effect is very small, and the difference can't be observed
in my test at all.
Cc: Dann Frazier <dann.frazier@canonical.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
Most interrupt flow handlers do not use the irq argument. Those few
which use it can retrieve the irq number from the irq descriptor.
Remove the argument.
Search and replace was done with coccinelle and some extra helper
scripts around it. Thanks to Julia for her help!
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Julia Lawall <Julia.Lawall@lip6.fr>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
|
|
MSI descriptors are per-irq instead of per irqchip, so move it into
struct irq_common_data.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Kevin Cernekee <cernekee@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/1433145945-789-35-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Irq affinity mask is per-irq instead of per irqchip, so move it into
struct irq_common_data.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Kevin Cernekee <cernekee@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: http://lkml.kernel.org/r/1433303281-27688-1-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Handler data (handler_data) is per-irq instead of per irqchip, so move
it into struct irq_common_data.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Kevin Cernekee <cernekee@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/1433145945-789-13-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
NUMA node information is per-irq instead of per-irqchip, so move it into
struct irq_common_data. Also use CONFIG_NUMA to guard irq_common_data.node.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Kevin Cernekee <cernekee@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: http://lkml.kernel.org/r/1433145945-789-8-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Provide a irq data flag to mark an irq forwarded to a VCPU along with
the accessor functions.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
Avoid the lookup of irq_desc and use the same mechanism for
hierarchical and flat irqdomains.
Based-on-a-patch-from: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
All users converted to irq_set_handler_locked()
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
All users converted to irq_set_chip_handler_name_locked()
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
cpufreq_cpu_get() called by get_cur_freq_on_cpu() is overkill,
because the ->get() callback is always invoked in a context in
which all of the conditions checked by cpufreq_cpu_get() are
guaranteed to be satisfied.
Use cpufreq_cpu_get_raw() instead of it and drop the
corresponding cpufreq_cpu_put() from get_cur_freq_on_cpu().
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
__do_IRQ() was removed by commit 1c77ff2 "genirq: Remove __do_IRQ",
but the comment referring to __do_IRQ() was left.
Update the comment for generic_handle_irq_desc().
Signed-off-by: Huang Shijie <shijie.huang@arm.com>
Cc: jiang.liu@linux.intel.com
Cc: peterz@infradead.org
Cc: rafael.j.wysocki@intel.com
Cc: jason@lakedaemon.net
Cc: marc.zyngier@arm.com
Link: http://lkml.kernel.org/r/1441074950-3893-1-git-send-email-shijie.huang@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Fix a few small mistakes in the static key documentation and
delete an unneeded sentence.
Suggested-by: Jason Baron <jbaron@akamai.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150914171105.511e1e21@lwn.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
This commit removes all CONFIG_.*{,_MODULE} in ACPI code, replacing it
with IS_ENABLED().
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|