Age | Commit message (Collapse) | Author | Files | Lines |
|
time
An interrupt behaves with a burst of activity with periodic interval of time
followed by one or two peaks of longer interval.
As the time intervals are periodic, statistically speaking they follow a normal
distribution and each interrupts can be tracked individually.
Add a mechanism to compute the statistics on all interrupts, except the
timers which are deterministic from a prediction point of view, as their
expiry time is known.
The goal is to extract the periodicity for each interrupt, with the last
timestamp and sum them, so the next event can be predicted to a certain
extent.
Taking the earliest prediction gives the expected wakeup on the system
(assuming a timer won't expire before).
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: "Rafael J . Wysocki" <rafael@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: http://lkml.kernel.org/r/1498227072-5980-2-git-send-email-daniel.lezcano@linaro.org
|
|
The interrupt framework gives a lot of information about each interrupt. It
does not keep track of when those interrupts occur though, which is a
prerequisite for estimating the next interrupt arrival for power management
purposes.
Add a mechanism to record the timestamp for each interrupt occurrences in a
per-CPU circular buffer to help with the prediction of the next occurrence
using a statistical model.
Each CPU can store up to IRQ_TIMINGS_SIZE events <irq, timestamp>, the
current value of IRQ_TIMINGS_SIZE is 32.
Each event is encoded into a single u64, where the high 48 bits are used
for the timestamp and the low 16 bits are for the irq number.
A static key is introduced so when the irq prediction is switched off at
runtime, the overhead is near to zero.
It results in most of the code in internals.h for inline reasons and a very
few in the new file timings.c. The latter will contain more in the next patch
which will provide the statistical model for the next event prediction.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: "Rafael J . Wysocki" <rafael@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: http://lkml.kernel.org/r/1498227072-5980-1-git-send-email-daniel.lezcano@linaro.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core
Pull irqchip updates for v4.13 from Marc Zyngier
- support for the new Marvell wire-to-MSI bridge
- support for the Aspeed I2C irqchip
- Armada XP370 per-cpu interrupt fixes
- GICv3 ITS ACPI NUMA support
- sunxi-nmi cleanup and updates for new platform support
- various GICv3 ITS cleanups and fixes
- some constifying in various places
|
|
The Marvell ICU unit is found in the CP110 block of the Marvell Armada
7K and 8K SoCs. It collects the wired interrupts of the devices located
in the CP110 and turns them into SPI interrupts in the GIC located in
the AP806 side of the SoC, by using a memory transaction.
Until now, the ICU was configured in a static fashion by the firmware,
and Linux was relying on this static configuration. By having Linux
configure the ICU, we are more flexible, and we can allocate dynamically
the GIC SPI interrupts only for devices that are actually in use.
The driver was initially written by Hanna Hawa <hannah@marvell.com>.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
|
|
It did seem like a good idea at the time, but it never really
caught on, and auto-recursive domains remain unused 3 years after
having been introduced.
Oh well, time for a late spring cleanup.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
We can have irq domains that are identified by the same fwnode
(because they are serviced by the same HW), and yet have different
functionnality (because they serve different busses, for example).
This is what we use the bus_token field.
Since we don't use this field when generating the domain name,
all the aliasing domains will get the same name, and the debugfs
file creation fails. Also, bus_token is updated by individual drivers,
and the core code is unaware of that update.
In order to sort this mess, let's introduce a helper that takes care
of updating bus_token, and regenerate the debugfs file.
A separate patch will update all the individual users.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Many interrupt chips allow only a single CPU as interrupt target. The core
code has no knowledge about that. That's unfortunate as it could avoid
trying to readd a newly online CPU to the effective affinity mask.
Add the status flag and the necessary accessors.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235447.352343969@linutronix.de
|
|
If a CPU goes offline, interrupts affine to the CPU are moved away. If the
outgoing CPU is the last CPU in the affinity mask the migration code breaks
the affinity and sets it it all online cpus.
This is a problem for affinity managed interrupts as CPU hotplug is often
used for power management purposes. If the affinity is broken, the
interrupt is not longer affine to the CPUs to which it was allocated.
The affinity spreading allows to lay out multi queue devices in a way that
they are assigned to a single CPU or a group of CPUs. If the last CPU goes
offline, then the queue is not longer used, so the interrupt can be
shutdown gracefully and parked until one of the assigned CPUs comes online
again.
Add a graceful shutdown mechanism into the irq affinity breaking code path,
mark the irq as MANAGED_SHUTDOWN and leave the affinity mask unmodified.
In the online path, scan the active interrupts for managed interrupts and
if the interrupt is functional and the newly online CPU is part of the
affinity mask, restart the interrupt if it is marked MANAGED_SHUTDOWN or if
the interrupts is started up, try to add the CPU back to the effective
affinity mask.
Originally-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170619235447.273417334@linutronix.de
|
|
Affinity managed interrupts should keep their assigned affinity accross CPU
hotplug. To avoid magic hackery in device drivers, the core code shall
manage them transparently and set these interrupts into a managed shutdown
state when the last CPU of the assigned affinity mask goes offline. The
interrupt will be restarted when one of the CPUs in the assigned affinity
mask comes back online.
Add the necessary logic to irq_startup(). If an interrupt is requested and
started up, the code checks whether it is affinity managed and if so, it
checks whether a CPU in the interrupts affinity mask is online. If not, it
puts the interrupt into managed shutdown state.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235447.189851170@linutronix.de
|
|
Affinity managed interrupts should keep their assigned affinity accross CPU
hotplug. To avoid magic hackery in device drivers, the core code shall
manage them transparently. This will set these interrupts into a managed
shutdown state when the last CPU of the assigned affinity mask goes
offline. The interrupt will be restarted when one of the CPUs in the
assigned affinity mask comes back online.
Introduce the necessary state flag and the accessor functions.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235446.954523476@linutronix.de
|
|
There is currently no way to evaluate the effective affinity mask of a
given interrupt. Many irq chips allow only a single target CPU or a subset
of CPUs in the affinity mask.
Updating the mask at the time of setting the affinity to the subset would
be counterproductive because information for cpu hotplug about assigned
interrupt affinities gets lost. On CPU hotplug it's also pointless to force
migrate an interrupt, which is not targeted at the CPU effectively. But
currently the information is not available.
Provide a seperate mask to be updated by the irq_chip->irq_set_affinity()
implementations. Implement the read only proc files so the user can see the
effective mask as well w/o trying to deduce it from /proc/interrupts.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235446.247834245@linutronix.de
|
|
Now that x86 uses the generic code, the function declaration and inline
stub can move to the core internal header.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235445.928156166@linutronix.de
|
|
In order to move x86 to the generic hotplug migration code, add support for
cleaning up move in progress bits.
On architectures which have this x86 specific (mis)feature not enabled,
this is optimized out by the compiler.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235445.525817311@linutronix.de
|
|
If an CPU goes offline, the interrupts are migrated away, but a eventually
pending interrupt move, which has not yet been made effective is kept
pending even if the outgoing CPU is the sole target of the pending affinity
mask. What's worse is, that the pending affinity mask is discarded even if
it would contain a valid subset of the online CPUs.
Implement a helper function which allows to avoid these issues.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235444.691345468@linutronix.de
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235444.614913014@linutronix.de
|
|
Debugging (hierarchical) interupt domains is tedious as there is no
information about the hierarchy and no information about states of
interrupts in the various domain levels.
Add a debugfs directory 'irq' and subdirectories 'domains' and 'irqs'.
The domains directory contains the domain files. The content is information
about the domain. If the domain is part of a hierarchy then the parent
domains are printed as well.
# ls /sys/kernel/debug/irq/domains/
default INTEL-IR-2 INTEL-IR-MSI-2 IO-APIC-IR-2 PCI-MSI
DMAR-MSI INTEL-IR-3 INTEL-IR-MSI-3 IO-APIC-IR-3 unknown-1
INTEL-IR-0 INTEL-IR-MSI-0 IO-APIC-IR-0 IO-APIC-IR-4 VECTOR
INTEL-IR-1 INTEL-IR-MSI-1 IO-APIC-IR-1 PCI-HT
# cat /sys/kernel/debug/irq/domains/VECTOR
name: VECTOR
size: 0
mapped: 216
flags: 0x00000041
# cat /sys/kernel/debug/irq/domains/IO-APIC-IR-0
name: IO-APIC-IR-0
size: 24
mapped: 19
flags: 0x00000041
parent: INTEL-IR-3
name: INTEL-IR-3
size: 65536
mapped: 167
flags: 0x00000041
parent: VECTOR
name: VECTOR
size: 0
mapped: 216
flags: 0x00000041
Unfortunately there is no per cpu information about the VECTOR domain (yet).
The irqs directory contains detailed information about mapped interrupts.
# cat /sys/kernel/debug/irq/irqs/3
handler: handle_edge_irq
status: 0x00004000
istate: 0x00000000
ddepth: 1
wdepth: 0
dstate: 0x01018000
IRQD_IRQ_DISABLED
IRQD_SINGLE_TARGET
IRQD_MOVE_PCNTXT
node: 0
affinity: 0-143
effectiv: 0
pending:
domain: IO-APIC-IR-0
hwirq: 0x3
chip: IR-IO-APIC
flags: 0x10
IRQCHIP_SKIP_SET_WAKE
parent:
domain: INTEL-IR-3
hwirq: 0x20000
chip: INTEL-IR
flags: 0x0
parent:
domain: VECTOR
hwirq: 0x3
chip: APIC
flags: 0x0
This was developed to simplify the debugging of the managed affinity
changes.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235444.537566163@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Add a map counter instead of counting radix tree entries for
diagnosis. That also gives correct information for linear domains.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235444.459397746@linutronix.de
|
|
In order to provide proper debug interface it's required to have domain
names available when the domain is added. Non fwnode based architectures
like x86 have no way to do so.
It's not possible to use domain ops or host data for this as domain ops
might be the same for several instances, but the names have to be unique.
Extend the irqchip fwnode to allow transporting the domain name. If no node
is supplied, create a 'unknown-N' placeholder.
Warn if an invalid node is supplied and treat it like no node. This happens
e.g. with i2 devices on x86 which hand in an ACPI type node which has no
interface for retrieving the name.
[ Folded a fix from Marc to make DT name parsing work ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235443.588784933@linutronix.de
|
|
Provide a resource managed variant of irq_setup_generic_chip().
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-doc@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Link: http://lkml.kernel.org/r/1496246820-13250-6-git-send-email-brgl@bgdev.pl
|
|
Provide a resource managed variant of irq_alloc_generic_chip().
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-doc@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Link: http://lkml.kernel.org/r/1496246820-13250-5-git-send-email-brgl@bgdev.pl
|
|
Most users of irq_alloc_generic_chip() call irq_setup_generic_chip()
too. To simplify the cleanup provide a function that both removes a
generic chip and frees its memory.
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-doc@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Link: http://lkml.kernel.org/r/1496246820-13250-3-git-send-email-brgl@bgdev.pl
|
|
Currently there's no way for users of irq_alloc_generic_chip() to free
the allocated memory other than calling kfree() manually on the
returned pointer. This may lead to errors if the internals of
irq_alloc_generic_chip() ever change. Provide a routine to free the
generic chip.
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-doc@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Link: http://lkml.kernel.org/r/1496246820-13250-2-git-send-email-brgl@bgdev.pl
|
|
Get upstream changes so pending patches won't conflict.
|
|
Stack guard page is a useful feature to reduce a risk of stack smashing
into a different mapping. We have been using a single page gap which
is sufficient to prevent having stack adjacent to a different mapping.
But this seems to be insufficient in the light of the stack usage in
userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
which is 256kB or stack strings with MAX_ARG_STRLEN.
This will become especially dangerous for suid binaries and the default
no limit for the stack size limit because those applications can be
tricked to consume a large portion of the stack and a single glibc call
could jump over the guard page. These attacks are not theoretical,
unfortunatelly.
Make those attacks less probable by increasing the stack guard gap
to 1MB (on systems with 4k pages; but make it depend on the page size
because systems with larger base pages might cap stack allocations in
the PAGE_SIZE units) which should cover larger alloca() and VLA stack
allocations. It is obviously not a full fix because the problem is
somehow inherent, but it should reduce attack space a lot.
One could argue that the gap size should be configurable from userspace,
but that can be done later when somebody finds that the new 1MB is wrong
for some special case applications. For now, add a kernel command line
option (stack_guard_gap) to specify the stack gap size (in page units).
Implementation wise, first delete all the old code for stack guard page:
because although we could get away with accounting one extra page in a
stack vma, accounting a larger gap can break userspace - case in point,
a program run with "ulimit -S -v 20000" failed when the 1MB gap was
counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
and strict non-overcommit mode.
Instead of keeping gap inside the stack vma, maintain the stack guard
gap as a gap between vmas: using vm_start_gap() in place of vm_start
(or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
places which need to respect the gap - mainly arch_get_unmapped_area(),
and and the vma tree's subtree_gap support for that.
Original-patch-by: Oleg Nesterov <oleg@redhat.com>
Original-patch-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Pull configfs updates from Christoph Hellwig:
"A fix from Nic for a race seen in production (including a stable tag).
And while I'm sending you this I'm also sneaking in a trivial new
helper from Bart so that we don't need inter-tree dependencies for the
next merge window"
* tag 'configfs-for-4.12' of git://git.infradead.org/users/hch/configfs:
configfs: Introduce config_item_get_unless_zero()
configfs: Fix race between create_link and configfs_rmdir
|
|
Pull block layer fix from Jens Axboe:
"Just a single fix this week, fixing a regression introduced in this
release.
When we put the final reference to the queue, we may need to block.
Ensure that we can safely do so. From Bart"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: Fix a blk_exit_rl() regression
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
Pull dmi fixes from Jean Delvare.
* 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
firmware: dmi_scan: Check DMI structure length
firmware: dmi: Fix permissions of product_family
firmware: dmi_scan: Make dmi_walk and dmi_walk_early return real error codes
firmware: dmi_scan: Look for SMBIOS 3 entry point first
|
|
Currently they return -1 on error, which will confuse callers if
they try to interpret it as a normal negative error code.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
|
|
Pull networking fixes from David Miller:
1) The netlink attribute passed in to dev_set_alias() is not
necessarily NULL terminated, don't use strlcpy() on it. From
Alexander Potapenko.
2) Fix implementation of atomics in arm64 bpf JIT, from Daniel
Borkmann.
3) Correct the release of netdevs and driver private data in certain
circumstances.
4) Sanitize netlink message length properly in decnet, from Mateusz
Jurczyk.
5) Don't leak kernel data in rtnl_fill_vfinfo() netlink blobs. From
Yuval Mintz.
6) Hash secret is never initialized in ipv6 ILA translation code, from
Arnd Bergmann. I guess those clang warnings about unused inline
functions are useful for something!
7) Fix endian selection in bpf_endian.h, from Daniel Borkmann.
8) Sanitize sockaddr length before dereferncing any fields in AF_UNIX
and CAIF. From Mateusz Jurczyk.
9) Fix timestamping for GMAC3 chips in stmmac driver, from Mario
Molitor.
10) Do not leak netdev on dev_alloc_name() errors in mac80211, from
Johannes Berg.
11) Fix locking in sctp_for_each_endpoint(), from Xin Long.
12) Fix wrong memset size on 32-bit in snmp6, from Christian Perle.
13) Fix use after free in ip_mc_clear_src(), from WANG Cong.
14) Fix regressions caused by ICMP rate limiting changes in 4.11, from
Jesper Dangaard Brouer.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (91 commits)
i40e: Fix a sleep-in-atomic bug
net: don't global ICMP rate limit packets originating from loopback
net/act_pedit: fix an error code
net: update undefined ->ndo_change_mtu() comment
net_sched: move tcf_lock down after gen_replace_estimator()
caif: Add sockaddr length check before accessing sa_family in connect handler
qed: fix dump of context data
qmi_wwan: new Telewell and Sierra device IDs
net: phy: Fix MDIO_THUNDER dependencies
netconsole: Remove duplicate "netconsole: " logging prefix
igmp: acquire pmc lock for ip_mc_clear_src()
r8152: give the device version
net: rps: fix uninitialized symbol warning
mac80211: don't send SMPS action frame in AP mode when not needed
mac80211/wpa: use constant time memory comparison for MACs
mac80211: set bss_info data before configuring the channel
mac80211: remove 5/10 MHz rate code from station MLME
mac80211: Fix incorrect condition when checking rx timestamp
mac80211: don't look at the PM bit of BAR frames
i40e: fix handling of HW ATR eviction
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These revert an ACPICA commit from the 4.11 cycle that causes problems
to happen on some systems and add a protection against possible kernel
crashes due to table reference counter imbalance.
Specifics:
- Revert a 4.11 ACPICA change that made assumptions which are not
satisfied on some systems and caused the enumeration of resources
to fail on them (Rafael Wysocki).
- Add a mechanism to prevent tables from being unmapped prematurely
due to reference counter overflows (Lv Zheng)"
* tag 'acpi-4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPICA: Tables: Mechanism to handle late stage acpi_get_table() imbalance
Revert "ACPICA: Disassembler: Enhance resource descriptor detection"
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
- some build dependency issues at CEC core with randconfigs
- fix an off by one error at vb2
- a race fix at cec core
- driver fixes at tc358743, sir_ir and rainshadow-cec
* tag 'media/v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] media/cec.h: use IS_REACHABLE instead of IS_ENABLED
[media] cec: race fix: don't return -ENONET in cec_receive()
[media] sir_ir: infinite loop in interrupt handler
[media] cec-notifier.h: handle unreachable CONFIG_CEC_CORE
[media] cec: improve MEDIA_CEC_RC dependencies
[media] vb2: Fix an off by one error in 'vb2_plane_vaddr'
[media] rainshadow-cec: Fix missing spin_lock_init()
[media] tc358743: fix register i2c_rd/wr function fix
|
|
* acpica-fixes:
ACPICA: Tables: Mechanism to handle late stage acpi_get_table() imbalance
Revert "ACPICA: Disassembler: Enhance resource descriptor detection"
|
|
Avoid that the following complaint is reported:
BUG: sleeping function called from invalid context at kernel/workqueue.c:2790
in_atomic(): 1, irqs_disabled(): 0, pid: 41, name: rcuop/3
1 lock held by rcuop/3/41:
#0: (rcu_callback){......}, at: [<ffffffff8111f9a2>] rcu_nocb_kthread+0x282/0x500
Call Trace:
dump_stack+0x86/0xcf
___might_sleep+0x174/0x260
__might_sleep+0x4a/0x80
flush_work+0x7e/0x2e0
__cancel_work_timer+0x143/0x1c0
cancel_work_sync+0x10/0x20
blk_throtl_exit+0x25/0x60
blkcg_exit_queue+0x35/0x40
blk_release_queue+0x42/0x130
kobject_put+0xa9/0x190
This happens since we invoke callbacks that need to block from the
queue release handler. Fix this by pushing the final release to
a workqueue.
Reported-by: Ross Zwisler <zwisler@gmail.com>
Fixes: commit b425e5049258 ("block: Avoid that blk_exit_rl() triggers a use-after-free")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Updated changelog
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Update ->ndo_change_mtu() callback comment to remove text
about returning error in case of undefined callback. This
change makes the comment match the existing code behavior.
Signed-off-by: Magnus Damm <damm+renesas@opensource.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Considering this case:
1. A program opens a sysfs table file 65535 times, it can increase
validation_count and first increment cause the table to be mapped:
validation_count = 65535
2. AML execution causes "Load" to be executed on the same
table, this time it cannot increase validation_count, so
validation_count remains:
validation_count = 65535
3. The program closes sysfs table file 65535 times, it can decrease
validation_count and the last decrement cause the table to be
unmapped:
validation_count = 0
4. AML code still accessing the loaded table, kernel crash can be
observed.
To prevent that from happening, add a validation_count threashold.
When it is reached, the validation_count can no longer be
incremented/decremented to invalidate the table descriptor (means
preventing table unmappings)
Note that code added in acpi_tb_put_table() is actually a no-op but
changes the warning message into a "warn once" one. Lv Zheng.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
[ rjw: Changelog, comments ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
[hch: minor style tweak]
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull key subsystem fixes from James Morris:
"Here are a bunch of fixes for Linux keyrings, including:
- Fix up the refcount handling now that key structs use the
refcount_t type and the refcount_t ops don't allow a 0->1
transition.
- Fix a potential NULL deref after error in x509_cert_parse().
- Don't put data for the crypto algorithms to use on the stack.
- Fix the handling of a null payload being passed to add_key().
- Fix incorrect cleanup an uninitialised key_preparsed_payload in
key_update().
- Explicit sanitisation of potentially secure data before freeing.
- Fixes for the Diffie-Helman code"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (23 commits)
KEYS: fix refcount_inc() on zero
KEYS: Convert KEYCTL_DH_COMPUTE to use the crypto KPP API
crypto : asymmetric_keys : verify_pefile:zero memory content before freeing
KEYS: DH: add __user annotations to keyctl_kdf_params
KEYS: DH: ensure the KDF counter is properly aligned
KEYS: DH: don't feed uninitialized "otherinfo" into KDF
KEYS: DH: forbid using digest_null as the KDF hash
KEYS: sanitize key structs before freeing
KEYS: trusted: sanitize all key material
KEYS: encrypted: sanitize all key material
KEYS: user_defined: sanitize key payloads
KEYS: sanitize add_key() and keyctl() key payloads
KEYS: fix freeing uninitialized memory in key_update()
KEYS: fix dereferencing NULL payload with nonzero length
KEYS: encrypted: use constant-time HMAC comparison
KEYS: encrypted: fix race causing incorrect HMAC calculations
KEYS: encrypted: fix buffer overread in valid_master_desc()
KEYS: encrypted: avoid encrypting/decrypting stack buffers
KEYS: put keyring if install_session_keyring_to_cred() fails
KEYS: Delete an error message for a failed memory allocation in get_derived_key()
...
|
|
Commit abb2ea7dfd82 ("compiler, clang: suppress warning for unused
static inline functions") just caused more warnings due to re-defining
the 'inline' macro.
So undef it before re-defining it, and also add the 'notrace' attribute
like the gcc version that this is overriding does.
Maybe this makes clang happier.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
"Fix various bug fixes in ext4 caused by races and memory allocation
failures"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: fix fdatasync(2) after extent manipulation operations
ext4: fix data corruption for mmap writes
ext4: fix data corruption with EXT4_GET_BLOCKS_ZERO
ext4: fix quota charging for shared xattr blocks
ext4: remove redundant check for encrypted file on dio write path
ext4: remove unused d_name argument from ext4_search_dir() et al.
ext4: fix off-by-one error when writing back pages before dio read
ext4: fix off-by-one on max nr_pages in ext4_find_unwritten_pgoff()
ext4: keep existing extra fields when inode expands
ext4: handle the rest of ext4_mb_load_buddy() ENOMEM errors
ext4: fix off-by-in in loop termination in ext4_find_unwritten_pgoff()
ext4: fix SEEK_HOLE
jbd2: preserve original nofs flag during journal restart
ext4: clear lockdep subtype for quota files on quota off
|
|
Pull KVM fixes from Paolo Bonzini:
"Bug fixes (ARM, s390, x86)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: async_pf: avoid async pf injection when in guest mode
KVM: cpuid: Fix read/write out-of-bounds vulnerability in cpuid emulation
arm: KVM: Allow unaligned accesses at HYP
arm64: KVM: Allow unaligned accesses at EL2
arm64: KVM: Preserve RES1 bits in SCTLR_EL2
KVM: arm/arm64: Handle possible NULL stage2 pud when ageing pages
KVM: nVMX: Fix exception injection
kvm: async_pf: fix rcu_irq_enter() with irqs enabled
KVM: arm/arm64: vgic-v3: Fix nr_pre_bits bitfield extraction
KVM: s390: fix ais handling vs cpu model
KVM: arm/arm64: Fix isues with GICv2 on GICv3 migration
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU fixes from Ingo Molnar:
"Fix an SRCU bug affecting KVM IRQ injection"
* 'rcu-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
srcu: Allow use of Classic SRCU from both process and interrupt context
srcu: Allow use of Tiny/Tree SRCU from both process and interrupt context
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
- another compile-fix for my header cleanup
- a couple of fixes for the recently merged IOMMU probe deferal code
- fixes for ACPI/IORT code necessary with IOMMU probe deferal
* tag 'iommu-fixes-v4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
arm: dma-mapping: Reset the device's dma_ops
ACPI/IORT: Move the check to get iommu_ops from translated fwspec
ARM: dma-mapping: Don't tear down third-party mappings
ACPI/IORT: Ignore all errors except EPROBE_DEFER
iommu/of: Ignore all errors except EPROBE_DEFER
iommu/of: Fix check for returning EPROBE_DEFER
iommu/dma: Fix function declaration
|
|
Pull block fixes from Jens Axboe:
"A set of fixes in the area of block IO, that should go into the next
-rc release. This contains:
- An OOPS fix from Dmitry, fixing a regression with the bio integrity
code in this series.
- Fix truncation of elevator io context cache name, from Eric
Biggers.
- NVMe pull from Christoph includes FC fixes from James, APST
fixes/tweaks from Kai-Heng, removal fix from Rakesh, and an RDMA
fix from Sagi.
- Two tweaks for the block throttling code. One from Joseph Qi,
fixing an oops from the timer code, and one from Shaohua, improving
the behavior on rotatonal storage.
- Two blk-mq fixes from Ming, fixing corner cases with the direct
issue code.
- Locking fix for bfq cgroups from Paolo"
* 'for-linus' of git://git.kernel.dk/linux-block:
block, bfq: access and cache blkg data only when safe
Fix loop device flush before configure v3
blk-throttle: set default latency baseline for harddisk
blk-throttle: fix NULL pointer dereference in throtl_schedule_pending_timer
nvme: relax APST default max latency to 100ms
nvme: only consider exit latency when choosing useful non-op power states
nvme-fc: fix missing put reference on controller create failure
nvme-fc: on lldd/transport io error, terminate association
nvme-rdma: fast fail incoming requests while we reconnect
nvme-pci: fix multiple ctrl removal scheduling
nvme: fix hang in remove path
elevator: fix truncation of icq_cache_name
blk-mq: fix direct issue
blk-mq: pass correct hctx to blk_mq_try_issue_directly
bio-integrity: Do not allocate integrity context for bio w/o data
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into rcu/urgent
Pull RCU fix from Paul E. McKenney:
" This series enables srcu_read_lock() and srcu_read_unlock() to be used from
interrupt handlers, which fixes a bug in KVM's use of SRCU in delivery
of interrupts to guest OSes. "
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: James Morris <james.l.morris@oracle.com>
|
|
While a 'struct key' itself normally does not contain sensitive
information, Documentation/security/keys.txt actually encourages this:
"Having a payload is not required; and the payload can, in fact,
just be a value stored in the struct key itself."
In case someone has taken this advice, or will take this advice in the
future, zero the key structure before freeing it. We might as well, and
as a bonus this could make it a bit more difficult for an adversary to
determine which keys have recently been in use.
This is safe because the key_jar cache does not use a constructor.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These revert one problematic commit related to system sleep and fix
one recent intel_pstate regression.
Specifics:
- Revert a recent commit that attempted to avoid spurious wakeups
from suspend-to-idle via ACPI SCI, but introduced regressions on
some systems (Rafael Wysocki).
We will get back to the problem it tried to address in the next
cycle.
- Fix a possible division by 0 during intel_pstate initialization
due to a missing check (Rafael Wysocki)"
* tag 'pm-4.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "ACPI / sleep: Ignore spurious SCI wakeups from suspend-to-idle"
cpufreq: intel_pstate: Avoid division by 0 in min_perf_pct_min()
|
|
* intel_pstate:
cpufreq: intel_pstate: Avoid division by 0 in min_perf_pct_min()
* pm-sleep:
Revert "ACPI / sleep: Ignore spurious SCI wakeups from suspend-to-idle"
|
|
Each time a new speed is added, the bonding 802.3ad isn't updated. Add a
comment to remind the developer to update this driver.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The first netlink attribute (value 0) must always be defined
as none/unspec.
Because we cannot change an existing UAPI, I add a comment to point the
mistake and avoid to propagate it in a new ovs API in the future.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|