Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This is an assorted set of stragglers into the merge window with
driver updates for qla2xxx, megaraid_sas, storvsc and ufs.
It also includes pulls of the uapi tree (all the remaining SCSI
pieces) and the fcoe tree (updates to fcoe and libfc)"
* tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (81 commits)
[SCSI] ufs: Separate PCI code into glue driver
[SCSI] ufs: Segregate PCI Specific Code
[SCSI] scsi: fix lpfc build when wmb() is defined as mb()
[SCSI] storvsc: Handle dynamic resizing of the device
[SCSI] storvsc: Restructure error handling code on command completion
[SCSI] storvsc: avoid usage of WRITE_SAME
[SCSI] aacraid: suppress two GCC warnings
[SCSI] hpsa: check for dma_mapping_error in hpsa_passthru ioctls
[SCSI] hpsa: reorganize error handling in hpsa_passthru_ioctl
[SCSI] hpsa: check for dma_mapping_error in hpsa_map_sg_chain_block
[SCSI] hpsa: Check for dma_mapping_error for all code paths using fill_cmd
[SCSI] hpsa: Check for dma_mapping_error in hpsa_map_one
[SCSI] dc395x: uninitialized variable in device_alloc()
[SCSI] Fix range check in scsi_host_dif_capable()
[SCSI] storvsc: Initialize the sglist
[SCSI] mpt2sas: Add support for OEM specific controller
[SCSI] ipr: Fix oops while resetting an ipr adapter
[SCSI] fnic: Fnic Trace Utility
[SCSI] fnic: New debug flags and debug log messages
[SCSI] fnic: fnic driver may hit BUG_ON on device reset
...
|
|
Tim found:
WARNING: at arch/x86/kernel/smpboot.c:324 topology_sane.isra.2+0x6f/0x80()
Hardware name: S2600CP
sched: CPU #1's llc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
smpboot: Booting Node 1, Processors #1
Modules linked in:
Pid: 0, comm: swapper/1 Not tainted 3.9.0-0-generic #1
Call Trace:
set_cpu_sibling_map+0x279/0x449
start_secondary+0x11d/0x1e5
Don Morris reproduced on a HP z620 workstation, and bisected it to
commit e8d195525809 ("acpi, memory-hotplug: parse SRAT before memblock
is ready")
It turns out movable_map has some problems, and it breaks several things
1. numa_init is called several times, NOT just for srat. so those
nodes_clear(numa_nodes_parsed)
memset(&numa_meminfo, 0, sizeof(numa_meminfo))
can not be just removed. Need to consider sequence is: numaq, srat, amd, dummy.
and make fall back path working.
2. simply split acpi_numa_init to early_parse_srat.
a. that early_parse_srat is NOT called for ia64, so you break ia64.
b. for (i = 0; i < MAX_LOCAL_APIC; i++)
set_apicid_to_node(i, NUMA_NO_NODE)
still left in numa_init. So it will just clear result from early_parse_srat.
it should be moved before that....
c. it breaks ACPI_TABLE_OVERIDE...as the acpi table scan is moved
early before override from INITRD is settled.
3. that patch TITLE is total misleading, there is NO x86 in the title,
but it changes critical x86 code. It caused x86 guys did not
pay attention to find the problem early. Those patches really should
be routed via tip/x86/mm.
4. after that commit, following range can not use movable ram:
a. real_mode code.... well..funny, legacy Node0 [0,1M) could be hot-removed?
b. initrd... it will be freed after booting, so it could be on movable...
c. crashkernel for kdump...: looks like we can not put kdump kernel above 4G
anymore.
d. init_mem_mapping: can not put page table high anymore.
e. initmem_init: vmemmap can not be high local node anymore. That is
not good.
If node is hotplugable, the mem related range like page table and
vmemmap could be on the that node without problem and should be on that
node.
We have workaround patch that could fix some problems, but some can not
be fixed.
So just remove that offending commit and related ones including:
f7210e6c4ac7 ("mm/memblock.c: use CONFIG_HAVE_MEMBLOCK_NODE_MAP to
protect movablecore_map in memblock_overlaps_region().")
01a178a94e8e ("acpi, memory-hotplug: support getting hotplug info from
SRAT")
27168d38fa20 ("acpi, memory-hotplug: extend movablemem_map ranges to
the end of node")
e8d195525809 ("acpi, memory-hotplug: parse SRAT before memblock is
ready")
fb06bc8e5f42 ("page_alloc: bootmem limit with movablecore_map")
42f47e27e761 ("page_alloc: make movablemem_map have higher priority")
6981ec31146c ("page_alloc: introduce zone_movable_limit[] to keep
movable limit for nodes")
34b71f1e04fc ("page_alloc: add movable_memmap kernel parameter")
4d59a75125d5 ("x86: get pg_data_t's memory from other node")
Later we should have patches that will make sure kernel put page table
and vmemmap on local node ram instead of push them down to node0. Also
need to find way to put other kernel used ram to local node ram.
Reported-by: Tim Gardner <tim.gardner@canonical.com>
Reported-by: Don Morris <don.morris@hp.com>
Bisected-by: Don Morris <don.morris@hp.com>
Tested-by: Don Morris <don.morris@hp.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull signal/compat fixes from Al Viro:
"Fixes for several regressions introduced in the last signal.git pile,
along with fixing bugs in truncate and ftruncate compat (on just about
anything biarch at least one of those two had been done wrong)."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
compat: restore timerfd settime and gettime compat syscalls
[regression] braino in "sparc: convert to ksignal"
fix compat truncate/ftruncate
switch lseek to COMPAT_SYSCALL_DEFINE
lseek() and truncate() on sparc really need sign extension
|
|
If CONFIG_IIO_TRIGGER is defined but CONFIG_IIO_BUFFER is not, the following
build error is seen.
drivers/iio/common/st_sensors/st_sensors_trigger.c:21:5: error:
redefinition of ‘st_sensors_allocate_trigger’
In file included from
drivers/iio/common/st_sensors/st_sensors_trigger.c:18:0:
include/linux/iio/common/st_sensors.h:239:19: note: previous
definition of ‘st_sensors_allocate_trigger’ was here
drivers/iio/common/st_sensors/st_sensors_trigger.c:65:6: error:
redefinition of ‘st_sensors_deallocate_trigger’
In file included from
drivers/iio/common/st_sensors/st_sensors_trigger.c:18:0:
include/linux/iio/common/st_sensors.h:244:20: note: previous
definition of ‘st_sensors_deallocate_trigger’ was here
This occurs because st_sensors_deallocate_trigger is built if CONFIG_IIO_TRIGGER
is defined, but the dummy function is compiled if CONFIG_IIO_BUFFER is defined.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Denis Ciocca <denis.ciocca@st.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull new ARC architecture from Vineet Gupta:
"Initial ARC Linux port with some fixes on top for 3.9-rc1:
I would like to introduce the Linux port to ARC Processors (from
Synopsys) for 3.9-rc1. The patch-set has been discussed on the public
lists since Nov and has received a fair bit of review, specially from
Arnd, tglx, Al and other subsystem maintainers for DeviceTree, kgdb...
The arch bits are in arch/arc, some asm-generic changes (acked by
Arnd), a minor change to PARISC (acked by Helge).
The series is a touch bigger for a new port for 2 main reasons:
1. It enables a basic kernel in first sub-series and adds
ptrace/kgdb/.. later
2. Some of the fallout of review (DeviceTree support, multi-platform-
image support) were added on top of orig series, primarily to
record the revision history.
This updated pull request additionally contains
- fixes due to our GNU tools catching up with the new syscall/ptrace
ABI
- some (minor) cross-arch Kconfig updates."
* tag 'arc-v3.9-rc1-late' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (82 commits)
ARC: split elf.h into uapi and export it for userspace
ARC: Fixup the current ABI version
ARC: gdbserver using regset interface possibly broken
ARC: Kconfig cleanup tracking cross-arch Kconfig pruning in merge window
ARC: make a copy of flat DT
ARC: [plat-arcfpga] DT arc-uart bindings change: "baud" => "current-speed"
ARC: Ensure CONFIG_VIRT_TO_BUS is not enabled
ARC: Fix pt_orig_r8 access
ARC: [3.9] Fallout of hlist iterator update
ARC: 64bit RTSC timestamp hardware issue
ARC: Don't fiddle with non-existent caches
ARC: Add self to MAINTAINERS
ARC: Provide a default serial.h for uart drivers needing BASE_BAUD
ARC: [plat-arcfpga] defconfig for fully loaded ARC Linux
ARC: [Review] Multi-platform image #8: platform registers SMP callbacks
ARC: [Review] Multi-platform image #7: SMP common code to use callbacks
ARC: [Review] Multi-platform image #6: cpu-to-dma-addr optional
ARC: [Review] Multi-platform image #5: NR_IRQS defined by ARC core
ARC: [Review] Multi-platform image #4: Isolate platform headers
ARC: [Review] Multi-platform image #3: switch to board callback
...
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Note that this thing does *not* contribute to inode refcount;
it's pinned down by dentry.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Add a num_write_bios function to struct target.
If an instance of a target sets this, it will be queried before the
target's mapping function is called on a write bio, and the response
controls the number of copies of the write bio that the target will
receive.
This provides a convenient way for a target to send the same data to
more than one device. The new cache target uses this in writethrough
mode, to send the data both to the cache and the backing device.
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
This patch allows the administrator to reduce the rate at which kcopyd
issues I/O.
Each module that uses kcopyd acquires a throttle parameter that can be
set in /sys/module/*/parameters.
We maintain a history of kcopyd usage by each module in the variables
io_period and total_period in struct dm_kcopyd_throttle. The actual
kcopyd activity is calculated as a percentage of time equal to
"(100 * io_period / total_period)". This is compared with the user-defined
throttle percentage threshold and if it is exceeded, we sleep.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
This patch introduces enhanced message support that allows the
device-mapper core to recognise messages that are common to all devices,
and for messages to return data to userspace.
Core messages are processed by the function "message_for_md". If the
device mapper doesn't support the message, it is passed to the target
driver.
If the message returns data, the kernel sets the flag
DM_MESSAGE_OUT_FLAG.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Device-mapper ioctls receive and send data in a buffer supplied
by userspace. The buffer has two parts. The first part contains
a 'struct dm_ioctl' and has a fixed size. The second part depends
on the ioctl and has a variable size.
This patch recognises the specific ioctls that do not use the variable
part of the buffer and skips allocating memory for it.
In particular, when a device is suspended and a resume ioctl is sent,
this now avoid memory allocation completely.
The variable "struct dm_ioctl tmp" is moved from the function
copy_params to its caller ctl_ioctl and renamed to param_kernel.
It is used directly when the ioctl function doesn't need any arguments.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Use 'bio' in the name of variables and functions that deal with
bios rather than 'request' to avoid confusion with the normal
block layer use of 'request'.
No functional changes.
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Avoid returning a truncated table or status string instead of setting
the DM_BUFFER_FULL_FLAG when the last target of a table fills the
buffer.
When processing a table or status request, the function retrieve_status
calls ti->type->status. If ti->type->status returns non-zero,
retrieve_status assumes that the buffer overflowed and sets
DM_BUFFER_FULL_FLAG.
However, targets don't return non-zero values from their status method
on overflow. Most targets returns always zero.
If a buffer overflow happens in a target that is not the last in the
table, it gets noticed during the next iteration of the loop in
retrieve_status; but if a buffer overflow happens in the last target, it
goes unnoticed and erroneously truncated data is returned.
In the current code, the targets behave in the following way:
* dm-crypt returns -ENOMEM if there is not enough space to store the
key, but it returns 0 on all other overflows.
* dm-thin returns errors from the status method if a disk error happened.
This is incorrect because retrieve_status doesn't check the error
code, it assumes that all non-zero values mean buffer overflow.
* all the other targets always return 0.
This patch changes the ti->type->status function to return void (because
most targets don't use the return code). Overflow is detected in
retrieve_status: if the status method fills up the remaining space
completely, it is assumed that buffer overflow happened.
Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Fix kernel-doc warnings in hsi files:
Warning(include/linux/hsi/hsi.h:136): Excess struct/union/enum/typedef member 'e_handler' description in 'hsi_client'
Warning(include/linux/hsi/hsi.h:136): Excess struct/union/enum/typedef member 'pclaimed' description in 'hsi_client'
Warning(include/linux/hsi/hsi.h:136): Excess struct/union/enum/typedef member 'nb' description in 'hsi_client'
Warning(drivers/hsi/hsi.c:434): No description found for parameter 'handler'
Warning(drivers/hsi/hsi.c:434): Excess function parameter 'cb' description in 'hsi_register_port_event'
Don't document "private:" fields with kernel-doc notation.
If you want to leave them fully documented, that's OK, but
then don't mark them as "private:".
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Carlos Chinea <carlos.chinea@nokia.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-omap@vger.kernel.org
Acked-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Add support for watchdog drivers to initialize/set the timeout field
of the watchdog_device structure. The timeout field is initialised
either with the module timeout parameter value (if valid) or with the
timeout-sec dt property (if valid). If both are invalid the initial
value is unchanged.
Signed-off-by: Fabio Porcedda <fabio.porcedda@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
|
|
Instead of accessing the function to set the watchdog timer directly,
register a platform driver the platform could register to use this
watchdog driver.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
|
|
This RTC also includes a watchdog timer. Provide an accessor function
for setting the watchdog timeout value which will be picked up by a
watchdog driver. Also register the platform_device for the watchdog here
to get the boot-time dependencies right.
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
|
|
FCoE Updates for 3.9
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
|
|
UAPI Disintegration 2012-12-19
This is the remaining SCSI part of the UAPI
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
|
|
commit df367931 (regulator: core: Provide regmap get/set bypass
operations) introduced regulator_[gs]et_bypass_regmap
However structure documentation for regulator_desc needs an update.
./scripts/kernel-doc include/linux/regulator/driver.h >/dev/null
generates:
Warning(include/linux/regulator/driver.h:233): No description found for parameter 'bypass_reg'
Warning(include/linux/regulator/driver.h:233): No description found for parameter 'bypass_mask'
Cc: Liam Girdwood <lgirdwood@gmail.com>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
|
|
When the system is under memory pressure, ext4_es_srhink() will get
called very often. So optimize returning the number of items in the
file system's extent status cache by keeping a per-filesystem count,
instead of calculating it each time by scanning all of the inodes in
the extent status cache.
Also rename the slab used for the extent status cache to be
"ext4_extent_status" so it's obviousl the slab in question is created
by ext4.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Zheng Liu <gnehzuil.liu@gmail.com>
|
|
git://github.com/markus-oberhumer/linux
Pull LZO compression update from Markus Oberhumer:
"Summary:
========
Update the Linux kernel LZO compression and decompression code to the
current upstream version which features significant performance
improvements on modern machines.
Some *synthetic* benchmarks:
============================
x86_64 (Sandy Bridge), gcc-4.6 -O3, Silesia test corpus, 256 kB block-size:
compression speed decompression speed
LZO-2005 : 150 MB/sec 468 MB/sec
LZO-2012 : 434 MB/sec 1210 MB/sec
i386 (Sandy Bridge), gcc-4.6 -O3, Silesia test corpus, 256 kB block-size:
compression speed decompression speed
LZO-2005 : 143 MB/sec 409 MB/sec
LZO-2012 : 372 MB/sec 1121 MB/sec
armv7 (Cortex-A9), Linaro gcc-4.6 -O3, Silesia test corpus, 256 kB block-size:
compression speed decompression speed
LZO-2005 : 27 MB/sec 84 MB/sec
LZO-2012 : 44 MB/sec 117 MB/sec
**LZO-2013-UA : 47 MB/sec 167 MB/sec
Legend:
LZO-2005 : LZO version in current 3.8 kernel (which is based on
the LZO 2.02 release from 2005)
LZO-2012 : updated LZO version available in linux-next
**LZO-2013-UA : updated LZO version available in linux-next plus experimental
ARM Unaligned Access patch. This needs approval
from some ARM maintainer ist NOT YET INCLUDED."
Andrew Morton <akpm@linux-foundation.org> acks it and says:
"There's a new LZ4 on the block which is even faster than the sped-up
LZO, but various filesystems and things use LZO"
* tag 'lzo-update-signature-20130226' of git://github.com/markus-oberhumer/linux:
crypto: testmgr - update LZO compression test vectors
lib/lzo: Update LZO compression to current upstream version
lib/lzo: Rename lzo1x_decompress.c to lzo1x_decompress_safe.c
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac
Pull EDAC fixes and ghes-edac from Mauro Carvalho Chehab:
"For:
- Some fixes at edac drivers (i7core_edac, sb_edac, i3200_edac);
- error injection support for i5100, when EDAC debug is enabled;
- fix edac when it is loaded builtin (early init for the subsystem);
- a "Firmware First" EDAC driver, allowing ghes to report errors via
EDAC (ghes-edac).
With regards to ghes-edac, this fixes a longstanding BZ at Red Hat
that happens with Nehalem and Sandy Bridge CPUs: when both GHES and
i7core_edac or sb_edac are running, the error reports are
unpredictable, as both BIOS and OS race to access the registers. With
ghes-edac, the EDAC core will refuse to register any other concurrent
memory error driver.
This patchset moves the ghes struct definitions to a separate header
file (include/acpi/ghes.h) and adds 3 hooks at apei/ghes.c to
register/unregister and to report errors via ghes-edac. Those changes
were acked by ghes driver maintainer (Huang)."
* 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac: (30 commits)
i5100_edac: convert to use simple_open()
ghes_edac: fix to use list_for_each_entry_safe() when delete list items
ghes_edac: Fix RAS tracing
ghes_edac: Make it compliant with UEFI spec 2.3.1
ghes_edac: Improve driver's printk messages
ghes_edac: Don't credit the same memory dimm twice
ghes_edac: do a better job of filling EDAC DIMM info
ghes_edac: add support for reporting errors via EDAC
ghes_edac: Register at EDAC core the BIOS report
ghes: add the needed hooks for EDAC error report
ghes: move structures/enum to a header file
edac: add support for error type "Info"
edac: add support for raw error reports
edac: reduce stack pressure by using a pre-allocated buffer
edac: lock module owner to avoid error report conflicts
edac: remove proc_name from mci structure
edac: add a new memory layer type
edac: initialize the core earlier
edac: better report error conditions in debug mode
i5100_edac: Remove two checkpatch warnings
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC late OMAP changes from Olof Johansson:
"This branch contains changes for OMAP that came in late during the
release staging, close to when the merge window opened.
It contains, among other things:
- OMAP PM fixes and some patches for audio device integration
- OMAP clock fixes related to common clock conversion
- A set of patches cleaning up WFI entry and blocking.
- A set of fixes and IP block support for PM on TI AM33xx SoCs
(Beaglebone, etc)
- A set of smaller fixes and cleanups around AM33xx restart and
revision detection, as well as removal of some dead code
(CONFIG_32K_TIMER_HZ)"
* tag 'late-omap' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (34 commits)
ARM: omap2: include linux/errno.h in hwmod_reset
ARM: OMAP2+: fix some omap_device_build() calls that aren't compiled by default
ARM: OMAP4: hwmod data: Enable AESS hwmod device
ARM: OMAP4: hwmod data: Update AESS data with memory bank area
ARM: OMAP4+: AESS: enable internal auto-gating during initial setup
ASoC: TI AESS: add autogating-enable function, callable from architecture code
ARM: OMAP2+: hwmod: add enable_preprogram hook
ARM: OMAP4: clock data: Add missing clkdm association for dpll_usb
ARM: OMAP2+: PM: Fix the dt return condition in pm_late_init()
ARM: OMAP2: am33xx-hwmod: Fix "register offset NULL check" bug
ARM: OMAP2+: AM33xx: hwmod: add missing HWMOD_NO_IDLEST flags
ARM: OMAP: AM33xx hwmod: Add parent-child relationship for PWM subsystem
ARM: OMAP: AM33xx hwmod: Corrects PWM subsystem HWMOD entries
ARM: DTS: AM33XX: Add nodes for OCMC RAM and WKUP-M3
ARM: OMAP2+: AM33XX: Update the hardreset API
ARM: OMAP2+: AM33XX: hwmod: Update the WKUP-M3 hwmod with reset status bit
ARM: OMAP2+: AM33XX: hwmod: Fixup cpgmac0 hwmod entry
ARM: OMAP2+: AM33XX: hwmod: Update TPTC0 hwmod with the right flags
ARM: OMAP2+: AM33XX: hwmod: Register OCMC RAM hwmod
ARM: OMAP2+: AM33XX: CM/PRM: Use __ASSEMBLER__ macros in header files
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal management updates from Zhang Rui:
"Highlights:
- introduction of Dove thermal sensor driver.
- introduction of Kirkwood thermal sensor driver.
- introduction of intel_powerclamp thermal cooling device driver.
- add interrupt and DT support for rcar thermal driver.
- add thermal emulation support which allows platform thermal driver
to do software/hardware emulation for thermal issues."
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (36 commits)
thermal: rcar: remove __devinitconst
thermal: return an error on failure to register thermal class
Thermal: rename thermal governor Kconfig option to avoid generic naming
thermal: exynos: Use the new thermal trend type for quick cooling action.
Thermal: exynos: Add support for temperature falling interrupt.
Thermal: Dove: Add Themal sensor support for Dove.
thermal: Add support for the thermal sensor on Kirkwood SoCs
thermal: rcar: add Device Tree support
thermal: rcar: remove machine_power_off() from rcar_thermal_notify()
thermal: rcar: add interrupt support
thermal: rcar: add read/write functions for common/priv data
thermal: rcar: multi channel support
thermal: rcar: use mutex lock instead of spin lock
thermal: rcar: enable CPCTL to use hardware TSC deciding
thermal: rcar: use parenthesis on macro
Thermal: fix a build warning when CONFIG_THERMAL_EMULATION cleared
Thermal: fix a wrong comment
thermal: sysfs: Add a new sysfs node emul_temp for thermal emulation
PM: intel_powerclamp: off by one in start_power_clamp()
thermal: exynos: Miscellaneous fixes to support falling threshold interrupt
...
|
|
git://git.linaro.org/people/sumitsemwal/linux-dma-buf
Pull dma-buf framework updates from Sumit Semwal:
"Refcounting implemented for vmap in core dma-buf"
* tag 'tag-for-linus-3.9' of git://git.linaro.org/people/sumitsemwal/linux-dma-buf:
CHROMIUM: dma-buf: restore args on failure of dma_buf_mmap
dma-buf: implement vmap refcounting in the interface logic
|
|
Pull nfsd changes from J Bruce Fields:
"Miscellaneous bugfixes, plus:
- An overhaul of the DRC cache by Jeff Layton. The main effect is
just to make it larger. This decreases the chances of intermittent
errors especially in the UDP case. But we'll need to watch for any
reports of performance regressions.
- Containerized nfsd: with some limitations, we now support
per-container nfs-service, thanks to extensive work from Stanislav
Kinsbursky over the last year."
Some notes about conflicts, since there were *two* non-data semantic
conflicts here:
- idr_remove_all() had been added by a memory leak fix, but has since
become deprecated since idr_destroy() does it for us now.
- xs_local_connect() had been added by this branch to make AF_LOCAL
connections be synchronous, but in the meantime Trond had changed the
calling convention in order to avoid a RCU dereference.
There were a couple of more obvious actual source-level conflicts due to
the hlist traversal changes and one just due to code changes next to
each other, but those were trivial.
* 'for-3.9' of git://linux-nfs.org/~bfields/linux: (49 commits)
SUNRPC: make AF_LOCAL connect synchronous
nfsd: fix compiler warning about ambiguous types in nfsd_cache_csum
svcrpc: fix rpc server shutdown races
svcrpc: make svc_age_temp_xprts enqueue under sv_lock
lockd: nlmclnt_reclaim(): avoid stack overflow
nfsd: enable NFSv4 state in containers
nfsd: disable usermode helper client tracker in container
nfsd: use proper net while reading "exports" file
nfsd: containerize NFSd filesystem
nfsd: fix comments on nfsd_cache_lookup
SUNRPC: move cache_detail->cache_request callback call to cache_read()
SUNRPC: remove "cache_request" argument in sunrpc_cache_pipe_upcall() function
SUNRPC: rework cache upcall logic
SUNRPC: introduce cache_detail->cache_request callback
NFS: simplify and clean cache library
NFS: use SUNRPC cache creation and destruction helper for DNS cache
nfsd4: free_stid can be static
nfsd: keep a checksum of the first 256 bytes of request
sunrpc: trim off trailing checksum before returning decrypted or integrity authenticated buffer
sunrpc: fix comment in struct xdr_buf definition
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull Ceph updates from Sage Weil:
"A few groups of patches here. Alex has been hard at work improving
the RBD code, layout groundwork for understanding the new formats and
doing layering. Most of the infrastructure is now in place for the
final bits that will come with the next window.
There are a few changes to the data layout. Jim Schutt's patch fixes
some non-ideal CRUSH behavior, and a set of patches from me updates
the client to speak a newer version of the protocol and implement an
improved hashing strategy across storage nodes (when the server side
supports it too).
A pair of patches from Sam Lang fix the atomicity of open+create
operations. Several patches from Yan, Zheng fix various mds/client
issues that turned up during multi-mds torture tests.
A final set of patches expose file layouts via virtual xattrs, and
allow the policies to be set on directories via xattrs as well
(avoiding the awkward ioctl interface and providing a consistent
interface for both kernel mount and ceph-fuse users)."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (143 commits)
libceph: add support for HASHPSPOOL pool flag
libceph: update osd request/reply encoding
libceph: calculate placement based on the internal data types
ceph: update support for PGID64, PGPOOL3, OSDENC protocol features
ceph: update "ceph_features.h"
libceph: decode into cpu-native ceph_pg type
libceph: rename ceph_pg -> ceph_pg_v1
rbd: pass length, not op for osd completions
rbd: move rbd_osd_trivial_callback()
libceph: use a do..while loop in con_work()
libceph: use a flag to indicate a fault has occurred
libceph: separate non-locked fault handling
libceph: encapsulate connection backoff
libceph: eliminate sparse warnings
ceph: eliminate sparse warnings in fs code
rbd: eliminate sparse warnings
libceph: define connection flag helpers
rbd: normalize dout() calls
rbd: barriers are hard
rbd: ignore zero-length requests
...
|
|
The client will currently try LAYOUTGETs forever if a server is returning
NFS4ERR_LAYOUTTRYLATER or NFS4ERR_RECALLCONFLICT - even if the client no
longer needs the layout (ie process killed, unmounted).
This patch uses the DS timeout value (module parameter 'dataserver_timeo'
via rpc layer) to set an upper limit of how long the client tries LATOUTGETs
in this situation. Once the timeout is reached, IO is redirected to the MDS.
This also changes how the client checks if a layout is on the clp list
to avoid a double list_add.
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Returns the configured timeout for the xprt of the rpc client.
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux
Pull writeback fixes from Wu Fengguang:
"Two writeback fixes
- fix negative (setpoint - dirty) in 32bit archs
- use down_read_trylock() in writeback_inodes_sb(_nr)_if_idle()"
* tag 'writeback-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:
Negative (setpoint-dirty) in bdi_position_ratio()
vfs: re-implement writeback_inodes_sb(_nr)_if_idle() and rename them
|
|
Pull block driver bits from Jens Axboe:
"After the block IO core bits are in, please grab the driver updates
from below as well. It contains:
- Fix ancient regression in dac960. Nobody must be using that
anymore...
- Some good fixes from Guo Ghao for loop, fixing both potential
oopses and deadlocks.
- Improve mtip32xx for NUMA systems, by being a bit more clever in
distributing work.
- Add IBM RamSan 70/80 driver. A second round of fixes for that is
pending, that will come in through for-linus during the 3.9 cycle
as per usual.
- A few xen-blk{back,front} fixes from Konrad and Roger.
- Other minor fixes and improvements."
* 'for-3.9/drivers' of git://git.kernel.dk/linux-block:
loopdev: ignore negative offset when calculate loop device size
loopdev: remove an user triggerable oops
loopdev: move common code into loop_figure_size()
loopdev: update block device size in loop_set_status()
loopdev: fix a deadlock
xen-blkback: use balloon pages for persistent grants
xen-blkfront: drop the use of llist_for_each_entry_safe
xen/blkback: Don't trust the handle from the frontend.
xen-blkback: do not leak mode property
block: IBM RamSan 70/80 driver fixes
rsxx: add slab.h include to dma.c
drivers/block/mtip32xx: add missing GENERIC_HARDIRQS dependency
block: remove new __devinit/exit annotations on ramsam driver
block: IBM RamSan 70/80 device driver
drivers/block/mtip32xx/mtip32xx.c:1726:5: sparse: symbol 'mtip_send_trim' was not declared. Should it be static?
drivers/block/mtip32xx/mtip32xx.c:4029:1: sparse: symbol 'mtip_workq_sdbf0' was not declared. Should it be static?
dac960: return success instead of -ENOTTY
mtip32xx: add trim support
mtip32xx: Add workqueue and NUMA support
block: delete super ancient PC-XT driver for 1980's hardware
|
|
Pull block IO core bits from Jens Axboe:
"Below are the core block IO bits for 3.9. It was delayed a few days
since my workstation kept crashing every 2-8h after pulling it into
current -git, but turns out it is a bug in the new pstate code (divide
by zero, will report separately). In any case, it contains:
- The big cfq/blkcg update from Tejun and and Vivek.
- Additional block and writeback tracepoints from Tejun.
- Improvement of the should sort (based on queues) logic in the plug
flushing.
- _io() variants of the wait_for_completion() interface, using
io_schedule() instead of schedule() to contribute to io wait
properly.
- Various little fixes.
You'll get two trivial merge conflicts, which should be easy enough to
fix up"
Fix up the trivial conflicts due to hlist traversal cleanups (commit
b67bfe0d42ca: "hlist: drop the node parameter from iterators").
* 'for-3.9/core' of git://git.kernel.dk/linux-block: (39 commits)
block: remove redundant check to bd_openers()
block: use i_size_write() in bd_set_size()
cfq: fix lock imbalance with failed allocations
drivers/block/swim3.c: fix null pointer dereference
block: don't select PERCPU_RWSEM
block: account iowait time when waiting for completion of IO request
sched: add wait_for_completion_io[_timeout]
writeback: add more tracepoints
block: add block_{touch|dirty}_buffer tracepoint
buffer: make touch_buffer() an exported function
block: add @req to bio_{front|back}_merge tracepoints
block: add missing block_bio_complete() tracepoint
block: Remove should_sort judgement when flush blk_plug
block,elevator: use new hashtable implementation
cfq-iosched: add hierarchical cfq_group statistics
cfq-iosched: collect stats from dead cfqgs
cfq-iosched: separate out cfqg_stats_reset() from cfq_pd_reset_stats()
blkcg: make blkcg_print_blkgs() grab q locks instead of blkcg lock
block: RCU free request_queue
blkcg: implement blkg_[rw]stat_recursive_sum() and blkg_[rw]stat_merge()
...
|
|
TCP prequeue mechanism purpose is to let incoming packets
being processed by the thread currently blocked in tcp_recvmsg(),
instead of behalf of the softirq handler, to better adapt flow
control on receiver host capacity to schedule the consumer.
But in typical request/answer workloads, we send request, then
block to receive the answer. And before the actual answer, TCP
stack receives the ACK packets acknowledging the request.
Processing pure ACK on behalf of the thread blocked in tcp_recvmsg()
is a waste of resources, as thread has to immediately sleep again
because it got no payload.
This patch avoids the extra context switches and scheduler overhead.
Before patch :
a:~# echo 0 >/proc/sys/net/ipv4/tcp_low_latency
a:~# perf stat ./super_netperf 300 -t TCP_RR -l 10 -H 7.7.7.84 -- -r 8k,8k
231676
Performance counter stats for './super_netperf 300 -t TCP_RR -l 10 -H 7.7.7.84 -- -r 8k,8k':
116251.501765 task-clock # 11.369 CPUs utilized
5,025,463 context-switches # 0.043 M/sec
1,074,511 CPU-migrations # 0.009 M/sec
216,923 page-faults # 0.002 M/sec
311,636,972,396 cycles # 2.681 GHz
260,507,138,069 stalled-cycles-frontend # 83.59% frontend cycles idle
155,590,092,840 stalled-cycles-backend # 49.93% backend cycles idle
100,101,255,411 instructions # 0.32 insns per cycle
# 2.60 stalled cycles per insn
16,535,930,999 branches # 142.243 M/sec
646,483,591 branch-misses # 3.91% of all branches
10.225482774 seconds time elapsed
After patch :
a:~# echo 0 >/proc/sys/net/ipv4/tcp_low_latency
a:~# perf stat ./super_netperf 300 -t TCP_RR -l 10 -H 7.7.7.84 -- -r 8k,8k
233297
Performance counter stats for './super_netperf 300 -t TCP_RR -l 10 -H 7.7.7.84 -- -r 8k,8k':
91084.870855 task-clock # 8.887 CPUs utilized
2,485,916 context-switches # 0.027 M/sec
815,520 CPU-migrations # 0.009 M/sec
216,932 page-faults # 0.002 M/sec
245,195,022,629 cycles # 2.692 GHz
202,635,777,041 stalled-cycles-frontend # 82.64% frontend cycles idle
124,280,372,407 stalled-cycles-backend # 50.69% backend cycles idle
83,457,289,618 instructions # 0.34 insns per cycle
# 2.43 stalled cycles per insn
13,431,472,361 branches # 147.461 M/sec
504,470,665 branch-misses # 3.76% of all branches
10.249594448 seconds time elapsed
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Merge third patch-bumb from Andrew Morton:
"This wraps me up for -rc1.
- Lots of misc stuff and things which were deferred/missed from
patchbombings 1 & 2.
- ocfs2 things
- lib/scatterlist
- hfsplus
- fatfs
- documentation
- signals
- procfs
- lockdep
- coredump
- seqfile core
- kexec
- Tejun's large IDR tree reworkings
- ipmi
- partitions
- nbd
- random() things
- kfifo
- tools/testing/selftests updates
- Sasha's large and pointless hlist cleanup"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (163 commits)
hlist: drop the node parameter from iterators
kcmp: make it depend on CHECKPOINT_RESTORE
selftests: add a simple doc
tools/testing/selftests/Makefile: rearrange targets
selftests/efivarfs: add create-read test
selftests/efivarfs: add empty file creation test
selftests: add tests for efivarfs
kfifo: fix kfifo_alloc() and kfifo_init()
kfifo: move kfifo.c from kernel/ to lib/
arch Kconfig: centralise CONFIG_ARCH_NO_VIRT_TO_BUS
w1: add support for DS2413 Dual Channel Addressable Switch
memstick: move the dereference below the NULL test
drivers/pps/clients/pps-gpio.c: use devm_kzalloc
Documentation/DMA-API-HOWTO.txt: fix typo
include/linux/eventfd.h: fix incorrect filename is a comment
mtd: mtd_stresstest: use prandom_bytes()
mtd: mtd_subpagetest: convert to use prandom library
mtd: mtd_speedtest: use prandom_bytes
mtd: mtd_pagetest: convert to use prandom library
mtd: mtd_oobtest: convert to use prandom library
...
|
|
The original device tree binding for this driver, from Viresh Kumar
unfortunately conflicted with the generic DMA binding, and did not allow
to completely seperate slave device configuration from the controller.
This is an attempt to replace it with an implementation of the generic
binding, but it is currently completely untested, because I do not have
any hardware with this particular controller.
The patch applies on top of the slave-dma tree, which contains both the base
support for the generic DMA binding, as well as the earlier attempt from
Viresh. Both of these are currently not merged upstream however.
This version incorporates feedback from Viresh Kumar, Andy Shevchenko
and Russell King.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Vinod Koul <vinod.koul@linux.intel.com>
Cc: devicetree-discuss@lists.ozlabs.org
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
I'm not sure why, but the hlist for each entry iterators were conceived
list_for_each_entry(pos, head, member)
The hlist ones were greedy and wanted an extra parameter:
hlist_for_each_entry(tpos, pos, head, member)
Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.
Besides the semantic patch, there was some manual work required:
- Fix up the actual hlist iterators in linux/list.h
- Fix up the declaration of other iterators based on the hlist ones.
- A very small amount of places were using the 'node' parameter, this
was modified to use 'obj->member' instead.
- Coccinelle didn't handle the hlist_for_each_entry_safe iterator
properly, so those had to be fixed up manually.
The semantic patch which is mostly the work of Peter Senna Tschudin is here:
@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
type T;
expression a,c,d,e;
identifier b;
statement S;
@@
-T b;
<+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
...+>
[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin <peter.senna@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Comment in eventfd.h referred to 'include/asm-generic/fcntl.h'
while the correct path is 'include/uapi/asm-generic/fcntl.h'.
Signed-off-by: Martin Sustrik <sustrik@250bpm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Currently, the NBD device does not accept flush requests from the Linux
block layer. If the NBD server opened the target with neither O_SYNC nor
O_DSYNC, however, the device will be effectively backed by a writeback
cache. Without issuing flushes properly, operation of the NBD device will
not be safe against power losses.
The NBD protocol has support for both a cache flush command and a FUA
command flag; the server will also pass a flag to note its support for
these features. This patch adds support for the cache flush command and
flag. In the kernel, we receive the flags via the NBD_SET_FLAGS ioctl,
and map NBD_FLAG_SEND_FLUSH to the argument of blk_queue_flush. When the
flag is active the block layer will send REQ_FLUSH requests, which we
translate to NBD_CMD_FLUSH commands.
FUA support is not included in this patch because all free software
servers implement it with a full fdatasync; thus it has no advantage over
supporting flush only. Because I [Paolo] cannot really benchmark it in a
realistic scenario, I cannot tell if it is a good idea or not. It is also
not clear if it is valid for an NBD server to support FUA but not flush.
The Linux block layer gives a warning for this combination, the NBD
protocol documentation says nothing about it.
The patch also fixes a small problem in the handling of flags: nbd->flags
must be cleared at the end of NBD_DO_IT, but the driver was not doing
that. The bug manifests itself as follows. Suppose you two different
client/server pairs to start the NBD device. Suppose also that the first
client supports NBD_SET_FLAGS, and the first server sends
NBD_FLAG_SEND_FLUSH; the second pair instead does neither of these two
things. Before this patch, the second invocation of NBD_DO_IT will use a
stale value of nbd->flags, and the second server will issue an error every
time it receives an NBD_CMD_FLUSH command.
This bug is pre-existing, but it becomes much more important after this
patch; flush failures make the device pretty much unusable, unlike
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bligh <alex@alex.org.uk>
Acked-by: Paul Clements <Paul.Clements@steeleye.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Given the obvious distinction between kernel and userspace supported
by uapi/, it seems unnecessary to comment on that.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
While idr lookup isn't a particularly heavy operation, it still is too
substantial to use in hot paths without worrying about the performance
implications. With recent changes, each idr_layer covers 256 slots
which should be enough to cover most use cases with single idr_layer
making lookup hint very attractive.
This patch adds idr->hint which points to the idr_layer which
allocated an ID most recently and the fast path lookup becomes
if (look up target's prefix matches that of the hinted layer)
return hint->ary[ID's offset in the leaf layer];
which can be inlined.
idr->hint is set to the leaf node on idr_fill_slot() and cleared from
free_layer().
[andriy.shevchenko@linux.intel.com: always do slow path when hint is uninitialized]
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Add a field which carries the prefix of ID the idr_layer covers. This
will be used to implement lookup hint.
This patch doesn't make use of the new field and doesn't introduce any
behavior difference.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
With recent preloading changes, idr no longer keeps full layer cache per
each idr instance (used to be ~6.5k per idr on 64bit) and the previous
patch removed restriction on the bitmap size. Both now allow us to have
larger layers.
Increase IDR_BITS to 8 regardless of BITS_PER_LONG. Each layer is
slightly larger than 2k on 64bit and 1k on 32bit and carries 256 entries.
The size isn't too large, especially compared to what we used to waste on
per-idr caches, and 256 entries should be able to serve most use cases
with single layer. The max tree depth is 4 which is much better than the
previous 6 on 64bit and 7 on 32bit.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Currently, idr->bitmap is declared as an unsigned long which restricts
the number of bits an idr_layer can contain. All bitops can handle
arbitrary positive integer bit number and there's no reason for this
restriction.
Declare idr_layer->bitmap using DECLARE_BITMAP() instead of a single
unsigned long.
* idr_layer->bitmap is now an array. '&' dropped from params to
bitops.
* Replaced "== IDR_FULL" tests with bitmap_full() and removed
IDR_FULL.
* Replaced find_next_bit() on ~bitmap with find_next_zero_bit().
* Replaced "bitmap = 0" with bitmap_clear().
This patch doesn't (or at least shouldn't) introduce any behavior
changes.
[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
MAX_IDR_MASK is another weirdness in the idr interface. As idr covers
whole positive integer range, it's defined as 0x7fffffff or INT_MAX.
Its usage in idr_find(), idr_replace() and idr_remove() is bizarre.
They basically mask off the sign bit and operate on the rest, so if
the caller, by accident, passes in a negative number, the sign bit
will be masked off and the remaining part will be used as if that was
the input, which is worse than crashing.
The constant is visible in idr.h and there are several users in the
kernel.
* drivers/i2c/i2c-core.c:i2c_add_numbered_adapter()
Basically used to test if adap->nr is a negative number which isn't
-1 and returns -EINVAL if so. idr_alloc() already has negative
@start checking (w/ WARN_ON_ONCE), so this can go away.
* drivers/infiniband/core/cm.c:cm_alloc_id()
drivers/infiniband/hw/mlx4/cm.c:id_map_alloc()
Used to wrap cyclic @start. Can be replaced with max(next, 0).
Note that this type of cyclic allocation using idr is buggy. These
are prone to spurious -ENOSPC failure after the first wraparound.
* fs/super.c:get_anon_bdev()
The ID allocated from ida is masked off before being tested whether
it's inside valid range. ida allocated ID can never be a negative
number and the masking is unnecessary.
Update idr_*() functions to fail with -EINVAL when negative @id is
specified and update other MAX_IDR_MASK users as described above.
This leaves MAX_IDR_MASK without any user, remove it and relocate
other MAX_IDR_* constants to lib/idr.c.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Roland Dreier <roland@kernel.org>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: "Marciniszyn, Mike" <mike.marciniszyn@intel.com>
Cc: Jack Morgenstein <jackm@dev.mellanox.co.il>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Wolfram Sang <wolfram@the-dreams.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The current idr interface is very cumbersome.
* For all allocations, two function calls - idr_pre_get() and
idr_get_new*() - should be made.
* idr_pre_get() doesn't guarantee that the following idr_get_new*()
will not fail from memory shortage. If idr_get_new*() returns
-EAGAIN, the caller is expected to retry pre_get and allocation.
* idr_get_new*() can't enforce upper limit. Upper limit can only be
enforced by allocating and then freeing if above limit.
* idr_layer buffer is unnecessarily per-idr. Each idr ends up keeping
around MAX_IDR_FREE idr_layers. The memory consumed per idr is
under two pages but it makes it difficult to make idr_layer larger.
This patch implements the following new set of allocation functions.
* idr_preload[_end]() - Similar to radix preload but doesn't fail.
The first idr_alloc() inside preload section can be treated as if it
were called with @gfp_mask used for idr_preload().
* idr_alloc() - Allocate an ID w/ lower and upper limits. Takes
@gfp_flags and can be used w/o preloading. When used inside
preloaded section, the allocation mask of preloading can be assumed.
If idr_alloc() can be called from a context which allows sufficiently
relaxed @gfp_mask, it can be used by itself. If, for example,
idr_alloc() is called inside spinlock protected region, preloading can
be used like the following.
idr_preload(GFP_KERNEL);
spin_lock(lock);
id = idr_alloc(idr, ptr, start, end, GFP_NOWAIT);
spin_unlock(lock);
idr_preload_end();
if (id < 0)
error;
which is much simpler and less error-prone than idr_pre_get and
idr_get_new*() loop.
The new interface uses per-pcu idr_layer buffer and thus the number of
idr's in the system doesn't affect the amount of memory used for
preloading.
idr_layer_alloc() is introduced to handle idr_layer allocations for
both old and new ID allocation paths. This is a bit hairy now but the
new interface is expected to replace the old and the internal
implementation eventually will become simpler.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
idr uses -1, IDR_NEED_TO_GROW and IDR_NOMORE_SPACE to communicate
exception conditions internally. The return value is later translated
to errno values using _idr_rc_to_errno().
This is confusing. Drop the custom ones and consistently use -EAGAIN
for "tree needs to grow", -ENOMEM for "need more memory" and -ENOSPC for
"ran out of ID space".
Due to the weird memory preloading mechanism, [ra]_get_new*() return
-EAGAIN on memory shortage, so we need to substitute -ENOMEM w/
-EAGAIN on those interface functions. They'll eventually be cleaned
up and the translations will go away.
This patch doesn't introduce any functional changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
* Move idr_for_each_entry() definition next to other idr related
definitions.
* Make id[r|a]_get_new() inline wrappers of id[r|a]_get_new_above().
This changes the implementation of idr_get_new() but the new
implementation is trivial. This patch doesn't introduce any
functional change.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
* Tab align fields like a normal person.
* Drop the unnecessary 0 inits from IDR_INIT().
This patch is purely cosmetic.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|