Age | Commit message (Collapse) | Author | Files | Lines |
|
With the routing cache removal we lost the "noref" code paths on
input, and this can kill some routing workloads.
Reinstate the noref path when we hit a cached route in the FIB
nexthops.
With help from Eric Dumazet.
Reported-by: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull networking changes from David S Miller:
1) Remove the ipv4 routing cache. Now lookups go directly into the FIB
trie and use prebuilt routes cached there.
No more garbage collection, no more rDOS attacks on the routing
cache. Instead we now get predictable and consistent performance,
no matter what the pattern of traffic we service.
This has been almost 2 years in the making. Special thanks to
Julian Anastasov, Eric Dumazet, Steffen Klassert, and others who
have helped along the way.
I'm sure that with a change of this magnitude there will be some
kind of fallout, but such things ought the be simple to fix at this
point. Luckily I'm not European so I'll be around all of August to
fix things :-)
The major stages of this work here are each fronted by a forced
merge commit whose commit message contains a top-level description
of the motivations and implementation issues.
2) Pre-demux of established ipv4 TCP sockets, saves a route demux on
input.
3) TCP SYN/ACK performance tweaks from Eric Dumazet.
4) Add namespace support for netfilter L4 conntrack helpers, from Gao
Feng.
5) Add config mechanism for Energy Efficient Ethernet to ethtool, from
Yuval Mintz.
6) Remove quadratic behavior from /proc/net/unix, from Eric Dumazet.
7) Support for connection tracker helpers in userspace, from Pablo
Neira Ayuso.
8) Allow userspace driven TX load balancing functions in TEAM driver,
from Jiri Pirko.
9) Kill off NLMSG_PUT and RTA_PUT macros, more gross stuff with
embedded gotos.
10) TCP Small Queues, essentially minimize the amount of TCP data queued
up in the packet scheduler layer. Whereas the existing BQL (Byte
Queue Limits) limits the pkt_sched --> netdevice queuing levels,
this controls the TCP --> pkt_sched queueing levels.
From Eric Dumazet.
11) Reduce the number of get_page/put_page ops done on SKB fragments,
from Alexander Duyck.
12) Implement protection against blind resets in TCP (RFC 5961), from
Eric Dumazet.
13) Support the client side of TCP Fast Open, basically the ability to
send data in the SYN exchange, from Yuchung Cheng.
Basically, the sender queues up data with a sendmsg() call using
MSG_FASTOPEN, then they do the connect() which emits the queued up
fastopen data.
14) Avoid all the problems we get into in TCP when timers or PMTU events
hit a locked socket. The TCP Small Queues changes added a
tcp_release_cb() that allows us to queue work up to the
release_sock() caller, and that's what we use here too. From Eric
Dumazet.
15) Zero copy on TX support for TUN driver, from Michael S. Tsirkin.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1870 commits)
genetlink: define lockdep_genl_is_held() when CONFIG_LOCKDEP
r8169: revert "add byte queue limit support".
ipv4: Change rt->rt_iif encoding.
net: Make skb->skb_iif always track skb->dev
ipv4: Prepare for change of rt->rt_iif encoding.
ipv4: Remove all RTCF_DIRECTSRC handliing.
ipv4: Really ignore ICMP address requests/replies.
decnet: Don't set RTCF_DIRECTSRC.
net/ipv4/ip_vti.c: Fix __rcu warnings detected by sparse.
ipv4: Remove redundant assignment
rds: set correct msg_namelen
openvswitch: potential NULL deref in sample()
tcp: dont drop MTU reduction indications
bnx2x: Add new 57840 device IDs
tcp: avoid oops in tcp_metrics and reset tcpm_stamp
niu: Change niu_rbr_fill() to use unlikely() to check niu_rbr_add_page() return value
niu: Fix to check for dma mapping errors.
net: Fix references to out-of-scope variables in put_cmsg_compat()
net: ethernet: davinci_emac: add pm_runtime support
net: ethernet: davinci_emac: Remove unnecessary #include
...
|
|
lockdep_is_held() is defined when CONFIG_LOCKDEP, not CONFIG_PROVE_LOCKING.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jesse Gross <jesse@nicira.com>
Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull arch/tile updates from Chris Metcalf:
"These changes provide support for PCIe root complex and USB host mode
for tilegx's on-chip I/Os.
In addition, this pull provides the required underpinning for the
on-chip networking support that was pulled into 3.5. The changes have
all been through LKML (with several rounds for PCIe RC) and on
linux-next."
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
tile: updates to pci root complex from community feedback
bounce: allow use of bounce pool via config option
usb: add host support for the tilegx architecture
arch/tile: provide kernel support for the tilegx USB shim
tile pci: enable IOMMU to support DMA for legacy devices
arch/tile: enable ZONE_DMA for tilegx
tilegx pci: support I/O to arbitrarily-cached pages
tile: remove unused header
arch/tile: tilegx PCI root complex support
arch/tile: provide kernel support for the tilegx TRIO shim
arch/tile: break out the "csum a long" function to <asm/checksum.h>
arch/tile: provide kernel support for the tilegx mPIPE shim
arch/tile: common DMA code for the GXIO IORPC subsystem
arch/tile: support MMIO-based readb/writeb etc.
arch/tile: introduce GXIO IORPC framework for tilegx
|
|
Pull SuperH updates from Paul Mundt:
- Migration off of old-style dynamic IRQ API.
- irqdomain and generic irq chip propagation.
- div4/6 clock consolidation, another step towards co-existing with the
common struct clk infrastructure.
- Extensive PFC rework
- Decoupling GPIO from pin state.
- Initial pinctrl support to facilitate incremental migration off of
legacy pinmux.
- gpiolib support made optional, and made pinctrl-backed.
* tag 'sh-for-linus' of git://github.com/pmundt/linux-sh: (38 commits)
sh: pfc: pin config get/set support.
sh: pfc: Prefer DRV_NAME over KBUILD_MODNAME.
sh: pfc: pinctrl legacy group support.
sh: pfc: Ignore pinmux GPIOs with invalid enum IDs.
sh: pfc: Export pinctrl binding init symbol.
sh: pfc: Error out on pinctrl init resolution failure.
sh: pfc: Make pr_fmt consistent across pfc drivers.
sh: pfc: pinctrl legacy function support.
sh: pfc: Rudimentary pinctrl-backed GPIO support.
sh: pfc: Dumb GPIO stringification.
sh: pfc: Shuffle PFC support core.
sh: pfc: Verify pin type encoding size at build time.
sh: pfc: Kill off unused pinmux bias flags.
sh: pfc: Make gpio chip support optional where possible.
sh: pfc: Split out gpio chip support.
sh64: Fix up section mismatch warnings.
sh64: Attempt to make reserved insn trap handler resemble C.
sh: Consolidate die definitions for trap handlers.
sh64: Kill off old exception debugging helpers.
sh64: Use generic unaligned access control/counters.
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc updates from Benjamin Herrenschmidt:
"Notable highlights:
- iommu improvements from Anton removing the per-iommu global lock in
favor of dividing the DMA space into pools, each with its own lock,
and hashed on the CPU number. Along with making the locking more
fine grained, this gives significant improvements in multiqueue
networking scalability.
- Still from Anton, we know provide a vdso based variant of getcpu
which makes sched_getcpu with the appropriate glibc patch something
like 18 times faster.
- More anton goodness (he's been busy !) in other areas such as a
faster __clear_user and copy_page on P7, various perf fixes to
improve sampling quality, etc...
- One more step toward removing legacy i2c interfaces by using new
device-tree based probing of platform devices for the AOA audio
drivers
- A nice series of patches from Michael Neuling that helps avoiding
confusion between register numbers and litterals in assembly code,
trying to enforce the use of "%rN" register names in gas rather
than plain numbers.
- A pile of FSL updates
- The usual bunch of small fixes, cleanups etc...
You may spot a change to drivers/char/mem. The patch got no comment
or ack from outside, it's a trivial patch to allow the architecture to
skip creating /dev/port, which we use to disable it on ppc64 that
don't have a legacy brige. On those, IO ports 0...64K are not mapped
in kernel space at all, so accesses to /dev/port cause oopses (and
yes, distros -still- ship userspace that bangs hard coded ports such
as kbdrate)."
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (106 commits)
powerpc/mpic: Create a revmap with enough entries for IPIs and timers
Remove stale .rej file
powerpc/iommu: Fix iommu pool initialization
powerpc/eeh: Check handle_eeh_events() return value
powerpc/85xx: Add phy nodes in SGMII mode for MPC8536/44/72DS & P2020DS
powerpc/e500: add paravirt QEMU platform
powerpc/mpc85xx_ds: convert to unified PCI init
powerpc/fsl-pci: get PCI init out of board files
powerpc/85xx: Update corenet64_smp_defconfig
powerpc/85xx: Update corenet32_smp_defconfig
powerpc/85xx: Rename P1021RDB-PC device trees to be consistent
powerpc/watchdog: move booke watchdog param related code to setup-common.c
sound/aoa: Adapt to new i2c probing scheme
i2c/powermac: Improve detection of devices from device-tree
powerpc: Disable /dev/port interface on systems without an ISA bridge
of: Improve prom_update_property() function
powerpc: Add "memory" attribute for mfmsr()
powerpc/ftrace: Fix assembly trampoline register usage
powerpc/hw_breakpoints: Fix incorrect pointer access
powerpc: Put the gpr save/restore functions in their own section
...
|
|
Pull arm-soc power management changes from Arnd Bergmann:
"These are various power management related changes, mainly concerning
cpuidle on i.MX and OMAP, as well as a the move of the omap
smartreflex driver to live in the power subsystem."
Fix up conflicts in arch/arm/mach-{imx/mach-imx6q.c,omap2/prm2xxx_3xxx.h}
* tag 'pm' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (37 commits)
ARM: OMAP2+: PM: fix IRQ_NOAUTOEN removal by mis-merge
ARM: OMAP2+: do not allow SmartReflex to be built as a module
ARM: OMAP2: Use hwmod to initialize mmc for 2420
ARM: OMAP3: PM: cpuidle: optimize the clkdm idle latency in C1 state
ARM: OMAP3: PM: cpuidle: optimize the PER latency in C1 state
ARM: OMAP3: PM: cpuidle: default to C1 in next_valid_state
ARM: OMAP3: PM: cleanup cam_pwrdm leftovers
ARM: OMAP3: PM: call pre/post transition per powerdomain
ARM: OMAP2+: powerdomain: allow pre/post transtion to be per pwrdm
ARM: OMAP3: PM: Remove IO Daisychain control from cpuidle
ARM: OMAP3PLUS: hwmod: reconfigure IO Daisychain during hwmod mux
ARM: OMAP3+: PRM: Enable IO wake up
ARM: OMAP4: PRM: Add IO Daisychain support
ARM: OMAP3: PM: Move IO Daisychain function to omap3 prm file
ARM: OMAP3: PM: correct enable/disable of daisy io chain
ARM: OMAP2+: PRM: fix compile for OMAP4-only build
W1: OMAP HDQ1W: use runtime PM
ARM: OMAP2+: HDQ1W: use omap_device
W1: OMAP HDQ1W: use 32-bit register accesses
W1: OMAP HDQ1W: allow driver to be built on all OMAP2+
...
|
|
Pull arm-soc board specific updates from Arnd Bergmann:
"These changes are all for individual board files. In the long run,
those files will largely go away, and the amount of changes appears to
be continuously decreasing, which is a good sign. This time around,
changes are focused on tegra, omap and samsung."
Fix conflicts in arch/arm/mach-{omap2/common-board-devices.c,tegra/Makefile.boot}
as per the 'for-linus' branch.
* tag 'boards' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (39 commits)
ARM: EXYNOS: Add leds status1 and status2 on Origen board
ARM: S3C64XX: Mark most Cragganmore initdata devinitdata
ARM: EXYNOS: Add missing .reserve field to SMDKC210
ARM: EXYNOS: Add DRM device to SMDK4X12 board
ARM: S3C64XX: Clean up after SPI driver platform data updates
ARM: SAMSUNG: no need to set the value for clk_xusbxti when it is 24Mhz
ARM: EXYNOS: Add framebuffer support for SMDK4X12
ARM: EXYNOS: Add HSOTG support to SMDK4X12
ARM: S5PV210: Add audio platform device in Goni board
ARM: S5PV210: Add audio platform device in Aquila board
ARM: EXYNOS: Add audio platform device in SMDKV310 board
ARM: S3C64XX: Don't specify an irq_base for WM1192-EV1 board
ARM: OMAP3: Fix omap3evm randconfig error introduced by VBUS support
ARM: OMAP: board-omap4panda: MUX configuration for sys_nirq2
ARM: OMAP: board-4430sdp: MUX configuration for sys_nirq2
ARM: OMAP3530evm: set pendown_state and debounce time for ads7846
ARM: omap3evm: enable VBUS switch for EHCI tranceiver
ARM: OMAP3EVM: Adding USB internal LDOs board file
ARM: OMAP3EVM: Add NAND flash definition
ARM: OMAP3: cm-t35: add tvp5150 decoder support
...
|
|
On input packet processing, rt->rt_iif will be zero if we should
use skb->dev->ifindex.
Since we access rt->rt_iif consistently via inet_iif(), that is
the only spot whose interpretation have to adjust.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use inet_iif() consistently, and for TCP record the input interface of
cached RX dst in inet sock.
rt->rt_iif is going to be encoded differently, so that we can
legitimately cache input routes in the FIB info more aggressively.
When the input interface is "use SKB device index" the rt->rt_iif will
be set to zero.
This forces us to move the TCP RX dst cache installation into the ipv4
specific code, and as well it should since doing the route caching for
ipv6 is pointless at the moment since it is not inspected in the ipv6
input paths yet.
Also, remove the unlikely on dst->obsolete, all ipv4 dsts have
obsolete set to a non-zero value to force invocation of the check
callback.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull support for three new arm SoC types from Arnd Bergmann:
- The mvebu platform includes Marvell's Armada XP and Armada 370 chips,
made by the mvebu business unit inside of Marvell. Since the same
group also made the older but similar platforms we call "orion5x",
"kirkwood", "mv78xx0" and "dove", we plan to move all of them into
the mach-mvebu directory in the future.
- socfpga is Altera's platform based on Cortex-A9 cores and a lot of
FPGA space. This is similar to the Xilinx zynq platform we already
support. The code is particularly clean, which is helped by the fact
that the hardware doesn't do much besides the parts that are expected
to get added in the FPGA.
- The OMAP subarchitecture gains support for the latest generation, the
OMAP5 based on the new Cortex-A15 core. Support is rather
rudimentary for now, but will be extended in the future.
* tag 'newsoc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (25 commits)
ARM: socfpga: initial support for Altera's SOCFPGA platform
arm: mvebu: generate DTBs for supported SoCs
ARM: mvebu: MPIC: read number of interrupts from control register
arm: mach-mvebu: add entry to MAINTAINERS
arm: mach-mvebu: add compilation/configuration change
arm: mach-mvebu: add defconfig
arm: mach-mvebu: add documentation for new device tree bindings
arm: mach-mvebu: add support for Armada 370 and Armada XP with DT
arm: mach-mvebu: add source files
arm: mach-mvebu: add header
clocksource: time-armada-370-xp: Marvell Armada 370/XP SoC timer driver
ARM: Kconfig update to support additional GPIOs in OMAP5
ARM: OMAP5: Add the build support
arm/dts: OMAP5: Add omap5 dts files
ARM: OMAP5: board-generic: Add device tree support
ARM: omap2+: board-generic: clean up the irq data from board file
ARM: OMAP5: Add SMP support
ARM: OMAP5: Add the WakeupGen IP updates
ARM: OMAP5: l3: Add l3 error handler support for omap5
ARM: OMAP5: gpmc: Update gpmc_init()
...
Conflicts:
Documentation/devicetree/bindings/arm/omap/omap.txt
arch/arm/mach-omap2/Makefile
drivers/clocksource/Kconfig
drivers/clocksource/Makefile
|
|
Pull arm-soc device tree description updates from Arnd Bergmann:
"This branch contains two kinds of updates: Some platforms in the
process of getting converted to device tree based booting, and the
platform specific patches necessary for that are included here.
Other platforms are already converted, so we just need to update the
actual device tree source files and the binding documents to add
support for new board and new drivers.
In the future we will probably separate those into two branches, and
in the long run, the plan is to move the device tree source files out
of the kernel repository, but that has to wait until we have completed
a much larger portion of the binding documents."
Fix up trivial conflicts in arch/arm/mach-imx/clk-imx6q.c due to newly
added clkdev registers next to a few removed unnecessary ones.
* tag 'dt' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (119 commits)
ARM: LPC32xx: Add PWM to base dts file
ARM: EXYNOS: mark the DMA channel binding for SPI as preliminary
ARM: dts: Add nodes for spi controllers for SAMSUNG EXYNOS5 platforms
ARM: EXYNOS: Enable platform support for SPI controllers for EXYNOS5
ARM: EXYNOS: Add spi clock support for EXYNOS5
ARM: dts: Add nodes for spi controllers for SAMSUNG EXYNOS4 platforms
ARM: EXYNOS: Enable platform support for SPI controllers for EXYNOX4
ARM: EXYNOS: Fix the incorrect hierarchy of spi controller bus clock
ARM: ux500: Remove PMU platform registration when booting with DT
ARM: ux500: Remove temporary snowball_of_platform_devs enablement structure
ARM: ux500: Ensure vendor specific properties have the vendor's identifier
pinctrl: pinctrl-nomadik: Append sleepmode property with vendor specific prefixes
ARM: ux500: Move rtc-pl031 registration to Device Tree when enabled
ARM: ux500: Enable the AB8500 RTC for all DT:ed DB8500 based devices
ARM: ux500: Correctly reference IRQs supplied by the AB8500 from Device Tree
ARM: ux500: Apply ab8500-debug node do the db8500 DT structure
ARM: ux500: Add a ab8500-usb Device Tree node for db8500 based devices
ARM: ux500: Add db8500 Device Tree node for misc/ab8500-pwm
ARM: ux500: Add db8500 Device Tree node for ab8500-sysctrl
ARM: ux500: Enable LED heartbeat functionality on Snowbal via DT
...
|
|
Pull arm soc-specific updates from Arnd Bergmann:
"This is stuff that does not fit well into another category and in
particular is not related to a particular board. The largest part in
here is extending the am33xx support in the omap platform."
Fix up trivial conflicts in arch/arm/mach-{imx/mach-mx35_3ds.c, tegra/Makefile}
* tag 'soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (74 commits)
ARM: LPC32xx: Add PWM support
ARM: LPC32xx: Add PWM clock
ARM: LPC32xx: Set system serial based on cpu unique id
ARM: vexpress: Config option for early printk console
ARM: vexpress: Add Device Tree for V2P-CA15_CA7 core tile
ARM: vexpress: Convert V2P-CA15 Device Tree to 64 bit addresses
ARM: vexpress: Add fixed regulator for SMSC
ARM: vexpress: Add missing SP804 interrupt in motherboard's DTS files
ARM: vexpress: Initial common clock support
ARM: SAMSUNG: Introduce Kconfig variable for Samsung custom clk API
ARM: EXYNOS: Add missing static storage class specifier in pmu.c file
ARM: EXYNOS: Make combiner_init function static
ARM: EXYNOS: Update HSOTG PHY clock setting for EXYNOS4X12
ARM: versatile: Make plat-versatile clock optional
ARM: vexpress: Check master site in daughterboard's sysctl operations
ARM: vexpress: remove automatic errata workaround selection
ARM: LPC32xx: Adjust to pl08x DMA interface changes
ARM: EXYNOS: Clear SYS_WDTRESET bit to use watchdog reset
ARM: imx: fix mx51 ehci setup errors
ARM: imx: make ehci power/oc polarities configurable
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull non-critical arm-soc bug fixes from Arnd Bergmann:
"These were submitted as bug fixes before v3.5 but not considered
important enough to be included in it."
* tag 'fixes-non-critical' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (29 commits)
ARM:vt8500: Convert to use .restart and remove arch_reset()
ARM: davinci: da8xx: fix interrupt handling
ARM: OMAP2+: fix CONFIG_CPU_IDLE dependency on CONFIG_PM
ARM: mxs/tx28: fix odd include
ARM: OMAP: remove unused cpu detection macros
ARM: OMAP: fix typos related to OMAP330
ARM: OMAP7XX: Remove omap730.h and omap850.h
ARM: OMAP2+: fix naming collision of variable nr_irqs
ARM: OMAP: omap2plus_defconfig: Enable EXT4 support
ARM: OMAP depends on MMU
arm: omap3: am35x: Set proper powerdomain states
ARM: OMAP AM35x: clockdomain data: Fix clockdomain dependencies
ARM: OMAP AM35x: EMAC/MDIO integration: Add Davinci EMAC/MDIO hwmod support
ARM: OMAP: AM35xx: fix UART4 softreset
ARM: OMAP AM35xx: clock and hwmod data: fix UART4 data
ARM: OMAP AM35xx: clock and hwmod data: fix AM35xx HSOTGUSB hwmod
ARM: OMAP: Fix dts files w/ status property: "disable" -> "disabled"
ARM: OMAP: beagle: Set USB Host Port 1 to OMAP_USBHS_PORT_MODE_UNUSED
ARM: OMAP2: twl-common: Fix compiler warning
ARM: OMAP: fix the ads7846 init code
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull the big VFS changes from Al Viro:
"This one is *big* and changes quite a few things around VFS. What's in there:
- the first of two really major architecture changes - death to open
intents.
The former is finally there; it was very long in making, but with
Miklos getting through really hard and messy final push in
fs/namei.c, we finally have it. Unlike his variant, this one
doesn't introduce struct opendata; what we have instead is
->atomic_open() taking preallocated struct file * and passing
everything via its fields.
Instead of returning struct file *, it returns -E... on error, 0
on success and 1 in "deal with it yourself" case (e.g. symlink
found on server, etc.).
See comments before fs/namei.c:atomic_open(). That made a lot of
goodies finally possible and quite a few are in that pile:
->lookup(), ->d_revalidate() and ->create() do not get struct
nameidata * anymore; ->lookup() and ->d_revalidate() get lookup
flags instead, ->create() gets "do we want it exclusive" flag.
With the introduction of new helper (kern_path_locked()) we are rid
of all struct nameidata instances outside of fs/namei.c; it's still
visible in namei.h, but not for long. Come the next cycle,
declaration will move either to fs/internal.h or to fs/namei.c
itself. [me, miklos, hch]
- The second major change: behaviour of final fput(). Now we have
__fput() done without any locks held by caller *and* not from deep
in call stack.
That obviously lifts a lot of constraints on the locking in there.
Moreover, it's legal now to call fput() from atomic contexts (which
has immediately simplified life for aio.c). We also don't need
anti-recursion logics in __scm_destroy() anymore.
There is a price, though - the damn thing has become partially
asynchronous. For fput() from normal process we are guaranteed
that pending __fput() will be done before the caller returns to
userland, exits or gets stopped for ptrace.
For kernel threads and atomic contexts it's done via
schedule_work(), so theoretically we might need a way to make sure
it's finished; so far only one such place had been found, but there
might be more.
There's flush_delayed_fput() (do all pending __fput()) and there's
__fput_sync() (fput() analog doing __fput() immediately). I hope
we won't need them often; see warnings in fs/file_table.c for
details. [me, based on task_work series from Oleg merged last
cycle]
- sync series from Jan
- large part of "death to sync_supers()" work from Artem; the only
bits missing here are exofs and ext4 ones. As far as I understand,
those are going via the exofs and ext4 trees resp.; once they are
in, we can put ->write_super() to the rest, along with the thread
calling it.
- preparatory bits from unionmount series (from dhowells).
- assorted cleanups and fixes all over the place, as usual.
This is not the last pile for this cycle; there's at least jlayton's
ESTALE work and fsfreeze series (the latter - in dire need of fixes,
so I'm not sure it'll make the cut this cycle). I'll probably throw
symlink/hardlink restrictions stuff from Kees into the next pile, too.
Plus there's a lot of misc patches I hadn't thrown into that one -
it's large enough as it is..."
* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (127 commits)
ext4: switch EXT4_IOC_RESIZE_FS to mnt_want_write_file()
btrfs: switch btrfs_ioctl_balance() to mnt_want_write_file()
switch dentry_open() to struct path, make it grab references itself
spufs: shift dget/mntget towards dentry_open()
zoran: don't bother with struct file * in zoran_map
ecryptfs: don't reinvent the wheels, please - use struct completion
don't expose I_NEW inodes via dentry->d_inode
tidy up namei.c a bit
unobfuscate follow_up() a bit
ext3: pass custom EOF to generic_file_llseek_size()
ext4: use core vfs llseek code for dir seeks
vfs: allow custom EOF in generic_file_llseek code
vfs: Avoid unnecessary WB_SYNC_NONE writeback during sys_sync and reorder sync passes
vfs: Remove unnecessary flushing of block devices
vfs: Make sys_sync writeout also block device inodes
vfs: Create function for iterating over block devices
vfs: Reorder operations during sys_sync
quota: Move quota syncing to ->sync_fs method
quota: Split dquot_quota_sync() to writeback and cache flushing part
vfs: Move noop_backing_dev_info check from sync into writeback
...
|
|
ICMP messages generated in output path if frame length is bigger than
mtu are actually lost because socket is owned by user (doing the xmit)
One example is the ipgre_tunnel_xmit() calling
icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu));
We had a similar case fixed in commit a34a101e1e6 (ipv6: disable GSO on
sockets hitting dst_allfrag).
Problem of such fix is that it relied on retransmit timers, so short tcp
sessions paid a too big latency increase price.
This patch uses the tcp_release_cb() infrastructure so that MTU
reduction messages (ICMP messages) are not lost, and no extra delay
is added in TCP transmits.
Reported-by: Maciej Żenczykowski <maze@google.com>
Diagnosed-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Tore Anderson <tore@fud.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The ipv4 routing cache is non-deterministic, performance wise, and is
subject to reasonably easy to launch denial of service attacks.
The routing cache works great for well behaved traffic, and the world
was a much friendlier place when the tradeoffs that led to the routing
cache's design were considered.
What it boils down to is that the performance of the routing cache is
a product of the traffic patterns seen by a system rather than being a
product of the contents of the routing tables. The former of which is
controllable by external entitites.
Even for "well behaved" legitimate traffic, high volume sites can see
hit rates in the routing cache of only ~%10.
The general flow of this patch series is that first the routing cache
is removed. We build a completely new rtable entry every lookup
request.
Next we make some simplifications due to the fact that removing the
routing cache causes several members of struct rtable to become no
longer necessary.
Then we need to make some amends such that we can legally cache
pre-constructed routes in the FIB nexthops. Firstly, we need to
invalidate routes which are hit with nexthop exceptions. Secondly we
have to change the semantics of rt->rt_gateway such that zero means
that the destination is on-link and non-zero otherwise.
Now that the preparations are ready, we start caching precomputed
routes in the FIB nexthops. Output and input routes need different
kinds of care when determining if we can legally do such caching or
not. The details are in the commit log messages for those changes.
The patch series then winds down with some more struct rtable
simplifications and other tidy ups that remove unnecessary overhead.
On a SPARC-T3 output route lookups are ~876 cycles. Input route
lookups are ~1169 cycles with rpfilter disabled, and about ~1468
cycles with rpfilter enabled.
These measurements were taken with the kbench_mod test module in the
net_test_tools GIT tree:
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net_test_tools.git
That GIT tree also includes a udpflood tester tool and stresses
route lookups on packet output.
For example, on the same SPARC-T3 system we can run:
time ./udpflood -l 10000000 10.2.2.11
with routing cache:
real 1m21.955s user 0m6.530s sys 1m15.390s
without routing cache:
real 1m31.678s user 0m6.520s sys 1m25.140s
Performance undoubtedly can easily be improved further.
For example fib_table_lookup() performs a lot of excessive
computations with all the masking and shifting, some of it
conditionalized to deal with edge cases.
Also, Eric's no-ref optimization for input route lookups can be
re-instated for the FIB nexthop caching code path. I would be really
pleased if someone would work on that.
In fact anyone suitable motivated can just fire up perf on the loading
of the test net_test_tools benchmark kernel module. I spend much of
my time going:
bash# perf record insmod ./kbench_mod.ko dst=172.30.42.22 src=74.128.0.1 iif=2
bash# perf report
Thanks to helpful feedback from Joe Perches, Eric Dumazet, Ben
Hutchings, and others.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
Pull MMC updates from Chris Ball:
"MMC highlights for 3.6:
Core:
- Rename cd-gpio to slot-gpio and extend it to support more slot GPIO
functions, such as write-protect.
- Add a function to get regulators (Vdd and Vccq) for a host.
Drivers:
- sdhci-pxav2, sdhci-pxav3: Add device tree support.
- sdhi: Add device tree support.
- sh_mmcif: Add support for regulators, device tree, slot-gpio.
- tmio: Add regulator support, use slot-gpio."
* tag 'mmc-merge-for-3.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: (62 commits)
mmc: sdhci-dove: Prepare for common clock framework
mmc: sdhci-dove: Add SDHCI_QUIRK_NO_HISPD_BIT
mmc: omap_hsmmc: ensure probe returns error upon resource failure
mmc: mxs-mmc: Add wp-inverted property
mmc: esdhc: Fix DMA_MASK to not break mx25 DMA access
mmc: core: reset signal voltage on power up
mmc: sd: Fix sd current limit setting
mmc: omap_hsmmc: add clk_prepare and clk_unprepare
mmc: sdhci: When a UHS switch fails, cycle power if regulator is used
mmc: atmel-mci: modify CLKDIV displaying in debugfs
mmc: atmel-mci: fix incorrect setting of host->data to NULL
mmc: sdhci: poll for card even when card is logically unremovable
mmc: sdhci: Introduce new flag SDHCI_USING_RETUNING_TIMER
mmc: sdio: Change pr_warning to pr_warn_ratelimited
mmc: core: Simplify and fix for SD switch processing
mmc: sdhci: restore host settings when card is removed
mmc: sdhci: fix incorrect command used in tuning
mmc: sdhci-pci: CaFe has broken card detection
mmc: sdhci: Report failure reasons for all cases in sdhci_add_host()
mmc: s3cmci: Convert s3cmci driver to gpiolib API
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/mce changes from Ingo Molnar:
"This tree improves the AMD thresholding bank code and includes a
memory fault signal handling fixlet."
* 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce: Fix siginfo_t->si_addr value for non-recoverable memory faults
x86, MCE, AMD: Update copyrights and boilerplate
x86, MCE, AMD: Give proper names to the thresholding banks
x86, MCE, AMD: Make error_count read only
x86, MCE, AMD: Cleanup reading of error_count
x86, MCE, AMD: Print decimal thresholding values
x86, MCE, AMD: Move shared bank to node descriptor
x86, MCE, AMD: Remove local_allocate_... wrapper
x86, MCE, AMD: Remove shared banks sysfs linking
x86, amd_nb: Export model 0x10 and later PCI id
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
- ACPI conversion to PM handling based on struct dev_pm_ops.
- Conversion of a number of platform drivers to PM handling based on
struct dev_pm_ops and removal of empty legacy PM callbacks from a
couple of PCI drivers.
- Suspend-to-both for in-kernel hibernation from Bojan Smojver.
- cpuidle fixes and cleanups from ShuoX Liu, Daniel Lezcano and Preeti
Murthy.
- cpufreq bug fixes from Jonghwa Lee and Stephen Boyd.
- Suspend and hibernate fixes from Srivatsa Bhat and Colin Cross.
- Generic PM domains framework updates.
- RTC CMOS wakeup signaling update from Paul Fox.
- sparse warnings fixes from Sachin Kamat.
- Build warnings fixes for the generic PM domains framework and PM
sysfs code.
- sysfs switch for printing device suspend times from Sameer Nanda.
- Documentation fix from Oskar Schirmer.
* tag 'pm-for-3.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (70 commits)
cpufreq: Fix sysfs deadlock with concurrent hotplug/frequency switch
EXYNOS: bugfix on retrieving old_index from freqs.old
PM / Sleep: call early resume handlers when suspend_noirq fails
PM / QoS: Use NULL pointer instead of plain integer in qos.c
PM / QoS: Use NULL pointer instead of plain integer in pm_qos.h
PM / Sleep: Require CAP_BLOCK_SUSPEND to use wake_lock/wake_unlock
PM / Sleep: Add missing static storage class specifiers in main.c
cpuilde / ACPI: remove time from acpi_processor_cx structure
cpuidle / ACPI: remove usage from acpi_processor_cx structure
cpuidle / ACPI : remove latency_ticks from acpi_processor_cx structure
rtc-cmos: report wakeups from interrupt handler
PM / Sleep: Fix build warning in sysfs.c for CONFIG_PM_SLEEP unset
PM / Domains: Fix build warning for CONFIG_PM_RUNTIME unset
olpc-xo15-sci: Use struct dev_pm_ops for power management
PM / Domains: Replace plain integer with NULL pointer in domain.c file
PM / Domains: Add missing static storage class specifier in domain.c file
PM / crypto / ux500: Use struct dev_pm_ops for power management
PM / IPMI: Remove empty legacy PCI PM callbacks
tpm_nsc: Use struct dev_pm_ops for power management
tpm_tis: Use struct dev_pm_ops for power management
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull target updates from Nicholas Bellinger:
"There have been lots of work in a number of areas this past round.
The highlights include:
- Break out target_core_cdb.c emulation into SPC/SBC ops (hch)
- Add a parse_cdb method to target backend drivers (hch)
- Move sync_cache + write_same + unmap into spc_ops (hch)
- Use target_execute_cmd for WRITEs in iscsi_target + srpt (hch)
- Offload WRITE I/O backend submission in tcm_qla2xxx + tcm_fc (hch +
nab)
- Refactor core_update_device_list_for_node() into enable/disable
funcs (agrover)
- Replace the TCM processing thread with a TMR work queue (hch)
- Fix regression in transport_add_device_to_core_hba from TMR
conversion (DanC)
- Remove racy, now-redundant check of sess_tearing_down with qla2xxx
(roland)
- Add range checking, fix reading of data len + possible underflow in
UNMAP (roland)
- Allow for target_submit_cmd() returning errors + convert fabrics
(roland + nab)
- Drop bogus struct file usage for iSCSI/SCTP (viro)"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (54 commits)
iscsi-target: Drop bogus struct file usage for iSCSI/SCTP
target: NULL dereference on error path
target: Allow for target_submit_cmd() returning errors
target: Check number of unmap descriptors against our limit
target: Fix possible integer underflow in UNMAP emulation
target: Fix reading of data length fields for UNMAP commands
target: Add range checking to UNMAP emulation
target: Add generation of LOGICAL BLOCK ADDRESS OUT OF RANGE
target: Make unnecessarily global se_dev_align_max_sectors() static
target: Remove se_session.sess_wait_list
qla2xxx: Remove racy, now-redundant check of sess_tearing_down
target: Check sess_tearing_down in target_get_sess_cmd()
sbp-target: Consolidate duplicated error path code in sbp_handle_command()
target: Un-export target_get_sess_cmd()
qla2xxx: Get rid of redundant qla_tgt_sess.tearing_down
target: Make core_disable_device_list_for_node use pre-refactoring lock ordering
target: refactor core_update_device_list_for_node()
target: Eliminate else using boolean logic
target: Misc retval cleanups
target: Remove hba param from core_dev_add_lun
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator updates from Mark Brown:
"Lots and lots of fixes from Axel and some others here, plus some
framework enhancements which continue the theme of factoring code out
of the drivers and into the core.
- Initial framework support for GPIO controlled enable signals,
saving a bunch of code in drivers.
- Move fixed regulator enable time and voltage mapping table
specifications to data.
- Used some of the recent framework enhancements to make voltage
change notifications more useful, passing the voltage in as an
argument to the notification.
- Fixed the pattern used for finding individual regulators on a
device to not rely on the node name, supporting the use of multiple
PMICs of the same type in the system.
- New drivers for Maxim MAX77686, TI LP872x and LP8788, Samsung
S2MPS11, and Wolfson Arizona microphone supplies and LDOs."
* tag 'regulator-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (176 commits)
regulator: add new lp8788 regulator driver
regulator: mc13xxx: Remove extern function declaration for mc13xxx_sw_regulator
regulator: tps65910: set input_supply on desc unconditionally
regulator: palmas: Fix calcuating selector in palmas_map_voltage_smps
regulator: lp872x: Simplify implementation of lp872x_find_regulator_init_data()
regulator: twl: Fix list_voltate for twl6030ldo_ops
regulator: twl: Convert twl6030ldo_ops to [get|set]_voltage_sel
regulator: twl: Fix the formula to calculate vsel and voltage for twl6030ldo
regulator: s5m8767: Properly handle gpio_request failure
regulator: max8997: Properly handle gpio_request failure
regulator: tps62360: use devm_* for gpio request
regulator: tps6586x: add support for input supply
regulator: tps65217: Add device tree support
regulator: aat2870: Remove unused min_uV and max_uV from struct aat2870_regulator
regulator: aat2870: Convert to regulator_list_voltage_table
regulator: da9052: initialize of_node param for regulator register
regulator: Add REGULATOR_STATUS_UNDEFINED.
regulator: Fix a typo in regulator_mode_to_status() core function.
regulator: s2mps11: Use sec_reg_write rather than sec_reg_update when mask is 0xff
regulator: s2mps11: Fix wrong setting for config.dev
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"A few fixes plus a few features, the most generally useful thing being
the register paging support which can be used by quite a few devices:
- Support for wake IRQs in regmap-irq
- Support for register paging
- Support for explicitly specified endianness, mostly for MMIO."
* tag 'regmap-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: Fix incorrect arguments to kzalloc() call
regmap: Add hook for printk logging for debugging during early init
regmap: Fix work_buf switching for page update during virtual range access.
regmap: Add support for register indirect addressing.
regmap: Move lock out from internal function _regmap_update_bits().
regmap: mmio: Staticize regmap_mmio_gen_context()
regmap: Remove warning on stubbed dev_get_regmap()
regmap: Implement support for wake IRQs
regmap: Don't try to map non-existant IRQs
regmap: Constify regmap_irq_chip
regmap: mmio: request native endian formatting
regmap: allow busses to request formatting with specific endianness
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
For ext3/4 htree directories, using the vfs llseek function with
SEEK_END goes to i_size like for any other file, but in reality
we want the maximum possible hash value. Recent changes
in ext4 have cut & pasted generic_file_llseek() back into fs/ext4/dir.c,
but replicating this core code seems like a bad idea, especially
since the copy has already diverged from the vfs.
This patch updates generic_file_llseek_size to accept
both a custom maximum offset, and a custom EOF position. With this
in place, ext4_dir_llseek can pass in the appropriate maximum hash
position for both maxsize and eof, and get what it wants.
As far as I know, this does not fix any bugs - nfs in the kernel
doesn't use SEEK_END, and I don't know of any user who does. But
some ext4 folks seem keen on doing the right thing here, and I can't
really argue.
(Patch also fixes up some comments slightly)
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Split off part of dquot_quota_sync() which writes dquots into a quota file
to a separate function. In the next patch we will use the function from
filesystems and we do not want to abuse ->quota_sync quotactl callback more
than necessary.
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
recursion in __scm_destroy() will be cut by delaying final fput()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
... and schedule_work() for interrupt/kernel_thread callers
(and yes, now it *is* OK to call from interrupt).
We are guaranteed that __fput() will be done before we return
to userland (or exit). Note that for fput() from a kernel
thread we get an async behaviour; it's almost always OK, but
sometimes you might need to have __fput() completed before
you do anything else. There are two mechanisms for that -
a general barrier (flush_delayed_fput()) and explicit
__fput_sync(). Both should be used with care (as was the
case for fput() from kernel threads all along). See comments
in fs/file_table.c for details.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
task_work and rcu_head are identical now; merge them (calling the result
struct callback_head, rcu_head #define'd to it), kill separate allocation
in security/keys since we can just use cred->rcu now.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
layout based on Oleg's suggestion; single-linked list,
task->task_works points to the last element, forward pointer
from said last element points to head. I'd still prefer
much more regular scheme with two pointers in task_work,
but...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
get rid of the only user of ->data; this is _not_ the final variant - in the
end we'll have task_work and rcu_head identical and just use cred->rcu,
at which point the separate allocation will be gone completely.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Instead of updating the sk_cgrp_prioidx struct field on every send
this only updates the field when a task is moved via cgroup
infrastructure.
This allows sockets that may be used by a kernel worker thread
to be managed. For example in the iscsi case today a user can
put iscsid in a netprio cgroup and control traffic will be sent
with the correct sk_cgrp_prioidx value set but as soon as data
is sent the kernel worker thread isssues a send and sk_cgrp_prioidx
is updated with the kernel worker threads value which is the
default case.
It seems more correct to only update the field when the user
explicitly sets it via control group infrastructure. This allows
the users to manage sockets that may be used with other threads.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Many places do
if ((skb_shinfo(skb)->tx_flags & SKBTX_DEV_ZEROCOPY))
skb_copy_ubufs(skb, gfp_mask);
to copy and invoke frag destructors if necessary.
Add an inline helper for this.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Host has different current capabilities at different voltages, we need
to record these settings seperately. The defined voltages are 1.8/3.0/3.3.
For other voltages, we do not touch current limit setting.
Before we set the current limit for the sd card, find out the host's
operating voltage first and then find out the current capabilities of
the host at that voltage to set the current limit.
Signed-off-by: Aaron Lu <aaron.lu@amd.com>
Reviewed-by: Philip Rakity <prakity@marvell.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
|
|
Add a new flag of SDHCI_USING_RETUNING_TIMER to represent if the host
is using a retuning timer for the card inserted.
This flag is set when the host does tuning the first time for the card
and the host's retuning mode is 1. This flag is used afterwards whenever
needs to decide if the host is currently using a retuning timer.
This flag is cleared when the card is removed in sdhci_reinit.
The set/clear of the flag and the start/stop of the retuning timer is
associated with the card's init/remove time, so there is no need to
touch it when the host is to be removed as at that time the card should
have already been removed.
Signed-off-by: Aaron Lu <aaron.lu@amd.com>
Reviewed-by: Girish K S <girish.shivananjappa@linaro.org>
Reviewed-by: Philip Rakity <prakity@marvell.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
|
|
Currently only the capability_0 register can be set if
SDHCI_QUIRK_MISSING_CAPS is defined. This is a problem when
the capability_1 register also needs changing. Use the quirk
SDHCI_QUIRK_MISSING_CAPS to allow both registers to be set.
Redefining caps[1] is useful when the board design does not
support 1.8v vccq so UHS modes are not available. The code that
calls sdhci_add_host can then detect this condition and adjust
the caps so the UHS mode will not be attempted on UHS cards.
Signed-off-by: Philip Rakity <prakity@marvell.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 platform changes from Ingo Molnar:
"This tree mostly involves various APIC driver cleanups/robustization,
and vSMP motivated platform callback improvements/cleanups"
Fix up trivial conflict due to printk cleanup right next to return value
change.
* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits)
Revert "x86/early_printk: Replace obsolete simple_strtoul() usage with kstrtoint()"
x86/apic/x2apic: Use multiple cluster members for the irq destination only with the explicit affinity
x86/apic/x2apic: Limit the vector reservation to the user specified mask
x86/apic: Optimize cpu traversal in __assign_irq_vector() using domain membership
x86/vsmp: Fix vector_allocation_domain's return value
irq/apic: Use config_enabled(CONFIG_SMP) checks to clean up irq_set_affinity() for UP
x86/vsmp: Fix linker error when CONFIG_PROC_FS is not set
x86/apic/es7000: Make apicid of a cluster (not CPU) from a cpumask
x86/apic/es7000+summit: Always make valid apicid from a cpumask
x86/apic/es7000+summit: Fix compile warning in cpu_mask_to_apicid()
x86/apic: Fix ugly casting and branching in cpu_mask_to_apicid_and()
x86/apic: Eliminate cpu_mask_to_apicid() operation
x86/x2apic/cluster: Vector_allocation_domain() should return a value
x86/apic/irq_remap: Silence a bogus pr_err()
x86/vsmp: Ignore IOAPIC IRQ affinity if possible
x86/apic: Make cpu_mask_to_apicid() operations check cpu_online_mask
x86/apic: Make cpu_mask_to_apicid() operations return error code
x86/apic: Avoid useless scanning thru a cpumask in assign_irq_vector()
x86/apic: Try to spread IRQ vectors to different priority levels
x86/apic: Factor out default vector_allocation_domain() operation
...
|
|
I've seen several attempts recently made to do quick failover of sctp transports
by reducing various retransmit timers and counters. While its possible to
implement a faster failover on multihomed sctp associations, its not
particularly robust, in that it can lead to unneeded retransmits, as well as
false connection failures due to intermittent latency on a network.
Instead, lets implement the new ietf quick failover draft found here:
http://tools.ietf.org/html/draft-nishida-tsvwg-sctp-failover-05
This will let the sctp stack identify transports that have had a small number of
errors, and avoid using them quickly until their reliability can be
re-established. I've tested this out on two virt guests connected via multiple
isolated virt networks and believe its in compliance with the above draft and
works well.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Vlad Yasevich <vyasevich@gmail.com>
CC: Sridhar Samudrala <sri@us.ibm.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: linux-sctp@vger.kernel.org
CC: joe@perches.com
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer core changes from Ingo Molnar:
"Continued cleanups of the core time and NTP code, plus more nohz work
preparing for tick-less userspace execution."
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
time: Rework timekeeping functions to take timekeeper ptr as argument
time: Move xtime_nsec adjustment underflow handling timekeeping_adjust
time: Move arch_gettimeoffset() usage into timekeeping_get_ns()
time: Refactor accumulation of nsecs to secs
time: Condense timekeeper.xtime into xtime_sec
time: Explicitly use u32 instead of int for shift values
time: Whitespace cleanups per Ingo%27s requests
nohz: Move next idle expiry time record into idle logic area
nohz: Move ts->idle_calls incrementation into strict idle logic
nohz: Rename ts->idle_tick to ts->last_tick
nohz: Make nohz API agnostic against idle ticks cputime accounting
nohz: Separate idle sleeping time accounting from nohz logic
timers: Improve get_next_timer_interrupt()
timers: Add accounting of non deferrable timers
timers: Consolidate base->next_timer update
timers: Create detach_if_pending() and use it
|
|
|
|
regulator-next
|
|
Conflicts (trivial context stuff):
drivers/base/regmap/regmap.c
include/linux/regmap.h
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull smp/hotplug changes from Ingo Molnar:
"Various cleanups to the SMP hotplug code - a continuing effort of
Thomas et al"
* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
smpboot: Remove leftover declaration
smp: Remove num_booting_cpus()
smp: Remove ipi_call_lock[_irq]()/ipi_call_unlock[_irq]()
POWERPC: Smp: remove call to ipi_call_lock()/ipi_call_unlock()
SPARC: SMP: Remove call to ipi_call_lock_irq()/ipi_call_unlock_irq()
ia64: SMP: Remove call to ipi_call_lock_irq()/ipi_call_unlock_irq()
x86-smp-remove-call-to-ipi_call_lock-ipi_call_unlock
tile: SMP: Remove call to ipi_call_lock()/ipi_call_unlock()
S390: Smp: remove call to ipi_call_lock()/ipi_call_unlock()
parisc: Smp: remove call to ipi_call_lock()/ipi_call_unlock()
mn10300: SMP: Remove call to ipi_call_lock()/ipi_call_unlock()
hexagon: SMP: Remove call to ipi_call_lock()/ipi_call_unlock()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf events changes from Ingo Molnar:
"- kernel side:
- Intel uncore PMU support for Nehalem and Sandy Bridge CPUs, we
support both the events available via the MSR and via the PCI
access space.
- various uprobes cleanups and restructurings
- PMU driver quirks by microcode version and required x86 microcode
loader cleanups/robustization
- various tracing robustness updates
- static keys: remove obsolete static_branch()
- tooling side:
- GTK browser improvements
- perf report browser: support screenshots to file
- more automated tests
- perf kvm improvements
- perf bench refinements
- build environment improvements
- pipe mode improvements
- libtraceevent updates, we have now hopefully merged most bits with
the out of tree forked code base
... and many other goodies."
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (138 commits)
tracing: Check for allocation failure in __tracing_open()
perf/x86: Fix intel_perfmon_event_mapformatting
jump label: Remove static_branch()
tracepoint: Use static_key_false(), since static_branch() is deprecated
perf/x86: Uncore filter support for SandyBridge-EP
perf/x86: Detect number of instances of uncore CBox
perf/x86: Fix event constraint for SandyBridge-EP C-Box
perf/x86: Use 0xff as pseudo code for fixed uncore event
perf/x86: Save a few bytes in 'struct x86_pmu'
perf/x86: Add a microcode revision check for SNB-PEBS
perf/x86: Improve debug output in check_hw_exists()
perf/x86/amd: Unify AMD's generic and family 15h pmus
perf/x86: Move Intel specific code to intel_pmu_init()
perf/x86: Rename Intel specific macros
perf/x86: Fix USER/KERNEL tagging of samples
perf tools: Split event symbols arrays to hw and sw parts
perf tools: Split out PE_VALUE_SYM parsing token to SW and HW tokens
perf tools: Add empty rule for new line in event syntax parsing
perf test: Use ARRAY_SIZE in parse events tests
tools lib traceevent: Cleanup realloc use
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU changes from Ingo Molnar:
"Quoting from Paul, the major features of this series are:
1. Preventing latency spikes of more than 200 microseconds for
kernels built with NR_CPUS=4096, which is reportedly becoming the
default for some distros. This is a first step, as it does not
help with systems that actually -have- 4096 CPUs (work on this case
is in progress, but is not yet ready for mainline).
This category also includes improving concurrency of rcu_barrier(),
placed here due to conflicts. Posted to LKML at:
https://lkml.org/lkml/2012/6/22/381
Note that patches 18-22 of that series have been defered to 3.7, as
they have not yet proven themselves to be mainline-ready (and yes,
these are the ones intended to get rid of RCU's latency spikes for
systems that actually have 4096 CPUs).
2. Updates to documentation and rcutorture fixes, the latter category
including improvements to rcu_barrier() testing. Posted to LKML at
http://lkml.indiana.edu/hypermail/linux/kernel/1206.1/04094.html.
3. Miscellaneous fixes posted to LKML at:
https://lkml.org/lkml/2012/6/22/500
with the exception of the last commit, which was posted here:
http://www.gossamer-threads.com/lists/linux/kernel/1561830
4. RCU_FAST_NO_HZ fixes and improvements. Posted to LKML at:
http://lkml.indiana.edu/hypermail/linux/kernel/1206.1/00006.html
http://www.gossamer-threads.com/lists/linux/kernel/1561833
The first four patches of the first series went into 3.5 to fix a
regression.
5. Code-style fixes. These were posted to LKML at
http://lkml.indiana.edu/hypermail/linux/kernel/1205.2/01180.html
http://lkml.indiana.edu/hypermail/linux/kernel/1205.2/01181.html"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (48 commits)
rcu: Fix broken strings in RCU's source code.
rcu: Fix code-style issues involving "else"
rcu: Introduce check for callback list/count mismatch
rcu: Make RCU_FAST_NO_HZ respect nohz= boot parameter
rcu: Fix qlen_lazy breakage
rcu: Round FAST_NO_HZ lazy timeout to nearest second
rcu: The rcu_needs_cpu() function is not a quiescent state
rcu: Dump only the current CPU's buffers for idle-entry/exit warnings
rcu: Add check for CPUs going offline with callbacks queued
rcu: Disable preemption in rcu_blocking_is_gp()
rcu: Prevent uninitialized string in RCU CPU stall info
rcu: Fix rcu_is_cpu_idle() #ifdef in TINY_RCU
rcu: Split RCU core processing out of __call_rcu()
rcu: Prevent __call_rcu() from invoking RCU core on offline CPUs
rcu: Make __call_rcu() handle invocation from idle
rcu: Remove function versions of __kfree_rcu and __is_kfree_rcu_offset
rcu: Consolidate tree/tiny __rcu_read_{,un}lock() implementations
rcu: Remove return value from rcu_assign_pointer()
key: Remove extraneous parentheses from rcu_assign_keypointer()
rcu: Remove return value from RCU_INIT_POINTER()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core/iommu changes from Ingo Molnar.
* 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
iommu/dmar: Use pr_format() instead of PREFIX to tidy up pr_*() calls
iommu/dmar: Reserve mmio space used by the IOMMU, if the BIOS forgets to
iommu/dmar: Replace printks with appropriate pr_*()
|
|
Linux 3.5-rc7
Prerequisite for samsung/board3 branch.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
The SYSTEM_SUSPEND_DISK system state is never used, so drop it.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Merge emailed kgdb dmesg fixups patches from Anton Vorontsov:
"The dmesg command appears to be broken after the printk rework. The
old logic in the kdb code makes no sense in terms of current
printk/logging storage format, and KDB simply hangs forever upon
entering 'dmesg' command.
The first patch revives the command by switching to kmsg_dumper
iterator. As a side-effect, the code is now much more simpler.
A few changes were needed in the printk.c: we needed unlocked variant
of the kmsg_dumper iterator, but these can surely wait for 3.6.
It's probably too late even for the first patch to go to 3.5, but I'll
try to convince otherwise. :-) Here we go:
- The current code is broken for sure, and has no hope to work at
all. It is a regression
- The new code works for me, and probably works for everyone else;
- If it compiles (and I urge everyone to compile-test it on your
setup), it hardly can make things worse."
* Merge emailed patches from Anton Vorontsov: (4 commits)
kdb: Switch to nolock variants of kmsg_dump functions
printk: Implement some unlocked kmsg_dump functions
printk: Remove kdb_syslog_data
kdb: Revive dmesg command
|