Age | Commit message (Collapse) | Author | Files | Lines |
|
the VIOS server
Older versions of VIOS servers do not send the firmware level in the VPD
buffer for the ibmvnic driver. Thus, not only the current message is mis-
leading but the firmware version in the ethtool will be NULL. Therefore,
this patch fixes the firmware string and its warning.
Fixes: 4e6759be28e4 ("ibmvnic: Feature implementation of VPD for the ibmvnic driver")
Signed-off-by: Desnes A. Nunes do Rosario <desnesn@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pointer rq is being initialized but this value is never read, it
is being updated inside a for-loop. Remove the initialization and
move it into the scope of the for-loop.
Cleans up clang warning:
drivers/net/vmxnet3/vmxnet3_drv.c:2763:27: warning: Value stored
to 'rq' during its initialization is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Shrikrishna Khare <skhare@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pointer phydev is initialized and this value is never read, phydev
is immediately updated to a new value, hence this initialization
is redundant and can be removed
Cleans up clang warning:
drivers/net/usb/lan78xx.c:2009:21: warning: Value stored to 'phydev'
during its initialization is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pointer rxdesc is assigned a value that is never read, it is overwritten
by a new assignment inside a while loop hence the initial assignment
is redundant and can be removed.
Cleans up clang warning:
drivers/net/ethernet/jme.c:1074:17: warning: Value stored to 'rxdesc'
during its initialization is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kconfig updates from Masahiro Yamada:
"A pretty big batch of Kconfig updates.
I have to mention the lexer and parser of Kconfig are now built from
real .l and .y sources. So, flex and bison are the requirement for
building the kernel. Both of them (unlike gperf) have been stable for
a long time. This change has been tested several weeks in linux-next,
and I did not receive any problem report about this.
Summary:
- add checks for mistakes, like the choice default is not in choice,
help is doubled
- document data structure and complex code
- fix various memory leaks
- change Makefile to build lexer and parser instead of using
pre-generated C files
- drop 'boolean' keyword, which is equivalent to 'bool'
- use default 'yy' prefix and remove unneeded Make variables
- fix gettext() check for xconfig
- announce that oldnoconfig will be finally removed
- make 'Selected by:' and 'Implied by' readable in help and search
result
- hide silentoldconfig from 'make help' to stop confusing people
- fix misc things and cleanups"
* tag 'kconfig-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (37 commits)
kconfig: Remove silentoldconfig from help and docs; fix kconfig/conf's help
kconfig: make "Selected by:" and "Implied by:" readable
kconfig: announce removal of oldnoconfig if used
kconfig: fix make xconfig when gettext is missing
kconfig: Clarify menu and 'if' dependency propagation
kconfig: Document 'if' flattening logic
kconfig: Clarify choice dependency propagation
kconfig: Document SYMBOL_OPTIONAL logic
kbuild: remove unnecessary LEX_PREFIX and YACC_PREFIX
kconfig: use default 'yy' prefix for lexer and parser
kconfig: make conf_unsaved a local variable of conf_read()
kconfig: make xfgets() really static
kconfig: make input_mode static
kconfig: Warn if there is more than one help text
kconfig: drop 'boolean' keyword
kconfig: use bool instead of boolean for type definition attributes, again
kconfig: Remove menu_end_entry()
kconfig: Document important expression functions
kconfig: Document automatic submenu creation code
kconfig: Fix choice symbol expression leak
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild misc updates from Masahiro Yamada:
- add snap-pkg target to create Linux kernel snap package
- make out-of-tree creation of source packages fail correctly
- improve and fix several semantic patches
* tag 'kbuild-misc-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
Coccinelle: coccicheck: fix typo
Coccinelle: memdup: drop spurious line
Coccinelle: kzalloc-simple: Rename kzalloc-simple to zalloc-simple
Coccinelle: ifnullfree: Trim the warning reported in report mode
Coccinelle: alloc_cast: Add more memory allocating functions to the list
Coccinelle: array_size: report even if include is missing
Coccinelle: kzalloc-simple: Add all zero allocating functions
kbuild: pkg: make out-of-tree rpm/deb-pkg build immediately fail
scripts/package: snap-pkg target
|
|
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for your net tree,
they are:
1) Fix OOM that syskaller triggers with ipt_replace.size = -1 and
IPT_SO_SET_REPLACE socket option, from Dmitry Vyukov.
2) Check for too long extension name in xt_request_find_{match|target}
that result in out-of-bound reads, from Eric Dumazet.
3) Fix memory exhaustion bug in ipset hash:*net* types when adding ranges
that look like x.x.x.x-255.255.255.255, from Jozsef Kadlecsik.
4) Fix pointer leaks to userspace in x_tables, from Dmitry Vyukov.
5) Insufficient sanity checks in clusterip_tg_check(), also from Dmitry.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- terminate the build correctly in case of fixdep errors
- clean up fixdep
- suppress packed-not-aligned warnings from GCC-8
- fix W= handling for extra DTC warnings
* tag 'kbuild-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: fix W= option checks for extra DTC warnings
Kbuild: suppress packed-not-aligned warning for default setting only
fixdep: use existing helper to check modular CONFIG options
fixdep: refactor parse_dep_file()
fixdep: move global variables to local variables of main()
fixdep: remove unneeded memcpy() in parse_dep_file()
fixdep: factor out common code for reading files
fixdep: use malloc() and read() to load dep_file to buffer
fixdep: remove unnecessary <arpa/inet.h> inclusion
fixdep: exit with error code in error branches of do_config_file()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull DeviceTree updates from Rob Herring:
- Convert to use memblock_virt_alloc in DT code which supports
bootmem arches. With this we can remove the arch specific
early_init_dt_alloc_memory_arch() functions.
- Enable running the DT unittests on UML
- Use SPDX license tags on DT files
- Fix early FDT kconfig ifdef logic
- Clean-up unittest Makefile
- Fix function comment for of_irq_parse_raw
- Add missing documentation for linux,initrd-{start,end} properties
- Clean-up of binding examples using uppercase hex
- Add trivial devices W83773G and Infineon TLV493D-A1B6
- Add missing STM32 SoC bindings
- Various small binding doc fixes
* tag 'devicetree-for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (23 commits)
xtensa: remove arch specific early DT functions
x86: remove arch specific early_init_dt_alloc_memory_arch
nios2: remove arch specific early_init_dt_alloc_memory_arch
mips: remove arch specific early_init_dt_alloc_memory_arch
metag: remove arch specific early DT functions
cris: remove arch specific early DT functions
libfdt: remove unnecessary include directive from <linux/libfdt.h>
of: unittest: refactor Makefile
of/fdt: use memblock_virt_alloc for early alloc
of: Use SPDX license tag for DT files
of/fdt: Fix #ifdef dependency of early flattree declarations
dt-bindings: h8300 clocksource: correct spelling of pulse
dt-bindings: imx6q-pcie: Add required property for i.MX6SX
mmc: Don't reference Linux-specific OF_GPIO_ACTIVE_LOW flag in DT binding
dt-bindings: Use lower case hex in unit-addresses
dt-bindings: display: panel: Fix compatible string for Toshiba LT089AC29000
dt-bindings: Add Infineon TLV493D-A1B6
dt-bindings: mailbox: ti,message-manager: Fix interrupt name error
dt-bindings: chosen: Document linux,initrd-{start,end}
dt-bindings: arm: document supported STM32 SoC family
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input layer updates from Dmitry Torokhov:
- evdev interface has been adjusted to extend the life of timestamps on
32 bit systems to the year of 2108
- Synaptics RMI4 driver's PS/2 guest handling ha beed updated to
improve chances of detecting trackpoints on the pass-through port
- mms114 touchcsreen controller driver has been updated to support
generic device properties and work with mms152 cntrollers
- Goodix driver now supports generic touchscreen properties
- couple of drivers for AVR32 architecture are gone as the architecture
support has been removed from the kernel
- gpio-tilt driver has been removed as there are no mainline users and
the driver itself is using legacy APIs and relies on platform data
- MODULE_LINECSE/MODULE_VERSION cleanups
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (45 commits)
Input: goodix - use generic touchscreen_properties
Input: mms114 - fix typo in definition
Input: mms114 - use BIT() macro instead of explicit shifting
Input: mms114 - replace mdelay with msleep
Input: mms114 - add support for mms152
Input: mms114 - drop platform data and use generic APIs
Input: mms114 - mark as direct input device
Input: mms114 - do not clobber interrupt trigger
Input: edt-ft5x06 - fix error handling for factory mode on non-M06
Input: stmfts - set IRQ_NOAUTOEN to the irq flag
Input: auo-pixcir-ts - delete an unnecessary return statement
Input: auo-pixcir-ts - remove custom log for a failed memory allocation
Input: da9052_tsi - remove unused mutex
Input: docs - use PROPERTY_ENTRY_U32() directly
Input: synaptics-rmi4 - log when we create a guest serio port
Input: synaptics-rmi4 - unmask F03 interrupts when port is opened
Input: synaptics-rmi4 - do not delete interrupt memory too early
Input: ad7877 - use managed resource allocations
Input: stmfts,s6sy671 - add SPDX identifier
Input: remove atmel-wm97xx touchscreen driver
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here is the big pull request for char/misc drivers for 4.16-rc1.
There's a lot of stuff in here. Three new driver subsystems were added
for various types of hardware busses:
- siox
- slimbus
- soundwire
as well as a new vboxguest subsystem for the VirtualBox hypervisor
drivers.
There's also big updates from the FPGA subsystem, lots of Android
binder fixes, the usual handful of hyper-v updates, and lots of other
smaller driver updates.
All of these have been in linux-next for a long time, with no reported
issues"
* tag 'char-misc-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (155 commits)
char: lp: use true or false for boolean values
android: binder: use VM_ALLOC to get vm area
android: binder: Use true and false for boolean values
lkdtm: fix handle_irq_event symbol for INT_HW_IRQ_EN
EISA: Delete error message for a failed memory allocation in eisa_probe()
EISA: Whitespace cleanup
misc: remove AVR32 dependencies
virt: vbox: Add error mapping for VERR_INVALID_NAME and VERR_NO_MORE_FILES
soundwire: Fix a signedness bug
uio_hv_generic: fix new type mismatch warnings
uio_hv_generic: fix type mismatch warnings
auxdisplay: img-ascii-lcd: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
uio_hv_generic: add rescind support
uio_hv_generic: check that host supports monitor page
uio_hv_generic: create send and receive buffers
uio: document uio_hv_generic regions
doc: fix documentation about uio_hv_generic
vmbus: add monitor_id and subchannel_id to sysfs per channel
vmbus: fix ABI documentation
uio_hv_generic: use ISR callback method
...
|
|
Restore an optimization removed in commit 7f19449553 "Fix debugfs glocks
dump": keep the glock hash table iterator active while the glock dump
file is held open. This avoids having to rescan the hash table from the
start for each read, with quadratically rising runtime.
In addition, use rhastable_walk_peek for resuming a glock dump at the
current position: when a glock doesn't fit in the provided buffer
anymore, the next read must revisit the same glock.
Finally, also restart the dump from the first entry when we notice that
the hash table has been resized in gfs2_glock_seq_start.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
|
Depend on LIBCRC32C which uses the crypto API to select the appropriate
crc32c implementation. With the CRYPTO and CRYPTO_CRC32C dependencies,
gfs2 would still need to use the crypto API directly like ext4 and btrfs
do, which isn't necessary.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the set of "big" driver core patches for 4.16-rc1.
The majority of the work here is in the firmware subsystem, with
reworks to try to attempt to make the code easier to handle in the
long run, but no functional change. There's also some tree-wide sysfs
attribute fixups with lots of acks from the various subsystem
maintainers, as well as a handful of other normal fixes and changes.
And finally, some license cleanups for the driver core and sysfs code.
All have been in linux-next for a while with no reported issues"
* tag 'driver-core-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (48 commits)
device property: Define type of PROPERTY_ENRTY_*() macros
device property: Reuse property_entry_free_data()
device property: Move property_entry_free_data() upper
firmware: Fix up docs referring to FIRMWARE_IN_KERNEL
firmware: Drop FIRMWARE_IN_KERNEL Kconfig option
USB: serial: keyspan: Drop firmware Kconfig options
sysfs: remove DEBUG defines
sysfs: use SPDX identifiers
drivers: base: add coredump driver ops
sysfs: add attribute specification for /sysfs/devices/.../coredump
test_firmware: fix missing unlock on error in config_num_requests_store()
test_firmware: make local symbol test_fw_config static
sysfs: turn WARN() into pr_warn()
firmware: Fix a typo in fallback-mechanisms.rst
treewide: Use DEVICE_ATTR_WO
treewide: Use DEVICE_ATTR_RO
treewide: Use DEVICE_ATTR_RW
sysfs.h: Use octal permissions
component: add debugfs support
bus: simple-pm-bus: convert bool SIMPLE_PM_BUS to tristate
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging/IIO updates from Greg KH:
"Here is the big Staging and IIO driver patches for 4.16-rc1.
There is the normal amount of new IIO drivers added, like all
releases.
The networking IPX and the ncpfs filesystem are moved into the staging
tree, as they are on their way out of the kernel due to lack of use
anymore.
The visorbus subsystem finall has started moving out of the staging
tree to the "real" part of the kernel, and the most and fsl-mc
codebases are almost ready to move out, that will probably happen for
4.17-rc1 if all goes well.
Other than that, there is a bunch of license header cleanups in the
tree, along with the normal amount of coding style churn that we all
know and love for this codebase. I also got frustrated at the
Meltdown/Spectre mess and took it out on the dgnc tty driver, deleting
huge chunks of it that were never even being used.
Full details of everything is in the shortlog.
All of these patches have been in linux-next for a while with no
reported issues"
* tag 'staging-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (627 commits)
staging: rtlwifi: remove redundant initialization of 'cfg_cmd'
staging: rtl8723bs: remove a couple of redundant initializations
staging: comedi: reformat lines to 80 chars or less
staging: lustre: separate a connection destroy from free struct kib_conn
Staging: rtl8723bs: Use !x instead of NULL comparison
Staging: rtl8723bs: Remove dead code
Staging: rtl8723bs: Change names to conform to the kernel code
staging: ccree: Fix missing blank line after declaration
staging: rtl8188eu: remove redundant initialization of 'pwrcfgcmd'
staging: rtlwifi: remove unused RTLHALMAC_ST and RTLPHYDM_ST
staging: fbtft: remove unused FB_TFT_SSD1325 kconfig
staging: comedi: dt2811: remove redundant initialization of 'ns'
staging: wilc1000: fix alignments to match open parenthesis
staging: wilc1000: removed unnecessary defined enums typedef
staging: wilc1000: remove unnecessary use of parentheses
staging: rtl8192u: remove redundant initialization of 'timeout'
staging: sm750fb: fix CamelCase for dispSet var
staging: lustre: lnet/selftest: fix compile error on UP build
staging: rtl8723bs: hal_com_phycfg: Remove unneeded semicolons
staging: rts5208: Fix "seg_no" calculation in reset_ms_card()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/staging driver updates from Greg KH:
"Here is the big tty/serial driver update for 4.16-rc1.
The usual number of various serial driver fixes and updates to try to
get them to work with crazy hardware configurations (seriously, how
many different ways are hardware engineers going to come up with to
hook up a simple UART?)
There is also some serdev bugfixes and updates, as well as a
smattering of other small fixes in here.
All have been in the linux-next tree for a while, with no reported
issues"
* tag 'tty-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (65 commits)
tty: serial: exar: Relocate sleep wake-up handling
tty: fix data race between tty_init_dev and flush of buf
serial: imx: fix endless loop during suspend
serial: core: mark port as initialized after successful IRQ change
serdev: only match serdev devices
serdev: do not generate modaliases for controllers
serial: mxs-auart: don't use GPIOF_* with gpiod_get_direction
serial: 8250_dw: Revert "Improve clock rate setting"
MAINTAINERS: Add myself as designated reviewer for 8250_dw
gpio: serial: max310x: Support open-drain configuration for GPIOs
serdev: Fix serdev_uevent failure on ACPI enumerated serdev-controllers
serial: 8250_ingenic: Parse earlycon options
serial: 8250_ingenic: Add support for the JZ4770 SoC
serial: core: Make uart_parse_options take const char* argument
serial: 8250_of: fix return code when probe function fails to get reset
serial: imx: Only wakeup via RTSDEN bit if the system has RTS/CTS
serial: 8250_uniphier: fix error return code in uniphier_uart_probe()
tty: n_gsm: Allow ADM response in addition to UA for control dlci
tty: omap-serial: Fix initial on-boot RTS GPIO level
tty: serial: jsm: Add one check against NULL pointer dereference
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB/PHY updates from Greg KH:
"Here is the big USB and PHY driver update for 4.16-rc1.
Along with the normally expected XHCI, MUSB, and Gadget driver
patches, there are some PHY driver fixes, license cleanups, sysfs
attribute cleanups, usbip changes, and a raft of other smaller fixes
and additions.
Full details are in the shortlog.
All of these have been in the linux-next tree for a long time with no
reported issues"
* tag 'usb-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (137 commits)
USB: serial: pl2303: new device id for Chilitag
USB: misc: fix up some remaining DEVICE_ATTR() usages
USB: musb: fix up one odd DEVICE_ATTR() usage
USB: atm: fix up some remaining DEVICE_ATTR() usage
USB: move many drivers to use DEVICE_ATTR_WO
USB: move many drivers to use DEVICE_ATTR_RO
USB: move many drivers to use DEVICE_ATTR_RW
USB: misc: chaoskey: Use true and false for boolean values
USB: storage: remove old wording about how to submit a change
USB: storage: remove invalid URL from drivers
usb: ehci-omap: don't complain on -EPROBE_DEFER when no PHY found
usbip: list: don't list devices attached to vhci_hcd
usbip: prevent bind loops on devices attached to vhci_hcd
USB: serial: remove redundant initializations of 'mos_parport'
usb/gadget: Fix "high bandwidth" check in usb_gadget_ep_match_desc()
usb: gadget: compress return logic into one line
usbip: vhci_hcd: update 'status' file header and format
USB: serial: simple: add Motorola Tetra driver
CDC-ACM: apply quirk for card reader
usb: option: Add support for FS040U modem
...
|
|
Pull small IDE cleanup from David Miller.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide:
ide: remove duplicated assignment to 'cursg'
|
|
Pull sparc updates from David Miller:
"Of note is the addition of a driver for the Data Analytics
Accelerator, and some small cleanups"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next:
oradax: Fix return value check in dax_attach()
sparc: vDSO: remove an extra tab
sparc64: drop unneeded compat include
sparc64: Oracle DAX driver
sparc64: Oracle DAX infrastructure
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
"Bug fixes, small improvements and one notable change: the system call
table and the unistd.h header are now generated automatically with a
shell script from a text file"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/decompressor: discard __ksymtab and .eh_frame sections
s390: fix handling of -1 in set{,fs}[gu]id16 syscalls
s390/tools: generate header files in arch/s390/include/generated/
s390/syscalls: use generated syscall_table.h and unistd.h header files
s390/syscalls: add Makefile to generate system call header files
s390/syscalls: add syscalltbl script
s390/syscalls: add system call table
s390/decompressor: swap .text and .rodata.compressed sections
s390/sclp: fix .data section specification
s390/ipl: avoid usage of __section(.data)
s390/head: replace hard coded values with constants
s390/disassembler: add generated gen_opcode_table tool to .gitignore
s390: remove bogus system call table entries
s390/kprobes: remove duplicate includes
s390/dasd: Remove dead return code checks
s390/dasd: Simplify code
s390/vdso: revise CFI annotations of vDSO functions
s390/kernel: emit CFI data in .debug_frame and discard .eh_frame sections
|
|
gcc versions prior to 4.6 require an extra level of braces when using a
designated initializer for a member in an anonymous struct or union.
This caused a compile error with the 'struct qstr' initialization in
__fscrypt_encrypt_symlink().
Fix it by using QSTR_INIT().
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Fixes: 76e81d6d5048 ("fscrypt: new helper functions for ->symlink()")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Correct spelling of "coccinelle".
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
RTM_NEWLINK supports the IFLA_IF_NETNSID property since
5bb8ed075428b71492734af66230aa0c07fcc515 so we should not error out
when it is passed.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, rocker user may experience following null pointer
derefence bug:
[ 3.062141] BUG: unable to handle kernel NULL pointer dereference at 00000000000000d0
[ 3.065163] IP: rocker_router_fib_event_work+0x36/0x110 [rocker]
The problem is uninitialized rocker->wops pointer that is initialized
only with the first initialized port. So move the port initialization
before registering the fib events.
Fixes: 936bd486564a ("rocker: use FIB notifications instead of switchdev calls")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With gcc-4.1.2:
net/ipv4/inet_hashtables.c: In function ‘inet_unhash’:
net/ipv4/inet_hashtables.c:628: warning: ‘ilb’ may be used uninitialized in this function
While this is a false positive, it can easily be avoided by using the
pointer itself as the canary variable.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With gcc-4.1.2.:
net/bridge/br_fdb.c: In function ‘br_fdb_sync_static’:
net/bridge/br_fdb.c:996: warning: ‘err’ may be used uninitialized in this function
Indeed, if the list is empty, err will be uninitialized, and will be
propagated up as the function return value.
Fix this by preinitializing err to zero.
Fixes: eb7935830d00b9e0 ("net: bridge: use rhashtable for fdbs")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
IPv4 and IPv6 packets may arrive with lower-layer padding that is not
included in the L3 length. For example, a short IPv4 packet may have
up to 6 bytes of padding following the IP payload when received on an
Ethernet device with a minimum packet length of 64 bytes.
Higher-layer processing functions in netfilter (e.g. nf_ip_checksum(),
and help() in nf_conntrack_ftp) assume skb->len reflects the length of
the L3 header and payload, rather than referring back to
ip_hdr->tot_len or ipv6_hdr->payload_len, and get confused by
lower-layer padding.
In the normal IPv4 receive path, ip_rcv() trims the packet to
ip_hdr->tot_len before invoking netfilter hooks. In the IPv6 receive
path, ip6_rcv() does the same using ipv6_hdr->payload_len. Similarly
in the br_netfilter receive path, br_validate_ipv4() and
br_validate_ipv6() trim the packet to the L3 length before invoking
netfilter hooks.
Currently in the OVS conntrack receive path, ovs_ct_execute() pulls
the skb to the L3 header but does not trim it to the L3 length before
calling nf_conntrack_in(NF_INET_PRE_ROUTING). When
nf_conntrack_proto_tcp encounters a packet with lower-layer padding,
nf_ip_checksum() fails causing a "nf_ct_tcp: bad TCP checksum" log
message. While extra zero bytes don't affect the checksum, the length
in the IP pseudoheader does. That length is based on skb->len, and
without trimming, it doesn't match the length the sender used when
computing the checksum.
In ovs_ct_execute(), trim the skb to the L3 length before higher-layer
processing.
Signed-off-by: Ed Swierk <eswierk@skyportsystems.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This commit fixes the pacing_gain to remain at BBR_UNIT (1.0) when
using lt_bw and returning from the PROBE_RTT state to PROBE_BW.
Previously, when using lt_bw, upon exiting PROBE_RTT and entering
PROBE_BW the bbr_reset_probe_bw_mode() code could sometimes randomly
end up with a cycle_idx of 0 and hence have bbr_advance_cycle_phase()
set a pacing gain above 1.0. In such cases this would result in a
pacing rate that is 1.25x higher than intended, potentially resulting
in a high loss rate for a little while until we stop using the lt_bw a
bit later.
This commit is a stable candidate for kernels back as far as 4.9.
Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Reported-by: Beyers Cronje <bcronje@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Variable head is initialized to a value that is never read and is
being updated to a new value a few lines later, hence this
initialization is redundant and can be safely removed as well
as the now unused pointer txq.
Cleans up clang warning:
drivers/net/ethernet/emulex/benet/be_main.c:996:6: warning: Value
stored to 'head' during its initialization is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Daniel Axtens says:
====================
bnx2x: disable GSO on too-large packets
We observed a case where a packet received on an ibmveth device had a
GSO size of around 10kB. This was forwarded by Open vSwitch to a bnx2x
device, where it caused a firmware assert. This is described in detail
at [0].
Ultimately we want a fix in the core, but that is very tricky to
backport. So for now, just stop the bnx2x driver from crashing.
When net-next re-opens I will send the fix to the core and a revert
for this.
v4 changes:
- fix compilation error with EXPORTs (patch 1)
- only do slow test if gso_size is greater than 9000 bytes (patch 2)
Thanks,
Daniel
[0]: https://patchwork.ozlabs.org/patch/859410/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If a bnx2x card is passed a GSO packet with a gso_size larger than
~9700 bytes, it will cause a firmware error that will bring the card
down:
bnx2x: [bnx2x_attn_int_deasserted3:4323(enP24p1s0f0)]MC assert!
bnx2x: [bnx2x_mc_assert:720(enP24p1s0f0)]XSTORM_ASSERT_LIST_INDEX 0x2
bnx2x: [bnx2x_mc_assert:736(enP24p1s0f0)]XSTORM_ASSERT_INDEX 0x0 = 0x00000000 0x25e43e47 0x00463e01 0x00010052
bnx2x: [bnx2x_mc_assert:750(enP24p1s0f0)]Chip Revision: everest3, FW Version: 7_13_1
... (dump of values continues) ...
Detect when the mac length of a GSO packet is greater than the maximum
packet size (9700 bytes) and disable GSO.
Signed-off-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If you take a GSO skb, and split it into packets, will the MAC
length (L2 + L3 + L4 headers + payload) of those packets be small
enough to fit within a given length?
Move skb_gso_mac_seglen() to skbuff.h with other related functions
like skb_gso_network_seglen() so we can use it, and then create
skb_gso_validate_mac_len to do the full calculation.
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Prepare input updates for 4.16 merge window.
|
|
Pull documentation updates from Jonathan Corbet:
"Documentation updates for 4.16.
New stuff includes refcount_t documentation, errseq documentation,
kernel-doc support for nested structure definitions, the removal of
lots of crufty kernel-doc support for unused formats, SPDX tag
documentation, the beginnings of a manual for subsystem maintainers,
and lots of fixes and updates.
As usual, some of the changesets reach outside of Documentation/ to
effect kerneldoc comment fixes. It also adds the new LICENSES
directory, of which Thomas promises I do not need to be the
maintainer"
* tag 'docs-4.16' of git://git.lwn.net/linux: (65 commits)
linux-next: docs-rst: Fix typos in kfigure.py
linux-next: DOC: HWPOISON: Fix path to debugfs in hwpoison.txt
Documentation: Fix misconversion of #if
docs: add index entry for networking/msg_zerocopy
Documentation: security/credentials.rst: explain need to sort group_list
LICENSES: Add MPL-1.1 license
LICENSES: Add the GPL 1.0 license
LICENSES: Add Linux syscall note exception
LICENSES: Add the MIT license
LICENSES: Add the BSD-3-clause "Clear" license
LICENSES: Add the BSD 3-clause "New" or "Revised" License
LICENSES: Add the BSD 2-clause "Simplified" license
LICENSES: Add the LGPL-2.1 license
LICENSES: Add the LGPL 2.0 license
LICENSES: Add the GPL 2.0 license
Documentation: Add license-rules.rst to describe how to properly identify file licenses
scripts: kernel_doc: better handle show warnings logic
fs/*/Kconfig: drop links to 404-compliant http://acl.bestbits.at
doc: md: Fix a file name to md-fault.c in fault-injection.txt
errseq: Add to documentation tree
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vmci iov_iter updates from Al Viro:
"Get rid of "is it an iovec or an entire array?" flags in vmxi - just
use iov_iter. Simplifies the living hell out of that code..."
* 'work.vmci' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
vmci: the same on the send side...
vmci: simplify qp_dequeue_locked()
vmci: get rid of qp_memcpy_from_queue()
vmci: fix buf_size in case of iovec-based accesses
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull asm/uaccess.h whack-a-mole from Al Viro:
"It's linux/uaccess.h, damnit... Oh, well - eventually they'll stop
cropping up..."
* 'work.whack-a-mole' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
asm-prototypes.h: use linux/uaccess.h, not asm/uaccess.h
riscv: use linux/uaccess.h, not asm/uaccess.h...
ppc: for put_user() pull linux/uaccess.h, not asm/uaccess.h
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull dcache updates from Al Viro:
"Neil Brown's d_move()/d_path() race fix"
* 'work.dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
VFS: close race between getcwd() and d_move()
|
|
Merge updates from Andrew Morton:
- misc fixes
- ocfs2 updates
- most of MM
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (118 commits)
mm: remove PG_highmem description
tools, vm: new option to specify kpageflags file
mm/swap.c: make functions and their kernel-doc agree
mm, memory_hotplug: fix memmap initialization
mm: correct comments regarding do_fault_around()
mm: numa: do not trap faults on shared data section pages.
hugetlb, mbind: fall back to default policy if vma is NULL
hugetlb, mempolicy: fix the mbind hugetlb migration
mm, hugetlb: further simplify hugetlb allocation API
mm, hugetlb: get rid of surplus page accounting tricks
mm, hugetlb: do not rely on overcommit limit during migration
mm, hugetlb: integrate giga hugetlb more naturally to the allocation path
mm, hugetlb: unify core page allocation accounting and initialization
mm/memcontrol.c: try harder to decrease [memory,memsw].limit_in_bytes
mm/memcontrol.c: make local symbol static
mm/hmm: fix uninitialized use of 'entry' in hmm_vma_walk_pmd()
include/linux/mmzone.h: fix explanation of lower bits in the SPARSEMEM mem_map pointer
mm/compaction.c: fix comment for try_to_compact_pages()
mm/page_ext.c: make page_ext_init a noop when CONFIG_PAGE_EXTENSION but nothing uses it
zsmalloc: use U suffix for negative literals being shifted
...
|
|
In the past the ast driver relied upon the fbdev emulation helpers to
call ->load_lut at boot-up. But since
commit b8e2b0199cc377617dc238f5106352c06dcd3fa2
Author: Peter Rosin <peda@axentia.se>
Date: Tue Jul 4 12:36:57 2017 +0200
drm/fb-helper: factor out pseudo-palette
that's cleaned up and drivers are expected to boot into a consistent
lut state. This patch fixes that.
Fixes: b8e2b0199cc3 ("drm/fb-helper: factor out pseudo-palette")
Cc: Peter Rosin <peda@axenita.se>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: <stable@vger.kernel.org> # v4.14+
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=198123
Cc: Bill Fraser <bill.fraser@gmail.com>
Reported-and-Tested-by: Bill Fraser <bill.fraser@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-next
This contains a fix to restrict what lessee can do with masters and
another one when waiting for timeouts on reservation objects.
* tag 'drm-misc-next-fixes-2018-01-31' of git://anongit.freedesktop.org/drm/drm-misc:
drm: Check for lessee in DROP_MASTER ioctl
dma-buf: fix reservation_object_wait_timeout_rcu once more v2
|
|
Commit cbe37d093707 ("[PATCH] mm: remove PG_highmem") removed PG_highmem
to save a page flag. So the description of PG_highmem is no longer
needed.
Link: http://lkml.kernel.org/r/1517391212-2950-1-git-send-email-miles.chen@mediatek.com
Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
page-types currently hardcodes /proc/kpageflags as the file to parse.
This works when using the tool to examine the state of pageflags on the
same system, but does not allow storing a snapshot of pageflags at a
given time to debug issues nor on a different system.
This allows the user to specify a saved version of kpageflags with a new
page-types -F option.
[akpm@linux-foundation.org: add "filename" to fix usage() string]
[rientjes@google.com: fix layout]
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1801301840050.140969@chino.kir.corp.google.com
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1801301458180.153857@chino.kir.corp.google.com
Signed-off-by: David Rientjes <rientjes@google.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Fix some basic kernel-doc notation in mm/swap.c:
- for function lru_cache_add_anon(), make its kernel-doc function name
match its function name and change colon to hyphen following the
function name
- for function pagevec_lookup_entries(), change the function parameter
name from nr_pages to nr_entries since that is more descriptive of
what the parameter actually is and then it matches the kernel-doc
comments also
Fix function kernel-doc to match the change in commit 67fd707f4681:
- drop the kernel-doc notation for @nr_pages from
pagevec_lookup_range() and correct the function description for that
change
Link: http://lkml.kernel.org/r/3b42ee3e-04a9-a6ca-6be4-f00752a114fe@infradead.org
Fixes: 67fd707f4681 ("mm: remove nr_pages argument from pagevec_lookup_{,range}_tag()")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Bharata has noticed that onlining a newly added memory doesn't increase
the total memory, pointing to commit f7f99100d8d9 ("mm: stop zeroing
memory during allocation in vmemmap") as a culprit. This commit has
changed the way how the memory for memmaps is initialized and moves it
from the allocation time to the initialization time. This works
properly for the early memmap init path.
It doesn't work for the memory hotplug though because we need to mark
page as reserved when the sparsemem section is created and later
initialize it completely during onlining. memmap_init_zone is called in
the early stage of onlining. With the current code it calls
__init_single_page and as such it clears up the whole stage and
therefore online_pages_range skips those pages.
Fix this by skipping mm_zero_struct_page in __init_single_page for
memory hotplug path. This is quite uggly but unifying both early init
and memory hotplug init paths is a large project. Make sure we plug the
regression at least.
Link: http://lkml.kernel.org/r/20180130101141.GW21609@dhcp22.suse.cz
Fixes: f7f99100d8d9 ("mm: stop zeroing memory during allocation in vmemmap")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Tested-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Bob Picco <bob.picco@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
There are multiple comments surrounding do_fault_around that memtion
fault_around_pages() and fault_around_mask(), two routines that do not
exist. These comments should be reworded to reference
fault_around_bytes, the value which is used to determine how much
do_fault_around() will attempt to read when processing a fault.
These comments should have been updated when fault_around_pages() and
fault_around_mask() were removed in commit aecd6f44266c ("mm: close race
between do_fault_around() and fault_around_bytes_set()").
Fixes: aecd6f44266c1 ("mm: close race between do_fault_around() and fault_around_bytes_set()")
Link: http://lkml.kernel.org/r/302D0B14-C7E9-44C6-8BED-033F9ACBD030@oracle.com
Signed-off-by: William Kucharski <william.kucharski@oracle.com>
Reviewed-by: Larry Bassel <larry.bassel@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Workloads consisting of a large number of processes running the same
program with a very large shared data segment may experience performance
problems when numa balancing attempts to migrate the shared cow pages.
This manifests itself with many processes or tasks in
TASK_UNINTERRUPTIBLE state waiting for the shared pages to be migrated.
The program listed below simulates the conditions with these results
when run with 288 processes on a 144 core/8 socket machine.
Average throughput Average throughput Average throughput
with numa_balancing=0 with numa_balancing=1 with numa_balancing=1
without the patch with the patch
--------------------- --------------------- ---------------------
2118782 2021534 2107979
Complex production environments show less variability and fewer poorly
performing outliers accompanied with a smaller number of processes
waiting on NUMA page migration with this patch applied. In some cases,
%iowait drops from 16%-26% to 0.
// SPDX-License-Identifier: GPL-2.0
/*
* Copyright (c) 2017 Oracle and/or its affiliates. All rights reserved.
*/
#include <sys/time.h>
#include <stdio.h>
#include <wait.h>
#include <sys/mman.h>
int a[1000000] = {13};
int main(int argc, const char **argv)
{
int n = 0;
int i;
pid_t pid;
int stat;
int *count_array;
int cpu_count = 288;
long total = 0;
struct timeval t1, t2 = {(argc > 1 ? atoi(argv[1]) : 10), 0};
if (argc > 2)
cpu_count = atoi(argv[2]);
count_array = mmap(NULL, cpu_count * sizeof(int),
(PROT_READ|PROT_WRITE),
(MAP_SHARED|MAP_ANONYMOUS), 0, 0);
if (count_array == MAP_FAILED) {
perror("mmap:");
return 0;
}
for (i = 0; i < cpu_count; ++i) {
pid = fork();
if (pid <= 0)
break;
if ((i & 0xf) == 0)
usleep(2);
}
if (pid != 0) {
if (i == 0) {
perror("fork:");
return 0;
}
for (;;) {
pid = wait(&stat);
if (pid < 0)
break;
}
for (i = 0; i < cpu_count; ++i)
total += count_array[i];
printf("Total %ld\n", total);
munmap(count_array, cpu_count * sizeof(int));
return 0;
}
gettimeofday(&t1, 0);
timeradd(&t1, &t2, &t1);
while (timercmp(&t2, &t1, <)) {
int b = 0;
int j;
for (j = 0; j < 1000000; j++)
b += a[j];
gettimeofday(&t2, 0);
n++;
}
count_array[i] = n;
return 0;
}
This patch changes change_pte_range() to skip shared copy-on-write pages
when called from change_prot_numa().
NOTE: change_prot_numa() is nominally called from task_numa_work() and
queue_pages_test_walk(). task_numa_work() is the auto NUMA balancing
path, and queue_pages_test_walk() is part of explicit NUMA policy
management. However, queue_pages_test_walk() only calls
change_prot_numa() when MPOL_MF_LAZY is specified and currently that is
not allowed, so change_prot_numa() is only called from auto NUMA
balancing.
In the case of explicit NUMA policy management, shared pages are not
migrated unless MPOL_MF_MOVE_ALL is specified, and MPOL_MF_MOVE_ALL
depends on CAP_SYS_NICE. Currently, there is no way to pass information
about MPOL_MF_MOVE_ALL to change_pte_range. This will have to be fixed
if MPOL_MF_LAZY is enabled and MPOL_MF_MOVE_ALL is to be honored in lazy
migration mode.
task_numa_work() skips the read-only VMAs of programs and shared
libraries.
Link: http://lkml.kernel.org/r/1516751617-7369-1-git-send-email-henry.willard@oracle.com
Signed-off-by: Henry Willard <henry.willard@oracle.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Steve Sistare <steven.sistare@oracle.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Dan Carpenter has noticed that mbind migration callback (new_page) can
get a NULL vma pointer and choke on it inside alloc_huge_page_vma which
relies on the VMA to get the hstate. We used to BUG_ON this case but
the BUG_+ON has been removed recently by "hugetlb, mempolicy: fix the
mbind hugetlb migration".
The proper way to handle this is to get the hstate from the migrated
page and rely on huge_node (resp. get_vma_policy) do the right thing
with null VMA. We are currently falling back to the default mempolicy
in that case which is in line what THP path is doing here.
Link: http://lkml.kernel.org/r/20180110104712.GR1732@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
do_mbind migration code relies on alloc_huge_page_noerr for hugetlb
pages. alloc_huge_page_noerr uses alloc_huge_page which is a highlevel
allocation function which has to take care of reserves, overcommit or
hugetlb cgroup accounting. None of that is really required for the page
migration because the new page is only temporal and either will replace
the original page or it will be dropped. This is essentially as for
other migration call paths and there shouldn't be any reason to handle
mbind in a special way.
The current implementation is even suboptimal because the migration
might fail just because the hugetlb cgroup limit is reached, or the
overcommit is saturated.
Fix this by making mbind like other hugetlb migration paths. Add a new
migration helper alloc_huge_page_vma as a wrapper around
alloc_huge_page_nodemask with additional mempolicy handling.
alloc_huge_page_noerr has no more users and it can go.
Link: http://lkml.kernel.org/r/20180103093213.26329-7-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andrea Reale <ar@linux.vnet.ibm.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Hugetlb allocator has several layer of allocation functions depending
and the purpose of the allocation. There are two allocators depending
on whether the page can be allocated from the page allocator or we need
a contiguous allocator. This is currently opencoded in
alloc_fresh_huge_page which is the only path that might allocate giga
pages which require the later allocator. Create alloc_fresh_huge_page
which hides this implementation detail and use it in all callers which
hardcoded the buddy allocator path (__hugetlb_alloc_buddy_huge_page).
This shouldn't introduce any funtional change because both migration and
surplus allocators exlude giga pages explicitly.
While we are at it let's do some renaming. The current scheme is not
consistent and overly painfull to read and understand. Get rid of
prefix underscores from most functions. There is no real reason to make
names longer.
* alloc_fresh_huge_page is the new layer to abstract underlying
allocator
* __hugetlb_alloc_buddy_huge_page becomes shorter and neater
alloc_buddy_huge_page.
* Former alloc_fresh_huge_page becomes alloc_pool_huge_page because we put
the new page directly to the pool
* alloc_surplus_huge_page can drop the opencoded prep_new_huge_page code
as it uses alloc_fresh_huge_page now
* others lose their excessive prefix underscores to make names shorter
[dan.carpenter@oracle.com: fix double unlock bug in alloc_surplus_huge_page()]
Link: http://lkml.kernel.org/r/20180109200559.g3iz5kvbdrz7yydp@mwanda
Link: http://lkml.kernel.org/r/20180103093213.26329-6-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andrea Reale <ar@linux.vnet.ibm.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
alloc_surplus_huge_page increases the pool size and the number of
surplus pages opportunistically to prevent from races with the pool size
change. See commit d1c3fb1f8f29 ("hugetlb: introduce
nr_overcommit_hugepages sysctl") for more details.
The resulting code is unnecessarily hairy, cause code duplication and
doesn't allow to share the allocation paths. Moreover pool size changes
tend to be very seldom so optimizing for them is not really reasonable.
Simplify the code and allow to allocate a fresh surplus page as long as
we are under the overcommit limit and then recheck the condition after
the allocation and drop the new page if the situation has changed. This
should provide a reasonable guarantee that an abrupt allocation requests
will not go way off the limit.
If we consider races with the pool shrinking and enlarging then we
should be reasonably safe as well. In the first case we are off by one
in the worst case and the second case should work OK because the page is
not yet visible. We can waste CPU cycles for the allocation but that
should be acceptable for a relatively rare condition.
Link: http://lkml.kernel.org/r/20180103093213.26329-5-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andrea Reale <ar@linux.vnet.ibm.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|