Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
"Doug and I are at a conference next week so if another PR is sent I
expect it to only be bug fixes. Parav noted yesterday that there are
some fringe case behavior changes in his work that he would like to
fix, and I see that Intel has a number of rc looking patches for HFI1
they posted yesterday.
Parav is again the biggest contributor by patch count with his ongoing
work to enable container support in the RDMA stack, followed by Leon
doing syzkaller inspired cleanups, though most of the actual fixing
went to RC.
There is one uncomfortable series here fixing the user ABI to actually
work as intended in 32 bit mode. There are lots of notes in the commit
messages, but the basic summary is we don't think there is an actual
32 bit kernel user of drivers/infiniband for several good reasons.
However we are seeing people want to use a 32 bit user space with 64
bit kernel, which didn't completely work today. So in fixing it we
required a 32 bit rxe user to upgrade their userspace. rxe users are
still already quite rare and we think a 32 bit one is non-existing.
- Fix RDMA uapi headers to actually compile in userspace and be more
complete
- Three shared with netdev pull requests from Mellanox:
* 7 patches, mostly to net with 1 IB related one at the back).
This series addresses an IRQ performance issue (patch 1),
cleanups related to the fix for the IRQ performance problem
(patches 2-6), and then extends the fragmented completion queue
support that already exists in the net side of the driver to the
ib side of the driver (patch 7).
* Mostly IB, with 5 patches to net that are needed to support the
remaining 10 patches to the IB subsystem. This series extends
the current 'representor' framework when the mlx5 driver is in
switchdev mode from being a netdev only construct to being a
netdev/IB dev construct. The IB dev is limited to raw Eth queue
pairs only, but by having an IB dev of this type attached to the
representor for a switchdev port, it enables DPDK to work on the
switchdev device.
* All net related, but needed as infrastructure for the rdma
driver
- Updates for the hns, i40iw, bnxt_re, cxgb3, cxgb4, hns drivers
- SRP performance updates
- IB uverbs write path cleanup patch series from Leon
- Add RDMA_CM support to ib_srpt. This is disabled by default. Users
need to set the port for ib_srpt to listen on in configfs in order
for it to be enabled
(/sys/kernel/config/target/srpt/discovery_auth/rdma_cm_port)
- TSO and Scatter FCS support in mlx4
- Refactor of modify_qp routine to resolve problems seen while
working on new code that is forthcoming
- More refactoring and updates of RDMA CM for containers support from
Parav
- mlx5 'fine grained packet pacing', 'ipsec offload' and 'device
memory' user API features
- Infrastructure updates for the new IOCTL interface, based on
increased usage
- ABI compatibility bug fixes to fully support 32 bit userspace on 64
bit kernel as was originally intended. See the commit messages for
extensive details
- Syzkaller bugs and code cleanups motivated by them"
* tag 'for-linus-unmerged' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (199 commits)
IB/rxe: Fix for oops in rxe_register_device on ppc64le arch
IB/mlx5: Device memory mr registration support
net/mlx5: Mkey creation command adjustments
IB/mlx5: Device memory support in mlx5_ib
net/mlx5: Query device memory capabilities
IB/uverbs: Add device memory registration ioctl support
IB/uverbs: Add alloc/free dm uverbs ioctl support
IB/uverbs: Add device memory capabilities reporting
IB/uverbs: Expose device memory capabilities to user
RDMA/qedr: Fix wmb usage in qedr
IB/rxe: Removed GID add/del dummy routines
RDMA/qedr: Zero stack memory before copying to user space
IB/mlx5: Add ability to hash by IPSEC_SPI when creating a TIR
IB/mlx5: Add information for querying IPsec capabilities
IB/mlx5: Add IPsec support for egress and ingress
{net,IB}/mlx5: Add ipsec helper
IB/mlx5: Add modify_flow_action_esp verb
IB/mlx5: Add implementation for create and destroy action_xfrm
IB/uverbs: Introduce ESP steering match filter
IB/uverbs: Add modify ESP flow_action
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull SELinux updates from Paul Moore:
"A bigger than usual pull request for SELinux, 13 patches (lucky!)
along with a scary looking diffstat.
Although if you look a bit closer, excluding the usual minor
tweaks/fixes, there are really only two significant changes in this
pull request: the addition of proper SELinux access controls for SCTP
and the encapsulation of a lot of internal SELinux state.
The SCTP changes are the result of a multi-month effort (maybe even a
year or longer?) between the SELinux folks and the SCTP folks to add
proper SELinux controls. A special thanks go to Richard for seeing
this through and keeping the effort moving forward.
The state encapsulation work is a bit of janitorial work that came out
of some early work on SELinux namespacing. The question of namespacing
is still an open one, but I believe there is some real value in the
encapsulation work so we've split that out and are now sending that up
to you"
* tag 'selinux-pr-20180403' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: wrap AVC state
selinux: wrap selinuxfs state
selinux: fix handling of uninitialized selinux state in get_bools/classes
selinux: Update SELinux SCTP documentation
selinux: Fix ltp test connect-syscall failure
selinux: rename the {is,set}_enforcing() functions
selinux: wrap global selinux state
selinux: fix typo in selinux_netlbl_sctp_sk_clone declaration
selinux: Add SCTP support
sctp: Add LSM hooks
sctp: Add ip option support
security: Add support for SCTP security hooks
netlabel: If PF_INET6, check sk_buff ip header version
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit
Pull audit updates from Paul Moore:
"We didn't have anything to send for v4.16, but we're back with a
little more than usual for v4.17.
Eleven patches in total, most fall into the small fix category, but
there are three non-trivial changes worth calling out:
- the audit entry filter is being removed after deprecating it for
quite a while (years of no one really using it because it turns out
to be not very practical)
- created our own version of "__mutex_owner()" because the locking
folks were upset we were using theirs
- improved our handling of kernel command line parameters to make
them more forgiving
- we fixed auditing of symlink operations
Everything passes the audit-testsuite and as of a few minutes ago it
merges well with your tree"
* tag 'audit-pr-20180403' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
audit: add refused symlink to audit_names
audit: remove path param from link denied function
audit: link denied should not directly generate PATH record
audit: make ANOM_LINK obey audit_enabled and audit_dummy_context
audit: do not panic on invalid boot parameter
audit: track the owner of the command mutex ourselves
audit: return on memory error to avoid null pointer dereference
audit: bail before bug check if audit disabled
audit: deprecate the AUDIT_FILTER_ENTRY filter
audit: session ID should not set arch quick field pointer
audit: update bugtracker and source URIs
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull pstore updates from Kees Cook:
"This cycle was almost entirely improvements to the pstore compression
options, noted below:
- Add lz4hc and 842 to pstore compression options (Geliang Tang)
- Refactor to use crypto compression API (Geliang Tang)
- Fix up Kconfig dependencies for compression (Arnd Bergmann)
- Allow for run-time compression selection
- Remove stack VLA usage"
* tag 'pstore-v4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
pstore: fix crypto dependencies
pstore: Use crypto compress API
pstore/ram: Do not use stack VLA for parity workspace
pstore: Select compression at runtime
pstore: Avoid size casts for 842 compression
pstore: Add lz4hc and 842 compression support
|
|
Merge updates from Andrew Morton:
- a few misc things
- ocfs2 updates
- the v9fs maintainers have been missing for a long time. I've taken
over v9fs patch slinging.
- most of MM
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (116 commits)
mm,oom_reaper: check for MMF_OOM_SKIP before complaining
mm/ksm: fix interaction with THP
mm/memblock.c: cast constant ULLONG_MAX to phys_addr_t
headers: untangle kmemleak.h from mm.h
include/linux/mmdebug.h: make VM_WARN* non-rvals
mm/page_isolation.c: make start_isolate_page_range() fail if already isolated
mm: change return type to vm_fault_t
mm, oom: remove 3% bonus for CAP_SYS_ADMIN processes
mm, page_alloc: wakeup kcompactd even if kswapd cannot free more memory
kernel/fork.c: detect early free of a live mm
mm: make counting of list_lru_one::nr_items lockless
mm/swap_state.c: make bool enable_vma_readahead and swap_vma_readahead() static
block_invalidatepage(): only release page if the full page was invalidated
mm: kernel-doc: add missing parameter descriptions
mm/swap.c: remove @cold parameter description for release_pages()
mm/nommu: remove description of alloc_vm_area
zram: drop max_zpage_size and use zs_huge_class_size()
zsmalloc: introduce zs_huge_class_size()
mm: fix races between swapoff and flush dcache
fs/direct-io.c: minor cleanups in do_blockdev_direct_IO
...
|
|
Pull MTD updates from Boris Brezillon:
"MTD Core:
- Remove support for asynchronous erase (not implemented by any of
the existing drivers anyway)
- Remove Cyrille from the list of SPI NOR and MTD maintainers
- Fix kernel doc headers
- Allow users to define the partitions parsers they want to test
through a DT property (compatible of the partitions subnode)
- Remove the bfin-async-flash driver (the only architecture using it
has been removed)
- Fix pagetest test
- Add extra checks in mtd_erase()
- Simplify the MTD partition creation logic and get rid of
mtd_add_device_partitions()
MTD Drivers:
- Add endianness information to the physmap DT binding
- Add Eon EN29LV400A IDs to JEDEC probe logic
- Use %*ph where appropriate
SPI NOR Drivers:
- Make fsl-quaspi assign different names to MTD devices connected to
the same QSPI controller
- Remove an unneeded driver.bus assigned in the fsl-qspi driver
NAND Core:
- Prepare arrival of the SPI NAND subsystem by implementing a generic
(interface-agnostic) layer to ease manipulation of NAND devices
- Move onenand code base to the drivers/mtd/nand/ dir
- Rework timing mode selection
- Provide a generic way for NAND chip drivers to flag a specific
GET/SET FEATURE operation as supported/unsupported
- Stop embedding ONFI/JEDEC param page in nand_chip
NAND Drivers:
- Rework/cleanup of the mxc driver
- Various cleanups in the vf610 driver
- Migrate the fsmc and vf610 to ->exec_op()
- Get rid of the pxa driver (replaced by marvell_nand)
- Support ->setup_data_interface() in the GPMI driver
- Fix probe error path in several drivers
- Remove support for unused hw_syndrome mode in sunxi_nand
- Various minor improvements"
* tag 'mtd/for-4.17' of git://git.infradead.org/linux-mtd: (89 commits)
dt-bindings: fsl-quadspi: Add the example of two SPI NOR
mtd: fsl-quadspi: Distinguish the mtd device names
mtd: nand: Fix some function description mismatches in core.c
mtd: fsl-quadspi: Remove unneeded driver.bus assignment
mtd: rawnand: marvell: Rename ->ecc_clk into ->core_clk
mtd: rawnand: s3c2410: enhance the probe function error path
mtd: rawnand: tango: fix probe function error path
mtd: rawnand: sh_flctl: fix the probe function error path
mtd: rawnand: omap2: fix the probe function error path
mtd: rawnand: mxc: fix probe function error path
mtd: rawnand: denali: fix probe function error path
mtd: rawnand: davinci: fix probe function error path
mtd: rawnand: cafe: fix probe function error path
mtd: rawnand: brcmnand: fix probe function error path
mtd: rawnand: sunxi: Stop supporting ECC_HW_SYNDROME mode
mtd: rawnand: marvell: Fix clock resource by adding a register clock
mtd: ftl: Use DIV_ROUND_UP()
mtd: Fix some function description mismatches in mtdcore.c
mtd: physmap_of: update struct map_info's swap as per map requirement
dt-bindings: mtd-physmap: Add endianness supports
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper updates from Mike Snitzer:
- DM core passthrough ioctl fix to retain reference to DM table, and
that table's block devices, while issuing the ioctl to one of those
block devices.
- DM core passthrough ioctl fix to _not_ override the fmode_t used to
issue the ioctl. Overriding by using the fmode_t that the block
device was originally open with during DM table load is a liability.
- Add DM core support for secure erase forwarding and update the DM
linear and DM striped targets to support them.
- A DM core 4.16 stable fix to allow abnormal IO (e.g. discard, write
same, write zeroes) for targets that make use of the non-splitting IO
variant (as is done for multipath or thinp when layered directly on
NVMe).
- Allow DM targets to return a payload in response to a DM message that
they are sent. This is useful for DM targets that would like to
provide statistics data in response to DM messages.
- Update DM bufio to support non-power-of-2 block sizes. Numerous other
related changes prepare the DM bufio code for this support.
- Fix DM crypt to use a bounded amount of memory across the entire
system. This is to avoid OOM that can otherwise occur in response to
certain pathological IO workloads (e.g. discarding a large DM crypt
device).
- Add a 'check_at_most_once' feature to the DM verity target to allow
verity to be used on mobile devices that have very limited resources.
- Fix the DM integrity target to fail early if a keyed algorithm (e.g.
HMAC) is to be used but the key isn't set.
- Add non-power-of-2 support to the DM unstripe target.
- Eliminate the use of a Variable Length Array in the DM stripe target.
- Update the DM log-writes target to record metadata (REQ_META flag).
- DM raid fixes for its nosync status and some variable range issues.
* tag 'for-4.17/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (28 commits)
dm: remove fmode_t argument from .prepare_ioctl hook
dm: hold DM table for duration of ioctl rather than use blkdev_get
dm raid: fix parse_raid_params() variable range issue
dm verity: make verity_for_io_block static
dm verity: add 'check_at_most_once' option to only validate hashes once
dm bufio: don't embed a bio in the dm_buffer structure
dm bufio: support non-power-of-two block sizes
dm bufio: use slab cache for dm_buffer structure allocations
dm bufio: reorder fields in dm_buffer structure
dm bufio: relax alignment constraint on slab cache
dm bufio: remove code that merges slab caches
dm bufio: get rid of slab cache name allocations
dm bufio: move dm-bufio.h to include/linux/
dm bufio: delete outdated comment
dm: add support for secure erase forwarding
dm: backfill abnormal IO support to non-splitting IO submission
dm raid: fix nosync status
dm mpath: use DM_MAPIO_SUBMITTED instead of magic number 0 in process_queued_bios()
dm stripe: get rid of a Variable Length Array (VLA)
dm log writes: record metadata flag for better flags record
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
"Assorted stuff, including Christoph's I_DIRTY patches"
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: move I_DIRTY_INODE to fs.h
ubifs: fix bogus __mark_inode_dirty(I_DIRTY_SYNC | I_DIRTY_DATASYNC) call
ntfs: fix bogus __mark_inode_dirty(I_DIRTY_SYNC | I_DIRTY_DATASYNC) call
gfs2: fix bogus __mark_inode_dirty(I_DIRTY_SYNC | I_DIRTY_DATASYNC) calls
fs: fold open_check_o_direct into do_dentry_open
vfs: Replace stray non-ASCII homoglyph characters with their ASCII equivalents
vfs: make sure struct filename->iname is word-aligned
get rid of pointless includes of fs_struct.h
[poll] annotate SAA6588_CMD_POLL users
|
|
Currently <linux/slab.h> #includes <linux/kmemleak.h> for no obvious
reason. It looks like it's only a convenience, so remove kmemleak.h
from slab.h and add <linux/kmemleak.h> to any users of kmemleak_* that
don't already #include it. Also remove <linux/kmemleak.h> from source
files that do not use it.
This is tested on i386 allmodconfig and x86_64 allmodconfig. It would
be good to run it through the 0day bot for other $ARCHes. I have
neither the horsepower nor the storage space for the other $ARCHes.
Update: This patch has been extensively build-tested by both the 0day
bot & kisskb/ozlabs build farms. Both of them reported 2 build failures
for which patches are included here (in v2).
[ slab.h is the second most used header file after module.h; kernel.h is
right there with slab.h. There could be some minor error in the
counting due to some #includes having comments after them and I didn't
combine all of those. ]
[akpm@linux-foundation.org: security/keys/big_key.c needs vmalloc.h, per sfr]
Link: http://lkml.kernel.org/r/e4309f98-3749-93e1-4bb7-d9501a39d015@infradead.org
Link: http://kisskb.ellerman.id.au/kisskb/head/13396/
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Reported-by: Michael Ellerman <mpe@ellerman.id.au> [2 build failures]
Reported-by: Fengguang Wu <fengguang.wu@intel.com> [2 build failures]
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Wei Yongjun <weiyongjun1@huawei.com>
Cc: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
At present the construct
if (VM_WARN(...))
will compile OK with CONFIG_DEBUG_VM=y and will fail with
CONFIG_DEBUG_VM=n. The reason is that VM_{WARN,BUG}* have always been
special wrt. {WARN/BUG}* and never generate any code when DEBUG_VM is
disabled. So we cannot really use it in conditionals.
We considered changing things so that this construct works in both cases
but that might cause unwanted code generation with CONFIG_DEBUG_VM=n.
It is safer and simpler to make the build fail in both cases.
[akpm@linux-foundation.org: changelog]
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The plan for these patches is to introduce the typedef, initially just
as documentation ("These functions should return a VM_FAULT_ status").
We'll trickle the patches to individual drivers/filesystems in through
the maintainers, as far as possible. Then we'll change the typedef to
an unsigned int and break the compilation of any unconverted
drivers/filesystems.
vmf_insert_page(), vmf_insert_mixed() and vmf_insert_pfn() are three
newly added functions. The various drivers/filesystems where return
value of fault(), huge_fault(), page_mkwrite() and pfn_mkwrite() get
converted, will need them. These functions will return correct
VM_FAULT_ code based on err value.
We've had bugs before where drivers returned -EFOO. And we have this
silly inefficiency where vm_insert_xxx() return an errno which (afaict)
every driver then converts into a VM_FAULT code. In many cases drivers
failed to return correct VM_FAULT code value despite of vm_insert_xxx()
fails. We have indentified and clean up all those existing bugs and
silly inefficiencies in driver/filesystems by adding these three new
inline wrappers. As mentioned above, we will trickle those patches to
individual drivers/filesystems in through maintainers after these three
wrapper functions are merged.
Eventually we can convert vm_insert_xxx() into vmf_insert_xxx() and
remove these inline wrappers, but these are a good intermediate step.
Link: http://lkml.kernel.org/r/20180310162351.GA7422@jordon-HP-15-Notebook-PC
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Kswapd will not wakeup if per-zone watermarks are not failing or if too
many previous attempts at background reclaim have failed.
This can be true if there is a lot of free memory available. For high-
order allocations, kswapd is responsible for waking up kcompactd for
background compaction. If the zone is not below its watermarks or
reclaim has recently failed (lots of free memory, nothing left to
reclaim), kcompactd does not get woken up.
When __GFP_DIRECT_RECLAIM is not allowed, allow kcompactd to still be
woken up even if kswapd will not reclaim. This allows high-order
allocations, such as thp, to still trigger background compaction even
when the zone has an abundance of free memory.
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1803111659420.209721@chino.kir.corp.google.com
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
During the reclaiming slab of a memcg, shrink_slab iterates over all
registered shrinkers in the system, and tries to count and consume
objects related to the cgroup. In case of memory pressure, this behaves
bad: I observe high system time and time spent in list_lru_count_one()
for many processes on RHEL7 kernel.
This patch makes list_lru_node::memcg_lrus rcu protected, that allows to
skip taking spinlock in list_lru_count_one().
Shakeel Butt with the patch observes significant perf graph change. He
says:
========================================================================
Setup: running a fork-bomb in a memcg of 200MiB on a 8GiB and 4 vcpu
VM and recording the trace with 'perf record -g -a'.
The trace without the patch:
+ 34.19% fb.sh [kernel.kallsyms] [k] queued_spin_lock_slowpath
+ 30.77% fb.sh [kernel.kallsyms] [k] _raw_spin_lock
+ 3.53% fb.sh [kernel.kallsyms] [k] list_lru_count_one
+ 2.26% fb.sh [kernel.kallsyms] [k] super_cache_count
+ 1.68% fb.sh [kernel.kallsyms] [k] shrink_slab
+ 0.59% fb.sh [kernel.kallsyms] [k] down_read_trylock
+ 0.48% fb.sh [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
+ 0.38% fb.sh [kernel.kallsyms] [k] shrink_node_memcg
+ 0.32% fb.sh [kernel.kallsyms] [k] queue_work_on
+ 0.26% fb.sh [kernel.kallsyms] [k] count_shadow_nodes
With the patch:
+ 0.16% swapper [kernel.kallsyms] [k] default_idle
+ 0.13% oom_reaper [kernel.kallsyms] [k] mutex_spin_on_owner
+ 0.05% perf [kernel.kallsyms] [k] copy_user_generic_string
+ 0.05% init.real [kernel.kallsyms] [k] wait_consider_task
+ 0.05% kworker/0:0 [kernel.kallsyms] [k] finish_task_switch
+ 0.04% kworker/2:1 [kernel.kallsyms] [k] finish_task_switch
+ 0.04% kworker/3:1 [kernel.kallsyms] [k] finish_task_switch
+ 0.04% kworker/1:0 [kernel.kallsyms] [k] finish_task_switch
+ 0.03% binary [kernel.kallsyms] [k] copy_page
========================================================================
Thanks Shakeel for the testing.
[ktkhai@virtuozzo.com: v2]
Link: http://lkml.kernel.org/r/151203869520.3915.2587549826865799173.stgit@localhost.localdomain
Link: http://lkml.kernel.org/r/150583358557.26700.8490036563698102569.stgit@localhost.localdomain
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Tested-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Patch series "zsmalloc/zram: drop zram's max_zpage_size", v3.
ZRAM's max_zpage_size is a bad thing. It forces zsmalloc to store
normal objects as huge ones, which results in bigger zsmalloc memory
usage. Drop it and use actual zsmalloc huge-class value when decide if
the object is huge or not.
This patch (of 2):
Not every object can be share its zspage with other objects, e.g. when
the object is as big as zspage or nearly as big a zspage. For such
objects zsmalloc has a so called huge class - every object which belongs
to huge class consumes the entire zspage (which consists of a physical
page). On x86_64, PAGE_SHIFT 12 box, the first non-huge class size is
3264, so starting down from size 3264, objects can share page(-s) and
thus minimize memory wastage.
ZRAM, however, has its own statically defined watermark for huge
objects, namely "3 * PAGE_SIZE / 4 = 3072", and forcibly stores every
object larger than this watermark (3072) as a PAGE_SIZE object, in other
words, to a huge class, while zsmalloc can keep some of those objects in
non-huge classes. This results in increased memory consumption.
zsmalloc knows better if the object is huge or not. Introduce
zs_huge_class_size() function which tells if the given object can be
stored in one of non-huge classes or not. This will let us to drop
ZRAM's huge object watermark and fully rely on zsmalloc when we decide
if the object is huge.
[sergey.senozhatsky.work@gmail.com: add pool param to zs_huge_class_size()]
Link: http://lkml.kernel.org/r/20180314081833.1096-2-sergey.senozhatsky@gmail.com
Link: http://lkml.kernel.org/r/20180306070639.7389-2-sergey.senozhatsky@gmail.com
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Thanks to commit 4b3ef9daa4fc ("mm/swap: split swap cache into 64MB
trunks"), after swapoff the address_space associated with the swap
device will be freed. So page_mapping() users which may touch the
address_space need some kind of mechanism to prevent the address_space
from being freed during accessing.
The dcache flushing functions (flush_dcache_page(), etc) in architecture
specific code may access the address_space of swap device for anonymous
pages in swap cache via page_mapping() function. But in some cases
there are no mechanisms to prevent the swap device from being swapoff,
for example,
CPU1 CPU2
__get_user_pages() swapoff()
flush_dcache_page()
mapping = page_mapping()
... exit_swap_address_space()
... kvfree(spaces)
mapping_mapped(mapping)
The address space may be accessed after being freed.
But from cachetlb.txt and Russell King, flush_dcache_page() only care
about file cache pages, for anonymous pages, flush_anon_page() should be
used. The implementation of flush_dcache_page() in all architectures
follows this too. They will check whether page_mapping() is NULL and
whether mapping_mapped() is true to determine whether to flush the
dcache immediately. And they will use interval tree (mapping->i_mmap)
to find all user space mappings. While mapping_mapped() and
mapping->i_mmap isn't used by anonymous pages in swap cache at all.
So, to fix the race between swapoff and flush dcache, __page_mapping()
is add to return the address_space for file cache pages and NULL
otherwise. All page_mapping() invoking in flush dcache functions are
replaced with page_mapping_file().
[akpm@linux-foundation.org: simplify page_mapping_file(), per Mike]
Link: http://lkml.kernel.org/r/20180305083634.15174-1-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
clang reports the following compile warning.
In file included from mm/vmscan.c:56:
./include/linux/swapops.h:327:22: warning:
section attribute is specified on redeclared variable [-Wsection]
extern atomic_long_t num_poisoned_pages __read_mostly;
^
./include/linux/mm.h:2585:22: note: previous declaration is here
extern atomic_long_t num_poisoned_pages;
^
Let's use __read_mostly everywhere.
Link: http://lkml.kernel.org/r/1519686565-8224-1-git-send-email-linux@roeck-us.net
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthias Kaehlcke <mka@chromium.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When device-dax is operating in huge-page mode we want it to behave like
hugetlbfs and report the MMU page mapping size that is being enforced by
the vma.
Similar to commit 31383c6865a5 "mm, hugetlbfs: introduce ->split() to
vm_operations_struct" it would be messy to teach vma_mmu_pagesize()
about device-dax page mapping sizes in the same (hstate) way that
hugetlbfs communicates this attribute. Instead, these patches introduce
a new ->pagesize() vm operation.
Link: http://lkml.kernel.org/r/151996254734.27922.15813097401404359642.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reported-by: Jane Chu <jane.chu@oracle.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
should_failslab() is a convenient function to hook into for directed
error injection into kmalloc(). However, it is only available if a
config flag is set.
The following BCC script, for example, fails kmalloc() calls after a
btrfs umount:
from bcc import BPF
prog = r"""
BPF_HASH(flag);
#include <linux/mm.h>
int kprobe__btrfs_close_devices(void *ctx) {
u64 key = 1;
flag.update(&key, &key);
return 0;
}
int kprobe__should_failslab(struct pt_regs *ctx) {
u64 key = 1;
u64 *res;
res = flag.lookup(&key);
if (res != 0) {
bpf_override_return(ctx, -ENOMEM);
}
return 0;
}
"""
b = BPF(text=prog)
while 1:
b.kprobe_poll()
This patch refactors the should_failslab implementation so that the
function is always available for error injection, independent of flags.
This change would be similar in nature to commit f5490d3ec921 ("block:
Add should_fail_bio() for bpf error injection").
Link: http://lkml.kernel.org/r/20180222020320.6944-1-hmclauchlan@fb.com
Signed-off-by: Howard McLauchlan <hmclauchlan@fb.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Johannes Weiner <jweiner@fb.com>
Cc: Alexei Starovoitov <ast@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This patch makes do_swap_page() not need to be aware of two different
swap readahead algorithms. Just unify cluster-based and vma-based
readahead function call.
Link: http://lkml.kernel.org/r/1509520520-32367-3-git-send-email-minchan@kernel.org
Link: http://lkml.kernel.org/r/20180220085249.151400-3-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When I see recent change of swap readahead, I am very unhappy about
current code structure which diverges two swap readahead algorithm in
do_swap_page. This patch is to clean it up.
Main motivation is that fault handler doesn't need to be aware of
readahead algorithms but just should call swapin_readahead.
As first step, this patch cleans up a little bit but not perfect (I just
separate for review easier) so next patch will make the goal complete.
[minchan@kernel.org: do not check readahead flag with THP anon]
Link: http://lkml.kernel.org/r/874lm83zho.fsf@yhuang-dev.intel.com
Link: http://lkml.kernel.org/r/20180227232611.169883-1-minchan@kernel.org
Link: http://lkml.kernel.org/r/1509520520-32367-2-git-send-email-minchan@kernel.org
Link: http://lkml.kernel.org/r/20180220085249.151400-2-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
page_ref_unfreeze() has exactly that semantic. No functional changes:
just minus one barrier and proper handling of PPro errata.
Link: http://lkml.kernel.org/r/151844393004.210639.4672319312617954272.stgit@buzz
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Recently the following BUG was reported:
Injecting memory failure for pfn 0x3c0000 at process virtual address 0x7fe300000000
Memory failure: 0x3c0000: recovery action for huge page: Recovered
BUG: unable to handle kernel paging request at ffff8dfcc0003000
IP: gup_pgd_range+0x1f0/0xc20
PGD 17ae72067 P4D 17ae72067 PUD 0
Oops: 0000 [#1] SMP PTI
...
CPU: 3 PID: 5467 Comm: hugetlb_1gb Not tainted 4.15.0-rc8-mm1-abc+ #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014
You can easily reproduce this by calling madvise(MADV_HWPOISON) twice on
a 1GB hugepage. This happens because get_user_pages_fast() is not aware
of a migration entry on pud that was created in the 1st madvise() event.
I think that conversion to pud-aligned migration entry is working, but
other MM code walking over page table isn't prepared for it. We need
some time and effort to make all this work properly, so this patch
avoids the reported bug by just disabling error handling for 1GB
hugepage.
[n-horiguchi@ah.jp.nec.com: v2]
Link: http://lkml.kernel.org/r/1517284444-18149-1-git-send-email-n-horiguchi@ah.jp.nec.com
Link: http://lkml.kernel.org/r/1517207283-15769-1-git-send-email-n-horiguchi@ah.jp.nec.com
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
During memory hotplugging we traverse struct pages three times:
1. memset(0) in sparse_add_one_section()
2. loop in __add_section() to set do: set_page_node(page, nid); and
SetPageReserved(page);
3. loop in memmap_init_zone() to call __init_single_pfn()
This patch removes the first two loops, and leaves only loop 3. All
struct pages are initialized in one place, the same as it is done during
boot.
The benefits:
- We improve memory hotplug performance because we are not evicting the
cache several times and also reduce loop branching overhead.
- Remove condition from hotpath in __init_single_pfn(), that was added
in order to fix the problem that was reported by Bharata in the above
email thread, thus also improve performance during normal boot.
- Make memory hotplug more similar to the boot memory initialization
path because we zero and initialize struct pages only in one
function.
- Simplifies memory hotplug struct page initialization code, and thus
enables future improvements, such as multi-threading the
initialization of struct pages in order to improve hotplug
performance even further on larger machines.
[pasha.tatashin@oracle.com: v5]
Link: http://lkml.kernel.org/r/20180228030308.1116-7-pasha.tatashin@oracle.com
Link: http://lkml.kernel.org/r/20180215165920.8570-7-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
During memory hotplugging the probe routine will leave struct pages
uninitialized, the same as it is currently done during boot. Therefore,
we do not want to access the inside of struct pages before
__init_single_page() is called during onlining.
Because during hotplug we know that pages in one memory block belong to
the same numa node, we can skip the checking. We should keep checking
for the boot case.
[pasha.tatashin@oracle.com: s/register_new_memory()/hotplug_memory_register()]
Link: http://lkml.kernel.org/r/20180228030308.1116-6-pasha.tatashin@oracle.com
Link: http://lkml.kernel.org/r/20180215165920.8570-6-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
During boot we poison struct page memory in order to ensure that no one
is accessing this memory until the struct pages are initialized in
__init_single_page().
This patch adds more scrutiny to this checking by making sure that flags
do not equal the poison pattern when they are accessed. The pattern is
all ones.
Since node id is also stored in struct page, and may be accessed quite
early, we add this enforcement into page_to_nid() function as well.
Note, this is applicable only when NODE_NOT_IN_PAGE_FLAGS=n
[pasha.tatashin@oracle.com: v4]
Link: http://lkml.kernel.org/r/20180215165920.8570-4-pasha.tatashin@oracle.com
Link: http://lkml.kernel.org/r/20180213193159.14606-4-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Deferred page initialization allows the boot cpu to initialize a small
subset of the system's pages early in boot, with other cpus doing the
rest later on.
It is, however, problematic to know how many pages the kernel needs
during boot. Different modules and kernel parameters may change the
requirement, so the boot cpu either initializes too many pages or runs
out of memory.
To fix that, initialize early pages on demand. This ensures the kernel
does the minimum amount of work to initialize pages during boot and
leaves the rest to be divided in the multithreaded initialization path
(deferred_init_memmap).
The on-demand code is permanently disabled using static branching once
deferred pages are initialized. After the static branch is changed to
false, the overhead is up-to two branch-always instructions if the zone
watermark check fails or if rmqueue fails.
Sergey Senozhatsky noticed that while deferred pages currently make
sense only on NUMA machines (we start one thread per latency node),
CONFIG_NUMA is not a requirement for CONFIG_DEFERRED_STRUCT_PAGE_INIT,
so that is also must be addressed in the patch.
[akpm@linux-foundation.org: fix typo in comment, make deferred_pages static]
[pasha.tatashin@oracle.com: fix min() type mismatch warning]
Link: http://lkml.kernel.org/r/20180212164543.26592-1-pasha.tatashin@oracle.com
[pasha.tatashin@oracle.com: use zone_to_nid() in deferred_grow_zone()]
Link: http://lkml.kernel.org/r/20180214163343.21234-2-pasha.tatashin@oracle.com
[pasha.tatashin@oracle.com: might_sleep warning]
Link: http://lkml.kernel.org/r/20180306192022.28289-1-pasha.tatashin@oracle.com
[akpm@linux-foundation.org: s/spin_lock/spin_lock_irq/ in page_alloc_init_late()]
[pasha.tatashin@oracle.com: v5]
Link: http://lkml.kernel.org/r/20180309220807.24961-3-pasha.tatashin@oracle.com
[akpm@linux-foundation.org: tweak comments]
[pasha.tatashin@oracle.com: v6]
Link: http://lkml.kernel.org/r/20180313182355.17669-3-pasha.tatashin@oracle.com
[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20180209192216.20509-2-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Gioh Kim <gi-oh.kim@profitbricks.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Miles Chen <miles.chen@mediatek.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Vlastimil Babka reported about a window issue during which when deferred
pages are initialized, and the current version of on-demand
initialization is finished, allocations may fail. While this is highly
unlikely scenario, since this kind of allocation request must be large,
and must come from interrupt handler, we still want to cover it.
We solve this by initializing deferred pages with interrupts disabled,
and holding node_size_lock spin lock while pages in the node are being
initialized. The on-demand deferred page initialization that comes
later will use the same lock, and thus synchronize with
deferred_init_memmap().
It is unlikely for threads that initialize deferred pages to be
interrupted. They run soon after smp_init(), but before modules are
initialized, and long before user space programs. This is why there is
no adverse effect of having these threads running with interrupts
disabled.
[pasha.tatashin@oracle.com: v6]
Link: http://lkml.kernel.org/r/20180313182355.17669-2-pasha.tatashin@oracle.com
Link: http://lkml.kernel.org/r/20180309220807.24961-2-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Gioh Kim <gi-oh.kim@profitbricks.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Miles Chen <miles.chen@mediatek.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
alloc_contig_range() initiates compaction and eventual migration for the
purpose of either CMA or HugeTLB allocations. At present, the reason
code remains the same MR_CMA for either of these cases. Let's make it
MR_CONTIG_RANGE which will appropriately reflect the reason code in both
these cases.
Link: http://lkml.kernel.org/r/20180202091518.18798-1-khandual@linux.vnet.ibm.com
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
struct kmem_cache_order_objects is for mixing order and number of
objects, and orders aren't big enough to warrant 64-bit width.
Propagate unsignedness down so that everything fits.
!!! Patch assumes that "PAGE_SIZE << order" doesn't overflow. !!!
Link: http://lkml.kernel.org/r/20180305200730.15812-23-adobriyan@gmail.com
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
If kmem case sizes are 32-bit, then usecopy region should be too.
Link: http://lkml.kernel.org/r/20180305200730.15812-21-adobriyan@gmail.com
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
If SLAB doesn't support 4GB+ kmem caches (it never did), KASAN should
not do it as well.
Link: http://lkml.kernel.org/r/20180305200730.15812-20-adobriyan@gmail.com
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Linux doesn't support negative length objects (including meta data).
Link: http://lkml.kernel.org/r/20180305200730.15812-18-adobriyan@gmail.com
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Linux doesn't support negative length objects.
Link: http://lkml.kernel.org/r/20180305200730.15812-17-adobriyan@gmail.com
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
->offset is free pointer offset from the start of the object, can't be
negative.
Link: http://lkml.kernel.org/r/20180305200730.15812-16-adobriyan@gmail.com
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
/*
* cpu_partial determined the maximum number of objects
* kept in the per cpu partial lists of a processor.
*/
Can't be negative.
Link: http://lkml.kernel.org/r/20180305200730.15812-15-adobriyan@gmail.com
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
->inuse is "the number of bytes in actual use by the object",
can't be negative.
Link: http://lkml.kernel.org/r/20180305200730.15812-14-adobriyan@gmail.com
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Kmem cache alignment can't be negative.
Link: http://lkml.kernel.org/r/20180305200730.15812-13-adobriyan@gmail.com
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
->reserved is either 0 or sizeof(struct rcu_head), can't be negative.
Link: http://lkml.kernel.org/r/20180305200730.15812-12-adobriyan@gmail.com
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Padding length can't be negative.
Link: http://lkml.kernel.org/r/20180305200730.15812-11-adobriyan@gmail.com
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
->max_attr_size is maximum length of every SLAB memcg attribute
ever written. VFS limits those to INT_MAX.
Link: http://lkml.kernel.org/r/20180305200730.15812-10-adobriyan@gmail.com
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
->remote_node_defrag_ratio is in range 0..1000.
This also adds a check and modifies the behavior to return an error
code. Before this patch invalid values were ignored.
Link: http://lkml.kernel.org/r/20180305200730.15812-9-adobriyan@gmail.com
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
struct kmem_cache::size and ::align were always 32-bit.
Out of curiosity I created 4GB kmem_cache, it oopsed with division by 0.
kmem_cache_create(1UL<<32+1) created 1-byte cache as expected.
size_t doesn't work and never did.
Link: http://lkml.kernel.org/r/20180305200730.15812-6-adobriyan@gmail.com
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
kmalloc_size() derives size of kmalloc cache from internal index, which
can't be negative.
Propagate unsignedness a bit.
Link: http://lkml.kernel.org/r/20180305200730.15812-3-adobriyan@gmail.com
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
kmalloc_index() return index into an array of kmalloc kmem caches,
therefore should be unsigned.
Space savings with SLUB on trimmed down .config:
add/remove: 0/1 grow/shrink: 6/56 up/down: 85/-557 (-472)
Function old new delta
calculate_sizes 924 983 +59
on_freelist 589 604 +15
init_cache_random_seq 122 127 +5
ext4_mb_init 1206 1210 +4
slab_pad_check.part 270 271 +1
cpu_partial_store 112 113 +1
usersize_show 28 27 -1
...
new_slab 1871 1837 -34
slab_order 204 - -204
This patch start a series of converting SLUB (mostly) to "unsigned int".
1) Most integers in the code are in fact unsigned entities: array
indexes, lengths, buffer sizes, allocation orders. It is therefore
better to use unsigned variables
2) Some integers in the code are either "size_t" or "unsigned long" for
no reason.
size_t usually comes from people trying to maintain type correctness
and figuring out that "sizeof" operator returns size_t or
memset/memcpy takes size_t so should everything passed to it.
However the number of 4GB+ objects in the kernel is very small. Most,
if not all, dynamically allocated objects with kmalloc() or
kmem_cache_create() aren't actually big. Maintaining wide types
doesn't do anything.
64-bit ops are bigger than 32-bit on our beloved x86_64,
so try to not use 64-bit where it isn't necessary
(read: everywhere where integers are integers not pointers)
3) in case of SLAB allocators, there are additional limitations
*) page->inuse, page->objects are only 16-/15-bit,
*) cache size was always 32-bit
*) slab orders are small, order 20 is needed to go 64-bit on x86_64
(PAGE_SIZE << order)
Basically everything is 32-bit except kmalloc(1ULL<<32) which gets
shortcut through page allocator.
Christoph said:
:
: That changes with large base page size on power and ARM64 f.e. but then
: we do not want to encourage larger allocations through slab anyways.
Link: http://lkml.kernel.org/r/20180305200730.15812-2-adobriyan@gmail.com
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC driver updates from Arnd Bergmann:
"The main addition this time around is the new ARM "SCMI" framework,
which is the latest in a series of standards coming from ARM to do
power management in a platform independent way.
This has been through many review cycles, and it relies on a rather
interesting way of using the mailbox subsystem, but in the end I
agreed that Sudeep's version was the best we could do after all.
Other changes include:
- the ARM CCN driver is moved out of drivers/bus into drivers/perf,
which makes more sense. Similarly, the performance monitoring
portion of the CCI driver are moved the same way and cleaned up a
little more.
- a series of updates to the SCPI framework
- support for the Mediatek mt7623a SoC in drivers/soc
- support for additional NVIDIA Tegra hardware in drivers/soc
- a new reset driver for Socionext Uniphier
- lesser bug fixes in drivers/soc, drivers/tee, drivers/memory, and
drivers/firmware and drivers/reset across platforms"
* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (87 commits)
reset: uniphier: add ethernet reset control support for PXs3
reset: stm32mp1: Enable stm32mp1 reset driver
dt-bindings: reset: add STM32MP1 resets
reset: uniphier: add Pro4/Pro5/PXs2 audio systems reset control
reset: imx7: add 'depends on HAS_IOMEM' to fix unmet dependency
reset: modify the way reset lookup works for board files
reset: add support for non-DT systems
clk: scmi: use devm_of_clk_add_hw_provider() API and drop scmi_clocks_remove
firmware: arm_scmi: prevent accessing rate_discrete uninitialized
hwmon: (scmi) return -EINVAL when sensor information is unavailable
amlogic: meson-gx-socinfo: Update soc ids
soc/tegra: pmc: Use the new reset APIs to manage reset controllers
soc: mediatek: update power domain data of MT2712
dt-bindings: soc: update MT2712 power dt-bindings
cpufreq: scmi: add thermal dependency
soc: mediatek: fix the mistaken pointer accessed when subdomains are added
soc: mediatek: add SCPSYS power domain driver for MediaTek MT7623A SoC
soc: mediatek: avoid hardcoded value with bus_prot_mask
dt-bindings: soc: add header files required for MT7623A SCPSYS dt-binding
dt-bindings: soc: add SCPSYS binding for MT7623 and MT7623A SoC
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC platform updates from Arnd Bergmann:
"This release brings up a new platform based on the old ARM9 core: the
Nuvoton NPCM is used as a baseboard management controller, competing
with the better known ASpeed AST2xx series.
Another important change is the addition of ARMv7-A based chips in
mach-stm32. The older parts in this platform are ARMv7-M based
microcontrollers, now they are expanding to general-purpose workloads.
The other changes are the usual defconfig updates to enable additional
drivers, lesser bugfixes. The largest updates as often are the ongoing
OMAP cleanups, but we also have a number of changes for the older PXA
and davinci platforms this time.
For the Renesas shmobile/r-car platform, some new infrastructure is
needed to make the watchdog work correctly.
Supporting Multiprocessing on Allwinner A80 required a significant
amount of new code, but is not doing anything unexpected"
* tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (179 commits)
arm: npcm: modify configuration for the NPCM7xx BMC.
MAINTAINERS: update entry for ARM/berlin
ARM: omap2: fix am43xx build without L2X0
ARM: davinci: da8xx: simplify CFGCHIP regmap_config
ARM: davinci: da8xx: fix oops in USB PHY driver due to stack allocated platform_data
ARM: multi_v7_defconfig: add NXP FlexCAN IP support
ARM: multi_v7_defconfig: enable thermal driver for i.MX devices
ARM: multi_v7_defconfig: add RN5T618 PMIC family support
ARM: multi_v7_defconfig: add NXP graphics drivers
ARM: multi_v7_defconfig: add GPMI NAND controller support
ARM: multi_v7_defconfig: add OCOTP driver for NXP SoCs
ARM: multi_v7_defconfig: configure I2C driver built-in
arm64: defconfig: add CONFIG_UNIPHIER_THERMAL and CONFIG_SNI_AVE
ARM: imx: fix imx6sll-only build
ARM: imx: select ARM_CPU_SUSPEND for CPU_IDLE as well
ARM: mxs_defconfig: Re-sync defconfig
ARM: imx_v4_v5_defconfig: Use the generic fsl-asoc-card driver
ARM: imx_v4_v5_defconfig: Re-sync defconfig
arm64: defconfig: enable stmmac ethernet to defconfig
ARM: EXYNOS: Simplify code in coupled CPU idle hot path
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC device tree updates from Arnd Bergmann:
"This is the usual set of changes for device trees, with over 700
non-merged changesets. There is an ongoing set of dtc warning fixes
and the usual bugfixes, cleanups and added device support.
The most interesting bit as usual is support for new machines listed
below:
- The Allwinner H6 makes its debut with the Pine-H64 board, and we
get two new machines based on its older siblings: the H5 based
OrangePi Zero+ and the A64 based Teres-I Laptop from Olimex. On the
32-bit side, we add The Olimex som204 based on Allwinner A20, and
the Banana Pi M2 Zero development board (based on H2).
- NVIDIA adds support for Tegra194 aka "Xavier", plus their p2972
development board and p2888 CPU module.
- The Nuvoton npcm750 is a BMC that was newly added, for now we only
support running on the evaluation board.
- STmicroelectronics stm32 gains support for the stm32mp157c and two
evaluation boards.
- The Toradex Colibri board family grows a few members based on the
i.MX6ULL variant.
- The Advantec DMS-BA16 is a Qseven module using the NXP i.MX6 family
of chips.
- The Phytec phyBOARD Mira is a family of industrial boards based on
i.MX6. For now, four models get added.
- TI am335x based PDU-001 is an industrial embedded machine used for
traffic monitoring
- The Aspeed platform now supports running on the BMC on the Qualcomm
Centriq 2400 server
- Samsung Exynos4 based Galaxy S3 is a family of mobile phones
Qualcomm msm8974 based Galaxy S5 is a rather different phone made
by the same company.
- The Xilinx Zynq and ZynqMP platforms now gained a lot of dts file
for the various boards made by Xilinx themselves, as well as the
Digilent Zybo Z7.
- The ARM Versatile family now supports the "IB2" interface board.
- The Renesas H2 based "Stout" and the H3 based Salvator-X are more
evaluation boards named after a kind of beer, as most of them are.
The r8a77980 (V3H) based "Condor" apparently doesn't follow that
tradition. ;-)
- ROC-RK3328-CC is a simple developement board from the Libre
Computer Project, based on the Rockchips RK3328 SoC
- Haiku is another development board plus Qseven module based on
Rockchips RK3368 and made by Theobroma Systems"
* tag 'armsoc-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (701 commits)
arm: dts: modify Nuvoton NPCM7xx device tree structure
arm: dts: modify Makefile NPCM750 configuration name
arm: dts: modify clock binding in NPCM750 device tree
arm: dts: modify timer register size in NPCM750 device tree
arm: dts: modify UART compatible name in NPCM750 device tree
arm: dts: add watchdog device to NPCM750 device tree
arm64: dts: uniphier: add ethernet node for PXs3
ARM: dts: uniphier: add pinctrl groups of ethernet for second instance
arm: dts: kirkwood*.dts: use SPDX-License-Identifier for board using GPL-2.0+
arm: dts: kirkwood*.dts: use SPDX-License-Identifier for boards using GPL-2.0+/MIT
arm: dts: kirkwood*.dts: use SPDX-License-Identifier for boards using GPL-2.0
arm: dts: armada-385-turris-omnia: use SPDX-License-Identifier
arm: dts: armada-385-db-ap: use SPDX-License-Identifier
arm: dts: armada-388-rd: use SPDX-License-Identifier
arm: dts: armada-xp-db-xc3-24g4xg: use SPDX-License-Identifier
arm: dts: armada-xp-db-dxbc2: use SPDX-License-Identifier
arm: dts: armada-370-db: use SPDX-License-Identifier
arm: dts: armada-*.dts: use SPDX-License-Identifier for most of the Armada based board
arm: dts: armada-xp-98dx: use SPDX-License-Identifier for prestara 98d SoCs
arm: dts: armada-*.dtsi: use SPDX-License-Identifier for most of the Armada SoCs
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull DeviceTree updates from Rob Herring:
- Sync dtc to upstream version v1.4.6-9-gaadd0b65c987. This adds a
bunch more warnings (hidden behind W=1).
- Build dtc lexer and parser files instead of using shipped versions.
- Rework overlay apply API to take an FDT as input and apply overlays
in a single step.
- Add a phandle lookup cache. This improves boot time by hundreds of
msec on systems with large DT.
- Add trivial mcp4017/18/19 potentiometers bindings.
- Remove VLA stack usage in DT code.
* tag 'devicetree-for-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (26 commits)
of: unittest: fix an error code in of_unittest_apply_overlay()
of: unittest: move misplaced function declaration
of: unittest: Remove VLA stack usage
of: overlay: Fix forgotten reference to of_overlay_apply()
of: Documentation: Fix forgotten reference to of_overlay_apply()
of: unittest: local return value variable related cleanups
of: unittest: remove unneeded local return value variables
dt-bindings: trivial: add various mcp4017/18/19 potentiometers
of: unittest: fix an error test in of_unittest_overlay_8()
of: cache phandle nodes to reduce cost of of_find_node_by_phandle()
dt-bindings: rockchip-dw-mshc: use consistent clock names
MAINTAINERS: Add linux/of_*.h headers to appropriate subsystems
scripts: turn off some new dtc warnings by default
scripts/dtc: Update to upstream version v1.4.6-9-gaadd0b65c987
scripts/dtc: generate lexer and parser during build instead of shipping
powerpc: boot: add strrchr function
of: overlay: do not include path in full_name of added nodes
of: unittest: clean up changeset test
arm64/efi: Make strrchr() available to the EFI namespace
ARM: boot: add strrchr function
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull siginfo updates from Eric Biederman:
"The work on cleaning up and getting the bugs out of siginfo generation
was largely stalled this round. The progress that was made was the
definition of FPE_FLTUNK. Which is usable to fix many of the cases
where siginfo generation is erroneously generating SI_USER by setting
si_code to 0, that has recently been tagged as FPE_FIXME.
You already have the change by way of the arm64 tree as that
definition was pulled into the arm64 tree to allow fixing the problem
there.
What remains is the second round of fixing for what I thought was a
trivial change to the struct siginfo when put the union in _sigfault
where it belongs. Do to historical reasons 32bit m68k only ensures
that pointers are 2 byte aligned. So I have added a m68k test case
made of BUILD_BUG_ONs to verify I have this fix correct and possibly
catch problems, and I have computed the number of bytes of padding
needed for the _addr_bnd and _addr_pkey cases and just use an array of
characters that size.
For pure paranoia I have written the code so if there is an
architecture out there that does not perform any alignment of
structures it should still work.
With the removal of all of the stale arechitectures this cycle future
work on cleaning up struct siginfo should be much easier. Almost all
of the conflicting si_code definitions have been removed with the
removal of (blackfin, tile, and frv). Plus some of the most difficult
to test cases have simply been removed from the tree.
Which means that with a little luck copy_siginfo_to_user can become a
light weight wrapper around copy_to_user in the next cycle"
* 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
m68k: Verify the offsets in struct siginfo never change.
signal: Correct the offset of si_pkey and si_lower in struct siginfo on m68k
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk
Pull printk updates from Petr Mladek:
- Add info about loaded kdump kernel into the dump stack header
- Move dump-stack related code from printk.c to lib/dump_stack.c
- Write message about suspending consoles in KERN_INFO log level
* 'for-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
printk: change message to pr_info
printk: move dump stack related code to lib/dump_stack.c
print kdump kernel loaded status in stack dump
|