kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2018-05-30	crypto: testmgr - add extra ecb(des) encryption test vectors	Eric Biggers	1	-0/+22
	Two "ecb(des)" decryption test vectors don't exactly match any of the encryption test vectors with input and result swapped. In preparation for removing the decryption test vectors, add these to the encryption test vectors, so we don't lose any test coverage. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2018-05-30	btrfs: return ENOMEM if path allocation fails in btrfs_cross_ref_exist	Su Yue	1	-1/+1
	The error code does not match the reason of failure and may confuse the callers. Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30	Merge branch 'for-linus' of ↵	Linus Torvalds	2	-3/+6
	git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: - a missing -msoft-float for the compile of the kexec purgatory - a fix for the dasd driver to avoid the double use of a field in the 'struct request' [ That latter one is being discussed, and Christoph asked for something cleaner, but for now it's a fix ] * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/dasd: use blk_mq_rq_from_pdu for per request data s390/purgatory: Fix endless interrupt loop
2018-05-30	btrfs: raid56: Remove VLA usage	Kees Cook	1	-10/+28
	In the quest to remove all stack VLA usage from the kernel[1], this allocates the working buffers during regular init, instead of using stack space. This refactors the allocation code a bit to make it easier to review. [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30	xfs: repair superblocks	Darrick J. Wong	6	-1/+99
	If one of the backup superblocks is found to differ seriously from superblock 0, write out a fresh copy from the in-core sb. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-05-30	xfs: add helpers to attach quotas to inodes	Darrick J. Wong	3	-0/+79
	Add a helper routine to attach quota information to inodes that are about to undergo repair. If that fails, we need to schedule a quotacheck for the next mount but allow the corrupted metadata repair to continue. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-05-30	xfs: recover AG btree roots from rmap data	Darrick J. Wong	2	-0/+210
	Add a helper function to help us recover btree roots from the rmap data. Callers pass in a list of rmap owner codes, buffer ops, and magic numbers. We iterate the rmap records looking for owner matches, and then read the matching blocks to see if the magic number & uuid match. If so, we then read-verify the block, and if that passes then we retain a pointer to the block with the highest level, assuming that by the end of the call we will have found the root. This will be used to reset the AGF/AGI btree root fields during their rebuild procedures. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-05-30	xfs: add helpers to dispose of old btree blocks after a repair	Darrick J. Wong	2	-0/+257
	Now that we've plumbed in the ability to construct a list of dead btree blocks following a repair, add more helpers to dispose of them. This is done by examining the rmapbt -- if the btree was the only owner we can free the block, otherwise it's crosslinked and we can only remove the rmapbt record. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-05-30	xfs: add helpers to collect and sift btree block pointers during repair	Darrick J. Wong	2	-0/+245
	Add some helpers to assemble a list of fs block extents. Generally, repair functions will iterate the rmapbt to make a list (1) of all extents owned by the nominal owner of the metadata structure; then they will iterate all other structures with the same rmap owner to make a list (2) of active blocks; and finally we have a subtraction function to subtract all the blocks in (2) from (1), with the result that (1) is now a list of blocks that were owned by the old btree and must be disposed. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-05-30	xfs: add helpers to allocate and initialize fresh btree roots	Darrick J. Wong	2	-0/+86
	Add a pair of helper functions to allocate and initialize fresh btree roots. The repair functions will use these as part of recreating corrupted metadata. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
2018-05-30	xfs: add helpers to deal with transaction allocation and rolling	Darrick J. Wong	6	-7/+194
	For repairs, we need to reserve at least as many blocks as we think we're going to need to rebuild the data structure, and we're going to need some helpers to roll transactions while maintaining locks on the AG headers so that other threads cannot wander into the middle of a repair. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
2018-05-30	xfs: grab the per-ag structure whenever relevant	Darrick J. Wong	3	-0/+19
	Grab and hold the per-AG data across a scrub run whenever relevant. This helps us avoid repeated trips through rcu and the radix tree in the repair code. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2018-05-30	btrfs: return error value if create_io_em failed in cow_file_range	Su Yue	1	-1/+3
	In cow_file_range(), create_io_em() may fail, but its return value is not recorded. Then return value may be 0 even it failed which is a wrong behavior. Let cow_file_range() return PTR_ERR(em) if create_io_em() failed. Fixes: 6f9994dbabe5 ("Btrfs: create a helper to create em for IO") CC: stable@vger.kernel.org # 4.11+ Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30	btrfs: drop useless member qgroup_reserved of btrfs_pending_snapshot	Gu JinXiang	1	-1/+0
	Since there is no more use of qgroup_reserved member in struct btrfs_pending_snapshot, remove it. Signed-off-by: Gu JinXiang <gujx@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30	btrfs: drop unused parameter qgroup_reserved	Gu JinXiang	4	-14/+5
	Since commit 7775c8184ec0 ("btrfs: remove unused parameter from btrfs_subvolume_release_metadata") parameter qgroup_reserved is not used by caller of function btrfs_subvolume_reserve_metadata. So remove it. Signed-off-by: Gu JinXiang <gujx@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30	btrfs: balance dirty metadata pages in btrfs_finish_ordered_io	Ethan Lien	1	-0/+3
	[Problem description and how we fix it] We should balance dirty metadata pages at the end of btrfs_finish_ordered_io, since a small, unmergeable random write can potentially produce dirty metadata which is multiple times larger than the data itself. For example, a small, unmergeable 4KiB write may produce: 16KiB dirty leaf (and possibly 16KiB dirty node) in subvolume tree 16KiB dirty leaf (and possibly 16KiB dirty node) in checksum tree 16KiB dirty leaf (and possibly 16KiB dirty node) in extent tree Although we do call balance dirty pages in write side, but in the buffered write path, most metadata are dirtied only after we reach the dirty background limit (which by far only counts dirty data pages) and wakeup the flusher thread. If there are many small, unmergeable random writes spread in a large btree, we'll find a burst of dirty pages exceeds the dirty_bytes limit after we wakeup the flusher thread - which is not what we expect. In our machine, it caused out-of-memory problem since a page cannot be dropped if it is marked dirty. Someone may worry about we may sleep in btrfs_btree_balance_dirty_nodelay, but since we do btrfs_finish_ordered_io in a separate worker, it will not stop the flusher consuming dirty pages. Also, we use different worker for metadata writeback endio, sleep in btrfs_finish_ordered_io help us throttle the size of dirty metadata pages. [Reproduce steps] To reproduce the problem, we need to do 4KiB write randomly spread in a large btree. In our 2GiB RAM machine: 1) Create 4 subvolumes. 2) Run fio on each subvolume: [global] direct=0 rw=randwrite ioengine=libaio bs=4k iodepth=16 numjobs=1 group_reporting size=128G runtime=1800 norandommap time_based randrepeat=0 3) Take snapshot on each subvolume and repeat fio on existing files. 4) Repeat step (3) until we get large btrees. In our case, by observing btrfs_root_item->bytes_used, we have 2GiB of metadata in each subvolume tree and 12GiB of metadata in extent tree. 5) Stop all fio, take snapshot again, and wait until all delayed work is completed. 6) Start all fio. Few seconds later we hit OOM when the flusher starts to work. It can be reproduced even when using nocow write. Signed-off-by: Ethan Lien <ethanlien@synology.com> Reviewed-by: David Sterba <dsterba@suse.com> [ add comment ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30	btrfs: lift some btrfs_cross_ref_exist checks in nocow path	Ethan Lien	1	-0/+15
	In nocow path, we check if the extent is snapshotted in btrfs_cross_ref_exist(). We can do the similar check earlier and avoid unnecessary search into extent tree. A fio test on a Intel D-1531, 16GB RAM, SSD RAID-5 machine as follows: [global] group_reporting time_based thread=1 ioengine=libaio bs=4k iodepth=32 size=64G runtime=180 numjobs=8 rw=randwrite [file1] filename=/mnt/nocow/testfile IOPS result: unpatched patched 1 fio round: 46670 46958 snapshot 2 fio round: 51826 54498 3 fio round: 59767 61289 After snapshot, the first fio get about 5% performance gain. As we continually write to the same file, all writes will resume to nocow mode and eventually we have no performance gain. Signed-off-by: Ethan Lien <ethanlien@synology.com> Reviewed-by: David Sterba <dsterba@suse.com> [ update comments ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30	btrfs: Remove fs_info argument from btrfs_uuid_tree_rem	Lu Fengqi	4	-9/+7
	This function always takes a transaction handle which contains a reference to the fs_info. Use that and remove the extra argument. Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> [ rename the function ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30	btrfs: Remove fs_info argument from btrfs_uuid_tree_add	Lu Fengqi	5	-13/+10
	This function always takes a transaction handle which contains a reference to the fs_info. Use that and remove the extra argument. Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30	Btrfs: remove unused check of skip_locking	Liu Bo	1	-2/+5
	The check is superfluous since all callers who set search_for_commit also have skip_locking set. ASSERT() is put in place to ensure skip_locking is set by new callers. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30	Btrfs: remove always true check in unlock_up	Liu Bo	1	-1/+1
	As unlock_up() is written as for () { if (!path->locks[i]) break; ... if (... && path->locks[i]) { } } Apparently, @path->locks[i] is always true at this 'if'. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: David Sterba <dsterba@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30	Btrfs: grab write lock directly if write_lock_level is the max level	Liu Bo	1	-11/+16
	Typically, when acquiring root node's lock, btrfs tries its best to get read lock and trade for write lock if @write_lock_level implies to do so. In case of (cow && (p->keep_locks \|\| p->lowest_level)), write_lock_level is set to BTRFS_MAX_LEVEL, which means we need to acquire root node's write lock directly. In this particular case, the dance of acquiring read lock and then trading for write lock can be saved. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30	Btrfs: move get root out of btrfs_search_slot to a helper	Liu Bo	1	-45/+65
	It's good to have a helper instead of having all get-root details open-coded. The new helper locks (if necessary) and sets root node of the path. Also invert the checks to make the code flow easier to read. There is no functional change in this commit. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30	Btrfs: use more straightforward extent_buffer_uptodate check	Liu Bo	1	-1/+1
	If parent_transid "0" is passed to btrfs_buffer_uptodate(), btrfs_buffer_uptodate() is equivalent to extent_buffer_uptodate(), but extent_buffer_uptodate() is preferred since we don't have to look into verify_parent_transid(). Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30	Btrfs: remove superfluous free_extent_buffer in read_block_for_search	Liu Bo	1	-1/+0
	read_block_for_search() can be simplified as: tmp = find_extent_buffer(); if (tmp) return; ... free_extent_buffer(); read_tree_block(); Apparently, @tmp must be NULL at this point, free_extent_buffer() is not needed. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30	btrfs: drop unused space_info parameter from create_space_info	Lu Fengqi	1	-8/+5
	Since commit dc2d3005d27d ("btrfs: remove dead create_space_info calls"), there is only one caller btrfs_init_space_info. However, it doesn't need create_space_info to return space_info at all. Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30	Btrfs: add parent_transid parameter to veirfy_level_key	Liu Bo	1	-6/+7
	As verify_level_key() is checked after verify_parent_transid(), i.e. if (verify_parent_transid()) ret = -EIO; else if (verify_level_key()) ret = -EUCLEAN; if parent_transid is 0, verify_parent_transid() skips verifying parent_transid and considers eb as valid, and if verify_level_key() reports something wrong, we're not going to know if it's caused by corrupted metadata or non-checkecd eb (e.g. stale eb). The stale eb can be from an outdated raid1 mirror after a degraded mount, see eg "btrfs: fix reading stale metadata blocks after degraded raid1 mounts" (02a3307aa9c20b4f66262) for more details. @parent_transid is able to tell whether the eb's generation has been verified by the caller. Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30	btrfs: qgroup: show more meaningful qgroup_rescan_init error message	Qu Wenruo	1	-15/+18
	Error message from qgroup_rescan_init() mostly looks like: BTRFS info (device nvme0n1p1): qgroup_rescan_init failed with -115 Which is far from meaningful, and sometimes confusing as for above -EINPROGRESS it's mostly (despite the init race) harmless, but sometimes it can also indicate problem if the return value is -EINVAL. Change it to some more meaningful messages like: BTRFS info (device nvme0n1p1): qgroup rescan is already in progress And BTRFS err(device nvme0n1p1): qgroup rescan init failed, qgroup is not enabled Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> [ update the messages and level ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30	Btrfs: fix memory and mount leak in btrfs_ioctl_rm_dev_v2()	Omar Sandoval	1	-2/+4
	If we have invalid flags set, when we error out we must drop our writer counter and free the buffer we allocated for the arguments. This bug is trivially reproduced with the following program on 4.7+: #include <fcntl.h> #include <stdint.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/ioctl.h> #include <sys/stat.h> #include <sys/types.h> #include <linux/btrfs.h> #include <linux/btrfs_tree.h> int main(int argc, char **argv) { struct btrfs_ioctl_vol_args_v2 vol_args = { .flags = UINT64_MAX, }; int ret; int fd; if (argc != 2) { fprintf(stderr, "usage: %s PATH\n", argv[0]); return EXIT_FAILURE; } fd = open(argv[1], O_WRONLY); if (fd == -1) { perror("open"); return EXIT_FAILURE; } ret = ioctl(fd, BTRFS_IOC_RM_DEV_V2, &vol_args); if (ret == -1) perror("ioctl"); close(fd); return EXIT_SUCCESS; } When unmounting the filesystem, we'll hit the WARN_ON(mnt_get_writers(mnt)) in cleanup_mnt() and also may prevent the filesystem to be remounted read-only as the writer count will stay lifted. Fixes: 6b526ed70cf1 ("btrfs: introduce device delete by devid") CC: stable@vger.kernel.org # 4.9+ Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30	btrfs: lzo: Harden inline lzo compressed extent decompression	Qu Wenruo	1	-1/+10
	For inlined extent, we only have one segment, thus less things to check. And further more, inlined extent always has the csum in its leaf header, it's less probable to have corrupted data. Anyway, still check header and segment header. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30	btrfs: lzo: Add header length check to avoid potential out-of-bounds access	Qu Wenruo	1	-2/+26
	James Harvey reported that some corrupted compressed extent data can lead to various kernel memory corruption. Such corrupted extent data belongs to inode with NODATASUM flags, thus data csum won't help us detecting such bug. If lucky enough, KASAN could catch it like: BUG: KASAN: slab-out-of-bounds in lzo_decompress_bio+0x384/0x7a0 [btrfs] Write of size 4096 at addr ffff8800606cb0f8 by task kworker/u16:0/2338 CPU: 3 PID: 2338 Comm: kworker/u16:0 Tainted: G O 4.17.0-rc5-custom+ #50 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Workqueue: btrfs-endio btrfs_endio_helper [btrfs] Call Trace: dump_stack+0xc2/0x16b print_address_description+0x6a/0x270 kasan_report+0x260/0x380 memcpy+0x34/0x50 lzo_decompress_bio+0x384/0x7a0 [btrfs] end_compressed_bio_read+0x99f/0x10b0 [btrfs] bio_endio+0x32e/0x640 normal_work_helper+0x15a/0xea0 [btrfs] process_one_work+0x7e3/0x1470 worker_thread+0x1b0/0x1170 kthread+0x2db/0x390 ret_from_fork+0x22/0x40 ... The offending compressed data has the following info: Header: length 32768 (looks completely valid) Segment 0 Header: length 3472882419 (obviously out of bounds) Then when handling segment 0, since it's over the current page, we need the copy the compressed data to temporary buffer in workspace, then such large size would trigger out-of-bounds memory access, screwing up the whole kernel. Fix it by adding extra checks on header and segment headers to ensure we won't access out-of-bounds, and even checks the decompressed data won't be out-of-bounds. Reported-by: James Harvey <jamespharvey20@gmail.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> [ updated comments ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30	perf test: "Session topology" dumps core on s390	Thomas Richter	1	-6/+24
	The "perf test Session topology" entry fails with core dump on s390. The root cause is a NULL pointer dereference in function check_cpu_topology() line 76 (or line 82 without -v). The session->header.env.cpu variable is NULL because on s390 function process_cpu_topology() returns with error: socket_id number is too big. You may need to upgrade the perf tool. and releases the env.cpu variable via zfree() and sets it to NULL. Here is the gdb output: (gdb) n 76 pr_debug("CPU %d, core %d, socket %d\n", i, (gdb) n Program received signal SIGSEGV, Segmentation fault. 0x00000000010f4d9e in check_cpu_topology (path=0x3ffffffd6c8 "/tmp/perf-test-J6CHMa", map=0x14a1740) at tests/topology.c:76 76 pr_debug("CPU %d, core %d, socket %d\n", i, (gdb) Make sure the env.cpu variable is not used when its NULL. Test for NULL pointer and return TEST_SKIP if so. Output before: [root@p23lp27 perf]# ./perf test -F 39 39: Session topology :Segmentation fault (core dumped) [root@p23lp27 perf]# Output after: [root@p23lp27 perf]# ./perf test -vF 39 39: Session topology : --- start --- templ file: /tmp/perf-test-Ajx59D socket_id number is too big.You may need to upgrade the perf tool. ---- end ---- Session topology: Skip [root@p23lp27 perf]# Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180528073657.11743-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-30	perf parse-events: Handle uncore event aliases in small groups properly	Kan Liang	4	-9/+137
	Perf stat doesn't count the uncore event aliases from the same uncore block in a group, for example: perf stat -e '{unc_m_cas_count.all,unc_m_clockticks}' -a -I 1000 # time counts unit events 1.000447342 <not counted> unc_m_cas_count.all 1.000447342 <not counted> unc_m_clockticks 2.000740654 <not counted> unc_m_cas_count.all 2.000740654 <not counted> unc_m_clockticks The output is very misleading. It gives a wrong impression that the uncore event doesn't work. An uncore block could be composed by several PMUs. An uncore event alias is a joint name which means the same event runs on all PMUs of a block. Perf doesn't support mixed events from different PMUs in the same group. It is wrong to put uncore event aliases in a big group. The right way is to split the big group into multiple small groups which only include the events from the same PMU. Only uncore event aliases from the same uncore block should be specially handled here. It doesn't make sense to mix the uncore events with other uncore events from different blocks or even core events in a group. With the patch: # time counts unit events 1.001557653 140,833 unc_m_cas_count.all 1.001557653 1,330,231,332 unc_m_clockticks 2.002709483 85,007 unc_m_cas_count.all 2.002709483 1,429,494,563 unc_m_clockticks Reported-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Agustin Vega-Frias <agustinv@codeaurora.org> Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/1525727623-19768-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-30	mmc: sunxi: Use ifdef rather than __maybe_unused	Ulf Hansson	1	-2/+4
	To be consistent with code in other mmc host drivers, convert to check the correct PM config #ifdef in favor of using __maybe_unused. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2018-05-30	mmc: mxmmc: Use ifdef rather than __maybe_unused	Ulf Hansson	1	-2/+4
	To be consistent with code in other mmc host drivers, convert to check the correct PM config #ifdef in favor of using __maybe_unused. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2018-05-30	PM / Domains: Drop unused parameter in genpd_allocate_dev_data()	Ulf Hansson	1	-2/+1
	The in-parameter struct generic_pm_domain *genpd to genpd_allocate_dev_data() is unused, so let's drop it. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-30	PM / Domains: Drop genpd as in-param for pm_genpd_remove_device()	Ulf Hansson	3	-8/+7
	There is no need to pass a genpd struct to pm_genpd_remove_device(), as we already have the information about the PM domain (genpd) through the device structure. Additionally, we don't allow to remove a PM domain from a device, other than the one it may have assigned to it, so really it does not make sense to have a separate in-param for it. For these reason, drop it and update the current only call to pm_genpd_remove_device() from amdgpu_acp. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-30	PM / Domains: Drop __pm_genpd_add_device()	Ulf Hansson	2	-17/+7
	There are still a few non-DT existing users of genpd, however neither of them uses __pm_genpd_add_device(), hence let's drop it. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-30	PM / Domains: Drop extern declarations of functions in pm_domain.h	Ulf Hansson	1	-28/+23
	Using "extern" to declare a function in a public header file is somewhat pointless, but also doesn't hurt. However, to make all the function declarations in pm_domain.h to be consistent, let's drop the use of "extern". Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-30	Merge branch 'pm-domains' into pm-opp	Rafael J. Wysocki	1	-3/+3

2018-05-30	PM / domains: Add perf_state attribute to genpd debugfs	Rajendra Nayak	1	-0/+18
	Now that genpd supports performance states, add this additional attribute as part of the power domains debugfs entry, to display the current performance state for the Power domain. Suggested-by: David Collins <collinsd@codeaurora.org> Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-30	Merge branch 'opp/linux-next' of ↵	Rafael J. Wysocki	2	-74/+36
	git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm into pm-opp Pull more OPP updates for v4.18 from Viresh Kumar. * 'opp/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: OPP: Allow same OPP table to be used for multiple genpd PM / OPP: Fix shared OPP table support in dev_pm_opp_register_set_opp_helper() PM / OPP: Fix shared OPP table support in dev_pm_opp_set_regulators() PM / OPP: Fix shared OPP table support in dev_pm_opp_set_prop_name() PM / OPP: Fix shared OPP table support in dev_pm_opp_set_supported_hw()
2018-05-30	dt-bindings: cpufreq: Document operating-points-v2-kryo-cpu	Ilia Lin	1	-0/+680
	The qcom-cpufreq-kryo driver reads the msm-id and efuse value from the SoC to provide the OPP framework with required information. This is used to determine the voltage and frequency value for each OPP of operating-points-v2 table when it is parsed by the OPP framework. This change adds documentation for the DT bindings. The "operating-points-v2-kryo-cpu" DT extends the "operating-points-v2" with following parameters: - nvmem-cells (NVMEM area containig the speedbin information) - opp-supported-hw: A single 32 bit bitmap value, representing compatible HW: 0: MSM8996 V3, speedbin 0 1: MSM8996 V3, speedbin 1 2: MSM8996 V3, speedbin 2 3: unused 4: MSM8996 SG, speedbin 0 5: MSM8996 SG, speedbin 1 6: MSM8996 SG, speedbin 2 7-31: unused Signed-off-by: Ilia Lin <ilialin@codeaurora.org> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Tested-by: Amit Kucheria <amit.kucheria@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-30	cpufreq: Add Kryo CPU scaling driver	Ilia Lin	5	-0/+234
	In Certain QCOM SoCs like apq8096 and msm8996 that have KRYO processors, the CPU frequency subset and voltage value of each OPP varies based on the silicon variant in use. Qualcomm Process Voltage Scaling Tables defines the voltage and frequency value based on the msm-id in SMEM and speedbin blown in the efuse combination. The qcom-cpufreq-kryo driver reads the msm-id and efuse value from the SoC to provide the OPP framework with required information. This is used to determine the voltage and frequency value for each OPP of operating-points-v2 table when it is parsed by the OPP framework. Signed-off-by: Ilia Lin <ilialin@codeaurora.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Tested-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-30	OPP: Allow same OPP table to be used for multiple genpd	Viresh Kumar	1	-2/+15
	The OPP binding says: Property: operating-points-v2 ... This can contain more than one phandle for power domain providers that provide multiple power domains. That is, one phandle for each power domain. If only one phandle is available, then the same OPP table will be used for all power domains provided by the power domain provider. But the OPP core isn't allowing the same OPP table to be used for multiple domains. Update dev_pm_opp_of_add_table_indexed() to allow that. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Rajendra Nayak <rnayak@codeaurora.org>
2018-05-30	s390/zcrypt: Fix CCA and EP11 CPRB processing failure memory leak.	Harald Freudenberger	3	-40/+43
	Tests showed, that the zcrypt device driver produces memory leaks when a valid CCA or EP11 CPRB can't get delivered or has a failure during processing within the zcrypt device driver. This happens when a invalid domain or adapter number is used or the lower level software or hardware layers produce any kind of failure during processing of the request. Only CPRBs send to CCA or EP11 cards can produce this memory leak. The accelerator and the CPRBs processed by this type of crypto card is not affected. The two fields message and private within the ap_message struct are allocated with pulling the function code for the CPRB but only freed when processing of the CPRB succeeds. So for example an invalid domain or adapter field causes the processing to fail, leaving these two memory areas allocated forever. Signed-off-by: Harald Freudenberger <freude@de.ibm.com> Reviewed-by: Ingo Franzki <ifranzki@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-05-30	s390/archrandom: Rework arch random implementation.	Harald Freudenberger	2	-14/+102
	The arch_get_random_seed_long() invocation done by the random device driver is done in interrupt context and may be invoked very very frequently. The existing s390 arch_get_random_seed() implementation uses the PRNO(TRNG) instruction which produces excellent high quality entropy but is relatively slow and thus expensive. This fix reworks the arch_get_random_seed implementation. It introduces a buffer concept to decouple the delivery of random data via arch_get_random_seed*() from the generation of new random bytes. The buffer of random data is filled asynchronously by a workqueue thread. If there are enough bytes in the buffer the s390_arch_random_generate() just delivers these bytes. Otherwise false is returned until the worker thread refills the buffer. The worker fills the rng buffer by pulling fresh entropy from the high quality (but slow) true hardware random generator. This entropy is then spread over the buffer with an pseudo random generator. As the arch_get_random_seed_long() fetches 8 bytes and the calling function add_interrupt_randomness() counts this as 1 bit entropy the distribution needs to make sure there is in fact 1 bit entropy contained in 8 bytes of the buffer. The current values pull 32 byte entropy and scatter this into a 2048 byte buffer. So 8 byte in the buffer will contain 1 bit of entropy. The worker thread is rescheduled based on the charge level of the buffer but at least with 500 ms delay to avoid too much cpu consumption. So the max. amount of rng data delivered via arch_get_random_seed is limited to 4Kb per second. Signed-off-by: Harald Freudenberger <freude@de.ibm.com> Reviewed-by: Patrick Steuer <patrick.steuer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-05-30	s390/net: add pnetid support	Ursula Braun	4	-0/+108
	s390 hardware supports the definition of a so-call Physical NETwork IDentifier (short PNETID) per network device port. These PNETIDS can be used to identify network devices that are attached to the same physical network (broadcast domain). This patch provides the interface to extract the PNETID of a port of a device attached to the ccw-bus or pci-bus. Parts of this patch are based on an initial implementation by Thomas Richter. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-05-30	cpufreq: Use static SRCU initializer	Sebastian Andrzej Siewior	1	-12/+1
	Use the static SRCU initializer for `cpufreq_transition_notifier_list'. This avoids the init_cpufreq_transition_notifier_list() initcall. Its only purpose is to initialize the SRCU notifier once during boot and set another variable which is used as an indicator whether the init was perfromed before cpufreq_register_notifier() was used. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-30	kernel/SRCU: provide a static initializer	Sebastian Andrzej Siewior	3	-11/+35
	There are macros for static initializer for the three out of four possible notifier types, that are: ATOMIC_NOTIFIER_HEAD() BLOCKING_NOTIFIER_HEAD() RAW_NOTIFIER_HEAD() This patch provides a static initilizer for the forth type to make it complete. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>