summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-02-01perf/x86/intel: Add perf core PMU support for Sapphire RapidsKan Liang5-14/+443
Add perf core PMU support for the Intel Sapphire Rapids server, which is the successor of the Intel Ice Lake server. The enabling code is based on Ice Lake, but there are several new features introduced. The event encoding is changed and simplified, e.g., the event codes which are below 0x90 are restricted to counters 0-3. The event codes which above 0x90 are likely to have no restrictions. The event constraints, extra_regs(), and hardware cache events table are changed accordingly. A new Precise Distribution (PDist) facility is introduced, which further minimizes the skid when a precise event is programmed on the GP counter 0. Enable the Precise Distribution (PDist) facility with :ppp event. For this facility to work, the period must be initialized with a value larger than 127. Add spr_limit_period() to apply the limit for :ppp event. Two new data source fields, data block & address block, are added in the PEBS Memory Info Record for the load latency event. To enable the feature, - An auxiliary event has to be enabled together with the load latency event on Sapphire Rapids. A new flag PMU_FL_MEM_LOADS_AUX is introduced to indicate the case. A new event, mem-loads-aux, is exposed to sysfs for the user tool. Add a check in hw_config(). If the auxiliary event is not detected, return an unique error -ENODATA. - The union perf_mem_data_src is extended to support the new fields. - Ice Lake and earlier models do not support block information, but the fields may be set by HW on some machines. Add pebs_no_block to explicitly indicate the previous platforms which don't support the new block fields. Accessing the new block fields are ignored on those platforms. A new store Latency facility is introduced, which leverages the PEBS facility where it can provide additional information about sampled stores. The additional information includes the data address, memory auxiliary info (e.g. Data Source, STLB miss) and the latency of the store access. To enable the facility, the new event (0x02cd) has to be programed on the GP counter 0. A new flag PERF_X86_EVENT_PEBS_STLAT is introduced to indicate the event. The store_latency_data() is introduced to parse the memory auxiliary info. The layout of access latency field of PEBS Memory Info Record has been changed. Two latency, instruction latency (bit 15:0) and cache access latency (bit 47:32) are recorded. - The cache access latency is similar to previous memory access latency. For loads, the latency starts by the actual cache access until the data is returned by the memory subsystem. For stores, the latency starts when the demand write accesses the L1 data cache and lasts until the cacheline write is completed in the memory subsystem. The cache access latency is stored in low 32bits of the sample type PERF_SAMPLE_WEIGHT_STRUCT. - The instruction latency starts by the dispatch of the load operation for execution and lasts until completion of the instruction it belongs to. Add a new flag PMU_FL_INSTR_LATENCY to indicate the instruction latency support. The instruction latency is stored in the bit 47:32 of the sample type PERF_SAMPLE_WEIGHT_STRUCT. Extends the PERF_METRICS MSR to feature TMA method level 2 metrics. The lower half of the register is the TMA level 1 metrics (legacy). The upper half is also divided into four 8-bit fields for the new level 2 metrics. Expose all eight Topdown metrics events to user space. The full description for the SPR features can be found at Intel Architecture Instruction Set Extensions and Future Features Programming Reference, 319433-041. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1611873611-156687-5-git-send-email-kan.liang@linux.intel.com
2021-02-01perf/x86/intel: Filter unsupported Topdown metrics eventKan Liang3-4/+22
Intel Sapphire Rapids server will introduce 8 metrics events. Intel Ice Lake only supports 4 metrics events. A perf tool user may mistakenly use the unsupported events via RAW format on Ice Lake. The user can still get a value from the unsupported Topdown metrics event once the following Sapphire Rapids enabling patch is applied. To enable the 8 metrics events on Intel Sapphire Rapids, the INTEL_TD_METRIC_MAX has to be updated, which impacts the is_metric_event(). The is_metric_event() is a generic function. On Ice Lake, the newly added SPR metrics events will be mistakenly accepted as metric events on creation. At runtime, the unsupported Topdown metrics events will be updated. Add a variable num_topdown_events in x86_pmu to indicate the available number of the Topdown metrics event on the platform. Apply the number into is_metric_event(). Only the supported Topdown metrics events should be created as metrics events. Apply the num_topdown_events in icl_update_topdown_event() as well. The function can be reused by the following patch. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1611873611-156687-4-git-send-email-kan.liang@linux.intel.com
2021-02-01perf/x86/intel: Factor out intel_update_topdown_event()Kan Liang1-7/+13
Similar to Ice Lake, Intel Sapphire Rapids server also supports the topdown performance metrics feature. The difference is that Intel Sapphire Rapids server extends the PERF_METRICS MSR to feature TMA method level two metrics, which will introduce 8 metrics events. Current icl_update_topdown_event() only check 4 level one metrics events. Factor out intel_update_topdown_event() to facilitate the code sharing between Ice Lake and Sapphire Rapids. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1611873611-156687-3-git-send-email-kan.liang@linux.intel.com
2021-02-01perf/core: Add PERF_SAMPLE_WEIGHT_STRUCTKan Liang5-17/+59
Current PERF_SAMPLE_WEIGHT sample type is very useful to expresses the cost of an action represented by the sample. This allows the profiler to scale the samples to be more informative to the programmer. It could also help to locate a hotspot, e.g., when profiling by memory latencies, the expensive load appear higher up in the histograms. But current PERF_SAMPLE_WEIGHT sample type is solely determined by one factor. This could be a problem, if users want two or more factors to contribute to the weight. For example, Golden Cove core PMU can provide both the instruction latency and the cache Latency information as factors for the memory profiling. For current X86 platforms, although meminfo::latency is defined as a u64, only the lower 32 bits include the valid data in practice (No memory access could last than 4G cycles). The higher 32 bits can be used to store new factors. Add a new sample type, PERF_SAMPLE_WEIGHT_STRUCT, to indicate the new sample weight structure. It shares the same space as the PERF_SAMPLE_WEIGHT sample type. Users can apply either the PERF_SAMPLE_WEIGHT sample type or the PERF_SAMPLE_WEIGHT_STRUCT sample type to retrieve the sample weight, but they cannot apply both sample types simultaneously. Currently, only X86 and PowerPC use the PERF_SAMPLE_WEIGHT sample type. - For PowerPC, there is nothing changed for the PERF_SAMPLE_WEIGHT sample type. There is no effect for the new PERF_SAMPLE_WEIGHT_STRUCT sample type. PowerPC can re-struct the weight field similarly later. - For X86, the same value will be dumped for the PERF_SAMPLE_WEIGHT sample type or the PERF_SAMPLE_WEIGHT_STRUCT sample type for now. The following patches will apply the new factors for the PERF_SAMPLE_WEIGHT_STRUCT sample type. The field in the union perf_sample_weight should be shared among different architectures. A generic name is required, but it's hard to abstract a name that applies to all architectures. For example, on X86, the fields are to store all kinds of latency. While on PowerPC, it stores MMCRA[TECX/TECM], which should not be latency. So a general name prefix 'var$NUM' is used here. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1611873611-156687-2-git-send-email-kan.liang@linux.intel.com
2021-01-27perf/intel: Remove Perfmon-v4 counter_freezing supportPeter Zijlstra3-160/+1
Perfmon-v4 counter freezing is fundamentally broken; remove this default disabled code to make sure nobody uses it. The feature is called Freeze-on-PMI in the SDM, and if it would do that, there wouldn't actually be a problem, *however* it does something subtly different. It globally disables the whole PMU when it raises the PMI, not when the PMI hits. This means there's a window between the PMI getting raised and the PMI actually getting served where we loose events and this violates the perf counter independence. That is, a counting event should not result in a different event count when there is a sampling event co-scheduled. This is known to break existing software (RR). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2021-01-27x86/perf: Use static_call for x86_pmu.guest_get_msrsLike Xu3-25/+21
Clean up that CONFIG_RETPOLINE crud and replace the indirect call x86_pmu.guest_get_msrs with static_call(). Reported-by: kernel test robot <lkp@intel.com> Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Like Xu <like.xu@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210125121458.181635-1-like.xu@linux.intel.com
2021-01-14perf/x86/intel/uncore: With > 8 nodes, get pci bus die id from NUMA infoSteve Wahl1-28/+65
The registers used to determine which die a pci bus belongs to don't contain enough information to uniquely specify more than 8 dies, so when more than 8 dies are present, use NUMA information instead. Continue to use the previous method for 8 or fewer because it works there, and covers cases of NUMA being disabled. Signed-off-by: Steve Wahl <steve.wahl@hpe.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lkml.kernel.org/r/20210108153549.108989-3-steve.wahl@hpe.com
2021-01-14perf/x86/intel/uncore: Store the logical die id instead of the physical die id.Steve Wahl4-57/+39
The phys_id isn't really used other than to map to a logical die id. Calculate the logical die id earlier, and store that instead of the phys_id. Signed-off-by: Steve Wahl <steve.wahl@hpe.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lkml.kernel.org/r/20210108153549.108989-2-steve.wahl@hpe.com
2021-01-04Linux 5.11-rc2v5.11-rc2Linus Torvalds1-1/+1
2021-01-02Merge tag 's390-5.11-3' of ↵Linus Torvalds4-21/+35
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 cleanups from Vasily Gorbik: "Update defconfigs and sort config select list" * tag 's390-5.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/Kconfig: sort config S390 select list once again s390: update defconfigs
2021-01-02Merge tag 'pm-5.11-rc2' of ↵Linus Torvalds4-5/+48
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix a crash in intel_pstate during resume from suspend-to-RAM that may occur after recent changes and two resource leaks in error paths in the operating performance points (OPP) framework, add a new C-states table to intel_idle and update the cpuidle MAINTAINERS entry to cover the governors too. Specifics: - Fix recently introduced crash in the intel_pstate driver that occurs if scale-invariance is disabled during resume from suspend-to-RAM due to inconsistent changes of APERF or MPERF MSR values made by the platform firmware (Rafael Wysocki). - Fix a memory leak and add a missing clk_put() in error paths in the OPP framework (Quanyang Wang, Viresh Kumar). - Add new C-states table for SnowRidge processors to the intel_idle driver (Artem Bityutskiy). - Update the MAINTAINERS entry for cpuidle to make it clear that the governors are covered by it too (Lukas Bulwahn)" * tag 'pm-5.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: intel_idle: add SnowRidge C-state table cpufreq: intel_pstate: Fix fast-switch fallback path opp: Call the missing clk_put() on error opp: fix memory leak in _allocate_opp_table MAINTAINERS: include governors into CPU IDLE TIME MANAGEMENT FRAMEWORK
2021-01-02Merge branches 'pm-cpufreq' and 'pm-cpuidle'Rafael J. Wysocki3-3/+41
* pm-cpufreq: cpufreq: intel_pstate: Fix fast-switch fallback path * pm-cpuidle: intel_idle: add SnowRidge C-state table MAINTAINERS: include governors into CPU IDLE TIME MANAGEMENT FRAMEWORK
2021-01-01Merge tag 'scsi-fixes' of ↵Linus Torvalds21-86/+208
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is a load of driver fixes (12 ufs, 1 mpt3sas, 1 cxgbi). The big core two fixes are for power management ("block: Do not accept any requests while suspended" and "block: Fix a race in the runtime power management code") which finally sorts out the resume problems we've occasionally been having. To make the resume fix, there are seven necessary precursors which effectively renames REQ_PREEMPT to REQ_PM, so every "special" request in block is automatically a power management exempt one. All of the non-PM preempt cases are removed except for the one in the SCSI Parallel Interface (spi) domain validation which is a genuine case where we have to run requests at high priority to validate the bus so this becomes an autopm get/put protected request" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (22 commits) scsi: cxgb4i: Fix TLS dependency scsi: ufs: Un-inline ufshcd_vops_device_reset function scsi: ufs: Re-enable WriteBooster after device reset scsi: ufs-mediatek: Use correct path to fix compile error scsi: mpt3sas: Signedness bug in _base_get_diag_triggers() scsi: block: Do not accept any requests while suspended scsi: block: Remove RQF_PREEMPT and BLK_MQ_REQ_PREEMPT scsi: core: Only process PM requests if rpm_status != RPM_ACTIVE scsi: scsi_transport_spi: Set RQF_PM for domain validation commands scsi: ide: Mark power management requests with RQF_PM instead of RQF_PREEMPT scsi: ide: Do not set the RQF_PREEMPT flag for sense requests scsi: block: Introduce BLK_MQ_REQ_PM scsi: block: Fix a race in the runtime power management code scsi: ufs-pci: Enable UFSHCD_CAP_RPM_AUTOSUSPEND for Intel controllers scsi: ufs-pci: Fix recovery from hibernate exit errors for Intel controllers scsi: ufs-pci: Ensure UFS device is in PowerDown mode for suspend-to-disk ->poweroff() scsi: ufs-pci: Fix restore from S4 for Intel controllers scsi: ufs-mediatek: Keep VCC always-on for specific devices scsi: ufs: Allow regulators being always-on scsi: ufs: Clear UAC for RPMB after ufshcd resets ...
2021-01-01Merge tag 'block-5.11-2021-01-01' of git://git.kernel.dk/linux-blockLinus Torvalds2-1/+2
Pull block fixes from Jens Axboe: "Two minor block fixes from this last week that should go into 5.11: - Add missing NOWAIT debugfs definition (Andres) - Fix kerneldoc warning introduced this merge window (Randy)" * tag 'block-5.11-2021-01-01' of git://git.kernel.dk/linux-block: block: add debugfs stanza for QUEUE_FLAG_NOWAIT fs: block_dev.c: fix kernel-doc warnings from struct block_device changes
2021-01-01Merge tag 'io_uring-5.11-2021-01-01' of git://git.kernel.dk/linux-blockLinus Torvalds3-21/+43
Pull io_uring fixes from Jens Axboe: "A few fixes that should go into 5.11, all marked for stable as well: - Fix issue around identity COW'ing and users that share a ring across processes - Fix a hang associated with unregistering fixed files (Pavel) - Move the 'process is exiting' cancelation a bit earlier, so task_works aren't affected by it (Pavel)" * tag 'io_uring-5.11-2021-01-01' of git://git.kernel.dk/linux-block: kernel/io_uring: cancel io_uring before task works io_uring: fix io_sqe_files_unregister() hangs io_uring: add a helper for setting a ref node io_uring: don't assume mm is constant across submits
2021-01-01depmod: handle the case of /sbin/depmod without /sbin in PATHLinus Torvalds1-0/+2
Commit 436e980e2ed5 ("kbuild: don't hardcode depmod path") stopped hard-coding the path of depmod, but in the process caused trouble for distributions that had that /sbin location, but didn't have it in the PATH (generally because /sbin is limited to the super-user path). Work around it for now by just adding /sbin to the end of PATH in the depmod.sh script. Reported-and-tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-31kernel/io_uring: cancel io_uring before task worksPavel Begunkov2-2/+2
For cancelling io_uring requests it needs either to be able to run currently enqueued task_works or having it shut down by that moment. Otherwise io_uring_cancel_files() may be waiting for requests that won't ever complete. Go with the first way and do cancellations before setting PF_EXITING and so before putting the task_work infrastructure into a transition state where task_work_run() would better not be called. Cc: stable@vger.kernel.org # 5.5+ Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-31io_uring: fix io_sqe_files_unregister() hangsPavel Begunkov1-2/+22
io_sqe_files_unregister() uninterruptibly waits for enqueued ref nodes, however requests keeping them may never complete, e.g. because of some userspace dependency. Make sure it's interruptible otherwise it would hang forever. Cc: stable@vger.kernel.org # 5.6+ Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-31io_uring: add a helper for setting a ref nodePavel Begunkov1-10/+12
Setting a new reference node to a file data is not trivial, don't repeat it, add and use a helper. Cc: stable@vger.kernel.org # 5.6+ Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-30Merge tag 'ceph-for-5.11-rc2' of git://github.com/ceph/ceph-clientLinus Torvalds3-36/+36
Pull ceph fixes from Ilya Dryomov: "A fix for an edge case in MClientRequest encoding and a couple of trivial fixups for the new msgr2 support" * tag 'ceph-for-5.11-rc2' of git://github.com/ceph/ceph-client: libceph: add __maybe_unused to DEFINE_MSGR2_FEATURE libceph: align session_key and con_secret to 16 bytes libceph: fix auth_signature buffer allocation in secure mode ceph: reencode gid_list when reconnecting
2020-12-30intel_idle: add SnowRidge C-state tableArtem Bityutskiy1-1/+40
Add C-state table for the SnowRidge SoC which is found on Intel Jacobsville platforms. The following has been changed. 1. C1E latency changed from 10us to 15us. It was measured using the open source "wult" tool (the "nic" method, 15us is the 99.99th percentile). 2. C1E power break even changed from 20us to 25us, which may result in less C1E residency in some workloads. 3. C6 latency changed from 50us to 130us. Measured the same way as C1E. The C6 C-state is supported only by some SnowRidge revisions, so add a C-state table commentary about this. On SnowRidge, C6 support is enumerated via the usual mechanism: "mwait" leaf of the "cpuid" instruction. The 'intel_idle' driver does check this leaf, so even though C6 is present in the table, the driver will only use it if the CPU does support it. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-12-30cpufreq: intel_pstate: Fix fast-switch fallback pathRafael J. Wysocki1-1/+0
When sugov_update_single_perf() falls back to the "frequency" path due to the missing scale-invariance, it will call cpufreq_driver_fast_switch() via sugov_fast_switch() and the driver's ->fast_switch() callback will be invoked, so it must not be NULL. However, after commit a365ab6b9dfb ("cpufreq: intel_pstate: Implement the ->adjust_perf() callback") intel_pstate sets ->fast_switch() to NULL when it is going to use intel_cpufreq_adjust_perf(), which is a mistake, because on x86 the scale-invariance may be turned off dynamically, so modify it to retain the original ->adjust_perf() callback pointer. Fixes: a365ab6b9dfb ("cpufreq: intel_pstate: Implement the ->adjust_perf() callback") Reported-by: Kenneth R. Crudup <kenny@panix.com> Tested-by: Kenneth R. Crudup <kenny@panix.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-12-30Merge branch 'opp/linux-next' of ↵Rafael J. Wysocki1-2/+7
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Pull operating performance points (OPP) framework fixes for 5.11-rc2 from Viresh Kumar: "This contains two patches to fix freeing of resources in error paths." * 'opp/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: opp: Call the missing clk_put() on error opp: fix memory leak in _allocate_opp_table
2020-12-30s390/Kconfig: sort config S390 select list once againHeiko Carstens1-14/+17
...and add comments at the top and bottom. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-12-30s390: update defconfigsHeiko Carstens3-7/+18
Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-12-30block: add debugfs stanza for QUEUE_FLAG_NOWAITAndres Freund1-0/+1
This was missed in 021a24460dc2. Leads to the numeric value of QUEUE_FLAG_NOWAIT (i.e. 29) showing up in /sys/kernel/debug/block/*/state. Fixes: 021a24460dc28e7412aecfae89f60e1847e685c0 Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andres Freund <andres@anarazel.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-30fs: block_dev.c: fix kernel-doc warnings from struct block_device changesRandy Dunlap1-1/+1
Fix new kernel-doc warnings in fs/block_dev.c: ../fs/block_dev.c:1066: warning: Excess function parameter 'whole' description in 'bd_abort_claiming' ../fs/block_dev.c:1837: warning: Function parameter or member 'dev' not described in 'lookup_bdev' Fixes: 4e7b5671c6a8 ("block: remove i_bdev") Fixes: 37c3fc9abb25 ("block: simplify the block device claiming interface") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Christoph Hellwig <hch@lst.de> Cc: linux-fsdevel@vger.kernel.org Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-30Merge branch 'akpm' (patches from Andrew)Linus Torvalds42-91/+101
Merge misc fixes from Andrew Morton: "16 patches Subsystems affected by this patch series: mm (selftests, hugetlb, pagecache, mremap, kasan, and slub), kbuild, checkpatch, misc, and lib" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: slub: call account_slab_page() after slab page initialization zlib: move EXPORT_SYMBOL() and MODULE_LICENSE() out of dfltcc_syms.c lib/zlib: fix inflating zlib streams on s390 lib/genalloc: fix the overflow when size is too big kdev_t: always inline major/minor helper functions sizes.h: add SZ_8G/SZ_16G/SZ_32G macros local64.h: make <asm/local64.h> mandatory kasan: fix null pointer dereference in kasan_record_aux_stack mm: generalise COW SMC TLB flushing race comment mm/mremap.c: fix extent calculation mm: memmap defer init doesn't work as expected mm: add prototype for __add_to_page_cache_locked() checkpatch: prefer strscpy to strlcpy Revert "kbuild: avoid static_assert for genksyms" mm/hugetlb: fix deadlock in hugetlb_cow error path selftests/vm: fix building protection keys test
2020-12-30mm: slub: call account_slab_page() after slab page initializationRoman Gushchin1-3/+2
It's convenient to have page->objects initialized before calling into account_slab_page(). In particular, this information can be used to pre-alloc the obj_cgroup vector. Let's call account_slab_page() a bit later, after the initialization of page->objects. This commit doesn't bring any functional change, but is required for further optimizations. [akpm@linux-foundation.org: undo changes needed by forthcoming mm-memcg-slab-pre-allocate-obj_cgroups-for-slab-caches-with-slab_account.patch] Link: https://lkml.kernel.org/r/20201110195753.530157-1-guro@fb.com Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-30zlib: move EXPORT_SYMBOL() and MODULE_LICENSE() out of dfltcc_syms.cRandy Dunlap4-19/+9
In commit 11fb479ff5d9 ("zlib: export S390 symbols for zlib modules"), I added EXPORT_SYMBOL()s to dfltcc_inflate.c but then Mikhail said that these should probably be in dfltcc_syms.c with the other EXPORT_SYMBOL()s. However, that is contrary to the current kernel style, which places EXPORT_SYMBOL() immediately after the function that it applies to, so move all EXPORT_SYMBOL()s to their respective function locations and drop the dfltcc_syms.c file. Also move MODULE_LICENSE() from the deleted file to dfltcc.c. [rdunlap@infradead.org: remove dfltcc_syms.o from Makefile] Link: https://lkml.kernel.org/r/20201227171837.15492-1-rdunlap@infradead.org Link: https://lkml.kernel.org/r/20201219052530.28461-1-rdunlap@infradead.org Fixes: 11fb479ff5d9 ("zlib: export S390 symbols for zlib modules") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Zaslonko Mikhail <zaslonko@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-30lib/zlib: fix inflating zlib streams on s390Ilya Leoshkevich1-2/+2
Decompressing zlib streams on s390 fails with "incorrect data check" error. Userspace zlib checks inflate_state.flags in order to byteswap checksums only for zlib streams, and s390 hardware inflate code, which was ported from there, tries to match this behavior. At the same time, kernel zlib does not use inflate_state.flags, so it contains essentially random values. For many use cases either zlib stream is zeroed out or checksum is not used, so this problem is masked, but at least SquashFS is still affected. Fix by always passing a checksum to and from the hardware as is, which matches zlib_inflate()'s expectations. Link: https://lkml.kernel.org/r/20201215155551.894884-1-iii@linux.ibm.com Fixes: 126196100063 ("lib/zlib: add s390 hardware support for kernel zlib_inflate") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Mikhail Zaslonko <zaslonko@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Mikhail Zaslonko <zaslonko@linux.ibm.com> Cc: <stable@vger.kernel.org> [5.6+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-30lib/genalloc: fix the overflow when size is too bigHuang Shijie1-12/+13
Some graphic card has very big memory on chip, such as 32G bytes. In the following case, it will cause overflow: pool = gen_pool_create(PAGE_SHIFT, NUMA_NO_NODE); ret = gen_pool_add(pool, 0x1000000, SZ_32G, NUMA_NO_NODE); va = gen_pool_alloc(pool, SZ_4G); The overflow occurs in gen_pool_alloc_algo_owner(): .... size = nbits << order; .... The @nbits is "int" type, so it will overflow. Then the gen_pool_avail() will return the wrong value. This patch converts some "int" to "unsigned long", and changes the compare code in while. Link: https://lkml.kernel.org/r/20201229060657.3389-1-sjhuang@iluvatar.ai Signed-off-by: Huang Shijie <sjhuang@iluvatar.ai> Reported-by: Shi Jiasheng <jiasheng.shi@iluvatar.ai> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-30kdev_t: always inline major/minor helper functionsJosh Poimboeuf1-11/+11
Silly GCC doesn't always inline these trivial functions. Fixes the following warning: arch/x86/kernel/sys_ia32.o: warning: objtool: cp_stat64()+0xd8: call to new_encode_dev() with UACCESS enabled Link: https://lkml.kernel.org/r/984353b44a4484d86ba9f73884b7306232e25e30.1608737428.git.jpoimboe@redhat.com Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> [build-tested] Cc: Peter Zijlstra <peterz@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-30sizes.h: add SZ_8G/SZ_16G/SZ_32G macrosHuang Shijie1-0/+3
Add these macros, since we can use them in drivers. Link: https://lkml.kernel.org/r/20201229072819.11183-1-sjhuang@iluvatar.ai Signed-off-by: Huang Shijie <sjhuang@iluvatar.ai> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-30local64.h: make <asm/local64.h> mandatoryRandy Dunlap22-21/+1
Make <asm-generic/local64.h> mandatory in include/asm-generic/Kbuild and remove all arch/*/include/asm/local64.h arch-specific files since they only #include <asm-generic/local64.h>. This fixes build errors on arch/c6x/ and arch/nios2/ for block/blk-iocost.c. Build-tested on 21 of 25 arch-es. (tools problems on the others) Yes, we could even rename <asm-generic/local64.h> to <linux/local64.h> and change all #includes to use <linux/local64.h> instead. Link: https://lkml.kernel.org/r/20201227024446.17018-1-rdunlap@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Aurelien Jacquiot <jacquiot.aurelien@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-30kasan: fix null pointer dereference in kasan_record_aux_stackWalter Wu1-0/+2
Syzbot reported the following [1]: BUG: kernel NULL pointer dereference, address: 0000000000000008 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 2d993067 P4D 2d993067 PUD 19a3c067 PMD 0 Oops: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 3852 Comm: kworker/1:2 Not tainted 5.10.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: events free_ipc RIP: 0010:kasan_record_aux_stack+0x77/0xb0 Add null checking slab object from kasan_get_alloc_meta() in order to avoid null pointer dereference. [1] https://syzkaller.appspot.com/x/log.txt?x=10a82a50d00000 Link: https://lkml.kernel.org/r/20201228080018.23041-1-walter-zh.wu@mediatek.com Signed-off-by: Walter Wu <walter-zh.wu@mediatek.com> Suggested-by: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Alexander Potapenko <glider@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-30mm: generalise COW SMC TLB flushing race commentNicholas Piggin1-3/+5
I'm not sure if I'm completely missing something here, but AFAIKS the reference to the mysterious "COW SMC race" confuses the issue. The original changelog and mailing list thread didn't help me either. This SMC race is where the problem was detected, but isn't the general problem bigger and more obvious: that the new PTE could be picked up at any time by any TLB while entries for the old PTE exist in other TLBs before the TLB flush takes effect? The case where the iTLB and dTLB of a CPU are pointing at different pages is an interesting one but follows from the general problem. The other (minor) thing with the comment I think it makes it a bit clearer to say what the old code was doing (i.e., it avoids the race as opposed to what?). References: 4ce072f1faf29 ("mm: fix a race condition under SMC + COW") Link: https://lkml.kernel.org/r/20201215121119.351650-1-npiggin@gmail.com Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Hugh Dickins <hughd@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suresh Siddha <sbsiddha@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-30mm/mremap.c: fix extent calculationKalesh Singh1-1/+3
When `next < old_addr`, `next - old_addr` arithmetic underflows causing `extent` to be incorrect. Make `extent` the smaller of `next - old_addr` or `old_end - old_addr`. Link: https://lkml.kernel.org/r/20201219170433.2418867-1-kaleshsingh@google.com Fixes: c49dd34018026 ("mm: speedup mremap on 1GB or larger regions") Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Helge Deller <deller@gmx.de> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-30mm: memmap defer init doesn't work as expectedBaoquan He4-8/+11
VMware observed a performance regression during memmap init on their platform, and bisected to commit 73a6e474cb376 ("mm: memmap_init: iterate over memblock regions rather that check each PFN") causing it. Before the commit: [0.033176] Normal zone: 1445888 pages used for memmap [0.033176] Normal zone: 89391104 pages, LIFO batch:63 [0.035851] ACPI: PM-Timer IO Port: 0x448 With commit [0.026874] Normal zone: 1445888 pages used for memmap [0.026875] Normal zone: 89391104 pages, LIFO batch:63 [2.028450] ACPI: PM-Timer IO Port: 0x448 The root cause is the current memmap defer init doesn't work as expected. Before, memmap_init_zone() was used to do memmap init of one whole zone, to initialize all low zones of one numa node, but defer memmap init of the last zone in that numa node. However, since commit 73a6e474cb376, function memmap_init() is adapted to iterater over memblock regions inside one zone, then call memmap_init_zone() to do memmap init for each region. E.g, on VMware's system, the memory layout is as below, there are two memory regions in node 2. The current code will mistakenly initialize the whole 1st region [mem 0xab00000000-0xfcffffffff], then do memmap defer to iniatialize only one memmory section on the 2nd region [mem 0x10000000000-0x1033fffffff]. In fact, we only expect to see that there's only one memory section's memmap initialized. That's why more time is costed at the time. [ 0.008842] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x0009ffff] [ 0.008842] ACPI: SRAT: Node 0 PXM 0 [mem 0x00100000-0xbfffffff] [ 0.008843] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x55ffffffff] [ 0.008844] ACPI: SRAT: Node 1 PXM 1 [mem 0x5600000000-0xaaffffffff] [ 0.008844] ACPI: SRAT: Node 2 PXM 2 [mem 0xab00000000-0xfcffffffff] [ 0.008845] ACPI: SRAT: Node 2 PXM 2 [mem 0x10000000000-0x1033fffffff] Now, let's add a parameter 'zone_end_pfn' to memmap_init_zone() to pass down the real zone end pfn so that defer_init() can use it to judge whether defer need be taken in zone wide. Link: https://lkml.kernel.org/r/20201223080811.16211-1-bhe@redhat.com Link: https://lkml.kernel.org/r/20201223080811.16211-2-bhe@redhat.com Fixes: commit 73a6e474cb376 ("mm: memmap_init: iterate over memblock regions rather that check each PFN") Signed-off-by: Baoquan He <bhe@redhat.com> Reported-by: Rahul Gopakumar <gopakumarr@vmware.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Cc: David Hildenbrand <david@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-30mm: add prototype for __add_to_page_cache_locked()Souptick Joarder1-0/+7
Otherwise it causes a gcc warning: mm/filemap.c:830:14: warning: no previous prototype for `__add_to_page_cache_locked' [-Wmissing-prototypes] A previous attempt to make this function static led to compilation errors when CONFIG_DEBUG_INFO_BTF is enabled because __add_to_page_cache_locked() is referred to by BPF code. Adding a prototype will silence the warning. Link: https://lkml.kernel.org/r/1608693702-4665-1-git-send-email-jrdr.linux@gmail.com Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Cc: Alex Shi <alex.shi@linux.alibaba.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-30checkpatch: prefer strscpy to strlcpyJoe Perches1-0/+6
Prefer strscpy over the deprecated strlcpy function. Link: https://lkml.kernel.org/r/19fe91084890e2c16fe56f960de6c570a93fa99b.camel@perches.com Signed-off-by: Joe Perches <joe@perches.com> Requested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-30Revert "kbuild: avoid static_assert for genksyms"Masahiro Yamada1-5/+0
This reverts commit 14dc3983b5dff513a90bd5a8cc90acaf7867c3d0. Macro Elver had sent a fix proper fix earlier, and also pointed out corner cases: "I guess what you propose is simpler, but might still have corner cases where we still get warnings. In particular, if some file (for whatever reason) does not include build_bug.h and uses a raw _Static_assert(), then we still get warnings. E.g. I see 1 user of raw _Static_assert() (drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h )." I believe the raw use of _Static_assert() should be allowed, so this should be fixed in genksyms. Even after commit 14dc3983b5df ("kbuild: avoid static_assert for genksyms"), I confirmed the following test code emits the warning. ---------------->8---------------- #include <linux/export.h> _Static_assert((1 ?: 0), ""); void foo(void) { } EXPORT_SYMBOL(foo); ---------------->8---------------- WARNING: modpost: EXPORT symbol "foo" [vmlinux] version generation failed, symbol will not be versioned. Now that commit 869b91992bce ("genksyms: Ignore module scoped _Static_assert()") fixed this issue properly, the workaround should be reverted. Link: https://lkml.org/lkml/2020/12/10/845 Link: https://lkml.kernel.org/r/20201219183911.181442-1-masahiroy@kernel.org Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Cc: Marco Elver <elver@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-30mm/hugetlb: fix deadlock in hugetlb_cow error pathMike Kravetz1-1/+21
syzbot reported the deadlock here [1]. The issue is in hugetlb cow error handling when there are not enough huge pages for the faulting task which took the original reservation. It is possible that other (child) tasks could have consumed pages associated with the reservation. In this case, we want the task which took the original reservation to succeed. So, we unmap any associated pages in children so that they can be used by the faulting task that owns the reservation. The unmapping code needs to hold i_mmap_rwsem in write mode. However, due to commit c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization") we are already holding i_mmap_rwsem in read mode when hugetlb_cow is called. Technically, i_mmap_rwsem does not need to be held in read mode for COW mappings as they can not share pmd's. Modifying the fault code to not take i_mmap_rwsem in read mode for COW (and other non-sharable) mappings is too involved for a stable fix. Instead, we simply drop the hugetlb_fault_mutex and i_mmap_rwsem before unmapping. This is OK as it is technically not needed. They are reacquired after unmapping as expected by calling code. Since this is done in an uncommon error path, the overhead of dropping and reacquiring mutexes is acceptable. While making changes, remove redundant BUG_ON after unmap_ref_private. [1] https://lkml.kernel.org/r/000000000000b73ccc05b5cf8558@google.com Link: https://lkml.kernel.org/r/4c5781b8-3b00-761e-c0c7-c5edebb6ec1a@oracle.com Fixes: c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reported-by: syzbot+5eee4145df3c15e96625@syzkaller.appspotmail.com Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-30selftests/vm: fix building protection keys testHarish1-5/+5
Commit d8cbe8bfa7d ("tools/testing/selftests/vm: fix build error") tried to include a ARCH check for powerpc, however ARCH is not defined in the Makefile before including lib.mk. This makes test building to skip on both x86 and powerpc. Fix the arch check by replacing it using machine type as it is already defined and used in the test. Link: https://lkml.kernel.org/r/20201215100402.257376-1-harish@linux.ibm.com Fixes: d8cbe8bfa7d ("tools/testing/selftests/vm: fix build error") Signed-off-by: Harish <harish@linux.ibm.com> Reviewed-by: Sandipan Das <sandipan@linux.ibm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Sandipan Das <sandipan@linux.ibm.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-29io_uring: don't assume mm is constant across submitsJens Axboe1-7/+7
If we COW the identity, we assume that ->mm never changes. But this isn't true of multiple processes end up sharing the ring. Hence treat id->mm like like any other process compontent when it comes to the identity mapping. This is pretty trivial, just moving the existing grab into io_grab_identity(), and including a check for the match. Cc: stable@vger.kernel.org # 5.10 Fixes: 1e6fa5216a0e ("io_uring: COW io_identity on mismatch") Reported-by: Christian Brauner <christian.brauner@ubuntu.com>: Tested-by: Christian Brauner <christian.brauner@ubuntu.com>: Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-29Merge tag 'for-5.11/dm-fix' of ↵Linus Torvalds1-4/+3
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fix from Mike Snitzer: "Revert WQ_SYSFS change that broke reencryption (and all other functionality that requires reloading a dm-crypt DM table)" * tag 'for-5.11/dm-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: Revert "dm crypt: export sysfs of kcryptd workqueue"
2020-12-29Revert "dm crypt: export sysfs of kcryptd workqueue"Mike Snitzer1-4/+3
This reverts commit a2b8b2d975673b1a50ab0bcce5d146b9335edfad. WQ_SYSFS breaks the ability to reload a DM table due to sysfs kobject collision (due to active and inactive table). Given lack of demonstrated need for exposing this workqueue via sysfs: revert exposing it. Reported-by: Ignat Korchagin <ignat@cloudflare.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-12-28libceph: add __maybe_unused to DEFINE_MSGR2_FEATUREIlya Dryomov1-2/+2
Avoid -Wunused-const-variable warnings for "make W=1". Reported-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-28libceph: align session_key and con_secret to 16 bytesIlya Dryomov1-2/+10
crypto_shash_setkey() and crypto_aead_setkey() will do a (small) GFP_ATOMIC allocation to align the key if it isn't suitably aligned. It's not a big deal, but at the same time easy to avoid. The actual alignment requirement is dynamic, queryable with crypto_shash_alignmask() and crypto_aead_alignmask(), but shouldn't be stricter than 16 bytes for our algorithms. Fixes: cd1a677cad99 ("libceph, ceph: implement msgr2.1 protocol (crc and secure modes)") Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-28libceph: fix auth_signature buffer allocation in secure modeIlya Dryomov1-1/+2
auth_signature frame is 68 bytes in plain mode and 96 bytes in secure mode but we are requesting 68 bytes in both modes. By luck, this doesn't actually result in any invalid memory accesses because the allocation is satisfied out of kmalloc-96 slab and so exactly 96 bytes are allocated, but KASAN rightfully complains. Fixes: cd1a677cad99 ("libceph, ceph: implement msgr2.1 protocol (crc and secure modes)") Reported-by: Luis Henriques <lhenriques@suse.de> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>