summaryrefslogtreecommitdiff
path: root/include/trace
AgeCommit message (Collapse)AuthorFilesLines
2020-10-06ASoC: Intel: Remove haswell solutionCezary Rojewski1-385/+0
Newly added catpt solution found in sound/soc/intel/catpt is a direct replacement to sound/soc/intel/haswell. It covers all features supported by it and more - by aligning to recommended flows and requirement list based on Windows driver equivalent. No harm is done to userspace as catpt - similarly to haswell - loads no extenal topology files while sharing the exact same ADSP firmware binary. Given the above, existing haswell code is redundant so remove it. Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Liam Girdwood <liam.r.girdwood@intel.com> Link: https://lore.kernel.org/r/20201006064907.16277-2-cezary.rojewski@intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-10-03sched/debug: Add new tracepoint to track cpu_capacityVincent Donnefort1-0/+4
rq->cpu_capacity is a key element in several scheduler parts, such as EAS task placement and load balancing. Tracking this value enables testing and/or debugging by a toolkit. Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1598605249-72651-1-git-send-email-vincent.donnefort@arm.com
2020-10-03scsi: target: core: Add CONTROL field for trace eventsRoman Bolshakov1-6/+6
trace-cmd report doesn't show events from target subsystem because scsi_command_size() leaks through event format string: [target:target_sequencer_start] function scsi_command_size not defined [target:target_cmd_complete] function scsi_command_size not defined Addition of scsi_command_size() to plugin_scsi.c in trace-cmd doesn't help because an expression is used inside TP_printk(). trace-cmd event parser doesn't understand minus sign inside [ ]: Error: expected ']' but read '-' Rather than duplicating kernel code in plugin_scsi.c, provide a dedicated field for CONTROL byte. Link: https://lore.kernel.org/r/20200929125957.83069-1-r.bolshakov@yadro.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-02bcache: add set_uuid in struct cache_setColy Li1-2/+2
This patch adds a separated set_uuid[16] in struct cache_set, to store the uuid of the cache set. This is the preparation to remove the embedded struct cache_sb from struct cache_set. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-01devlink: Add a tracepoint for trap reportsIdo Schimmel1-0/+37
Add a tracepoint for trap reports so that drop monitor could register its probe on it. Use trace_devlink_trap_report_enabled() to avoid wasting cycles setting the trap metadata if the tracepoint is not enabled. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28KVM: x86: Allow deflecting unknown MSR accesses to user spaceAlexander Graf1-1/+1
MSRs are weird. Some of them are normal control registers, such as EFER. Some however are registers that really are model specific, not very interesting to virtualization workloads, and not performance critical. Others again are really just windows into package configuration. Out of these MSRs, only the first category is necessary to implement in kernel space. Rarely accessed MSRs, MSRs that should be fine tunes against certain CPU models and MSRs that contain information on the package level are much better suited for user space to process. However, over time we have accumulated a lot of MSRs that are not the first category, but still handled by in-kernel KVM code. This patch adds a generic interface to handle WRMSR and RDMSR from user space. With this, any future MSR that is part of the latter categories can be handled in user space. Furthermore, it allows us to replace the existing "ignore_msrs" logic with something that applies per-VM rather than on the full system. That way you can run productive VMs in parallel to experimental ones where you don't care about proper MSR handling. Signed-off-by: Alexander Graf <graf@amazon.com> Reviewed-by: Jim Mattson <jmattson@google.com> Message-Id: <20200925143422.21718-3-graf@amazon.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-25iocost: add iocg_forgive_debt tracepointTejun Heo1-0/+41
Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-24Merge branch 'for-5.10/block' into for-5.10/driversJens Axboe1-17/+9
* for-5.10/block: (140 commits) bdi: replace BDI_CAP_NO_{WRITEBACK,ACCT_DIRTY} with a single flag bdi: invert BDI_CAP_NO_ACCT_WB bdi: replace BDI_CAP_STABLE_WRITES with a queue and a sb flag mm: use SWP_SYNCHRONOUS_IO more intelligently bdi: remove BDI_CAP_SYNCHRONOUS_IO bdi: remove BDI_CAP_CGROUP_WRITEBACK block: lift setting the readahead size into the block layer md: update the optimal I/O size on reshape bdi: initialize ->ra_pages and ->io_pages in bdi_init aoe: set an optimal I/O size bcache: inherit the optimal I/O size drbd: remove dead code in device_to_statistics fs: remove the unused SB_I_MULTIROOT flag block: mark blkdev_get static PM: mm: cleanup swsusp_swap_check mm: split swap_type_of PM: rewrite is_hibernate_resume_dev to not require an inode mm: cleanup claim_swapfile ocfs2: cleanup o2hb_region_dev_store dasd: cleanup dasd_scan_partitions ...
2020-09-21SUNRPC: Remove dprintk call sites in RPC queuing functionsChuck Lever1-0/+1
Remove redundant call sites or call sites that are already covered by tracepoints. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Clean up RPC scheduler tracepointsChuck Lever1-0/+2
Remove several redundant dprintk call sites, and replace a couple of potentially useful ones with tracepoints. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Replace rpcbind dprintk call sites with tracepointsChuck Lever1-0/+86
In many cases, tracepoints already report these errors. In others, the dprintks were mainly useful when this code was less mature. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Hoist trace_xprtrdma_op_setport into generic codeChuck Lever2-1/+29
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Remove rpcb_getport_async dprintk call sitesChuck Lever1-0/+35
In many cases, tracepoints already report these errors. In others, the dprintks were mainly useful when this code was less mature. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Clean up call_bind_status() observabilityChuck Lever1-1/+13
Time to remove dprintk call sites in here. Regarding the rpc_bind_status tracepoint: It's friendlier to administrators if they don't have to look up the error code to figure out what went wrong. Replace trace_rpc_bind_status with a set of tracepoints that report more specifically what the problem was, and what RPC program/version was being queried. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Trace call_refresh eventsChuck Lever1-0/+2
Clean up: Replace dprintk call sites. Note that rpc_call_rpcerror() already has a trace point, so perhaps adding trace_rpc_refresh_status() isn't necessary. However, it does report a particular category of error. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Add trace_rpc_timeout_status()Chuck Lever1-0/+1
For a long while we've wanted a tracepoint that fires when a major timeout is reported in the system log. Such a tracepoint can be attached to other actions that can take place when a timeout is detected (eg, server or connection health assessment). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Replace connect dprintk call sites with a tracepointChuck Lever1-0/+1
This trace event can be used to audit transport connections from the client. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Replace dprintk() call site in xs_nospace()Chuck Lever1-0/+28
"no socket space" is an exceptional and infrequent condition that troubleshooters want to know about. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Replace dprintk() call site in xprt_prepare_transmitChuck Lever1-0/+1
Generate a trace event when an RPC request is queued without being sent immediately. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Update debugging instrumentation in xprt_do_reserve()Chuck Lever1-31/+24
Replace a dprintk() with a tracepoint. The tracepoint marks the point where an RPC request is assigned an XID. Additional clean up: Remove trace_xprt_enq_xmit, which reports much the same thing. That tracepoint was added for debugging commit 918f3c1fe83c ("SUNRPC: Improve latency for interactive tasks"). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Remove debugging instrumentation from xprt_releaseChuck Lever1-32/+0
These instruments don't appear to add any substantial value. We already have this at the termination of each RPC: iozone-2617 [002] 975.713126: rpc_stats_latency: task:418@5 xid=0x260eab5d nfsv3 LOOKUP backlog=15 rtt=32 execute=58 iozone-2617 [002] 975.713127: xprt_release_cong: task:418@5 snd_task:4294967295 cong=256 cwnd=16384 iozone-2617 [002] 975.713127: xprt_put_cong: task:418@5 snd_task:4294967295 cong=0 cwnd=16384 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Hoist trace_xprtrdma_op_allocate into generic codeChuck Lever2-30/+30
Introduce a tracepoint in call_allocate that reports the exact sizes in the RPC buffer allocation request and the status of the result. This helps catch problems with XDR buffer provisioning, and replaces transport-specific debugging instrumentation. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21SUNRPC: Remove trace_xprt_complete_rqst()Chuck Lever1-1/+0
Request completion is already recorded by an "rpc_task_wakeup queue=xprt_pending" trace record. A subsequent rpc_xdr_recvfrom trace record shows the number of bytes received. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-18Merge branch 'mlx5_active_speed' into rdma.git for-nextJason Gunthorpe1-5/+22
Leon Romanovsky says: ==================== IBTA declares speed as 16 bits, but kernel stores it in u8. This series fixes in-kernel declaration while keeping external interface intact. ==================== Based on the mlx5-next branch at git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux due to dependencies. * branch 'mlx5_active_speed': RDMA: Fix link active_speed size RDMA/mlx5: Delete duplicated mlx5_ptys_width enum net/mlx5: Refactor query port speed functions
2020-09-14rxrpc: Fix a missing NULL-pointer check in a traceDavid Howells1-1/+1
Fix the rxrpc_client tracepoint to not dereference conn to get the cid if conn is NULL, as it does for other fields. RIP: 0010:trace_event_raw_event_rxrpc_client+0x7e/0xe0 [rxrpc] Call Trace: rxrpc_activate_channels+0x62/0xb0 [rxrpc] rxrpc_connect_call+0x481/0x650 [rxrpc] ? wake_up_q+0xa0/0xa0 ? rxrpc_kernel_begin_call+0x12a/0x1b0 [rxrpc] rxrpc_new_client_call+0x2a5/0x5e0 [rxrpc] Fixes: 245500d853e9 ("rxrpc: Rewrite the client connection manager") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Marc Dionne <marc.dionne@auristor.com>
2020-09-11f2fs: trace: fix typoChao Yu1-1/+1
Fixes a typo from 'compreesed' to 'compressed'. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-09-11f2fs: support age threshold based garbage collectionChao Yu1-3/+5
There are several issues in current background GC algorithm: - valid blocks is one of key factors during cost overhead calculation, so if segment has less valid block, however even its age is young or it locates hot segment, CB algorithm will still choose the segment as victim, it's not appropriate. - GCed data/node will go to existing logs, no matter in-there datas' update frequency is the same or not, it may mix hot and cold data again. - GC alloctor mainly use LFS type segment, it will cost free segment more quickly. This patch introduces a new algorithm named age threshold based garbage collection to solve above issues, there are three steps mainly: 1. select a source victim: - set an age threshold, and select candidates beased threshold: e.g. 0 means youngest, 100 means oldest, if we set age threshold to 80 then select dirty segments which has age in range of [80, 100] as candiddates; - set candidate_ratio threshold, and select candidates based the ratio, so that we can shrink candidates to those oldest segments; - select target segment with fewest valid blocks in order to migrate blocks with minimum cost; 2. select a target victim: - select candidates beased age threshold; - set candidate_radius threshold, search candidates whose age is around source victims, searching radius should less than the radius threshold. - select target segment with most valid blocks in order to avoid migrating current target segment. 3. merge valid blocks from source victim into target victim with SSR alloctor. Test steps: - create 160 dirty segments: * half of them have 128 valid blocks per segment * left of them have 384 valid blocks per segment - run background GC Benefit: GC count and block movement count both decrease obviously: - Before: - Valid: 86 - Dirty: 1 - Prefree: 11 - Free: 6001 (6001) GC calls: 162 (BG: 220) - data segments : 160 (160) - node segments : 2 (2) Try to move 41454 blocks (BG: 41454) - data blocks : 40960 (40960) - node blocks : 494 (494) IPU: 0 blocks SSR: 0 blocks in 0 segments LFS: 41364 blocks in 81 segments - After: - Valid: 87 - Dirty: 0 - Prefree: 4 - Free: 6008 (6008) GC calls: 75 (BG: 76) - data segments : 74 (74) - node segments : 1 (1) Try to move 12813 blocks (BG: 12813) - data blocks : 12544 (12544) - node blocks : 269 (269) IPU: 0 blocks SSR: 12032 blocks in 77 segments LFS: 855 blocks in 2 segments Signed-off-by: Chao Yu <yuchao0@huawei.com> [Jaegeuk Kim: fix a bug along with pinfile in-mem segment & clean up] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-09-08rxrpc: Rewrite the client connection managerDavid Howells1-29/+4
Rewrite the rxrpc client connection manager so that it can support multiple connections for a given security key to a peer. The following changes are made: (1) For each open socket, the code currently maintains an rbtree with the connections placed into it, keyed by communications parameters. This is tricky to maintain as connections can be culled from the tree or replaced within it. Connections can require replacement for a number of reasons, e.g. their IDs span too great a range for the IDR data type to represent efficiently, the call ID numbers on that conn would overflow or the conn got aborted. This is changed so that there's now a connection bundle object placed in the tree, keyed on the same parameters. The bundle, however, does not need to be replaced. (2) An rxrpc_bundle object can now manage the available channels for a set of parallel connections. The lock that manages this is moved there from the rxrpc_connection struct (channel_lock). (3) There'a a dummy bundle for all incoming connections to share so that they have a channel_lock too. It might be better to give each incoming connection its own bundle. This bundle is not needed to manage which channels incoming calls are made on because that's the solely at whim of the client. (4) The restrictions on how many client connections are around are removed. Instead, a previous patch limits the number of client calls that can be allocated. Ordinarily, client connections are reaped after 2 minutes on the idle queue, but when more than a certain number of connections are in existence, the reaper starts reaping them after 2s of idleness instead to get the numbers back down. It could also be made such that new call allocations are forced to wait until the number of outstanding connections subsides. Signed-off-by: David Howells <dhowells@redhat.com>
2020-09-04mm: Add PG_arch_2 page flagSteven Price1-1/+8
For arm64 MTE support it is necessary to be able to mark pages that contain user space visible tags that will need to be saved/restored e.g. when swapped out. To support this add a new arch specific flag (PG_arch_2). This flag is only available on 64-bit architectures due to the limited number of spare page flags on the 32-bit ones. Signed-off-by: Steven Price <steven.price@arm.com> [catalin.marinas@arm.com: use CONFIG_64BIT for guarding this new flag] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org>
2020-09-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds1-5/+22
Pull networking fixes from David Miller: 1) Use netif_rx_ni() when necessary in batman-adv stack, from Jussi Kivilinna. 2) Fix loss of RTT samples in rxrpc, from David Howells. 3) Memory leak in hns_nic_dev_probe(), from Dignhao Liu. 4) ravb module cannot be unloaded, fix from Yuusuke Ashizuka. 5) We disable BH for too lokng in sctp_get_port_local(), add a cond_resched() here as well, from Xin Long. 6) Fix memory leak in st95hf_in_send_cmd, from Dinghao Liu. 7) Out of bound access in bpf_raw_tp_link_fill_link_info(), from Yonghong Song. 8) Missing of_node_put() in mt7530 DSA driver, from Sumera Priyadarsini. 9) Fix crash in bnxt_fw_reset_task(), from Michael Chan. 10) Fix geneve tunnel checksumming bug in hns3, from Yi Li. 11) Memory leak in rxkad_verify_response, from Dinghao Liu. 12) In tipc, don't use smp_processor_id() in preemptible context. From Tuong Lien. 13) Fix signedness issue in mlx4 memory allocation, from Shung-Hsi Yu. 14) Missing clk_disable_prepare() in gemini driver, from Dan Carpenter. 15) Fix ABI mismatch between driver and firmware in nfp, from Louis Peens. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (110 commits) net/smc: fix sock refcounting in case of termination net/smc: reset sndbuf_desc if freed net/smc: set rx_off for SMCR explicitly net/smc: fix toleration of fake add_link messages tg3: Fix soft lockup when tg3_reset_task() fails. doc: net: dsa: Fix typo in config code sample net: dp83867: Fix WoL SecureOn password nfp: flower: fix ABI mismatch between driver and firmware tipc: fix shutdown() of connectionless socket ipv6: Fix sysctl max for fib_multipath_hash_policy drivers/net/wan/hdlc: Change the default of hard_header_len to 0 net: gemini: Fix another missing clk_disable_unprepare() in probe net: bcmgenet: fix mask check in bcmgenet_validate_flow() amd-xgbe: Add support for new port mode net: usb: dm9601: Add USB ID of Keenetic Plus DSL vhost: fix typo in error message net: ethernet: mlx4: Fix memory allocation in mlx4_buddy_init() pktgen: fix error message with wrong function name net: ethernet: ti: am65-cpsw: fix rmii 100Mbit link mode cxgb4: fix thermal zone device registration ...
2020-09-02blk-iocost: restore inuse update tracepointsTejun Heo1-3/+3
Update and restore the inuse update tracepoints. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-02blk-iocost: decouple vrate adjustment from surplus transfersTejun Heo1-9/+4
Budget donations are inaccurate and could take multiple periods to converge. To prevent triggering vrate adjustments while surplus transfers were catching up, vrate adjustment was suppressed if donations were increasing, which was indicated by non-zero nr_surpluses. This entangling won't be necessary with the scheduled rewrite of donation mechanism which will make it precise and immediate. Let's decouple the two in preparation. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-02blk-iocost: calculate iocg->usages[] from iocg->local_stat.usage_usTejun Heo1-5/+2
Currently, iocg->usages[] which are used to guide inuse adjustments are calculated from vtime deltas. This, however, assumes that the hierarchical inuse weight at the time of calculation held for the entire period, which often isn't true and can lead to significant errors. Now that we have absolute usage information collected, we can derive iocg->usages[] from iocg->local_stat.usage_us so that inuse adjustment decisions are made based on actual absolute usage. The calculated usage is clamped between 1 and WEIGHT_ONE and WEIGHT_ONE is also used to signal saturation regardless of the current hierarchical inuse weight. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-02locks: Remove extra "0x" in tracepoint format specifierChuck Lever1-4/+4
Clean up: %p adds its own 0x already. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jeff Layton <jlayton@kernel.org>
2020-09-01tracepoint: Optimize using static_call()Steven Rostedt (VMware)1-7/+7
Currently the tracepoint site will iterate a vector and issue indirect calls to however many handlers are registered (ie. the vector is long). Using static_call() it is possible to optimize this for the common case of only having a single handler registered. In this case the static_call() can directly call this handler. Otherwise, if the vector is longer than 1, call a function that iterates the whole vector like the current code. [peterz: updated to new interface] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20200818135805.279421092@infradead.org
2020-08-31Merge tag 'v5.9-rc3' into rdma.git for-nextJason Gunthorpe3-18/+83
Required due to dependencies in following patches. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-30Merge tag 'powerpc-5.9-4' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Revert our removal of PROT_SAO, at least one user expressed an interest in using it on Power9. Instead don't allow it to be used in guests unless enabled explicitly at compile time. - A fix for a crash introduced by a recent change to FP handling. - Revert a change to our idle code that left Power10 with no idle support. - One minor fix for the new scv system call path to set PPR. - Fix a crash in our "generic" PMU if branch stack events were enabled. - A fix for the IMC PMU, to correctly identify host kernel samples. - The ADB_PMU powermac code was found to be incompatible with VMAP_STACK, so make them incompatible in Kconfig until the code can be fixed. - A build fix in drivers/video/fbdev/controlfb.c, and a documentation fix. Thanks to Alexey Kardashevskiy, Athira Rajeev, Christophe Leroy, Giuseppe Sacco, Madhavan Srinivasan, Milton Miller, Nicholas Piggin, Pratik Rajesh Sampat, Randy Dunlap, Shawn Anastasio, Vaidyanathan Srinivasan. * tag 'powerpc-5.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/32s: Disable VMAP stack which CONFIG_ADB_PMU Revert "powerpc/powernv/idle: Replace CPU feature check with PVR check" powerpc/perf: Fix reading of MSR[HV/PR] bits in trace-imc powerpc/perf: Fix crashes with generic_compat_pmu & BHRB powerpc/64s: Fix crash in load_fp_state() due to fpexc_mode powerpc/64s: scv entry should set PPR Documentation/powerpc: fix malformed table in syscall64-abi video: fbdev: controlfb: Fix build for COMPILE_TEST=y && PPC_PMAC=n selftests/powerpc: Update PROT_SAO test to skip ISA 3.1 powerpc/64s: Disallow PROT_SAO in LPARs by default Revert "powerpc/64s: Remove PROT_SAO support"
2020-08-28Merge tag 'writeback_for_v5.9-rc3' of ↵Linus Torvalds1-8/+6
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull writeback fixes from Jan Kara: "Fixes for writeback code occasionally skipping writeback of some inodes or livelocking sync(2)" * tag 'writeback_for_v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: writeback: Drop I_DIRTY_TIME_EXPIRE writeback: Fix sync livelock due to b_dirty_time processing writeback: Avoid skipping inode writeback writeback: Protect inode->i_io_list with inode->i_lock
2020-08-27Merge tag 'rxrpc-fixes-20200820' of ↵David S. Miller1-5/+22
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc, afs: Fix probing issues Here are some fixes for rxrpc and afs to fix issues in the RTT measuring in rxrpc and thence the Volume Location server probing in afs: (1) Move the serial number of a received ACK into a local variable to simplify the next patch. (2) Fix the loss of RTT samples due to extra interposed ACKs causing baseline information to be discarded too early. This is a particular problem for afs when it sends a single very short call to probe a server it hasn't talked to recently. (3) Fix rxrpc_kernel_get_srtt() to indicate whether it actually has seen any valid samples or not. (4) Remove a field that's set/woken, but never read/waited on. (5) Expose the RTT and other probe information through procfs to make debugging of this stuff easier. (6) Fix VL rotation in afs to only use summary information from VL probing and not the probe running state (which gets clobbered when next a probe is issued). (7) Fix VL rotation to actually return the error aggregated from the probe errors. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-25rcu/trace: Print negative GP numbers correctlyJoel Fernandes (Google)1-27/+27
GP numbers start from -300 and gp_seq numbers start of -1200 (for a shift of 2). These negative numbers are printed as unsigned long which not only takes up more text space, but is rather confusing to the reader as they have to constantly expend energy to truncate the number. Just print the negative numbering directly. Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-08-24RDMA/core: Move the rdma_show_ib_cm_event() macroChuck Lever2-1/+41
Refactor: Make it globally available in the utilities header. Link: https://lore.kernel.org/r/159767239131.2968.9520990257041764685.stgit@klimt.1015granger.net Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-24Revert "powerpc/64s: Remove PROT_SAO support"Shawn Anastasio1-0/+2
This reverts commit 5c9fa16e8abd342ce04dc830c1ebb2a03abf6c05. Since PROT_SAO can still be useful for certain classes of software, reintroduce it. Concerns about guest migration for LPARs using SAO will be addressed next. Signed-off-by: Shawn Anastasio <shawn@anastas.io> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200821185558.35561-2-shawn@anastas.io
2020-08-22selinux: add basic filtering for audit trace eventsPeter Enderborg1-10/+26
This patch adds further attributes to the event. These attributes are helpful to understand the context of the message and can be used to filter the events. There are three common items. Source context, target context and tclass. There are also items from the outcome of operation performed. An event is similar to: <...>-1309 [002] .... 6346.691689: selinux_audited: requested=0x4000000 denied=0x4000000 audited=0x4000000 result=-13 scontext=system_u:system_r:cupsd_t:s0-s0:c0.c1023 tcontext=system_u:object_r:bin_t:s0 tclass=file With systems where many denials are occurring, it is useful to apply a filter. The filtering is a set of logic that is inserted with the filter file. Example: echo "tclass==\"file\" " > events/avc/selinux_audited/filter This adds that we only get tclass=file. The trace can also have extra properties. Adding the user stack can be done with echo 1 > options/userstacktrace Now the output will be runcon-1365 [003] .... 6960.955530: selinux_audited: requested=0x4000000 denied=0x4000000 audited=0x4000000 result=-13 scontext=system_u:system_r:cupsd_t:s0-s0:c0.c1023 tcontext=system_u:object_r:bin_t:s0 tclass=file runcon-1365 [003] .... 6960.955560: <user stack trace> => <00007f325b4ce45b> => <00005607093efa57> Signed-off-by: Peter Enderborg <peter.enderborg@sony.com> Reviewed-by: Thiébaud Weksteen <tweek@google.com> Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-22selinux: add tracepoint on audited eventsThiébaud Weksteen1-0/+37
The audit data currently captures which process and which target is responsible for a denial. There is no data on where exactly in the process that call occurred. Debugging can be made easier by being able to reconstruct the unified kernel and userland stack traces [1]. Add a tracepoint on the SELinux denials which can then be used by userland (i.e. perf). Although this patch could manually be added by each OS developer to trouble shoot a denial, adding it to the kernel streamlines the developers workflow. It is possible to use perf for monitoring the event: # perf record -e avc:selinux_audited -g -a ^C # perf report -g [...] 6.40% 6.40% audited=800000 tclass=4 | __libc_start_main | |--4.60%--__GI___ioctl | entry_SYSCALL_64 | do_syscall_64 | __x64_sys_ioctl | ksys_ioctl | binder_ioctl | binder_set_nice | can_nice | capable | security_capable | cred_has_capability.isra.0 | slow_avc_audit | common_lsm_audit | avc_audit_post_callback | avc_audit_post_callback | It is also possible to use the ftrace interface: # echo 1 > /sys/kernel/debug/tracing/events/avc/selinux_audited/enable # cat /sys/kernel/debug/tracing/trace tracer: nop entries-in-buffer/entries-written: 1/1 #P:8 [...] dmesg-3624 [001] 13072.325358: selinux_denied: audited=800000 tclass=4 The tclass value can be mapped to a class by searching security/selinux/flask.h. The audited value is a bit field of the permissions described in security/selinux/av_permissions.h for the corresponding class. [1] https://source.android.com/devices/tech/debug/native_stack_dump Signed-off-by: Thiébaud Weksteen <tweek@google.com> Suggested-by: Joel Fernandes <joelaf@google.com> Reviewed-by: Peter Enderborg <peter.enderborg@sony.com> Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-21Merge tag 'ext4_for_linus' of ↵Linus Torvalds1-10/+75
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "Improvements to ext4's block allocator performance for very large file systems, especially when the file system or files which are highly fragmented. There is a new mount option, prefetch_block_bitmaps which will pull in the block bitmaps and set up the in-memory buddy bitmaps when the file system is initially mounted. Beyond that, a lot of bug fixes and cleanups. In particular, a number of changes to make ext4 more robust in the face of write errors or file system corruptions" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (46 commits) ext4: limit the length of per-inode prealloc list ext4: reorganize if statement of ext4_mb_release_context() ext4: add mb_debug logging when there are lost chunks ext4: Fix comment typo "the the". jbd2: clean up checksum verification in do_one_pass() ext4: change to use fallthrough macro ext4: remove unused parameter of ext4_generic_delete_entry function mballoc: replace seq_printf with seq_puts ext4: optimize the implementation of ext4_mb_good_group() ext4: delete invalid comments near ext4_mb_check_limits() ext4: fix typos in ext4_mb_regular_allocator() comment ext4: fix checking of directory entry validity for inline directories fs: prevent BUG_ON in submit_bh_wbc() ext4: correctly restore system zone info when remount fails ext4: handle add_system_zone() failure in ext4_setup_system_zone() ext4: fold ext4_data_block_valid_rcu() into the caller ext4: check journal inode extents more carefully ext4: don't allow overlapping system zones ext4: handle error of ext4_setup_system_zone() on remount ext4: delete the invalid BUGON in ext4_mb_load_buddy_gfp() ...
2020-08-20rxrpc: Fix loss of RTT samples due to interposed ACKDavid Howells1-5/+22
The Rx protocol has a mechanism to help generate RTT samples that works by a client transmitting a REQUESTED-type ACK when it receives a DATA packet that has the REQUEST_ACK flag set. The peer, however, may interpose other ACKs before transmitting the REQUESTED-ACK, as can be seen in the following trace excerpt: rxrpc_tx_data: c=00000044 DATA d0b5ece8:00000001 00000001 q=00000001 fl=07 rxrpc_rx_ack: c=00000044 00000001 PNG r=00000000 f=00000002 p=00000000 n=0 rxrpc_rx_ack: c=00000044 00000002 REQ r=00000001 f=00000002 p=00000001 n=0 ... DATA packet 1 (q=xx) has REQUEST_ACK set (bit 1 of fl=xx). The incoming ping (labelled PNG) hard-acks the request DATA packet (f=xx exceeds the sequence number of the DATA packet), causing it to be discarded from the Tx ring. The ACK that was requested (labelled REQ, r=xx references the serial of the DATA packet) comes after the ping, but the sk_buff holding the timestamp has gone and the RTT sample is lost. This is particularly noticeable on RPC calls used to probe the service offered by the peer. A lot of peers end up with an unknown RTT because we only ever sent a single RPC. This confuses the server rotation algorithm. Fix this by caching the information about the outgoing packet in RTT calculations in the rxrpc_call struct rather than looking in the Tx ring. A four-deep buffer is maintained and both REQUEST_ACK-flagged DATA and PING-ACK transmissions are recorded in there. When the appropriate response ACK is received, the buffer is checked for a match and, if found, an RTT sample is recorded. If a received ACK refers to a packet with a later serial number than an entry in the cache, that entry is presumed lost and the entry is made available to record a new transmission. ACKs types other than REQUESTED-type and PING-type cause any matching sample to be cancelled as they don't necessarily represent a useful measurement. If there's no space in the buffer on ping/data transmission, the sample base is discarded. Fixes: 50235c4b5a2f ("rxrpc: Obtain RTT data by requesting ACKs on DATA packets") Signed-off-by: David Howells <dhowells@redhat.com>
2020-08-19ext4: limit the length of per-inode prealloc listbrookxu1-6/+11
In the scenario of writing sparse files, the per-inode prealloc list may be very long, resulting in high overhead for ext4_mb_use_preallocated(). To circumvent this problem, we limit the maximum length of per-inode prealloc list to 512 and allow users to modify it. After patching, we observed that the sys ratio of cpu has dropped, and the system throughput has increased significantly. We created a process to write the sparse file, and the running time of the process on the fixed kernel was significantly reduced, as follows: Running time on unfixed kernel: [root@TENCENT64 ~]# time taskset 0x01 ./sparse /data1/sparce.dat real 0m2.051s user 0m0.008s sys 0m2.026s Running time on fixed kernel: [root@TENCENT64 ~]# time taskset 0x01 ./sparse /data1/sparce.dat real 0m0.471s user 0m0.004s sys 0m0.395s Signed-off-by: Chunguang Xu <brookxu@tencent.com> Link: https://lore.kernel.org/r/d7a98178-056b-6db5-6bce-4ead23f4a257@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-08-15x86/paravirt: Remove set_pte_at() pv-opJuergen Gross1-20/+0
On x86 set_pte_at() is now always falling back to set_pte(). So instead of having this fallback after the paravirt maze just drop the set_pte_at paravirt operation and let set_pte_at() use the set_pte() function directly. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200815100641.26362-6-jgross@suse.com
2020-08-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds1-0/+17
Pull networking fixes from David Miller: "Some merge window fallout, some longer term fixes: 1) Handle headroom properly in lapbether and x25_asy drivers, from Xie He. 2) Fetch MAC address from correct r8152 device node, from Thierry Reding. 3) In the sw kTLS path we should allow MSG_CMSG_COMPAT in sendmsg, from Rouven Czerwinski. 4) Correct fdputs in socket layer, from Miaohe Lin. 5) Revert troublesome sockptr_t optimization, from Christoph Hellwig. 6) Fix TCP TFO key reading on big endian, from Jason Baron. 7) Missing CAP_NET_RAW check in nfc, from Qingyu Li. 8) Fix inet fastreuse optimization with tproxy sockets, from Tim Froidcoeur. 9) Fix 64-bit divide in new SFC driver, from Edward Cree. 10) Add a tracepoint for prandom_u32 so that we can more easily perform usage analysis. From Eric Dumazet. 11) Fix rwlock imbalance in AF_PACKET, from John Ogness" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (49 commits) net: openvswitch: introduce common code for flushing flows af_packet: TPACKET_V3: fix fill status rwlock imbalance random32: add a tracepoint for prandom_u32() Revert "ipv4: tunnel: fix compilation on ARCH=um" net: accept an empty mask in /sys/class/net/*/queues/rx-*/rps_cpus net: ethernet: stmmac: Disable hardware multicast filter net: stmmac: dwmac1000: provide multicast filter fallback ipv4: tunnel: fix compilation on ARCH=um vsock: fix potential null pointer dereference in vsock_poll() sfc: fix ef100 design-param checking net: initialize fastreuse on inet_inherit_port net: refactor bind_bucket fastreuse into helper net: phy: marvell10g: fix null pointer dereference net: Fix potential memory leak in proto_register() net: qcom/emac: add missed clk_disable_unprepare in error path of emac_clks_phase1_init ionic_lif: Use devm_kcalloc() in ionic_qcq_alloc() net/nfc/rawsock.c: add CAP_NET_RAW check. hinic: fix strncpy output truncated compile warnings drivers/net/wan/x25_asy: Added needed_headroom and a skb->len check net/tls: Fix kmap usage ...
2020-08-14random32: add a tracepoint for prandom_u32()Eric Dumazet1-0/+17
There has been some heat around prandom_u32() lately, and some people were wondering if there was a simple way to determine how often it was used, before considering making it maybe 10 times more expensive. This tracepoint exports the generated pseudo random value. Tested: perf list | grep prandom_u32 random:prandom_u32 [Tracepoint event] perf record -a [-g] [-C1] -e random:prandom_u32 sleep 1 [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 259.748 MB perf.data (924087 samples) ] perf report --nochildren ... 97.67% ksoftirqd/1 [kernel.vmlinux] [k] prandom_u32 | ---prandom_u32 prandom_u32 | |--48.86%--tcp_v4_syn_recv_sock | tcp_check_req | tcp_v4_rcv | ... --48.81%--tcp_conn_request tcp_v4_conn_request tcp_rcv_state_process ... perf script Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Sedat Dilek <sedat.dilek@gmail.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>