summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2018-04-27cgroup: Factor out and expose cgroup_rstat_*() interface functionsTejun Heo1-2/+9
cgroup_rstat is being generalized so that controllers can use it too. This patch factors out and exposes the following interface functions. * cgroup_rstat_updated(): Renamed from cgroup_rstat_cpu_updated() for consistency. * cgroup_rstat_flush_hold/release(): Factored out from base stat implementation. * cgroup_rstat_flush(): Verbatim expose. While at it, drop assert on cgroup_rstat_mutex in cgroup_base_stat_flush() as it crosses layers and make a minor comment update. v2: Added EXPORT_SYMBOL_GPL(cgroup_rstat_updated) to fix a build bug. Signed-off-by: Tejun Heo <tj@kernel.org>
2018-04-27cgroup: Distinguish base resource stat implementation from rstatTejun Heo1-13/+16
Base resource stat accounts universial (not specific to any controller) resource consumptions on top of rstat. Currently, its implementation is intermixed with rstat implementation making the code confusing to follow. This patch clarifies the distintion by doing the followings. * Encapsulate base resource stat counters, currently only cputime, in struct cgroup_base_stat. * Move prev_cputime into struct cgroup and initialize it with cgroup. * Rename the related functions so that they start with cgroup_base_stat. * Prefix the related variables and field names with b. This patch doesn't make any functional changes. Signed-off-by: Tejun Heo <tj@kernel.org>
2018-04-27cgroup: Rename stat to rstatTejun Heo1-7/+9
stat is too generic a name and ends up causing subtle confusions. It'll be made generic so that controllers can plug into it, which will make the problem worse. Let's rename it to something more specific - cgroup_rstat for cgroup recursive stat. This patch does the following renames. No other changes. * cpu_stat -> rstat_cpu * stat -> rstat * ?cstat -> ?rstatc Note that the renames are selective. The unrenamed are the ones which implement basic resource statistics on top of rstat. This will be further cleaned up in the following patches. Signed-off-by: Tejun Heo <tj@kernel.org>
2018-04-27cgroup: Limit event generation frequencyTejun Heo1-0/+2
".events" files generate file modified event to notify userland of possible new events. Some of the events can be quite bursty (e.g. memory high event) and generating notification each time is costly and pointless. This patch implements a event rate limit mechanism. If a new notification is requested before 10ms has passed since the previous notification, the new notification is delayed till then. As this only delays from the second notification on in a given close cluster of notifications, userland reactions to notifications shouldn't be delayed at all in most cases while avoiding notification storms. Signed-off-by: Tejun Heo <tj@kernel.org>
2018-04-26Merge tag 'for_v4.17-rc3' of ↵Linus Torvalds1-3/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull fsnotify fix from Jan Kara: "A fix of a fsnotify race causing panics / softlockups" * tag 'for_v4.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fsnotify: Fix fsnotify_mark_connector race
2018-04-26Merge tag 'scsi-fixes' of ↵Linus Torvalds1-0/+5
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Eight bug fixes, one spelling update and one tracepoint addition. The most serious is probably the mptsas write same fix because it means anyone using these controllers sees errors when modern filesystems try to issue discards" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: target: fix crash with iscsi target and dvd scsi: sd_zbc: Avoid that resetting a zone fails sporadically scsi: sd: Defer spinning up drive while SANITIZE is in progress scsi: megaraid_sas: Do not log an error if FW successfully initializes. scsi: ufs: add trace event for ufs upiu scsi: core: remove reference to scsi_show_extd_sense() scsi: mptsas: Disable WRITE SAME scsi: fnic: fix spelling mistake in fnic stats "Abord" -> "Abort" scsi: scsi_debug: IMMED related delay adjustments scsi: iscsi: respond to netlink with unicast when appropriate
2018-04-26Merge tag 'for-linus-20180425' of git://git.kernel.dk/linux-blockLinus Torvalds2-0/+4
Pull block updates from Jens Axboe: "I ended up sitting on this about a week longer than I wanted to, since we were hashing out details with a timeout change. I've now killed that patch, so we can flush the existing queue in due time. This contains: - Fix for an old regression, where entering the queue can be disturbed by a signal to the process. This can cause spurious EIO. Fix from Alan Jenkins. - cdrom information leak fix from Dan. - Trivial helper for testing queue FUA from Dave Chinner, part of his O_DIRECT FUA series. - Series of swim fixes from Finn that actually makes it work again. - Loop O_DIRECT corruption fix, which caused data corruption in production for us. From me. - BFQ crash fix from me. - bcache maintainer update. Michael no longer has the time to do it, Coly has stepped up to serve as the new maintainer. - blkcg locking fixes from Jiang Biao. - Revert of a change from this merge window from Ming, that causes an issue on some hardware. - Minor clarification doc addition from Linus Walleij" * tag 'for-linus-20180425' of git://git.kernel.dk/linux-block: (22 commits) Revert "blk-mq: remove code for dealing with remapping queue" block: mq: Add some minor doc for core structs bcache: mark Coly Li as bcache maintainer MAINTAINERS: Remove me as maintainer of bcache blkcg: init root blkcg_gq under lock blkcg: small fix on comment in blkcg_init_queue blkcg: don't hold blkcg lock when deactivating policy block: add blk_queue_fua() helper function cdrom: information leak in cdrom_ioctl_media_changed() bfq-iosched: ensure to clear bic/bfqq pointers when preparing request blk-mq: start request gstate with gen 1 block/swim: Select appropriate drive on device open block/swim: Fix IO error at end of medium block/swim: Check drive type block/swim: Rename macros to avoid inconsistent inverted logic block/swim: Don't log an error message for an invalid ioctl block/swim: Remove extra put_disk() call from error path block/swim: Fix array bounds check m68k/mac: Don't remap SWIM MMIO region loop: handle short DIO reads ...
2018-04-25block: mq: Add some minor doc for core structsLinus Walleij1-0/+3
As it came up in discussion on the mailing list that the semantic meaning of 'blk_mq_ctx' and 'blk_mq_hw_ctx' isn't completely obvious to everyone, let's add some minimal kerneldoc for a starter. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-04-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2-2/+4
Pull networking fixes from David Miller: 1) Fix rtnl deadlock in ipvs, from Julian Anastasov. 2) s390 qeth fixes from Julian Wiedmann (control IO completion stalls, bad MAC address update sequence, request side races on command IO timeouts). 3) Handle seq_file overflow properly in l2tp, from Guillaume Nault. 4) Fix VLAN priority mappings in cpsw driver, from Ivan Khoronzhuk. 5) Packet scheduler ife action fixes (malformed TLV lengths, etc.) from Alexander Aring. 6) Fix out of bounds access in tcp md5 option parser, from Jann Horn. 7) Missing netlink attribute policies in rtm_ipv6_policy table, from Eric Dumazet. 8) Missing socket address length checks in l2tp and pppoe connect, from Guillaume Nault. 9) Fix netconsole over team and bonding, from Xin Long. 10) Fix race with AF_PACKET socket state bitfields, from Willem de Bruijn. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (51 commits) ice: Fix insufficient memory issue in ice_aq_manage_mac_read sfc: ARFS filter IDs net: ethtool: Add missing kernel doc for FEC parameters packet: fix bitfield update race ice: Do not check INTEVENT bit for OICR interrupts ice: Fix incorrect comment for action type ice: Fix initialization for num_nodes_added igb: Fix the transmission mode of queue 0 for Qav mode ixgbevf: ensure xdp_ring resources are free'd on error exit team: fix netconsole setup over team amd-xgbe: Only use the SFP supported transceiver signals amd-xgbe: Improve KR auto-negotiation and training amd-xgbe: Add pre/post auto-negotiation phy hooks pppoe: check sockaddr length in pppoe_connect() l2tp: check sockaddr length in pppol2tp_connect() net: phy: marvell: clear wol event before setting it ipv6: add RTA_TABLE and RTA_PREFSRC to rtm_ipv6_policy bonding: do not set slave_dev npinfo before slave_enable_netpoll in bond_enslave tcp: don't read out-of-bounds opsize ibmvnic: Clean actual number of RX or TX pools ...
2018-04-24net: ethtool: Add missing kernel doc for FEC parametersFlorian Fainelli1-0/+2
While adding support for ethtool::get_fecparam and set_fecparam, kernel doc for these functions was missed, add those. Fixes: 1a5f3da20bd9 ("net: ethtool: add support for forward error correction modes") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller1-2/+2
Daniel Borkmann says: ==================== pull-request: bpf 2018-04-21 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix a deadlock between mm->mmap_sem and bpf_event_mutex when one task is detaching a BPF prog via perf_event_detach_bpf_prog() and another one dumping through bpf_prog_array_copy_info(). For the latter we move the copy_to_user() out of the bpf_event_mutex lock to fix it, from Yonghong. 2) Fix test_sock and test_sock_addr.sh failures. The former was hitting rlimit issues and the latter required ping to specify the address family, from Yonghong. 3) Remove a dead check in sockmap's sock_map_alloc(), from Jann. 4) Add generated files to BPF kselftests gitignore that were previously missed, from Anders. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-22Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds2-5/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "A small set of timer fixes: - Evaluate the -ETIME condition correctly in the imx tpm driver - Fix the evaluation order of a condition in posix cpu timers - Use pr_cont() in the clockevents code to prevent ugly message splitting - Remove __current_kernel_time() which is now unused to prevent that new users show up. - Remove a stale forward declaration" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource/imx-tpm: Correct -ETIME return condition check posix-cpu-timers: Ensure set_process_cpu_timer is always evaluated timekeeping: Remove __current_kernel_time() timers: Remove stale struct tvec_base forward declaration clockevents: Fix kernel messages split across multiple lines
2018-04-22Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds1-12/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "A larger set of updates for perf. Kernel: - Handle the SBOX uncore monitoring correctly on Broadwell CPUs which do not have SBOX. - Store context switch out type in PERF_RECORD_SWITCH[_CPU_WIDE]. The percentage of preempting and non-preempting context switches help understanding the nature of workloads (CPU or IO bound) that are running on a machine. This adds the kernel facility and userspace changes needed to show this information in 'perf script' and 'perf report -D' (Alexey Budankov) - Remove a WARN_ON() in the trace/kprobes code which is pointless because the return error code is already telling the caller what's wrong. - Revert a fugly workaround for clang BPF targets. - Fix sample_max_stack maximum check and do not proceed when an error has been detect, return them to avoid misidentifying errors (Jiri Olsa) - Add SPDX idenitifiers and get rid of GPL boilderplate. Tools: - Synchronize kernel ABI headers, v4.17-rc1 (Ingo Molnar) - Support MAP_FIXED_NOREPLACE, noticed when updating the tools/include/ copies (Arnaldo Carvalho de Melo) - Add '\n' at the end of parse-options error messages (Ravi Bangoria) - Add s390 support for detailed/verbose PMU event description (Thomas Richter) - perf annotate fixes and improvements: * Allow showing offsets in more than just jump targets, use the new 'O' hotkey in the TUI, config ~/.perfconfig annotate.offset_level for it and for --stdio2 (Arnaldo Carvalho de Melo) * Use the resolved variable names from objdump disassembled lines to make them more compact, just like was already done for some instructions, like "mov", this eventually will be done more generally, but lets now add some more to the existing mechanism (Arnaldo Carvalho de Melo) - perf record fixes: * Change warning for missing topology sysfs entry to debug, as not all architectures have those files, s390 being one of those (Thomas Richter) * Remove old error messages about things that unlikely to be the root cause in modern systems (Andi Kleen) - perf sched fixes: * Fix -g/--call-graph documentation (Takuya Yamamoto) - perf stat: * Enable 1ms interval for printing event counters values in (Alexey Budankov) - perf test fixes: * Run dwarf unwind on arm32 (Kim Phillips) * Remove unused ptrace.h include from LLVM test, sidesteping older clang's lack of support for some asm constructs (Arnaldo Carvalho de Melo) * Fixup BPF test using epoll_pwait syscall function probe, to cope with the syscall routines renames performed in this development cycle (Arnaldo Carvalho de Melo) - perf version fixes: * Do not print info about HAVE_LIBAUDIT_SUPPORT in 'perf version --build-options' when HAVE_SYSCALL_TABLE_SUPPORT is true, as libaudit won't be used in that case, print info about syscall_table support instead (Jin Yao) - Build system fixes: * Use HAVE_..._SUPPORT used consistently (Jin Yao) * Restore READ_ONCE() C++ compatibility in tools/include (Mark Rutland) * Give hints about package names needed to build jvmti (Arnaldo Carvalho de Melo)" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits) perf/x86/intel/uncore: Fix SBOX support for Broadwell CPUs perf/x86/intel/uncore: Revert "Remove SBOX support for Broadwell server" coresight: Move to SPDX identifier perf test BPF: Fixup BPF test using epoll_pwait syscall function probe perf tests mmap: Show which tracepoint is failing perf tools: Add '\n' at the end of parse-options error messages perf record: Remove suggestion to enable APIC perf record: Remove misleading error suggestion perf hists browser: Clarify top/report browser help perf mem: Allow all record/report options perf trace: Support MAP_FIXED_NOREPLACE perf: Remove superfluous allocation error check perf: Fix sample_max_stack maximum check perf: Return proper values for user stack errors perf list: Add s390 support for detailed/verbose PMU event description perf script: Extend misc field decoding with switch out event type perf report: Extend raw dump (-D) out with switch out event type perf/core: Store context switch out type in PERF_RECORD_SWITCH[_CPU_WIDE] tools/headers: Synchronize kernel ABI headers, v4.17-rc1 trace_kprobe: Remove warning message "Could not insert probe at..." ...
2018-04-21kasan: add no_sanitize attribute for clang buildsAndrey Konovalov1-0/+3
KASAN uses the __no_sanitize_address macro to disable instrumentation of particular functions. Right now it's defined only for GCC build, which causes false positives when clang is used. This patch adds a definition for clang. Note, that clang's revision 329612 or higher is required. [andreyknvl@google.com: remove redundant #ifdef CONFIG_KASAN check] Link: http://lkml.kernel.org/r/c79aa31a2a2790f6131ed607c58b0dd45dd62a6c.1523967959.git.andreyknvl@google.com Link: http://lkml.kernel.org/r/4ad725cc903f8534f8c8a60f0daade5e3d674f8d.1523554166.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Paul Lawrence <paullawrence@google.com> Cc: Sandipan Das <sandipan@linux.vnet.ibm.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-21writeback: safer lock nestingGreg Thelen2-14/+21
lock_page_memcg()/unlock_page_memcg() use spin_lock_irqsave/restore() if the page's memcg is undergoing move accounting, which occurs when a process leaves its memcg for a new one that has memory.move_charge_at_immigrate set. unlocked_inode_to_wb_begin,end() use spin_lock_irq/spin_unlock_irq() if the given inode is switching writeback domains. Switches occur when enough writes are issued from a new domain. This existing pattern is thus suspicious: lock_page_memcg(page); unlocked_inode_to_wb_begin(inode, &locked); ... unlocked_inode_to_wb_end(inode, locked); unlock_page_memcg(page); If both inode switch and process memcg migration are both in-flight then unlocked_inode_to_wb_end() will unconditionally enable interrupts while still holding the lock_page_memcg() irq spinlock. This suggests the possibility of deadlock if an interrupt occurs before unlock_page_memcg(). truncate __cancel_dirty_page lock_page_memcg unlocked_inode_to_wb_begin unlocked_inode_to_wb_end <interrupts mistakenly enabled> <interrupt> end_page_writeback test_clear_page_writeback lock_page_memcg <deadlock> unlock_page_memcg Due to configuration limitations this deadlock is not currently possible because we don't mix cgroup writeback (a cgroupv2 feature) and memory.move_charge_at_immigrate (a cgroupv1 feature). If the kernel is hacked to always claim inode switching and memcg moving_account, then this script triggers lockup in less than a minute: cd /mnt/cgroup/memory mkdir a b echo 1 > a/memory.move_charge_at_immigrate echo 1 > b/memory.move_charge_at_immigrate ( echo $BASHPID > a/cgroup.procs while true; do dd if=/dev/zero of=/mnt/big bs=1M count=256 done ) & while true; do sync done & sleep 1h & SLEEP=$! while true; do echo $SLEEP > a/cgroup.procs echo $SLEEP > b/cgroup.procs done The deadlock does not seem possible, so it's debatable if there's any reason to modify the kernel. I suggest we should to prevent future surprises. And Wang Long said "this deadlock occurs three times in our environment", so there's more reason to apply this, even to stable. Stable 4.4 has minor conflicts applying this patch. For a clean 4.4 patch see "[PATCH for-4.4] writeback: safer lock nesting" https://lkml.org/lkml/2018/4/11/146 Wang Long said "this deadlock occurs three times in our environment" [gthelen@google.com: v4] Link: http://lkml.kernel.org/r/20180411084653.254724-1-gthelen@google.com [akpm@linux-foundation.org: comment tweaks, struct initialization simplification] Change-Id: Ibb773e8045852978f6207074491d262f1b3fb613 Link: http://lkml.kernel.org/r/20180410005908.167976-1-gthelen@google.com Fixes: 682aa8e1a6a1 ("writeback: implement unlocked_inode_to_wb transaction and use it for stat updates") Signed-off-by: Greg Thelen <gthelen@google.com> Reported-by: Wang Long <wanglong19@meituan.com> Acked-by: Wang Long <wanglong19@meituan.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: <stable@vger.kernel.org> [v4.2+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-21fork: unconditionally clear stack on forkKees Cook1-5/+1
One of the classes of kernel stack content leaks[1] is exposing the contents of prior heap or stack contents when a new process stack is allocated. Normally, those stacks are not zeroed, and the old contents remain in place. In the face of stack content exposure flaws, those contents can leak to userspace. Fixing this will make the kernel no longer vulnerable to these flaws, as the stack will be wiped each time a stack is assigned to a new process. There's not a meaningful change in runtime performance; it almost looks like it provides a benefit. Performing back-to-back kernel builds before: Run times: 157.86 157.09 158.90 160.94 160.80 Mean: 159.12 Std Dev: 1.54 and after: Run times: 159.31 157.34 156.71 158.15 160.81 Mean: 158.46 Std Dev: 1.46 Instead of making this a build or runtime config, Andy Lutomirski recommended this just be enabled by default. [1] A noisy search for many kinds of stack content leaks can be seen here: https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=linux+kernel+stack+leak I did some more with perf and cycle counts on running 100,000 execs of /bin/true. before: Cycles: 218858861551 218853036130 214727610969 227656844122 224980542841 Mean: 221015379122.60 Std Dev: 4662486552.47 after: Cycles: 213868945060 213119275204 211820169456 224426673259 225489986348 Mean: 217745009865.40 Std Dev: 5935559279.99 It continues to look like it's faster, though the deviation is rather wide, but I'm not sure what I could do that would be less noisy. I'm open to ideas! Link: http://lkml.kernel.org/r/20180221021659.GA37073@beast Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Laura Abbott <labbott@redhat.com> Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds3-4/+15
Pull networking fixes from David Miller: 1) Unbalanced refcounting in TIPC, from Jon Maloy. 2) Only allow TCP_MD5SIG to be set on sockets in close or listen state. Once the connection is established it makes no sense to change this. From Eric Dumazet. 3) Missing attribute validation in neigh_dump_table(), also from Eric Dumazet. 4) Fix address comparisons in SCTP, from Xin Long. 5) Neigh proxy table clearing can deadlock, from Wolfgang Bumiller. 6) Fix tunnel refcounting in l2tp, from Guillaume Nault. 7) Fix double list insert in team driver, from Paolo Abeni. 8) af_vsock.ko module was accidently made unremovable, from Stefan Hajnoczi. 9) Fix reference to freed llc_sap object in llc stack, from Cong Wang. 10) Don't assume netdevice struct is DMA'able memory in virtio_net driver, from Michael S. Tsirkin. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (62 commits) net/smc: fix shutdown in state SMC_LISTEN bnxt_en: Fix memory fault in bnxt_ethtool_init() virtio_net: sparse annotation fix virtio_net: fix adding vids on big-endian virtio_net: split out ctrl buffer net: hns: Avoid action name truncation docs: ip-sysctl.txt: fix name of some ipv6 variables vmxnet3: fix incorrect dereference when rxvlan is disabled llc: hold llc_sap before release_sock() MAINTAINERS: Direct networking documentation changes to netdev atm: iphase: fix spelling mistake: "Tansmit" -> "Transmit" net: qmi_wwan: add Wistron Neweb D19Q1 net: caif: fix spelling mistake "UKNOWN" -> "UNKNOWN" net: stmmac: Disable ACS Feature for GMAC >= 4 net: mvpp2: Fix DMA address mask size net: change the comment of dev_mc_init net: qualcomm: rmnet: Fix warning seen with fill_info tun: fix vlan packet truncation tipc: fix infinite loop when dumping link monitor summary tipc: fix use-after-free in tipc_nametbl_stop ...
2018-04-20Merge branch 'for-linus' of ↵Linus Torvalds1-2/+5
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "Assorted fixes. Some of that is only a matter with fault injection (broken handling of small allocation failure in various mount-related places), but the last one is a root-triggerable stack overflow, and combined with userns it gets really nasty ;-/" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: Don't leak MNT_INTERNAL away from internal mounts mm,vmscan: Allow preallocating memory for register_shrinker(). rpc_pipefs: fix double-dput() orangefs_kill_sb(): deal with allocation failures jffs2_kill_sb(): deal with failed allocations hypfs_kill_super(): deal with failed allocations
2018-04-20Merge tag 'for_v4.17-rc2' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs - isofs memory leak fix - two fsnotify fixes of event mask handling - udf fix of UTF-16 handling - couple other smaller cleanups * tag 'for_v4.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: udf: Fix leak of UTF-16 surrogates into encoded strings fs: ext2: Adding new return type vm_fault_t isofs: fix potential memory leak in mount option parsing MAINTAINERS: add an entry for FSNOTIFY infrastructure fsnotify: fix typo in a comment about mark->g_list fsnotify: fix ignore mask logic in send_to_group() isofs compress: Remove VLA usage fs: quota: Replace GFP_ATOMIC with GFP_KERNEL in dquot_init fanotify: fix logic of events on child
2018-04-20Merge branch 'for-linus' of ↵Linus Torvalds1-1/+8
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid Pull HID updates from Jiri Kosina: - suspend/resume handling fix for Raydium I2C-connected touchscreen from Aaron Ma - protocol fixup for certain BT-connected Wacoms from Aaron Armstrong Skomra - battery level reporting fix on BT-connected mice from Dmitry Torokhov - hidraw race condition fix from Rodrigo Rivas Costa * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: HID: i2c-hid: fix inverted return value from i2c_hid_command() HID: i2c-hid: Fix resume issue on Raydium touchscreen device HID: wacom: bluetooth: send exit report for recent Bluetooth devices HID: hidraw: Fix crash on HIDIOCGFEATURE with a destroyed device HID: input: fix battery level reporting on BT mice
2018-04-20Merge branch 'for-linus' of ↵Linus Torvalds1-6/+13
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching Pull livepatching fix from Jiri Kosina: "Shadow variable API list_head initialization fix from Petr Mladek" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching: livepatch: Allow to call a custom callback when freeing shadow variables livepatch: Initialize shadow variables safely by a custom callback
2018-04-19fsnotify: Fix fsnotify_mark_connector raceRobert Kolchmeyer1-3/+1
fsnotify() acquires a reference to a fsnotify_mark_connector through the SRCU-protected pointer to_tell->i_fsnotify_marks. However, it appears that no precautions are taken in fsnotify_put_mark() to ensure that fsnotify() drops its reference to this fsnotify_mark_connector before assigning a value to its 'destroy_next' field. This can result in fsnotify_put_mark() assigning a value to a connector's 'destroy_next' field right before fsnotify() tries to traverse the linked list referenced by the connector's 'list' field. Since these two fields are members of the same union, this behavior results in a kernel panic. This issue is resolved by moving the connector's 'destroy_next' field into the object pointer union. This should work since the object pointer access is protected by both a spinlock and the value of the 'flags' field, and the 'flags' field is cleared while holding the spinlock in fsnotify_put_mark() before 'destroy_next' is updated. It shouldn't be possible for another thread to accidentally read from the object pointer after the 'destroy_next' field is updated. The offending behavior here is extremely unlikely; since fsnotify_put_mark() removes references to a connector (specifically, it ensures that the connector is unreachable from the inode it was formerly attached to) before updating its 'destroy_next' field, a sizeable chunk of code in fsnotify_put_mark() has to execute in the short window between when fsnotify() acquires the connector reference and saves the value of its 'list' field. On the HEAD kernel, I've only been able to reproduce this by inserting a udelay(1) in fsnotify(). However, I've been able to reproduce this issue without inserting a udelay(1) anywhere on older unmodified release kernels, so I believe it's worth fixing at HEAD. References: https://bugzilla.kernel.org/show_bug.cgi?id=199437 Fixes: 08991e83b7286635167bab40927665a90fb00d81 CC: stable@vger.kernel.org Signed-off-by: Robert Kolchmeyer <rkolchmeyer@google.com> Signed-off-by: Jan Kara <jack@suse.cz>
2018-04-19coresight: Move to SPDX identifierMathieu Poirier1-12/+1
Move CoreSight headers to the SPDX identifier. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1524089118-27595-1-git-send-email-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-19scsi: sd_zbc: Avoid that resetting a zone fails sporadicallyBart Van Assche1-0/+5
Since SCSI scanning occurs asynchronously, since sd_revalidate_disk() is called from sd_probe_async() and since sd_revalidate_disk() calls sd_zbc_read_zones() it can happen that sd_zbc_read_zones() is called concurrently with blkdev_report_zones() and/or blkdev_reset_zones(). That can cause these functions to fail with -EIO because sd_zbc_read_zones() e.g. sets q->nr_zones to zero before restoring it to the actual value, even if no drive characteristics have changed. Avoid that this can happen by making the following changes: - Protect the code that updates zone information with blk_queue_enter() and blk_queue_exit(). - Modify sd_zbc_setup_seq_zones_bitmap() and sd_zbc_setup() such that these functions do not modify struct scsi_disk before all zone information has been obtained. Note: since commit 055f6e18e08f ("block: Make q_usage_counter also track legacy requests"; kernel v4.15) the request queue freezing mechanism also affects legacy request queues. Fixes: 89d947561077 ("sd: Implement support for ZBC devices") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Damien Le Moal <damien.lemoal@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: stable@vger.kernel.org # v4.16 Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18block: add blk_queue_fua() helper functionDave Chinner1-0/+1
So we can check FUA support status from the iomap direct IO code. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-04-17vlan: Fix reading memory beyond skb->tail in skb_vlan_tagged_multiToshiaki Makita1-2/+5
Syzkaller spotted an old bug which leads to reading skb beyond tail by 4 bytes on vlan tagged packets. This is caused because skb_vlan_tagged_multi() did not check skb_headlen. BUG: KMSAN: uninit-value in eth_type_vlan include/linux/if_vlan.h:283 [inline] BUG: KMSAN: uninit-value in skb_vlan_tagged_multi include/linux/if_vlan.h:656 [inline] BUG: KMSAN: uninit-value in vlan_features_check include/linux/if_vlan.h:672 [inline] BUG: KMSAN: uninit-value in dflt_features_check net/core/dev.c:2949 [inline] BUG: KMSAN: uninit-value in netif_skb_features+0xd1b/0xdc0 net/core/dev.c:3009 CPU: 1 PID: 3582 Comm: syzkaller435149 Not tainted 4.16.0+ #82 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:53 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676 eth_type_vlan include/linux/if_vlan.h:283 [inline] skb_vlan_tagged_multi include/linux/if_vlan.h:656 [inline] vlan_features_check include/linux/if_vlan.h:672 [inline] dflt_features_check net/core/dev.c:2949 [inline] netif_skb_features+0xd1b/0xdc0 net/core/dev.c:3009 validate_xmit_skb+0x89/0x1320 net/core/dev.c:3084 __dev_queue_xmit+0x1cb2/0x2b60 net/core/dev.c:3549 dev_queue_xmit+0x4b/0x60 net/core/dev.c:3590 packet_snd net/packet/af_packet.c:2944 [inline] packet_sendmsg+0x7c57/0x8a10 net/packet/af_packet.c:2969 sock_sendmsg_nosec net/socket.c:630 [inline] sock_sendmsg net/socket.c:640 [inline] sock_write_iter+0x3b9/0x470 net/socket.c:909 do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776 do_iter_write+0x30d/0xd40 fs/read_write.c:932 vfs_writev fs/read_write.c:977 [inline] do_writev+0x3c9/0x830 fs/read_write.c:1012 SYSC_writev+0x9b/0xb0 fs/read_write.c:1085 SyS_writev+0x56/0x80 fs/read_write.c:1082 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x43ffa9 RSP: 002b:00007fff2cff3948 EFLAGS: 00000217 ORIG_RAX: 0000000000000014 RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043ffa9 RDX: 0000000000000001 RSI: 0000000020000080 RDI: 0000000000000003 RBP: 00000000006cb018 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000217 R12: 00000000004018d0 R13: 0000000000401960 R14: 0000000000000000 R15: 0000000000000000 Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline] kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188 kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314 kmsan_slab_alloc+0x11/0x20 mm/kmsan/kmsan.c:321 slab_post_alloc_hook mm/slab.h:445 [inline] slab_alloc_node mm/slub.c:2737 [inline] __kmalloc_node_track_caller+0xaed/0x11c0 mm/slub.c:4369 __kmalloc_reserve net/core/skbuff.c:138 [inline] __alloc_skb+0x2cf/0x9f0 net/core/skbuff.c:206 alloc_skb include/linux/skbuff.h:984 [inline] alloc_skb_with_frags+0x1d4/0xb20 net/core/skbuff.c:5234 sock_alloc_send_pskb+0xb56/0x1190 net/core/sock.c:2085 packet_alloc_skb net/packet/af_packet.c:2803 [inline] packet_snd net/packet/af_packet.c:2894 [inline] packet_sendmsg+0x6444/0x8a10 net/packet/af_packet.c:2969 sock_sendmsg_nosec net/socket.c:630 [inline] sock_sendmsg net/socket.c:640 [inline] sock_write_iter+0x3b9/0x470 net/socket.c:909 do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776 do_iter_write+0x30d/0xd40 fs/read_write.c:932 vfs_writev fs/read_write.c:977 [inline] do_writev+0x3c9/0x830 fs/read_write.c:1012 SYSC_writev+0x9b/0xb0 fs/read_write.c:1085 SyS_writev+0x56/0x80 fs/read_write.c:1082 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Fixes: 58e998c6d239 ("offloading: Force software GSO for multiple vlan tags.") Reported-and-tested-by: syzbot+0bbe42c764feafa82c5a@syzkaller.appspotmail.com Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17timekeeping: Remove __current_kernel_time()Baolin Wang1-3/+0
The __current_kernel_time() function based on 'struct timespec' is no longer recommended for new code, and the only user of this function has been replaced by commit 6909e29fdefb ("kdb: use __ktime_get_real_seconds instead of __current_kernel_time"). Remove the obsolete interface. Signed-off-by: Baolin Wang <baolin.wang@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: arnd@arndb.de Cc: sboyd@kernel.org Cc: broonie@kernel.org Cc: john.stultz@linaro.org Link: https://lkml.kernel.org/r/1a9dbea7ee2cda7efe9ed330874075cf17fdbff6.1523596316.git.baolin.wang@linaro.org
2018-04-17timers: Remove stale struct tvec_base forward declarationLiu, Changcheng1-2/+0
struct tvec_base is a leftover of the original timer wheel implementation and not longer used. Remove the forward declaration. Signed-off-by: Liu Changcheng <changcheng.liu@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: akpm@linux-foundation.org Link: https://lkml.kernel.org/r/20180412075701.GA38952@sofia
2018-04-17livepatch: Allow to call a custom callback when freeing shadow variablesPetr Mladek1-2/+3
We might need to do some actions before the shadow variable is freed. For example, we might need to remove it from a list or free some data that it points to. This is already possible now. The user can get the shadow variable by klp_shadow_get(), do the necessary actions, and then call klp_shadow_free(). This patch allows to do it a more elegant way. The user could implement the needed actions in a callback that is passed to klp_shadow_free() as a parameter. The callback usually does reverse operations to the constructor callback that can be called by klp_shadow_*alloc(). It is especially useful for klp_shadow_free_all(). There we need to do these extra actions for each found shadow variable with the given ID. Note that the memory used by the shadow variable itself is still released later by rcu callback. It is needed to protect internal structures that keep all shadow variables. But the destructor is called immediately. The shadow variable must not be access anyway after klp_shadow_free() is called. The user is responsible to protect this any suitable way. Be aware that the destructor is called under klp_shadow_lock. It is the same as for the contructor in klp_shadow_alloc(). Signed-off-by: Petr Mladek <pmladek@suse.com> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2018-04-17livepatch: Initialize shadow variables safely by a custom callbackPetr Mladek1-4/+10
The existing API allows to pass a sample data to initialize the shadow data. It works well when the data are position independent. But it fails miserably when we need to set a pointer to the shadow structure itself. Unfortunately, we might need to initialize the pointer surprisingly often because of struct list_head. It is even worse because the list might be hidden in other common structures, for example, struct mutex, struct wait_queue_head. For example, this was needed to fix races in ALSA sequencer. It required to add mutex into struct snd_seq_client. See commit b3defb791b26ea06 ("ALSA: seq: Make ioctls race-free") and commit d15d662e89fc667b9 ("ALSA: seq: Fix racy pool initializations") This patch makes the API more safe. A custom constructor function and data are passed to klp_shadow_*alloc() functions instead of the sample data. Note that ctor_data are no longer a template for shadow->data. It might point to any data that might be necessary when the constructor is called. Also note that the constructor is called under klp_shadow_lock. It is an internal spin_lock that synchronizes alloc() vs. get() operations, see klp_shadow_get_or_alloc(). On one hand, this adds a risk of ABBA deadlocks. On the other hand, it allows to do some operations safely. For example, we could add the new structure into an existing list. This must be done only once when the structure is allocated. Reported-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Petr Mladek <pmladek@suse.com> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2018-04-17textsearch: fix kernel-doc warnings and add kernel-api sectionRandy Dunlap1-2/+2
Make lib/textsearch.c usable as kernel-doc. Add textsearch() function family to kernel-api documentation. Fix kernel-doc warnings in <linux/textsearch.h>: ../include/linux/textsearch.h:65: warning: Incorrect use of kernel-doc format: * get_next_block - fetch next block of data ../include/linux/textsearch.h:82: warning: Incorrect use of kernel-doc format: * finish - finalize/clean a series of get_next_block() calls Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-16mm,vmscan: Allow preallocating memory for register_shrinker().Tetsuo Handa1-2/+5
syzbot is catching so many bugs triggered by commit 9ee332d99e4d5a97 ("sget(): handle failures of register_shrinker()"). That commit expected that calling kill_sb() from deactivate_locked_super() without successful fill_super() is safe, but the reality was different; some callers assign attributes which are needed for kill_sb() after sget() succeeds. For example, [1] is a report where sb->s_mode (which seems to be either FMODE_READ | FMODE_EXCL | FMODE_WRITE or FMODE_READ | FMODE_EXCL) is not assigned unless sget() succeeds. But it does not worth complicate sget() so that register_shrinker() failure path can safely call kill_block_super() via kill_sb(). Making alloc_super() fail if memory allocation for register_shrinker() failed is much simpler. Let's avoid calling deactivate_locked_super() from sget_userns() by preallocating memory for the shrinker and making register_shrinker() in sget_userns() never fail. [1] https://syzkaller.appspot.com/bug?id=588996a25a2587be2e3a54e8646728fb9cae44e7 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot <syzbot+5a170e19c963a2e0df79@syzkaller.appspotmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-04-16Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds2-19/+74
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A set of fixes and updates for x86: - Address a swiotlb regression which was caused by the recent DMA rework and made driver fail because dma_direct_supported() returned false - Fix a signedness bug in the APIC ID validation which caused invalid APIC IDs to be detected as valid thereby bloating the CPU possible space. - Fix inconsisten config dependcy/select magic for the MFD_CS5535 driver. - Fix a corruption of the physical address space bits when encryption has reduced the address space and late cpuinfo updates overwrite the reduced bit information with the original value. - Dominiks syscall rework which consolidates the architecture specific syscall functions so all syscalls can be wrapped with the same macros. This allows to switch x86/64 to struct pt_regs based syscalls. Extend the clearing of user space controlled registers in the entry patch to the lower registers" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Fix signedness bug in APIC ID validity checks x86/cpu: Prevent cpuinfo_x86::x86_phys_bits adjustment corruption x86/olpc: Fix inconsistent MFD_CS5535 configuration swiotlb: Use dma_direct_supported() for swiotlb_ops syscalls/x86: Adapt syscall_wrapper.h to the new syscall stub naming convention syscalls/core, syscalls/x86: Rename struct pt_regs-based sys_*() to __x64_sys_*() syscalls/core, syscalls/x86: Clean up compat syscall stub naming convention syscalls/core, syscalls/x86: Clean up syscall stub naming convention syscalls/x86: Extend register clearing on syscall entry to lower registers syscalls/x86: Unconditionally enable 'struct pt_regs' based syscalls on x86_64 syscalls/x86: Use 'struct pt_regs' based syscall calling for IA32_EMULATION and x32 syscalls/core: Prepare CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y for compat syscalls syscalls/x86: Use 'struct pt_regs' based syscall calling convention for 64-bit syscalls syscalls/core: Introduce CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y x86/syscalls: Don't pointlessly reload the system call number x86/mm: Fix documentation of module mapping range with 4-level paging x86/cpuid: Switch to 'static const' specifier
2018-04-15Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Thomas Gleixner: "A few scheduler fixes: - Prevent a bogus warning vs. runqueue clock update flags in do_sched_rt_period_timer() - Simplify the helper functions which handle requests for skipping the runqueue clock updat. - Do not unlock the tunables mutex in the error path of the cpu frequency scheduler utils. Its not held. - Enforce proper alignement for 'struct util_est' in sched_avg to prevent a misalignment fault on IA64" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/core: Force proper alignment of 'struct util_est' sched/core: Simplify helpers for rq clock update skip requests sched/rt: Fix rq->clock_update_flags < RQCF_ACT_SKIP warning sched/cpufreq/schedutil: Fix error path mutex unlock
2018-04-14Merge branch 'akpm' (patches from Andrew)Linus Torvalds2-28/+83
Merge yet more updates from Andrew Morton: - various hotfixes - kexec_file updates and feature work * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (27 commits) kernel/kexec_file.c: move purgatories sha256 to common code kernel/kexec_file.c: allow archs to set purgatory load address kernel/kexec_file.c: remove mis-use of sh_offset field during purgatory load kernel/kexec_file.c: remove unneeded variables in kexec_purgatory_setup_sechdrs kernel/kexec_file.c: remove unneeded for-loop in kexec_purgatory_setup_sechdrs kernel/kexec_file.c: split up __kexec_load_puragory kernel/kexec_file.c: use read-only sections in arch_kexec_apply_relocations* kernel/kexec_file.c: search symbols in read-only kexec_purgatory kernel/kexec_file.c: make purgatory_info->ehdr const kernel/kexec_file.c: remove checks in kexec_purgatory_load include/linux/kexec.h: silence compile warnings kexec_file, x86: move re-factored code to generic side x86: kexec_file: clean up prepare_elf64_headers() x86: kexec_file: lift CRASH_MAX_RANGES limit on crash_mem buffer x86: kexec_file: remove X86_64 dependency from prepare_elf64_headers() x86: kexec_file: purge system-ram walking from prepare_elf64_headers() kexec_file,x86,powerpc: factor out kexec_file_ops functions kexec_file: make use of purgatory optional proc: revalidate misc dentries mm, slab: reschedule cache_reap() on the same CPU ...
2018-04-14kernel/kexec_file.c: move purgatories sha256 to common codePhilipp Rudo1-0/+30
The code to verify the new kernels sha digest is applicable for all architectures. Move it to common code. One problem is the string.c implementation on x86. Currently sha256 includes x86/boot/string.h which defines memcpy and memset to be gcc builtins. By moving the sha256 implementation to common code and changing the include to linux/string.h both functions are no longer defined. Thus definitions have to be provided in x86/purgatory/string.c Link: http://lkml.kernel.org/r/20180321112751.22196-12-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14kernel/kexec_file.c: allow archs to set purgatory load addressPhilipp Rudo1-11/+6
For s390 new kernels are loaded to fixed addresses in memory before they are booted. With the current code this is a problem as it assumes the kernel will be loaded to an 'arbitrary' address. In particular, kexec_locate_mem_hole searches for a large enough memory region and sets the load address (kexec_bufer->mem) to it. Luckily there is a simple workaround for this problem. By returning 1 in arch_kexec_walk_mem, kexec_locate_mem_hole is turned off. This allows the architecture to set kbuf->mem by hand. While the trick works fine for the kernel it does not for the purgatory as here the architectures don't have access to its kexec_buffer. Give architectures access to the purgatories kexec_buffer by changing kexec_load_purgatory to take a pointer to it. With this change architectures have access to the buffer and can edit it as they need. A nice side effect of this change is that we can get rid of the purgatory_info->purgatory_load_address field. As now the information stored there can directly be accessed from kbuf->mem. Link: http://lkml.kernel.org/r/20180321112751.22196-11-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14kernel/kexec_file.c: use read-only sections in arch_kexec_apply_relocations*Philipp Rudo1-4/+9
When the relocations are applied to the purgatory only the section the relocations are applied to is writable. The other sections, i.e. the symtab and .rel/.rela, are in read-only kexec_purgatory. Highlight this by marking the corresponding variables as 'const'. While at it also change the signatures of arch_kexec_apply_relocations* to take section pointers instead of just the index of the relocation section. This removes the second lookup and sanity check of the sections in arch code. Link: http://lkml.kernel.org/r/20180321112751.22196-6-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14kernel/kexec_file.c: make purgatory_info->ehdr constPhilipp Rudo1-6/+11
The kexec_purgatory buffer is read-only. Thus all pointers into kexec_purgatory are read-only, too. Point this out by explicitly marking purgatory_info->ehdr as 'const' and update the comments in purgatory_info. Link: http://lkml.kernel.org/r/20180321112751.22196-4-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14include/linux/kexec.h: silence compile warningsPhilipp Rudo1-0/+2
Patch series "kexec_file: Clean up purgatory load", v2. Following the discussion with Dave and AKASHI, here are the common code patches extracted from my recent patch set (Add kexec_file_load support to s390) [1]. The patches were extracted to allow upstream integration together with AKASHI's common code patches before the arch code gets adjusted to the new base. The reason for this series is to prepare common code for adding kexec_file_load to s390 as well as cleaning up the mis-use of the sh_offset field during purgatory load. In detail this series contains: Patch #1&2: Minor cleanups/fixes. Patch #3-9: Clean up the purgatory load/relocation code. Especially remove the mis-use of the purgatory_info->sechdrs->sh_offset field, currently holding a pointer into either kexec_purgatory (ro) or purgatory_buf (rw) depending on the section. With these patches the section address will be calculated verbosely and sh_offset will contain the offset of the section in the stripped purgatory binary (purgatory_buf). Patch #10: Allows architectures to set the purgatory load address. This patch is important for s390 as the kernel and purgatory have to be loaded to fixed addresses. In current code this is impossible as the purgatory load is opaque to the architecture. Patch #11: Moves x86 purgatories sha implementation to common lib/ directory to allow reuse in other architectures. This patch (of 11) When building the kernel with CONFIG_KEXEC_FILE enabled gcc prints a compile warning multiple times. In file included from <path>/linux/init/initramfs.c:526:0: <path>/include/linux/kexec.h:120:9: warning: `struct kimage' declared inside parameter list [enabled by default] unsigned long cmdline_len); ^ This is because the typedefs for kexec_file_load uses struct kimage before it is declared. Fix this by simply forward declaring struct kimage. Link: http://lkml.kernel.org/r/20180321112751.22196-2-prudo@linux.vnet.ibm.com Signed-off-by: Philipp Rudo <prudo@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14kexec_file, x86: move re-factored code to generic sideAKASHI Takahiro1-0/+19
In the previous patches, commonly-used routines, exclude_mem_range() and prepare_elf64_headers(), were carved out. Now place them in kexec common code. A prefix "crash_" is given to each of their names to avoid possible name collisions. Link: http://lkml.kernel.org/r/20180306102303.9063-8-takahiro.akashi@linaro.org Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Acked-by: Dave Young <dyoung@redhat.com> Tested-by: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14kexec_file,x86,powerpc: factor out kexec_file_ops functionsAKASHI Takahiro1-7/+6
As arch_kexec_kernel_image_{probe,load}(), arch_kimage_file_post_load_cleanup() and arch_kexec_kernel_verify_sig() are almost duplicated among architectures, they can be commonalized with an architecture-defined kexec_file_ops array. So let's factor them out. Link: http://lkml.kernel.org/r/20180306102303.9063-3-takahiro.akashi@linaro.org Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Acked-by: Dave Young <dyoung@redhat.com> Tested-by: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-14Merge branch 'next' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux Pull thermal management update from Zhang Rui: - Fix race condition in imx_thermal_probe() (Mikhail Lappo) - Add cooling device's statistics in sysfs (Viresh Kumar) * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: thermal: Add cooling device's statistics in sysfs thermal: imx: Fix race condition in imx_thermal_probe()
2018-04-14Merge branch 'dmi-for-linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging Pull dmi updates from Jean Delvare. * 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging: firmware: dmi_scan: Use lowercase letters for UUID firmware: dmi_scan: Add DMI_OEM_STRING support to dmi_matches firmware: dmi_scan: Fix UUID length safety check
2018-04-14Merge tag 'chrome-platform-for-linus-4.17' of ↵Linus Torvalds3-31/+5
git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform Pull chrome platform updates from Benson Leung: - a series from Dmitry to remove platform data from chromeos_laptop.c, which was the only user of platform data for the atmel_mxt_ts driver. - a series to clean up sysfs and debugfs for cros_ec - other misc cleanups * tag 'chrome-platform-for-linus-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform: (22 commits) platform/chrome: mfd/cros_ec_dev: Add sysfs entry to set keyboard wake lid angle platform/chrome: cros_ec_debugfs: Add PD port info to debugfs platform/chrome: cros_ec_debugfs: Use octal permissions '0444' platform/chrome: cros_ec_sysfs: use permission-specific DEVICE_ATTR variants platform/chrome: cros_ec_sysfs: introduce to_cros_ec_dev define. platform/chrome: cros_ec_sysfs: Modify error handling platform/chrome: cros_ec_lpc: Add support for Google devices using custom coreboot firmware platform/chrome: cros_ec_lpc: wake up from s2idle on Chrome EC Input: atmel_mxt_ts - remove platform data support platform/chrome: chromeos_laptop - discard data for unneeded boards platform/chrome: chromeos_laptop - use device properties for Pixel platform/chrome: chromeos_laptop - rely on I2C to set up interrupt trigger platform/chrome: chromeos_laptop - use I2C notifier to create devices platform/chrome: chromeos_laptop - parse DMI IRQ data once platform/chrome: chromeos_laptop - rework i2c peripherals initialization platform/chrome: chromeos_laptop - factor out getting IRQ from DMI platform/chrome: chromeos_laptop - introduce pr_fmt() platform/chrome: chromeos_laptop - stop setting suspend mode for Atmel devices platform/chrome: chromeos_laptop - add SPDX identifier Input: atmel_mxt_ts - switch ChromeOS ACPI devices to generic props ...
2018-04-14Merge tag 'clk-for-linus' of ↵Linus Torvalds6-9/+75
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk updates from Stephen Boyd: "The large diff this time around is from the addition of a new clk driver for the TI Davinci family of SoCs. So far those clks have been supported with a custom implementation of the clk API in the arch port instead of in the CCF. With this driver merged we're one step closer to having a single clk API implementation. The other large diff is from the Amlogic clk driver that underwent some major surgery to use regmap. Beyond that, the biggest hitter is Samsung which needed some reworks to properly handle clk provider power domains and a bunch of PLL rate updates. The core framework was fairly quiet this round, just getting some cleanups and small fixes for some of the more esoteric features. And the usual set of driver non-critical fixes, cleanups, and minor additions are here as well. Core: - Rejig clk_ops::init() to be a little earlier for phase/accuracy ops - debugfs ops macroized to shave some lines of boilerplate code - Always calculate the phase instead of caching it in clk_get_phase() - More __must_check on bulk clk APIs New Drivers: - TI's Davinci family of SoCs - Intel's Stratix10 SoC - stm32mp157 SoC - Allwinner H6 CCU - Silicon Labs SI544 clock generator chip - Renesas R-Car M3-N and V3H SoCs - i.MX6SLL SoCs Removed Drivers: - ST-Ericsson AB8540/9540 Updates: - Mediatek MT2701 and MT7622 audsys support and MT2712 updates - STM32F469 DSI and STM32F769 sdmmc2 support - GPIO clks can sleep now - Spreadtrum SC9860 RTC clks - Nvidia Tegra MBIST workarounds and various minor fixes - Rockchip phase handling fixes and a memory leak plugged - Renesas drivers switch to readl/writel from clk_readl/clk_writel - Renesas gained CPU (Z/Z2) and watchdog support - Rockchip rk3328 display clks and rk3399 1.6GHz PLL support - Qualcomm PM8921 PMIC XO buffers - Amlogic migrates to regmap APIs - TI Keystone clk latching support - Allwinner H3 and H5 video clk fixes - Broadcom BCM2835 PLLs needed another bit to enable - i.MX6SX CKO mux fix and i.MX7D Video PLL divider fix - i.MX6UL/ULL epdc_podf support - Hi3798CV200 COMBPHY0 and USB2_OTG_UTMI and phase support for eMMC" * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (233 commits) clk: davinci: add a reset lookup table for psc0 clk: imx: add clock driver for imx6sll dt-bindings: imx: update clock doc for imx6sll clk: imx: add new gate/gate2 wrapper funtion clk: imx: Add CLK_IS_CRITICAL flag for busy divider and busy mux clk: cs2000: set pm_ops in hibernate-compatible way clk: bcm2835: De-assert/assert PLL reset signal when appropriate clk: imx7d: Move clks_init_on before any clock operations clk: imx7d: Correct ahb clk parent select clk: imx7d: Correct dram pll type clk: imx7d: Add USB clock information clk: socfpga: stratix10: add clock driver for Stratix10 platform dt-bindings: documentation: add clock bindings information for Stratix10 clk: ti: fix flag space conflict with clkctrl clocks clk: uniphier: add additional ethernet clock lines for Pro4 clk: uniphier: add SATA clock control support clk: uniphier: add PCIe clock control support clk: Add driver for the si544 clock generator chip clk: davinci: Remove redundant dev_err calls clk: uniphier: add ethernet clock control support for PXs3 ...
2018-04-14Merge tag 'for-linus-20180413' of git://git.kernel.dk/linux-blockLinus Torvalds2-2/+1
Pull block fixes from Jens Axboe: "Followup fixes for this merge window. This contains: - Series from Ming, fixing corner cases in our CPU <-> queue mapping. This triggered repeated warnings on especially s390, but I also hit it in cpu hot plug/unplug testing while doing IO on NVMe on x86-64. - Another fix from Ming, ensuring that we always order budget and driver tag identically, avoiding a deadlock on QD=1 devices. - Loop locking regression fix from this merge window, from Omar. - Another loop locking fix, this time missing an unlock, from Tetsuo Handa. - Fix for racing IO submission with device removal from Bart. - sr reference fix from me, fixing a case where disk change or getevents can race with device removal. - Set of nvme fixes by way of Keith, from various contributors" * tag 'for-linus-20180413' of git://git.kernel.dk/linux-block: (28 commits) nvme: expand nvmf_check_if_ready checks nvme: Use admin command effects for admin commands nvmet: fix space padding in serial number nvme: check return value of init_srcu_struct function nvmet: Fix nvmet_execute_write_zeroes sector count nvme-pci: Separate IO and admin queue IRQ vectors nvme-pci: Remove unused queue parameter nvme-pci: Skip queue deletion if there are no queues nvme: target: fix buffer overflow nvme: don't send keep-alives to the discovery controller nvme: unexport nvme_start_keep_alive nvme-loop: fix kernel oops in case of unhandled command nvme: enforce 64bit offset for nvme_get_log_ext fn sr: get/drop reference to device in revalidate and check_events blk-mq: Revert "blk-mq: reimplement blk_mq_hw_queue_mapped" blk-mq: Avoid that submitting a bio concurrently with device removal triggers a crash backing: silence compiler warning using __printf blk-mq: remove code for dealing with remapping queue blk-mq: reimplement blk_mq_hw_queue_mapped blk-mq: don't check queue mapped in __blk_mq_delay_run_hw_queue() ...
2018-04-13fsnotify: fix typo in a comment about mark->g_listAmir Goldstein1-1/+1
Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2018-04-13firmware: dmi_scan: Add DMI_OEM_STRING support to dmi_matchesAlex Hung1-0/+1
OEM strings are defined by each OEM and they contain customized and useful OEM information. Supporting it provides more flexible uses of the dmi_matches function. Signed-off-by: Alex Hung <alex.hung@canonical.com> Signed-off-by: Jean Delvare <jdelvare@suse.de>
2018-04-13lan78xx: PHY DSP registers initialization to address EEE link drop issues ↵Raghuram Chary J1-0/+8
with long cables The patch is to configure DSP registers of PHY device to handle Gbe-EEE failures with >40m cable length. Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver") Signed-off-by: Raghuram Chary J <raghuramchary.jallipalli@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>