summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2019-06-29net/mlx5e: reduce stack usage in mlx5_eswitch_termtbl_createArnd Bergmann1-1/+1
Putting an empty 'mlx5_flow_spec' structure on the stack is a bit wasteful and causes a warning on 32-bit architectures when building with clang -fsanitize-coverage: drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_termtbl.c: In function 'mlx5_eswitch_termtbl_create': drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_termtbl.c:90:1: error: the frame size of 1032 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] Since the structure is never written to, we can statically allocate it to avoid the stack usage. To be on the safe side, mark all subsequent function arguments that we pass it into as 'const' as well. Fixes: 10caabdaad5a ("net/mlx5e: Use termination table for VLAN push actions") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-06-29Merge branch 'mlx5-next' of ↵Saeed Mahameed6-49/+105
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Misc updates from mlx5-next branch: 1) E-Switch vport metadata support for source vport matching 2) Convert mkey_table to XArray 3) Shared IRQs and to use single IRQ for all async EQs Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-06-29net: phylink: further documentation clarificationsRussell King1-2/+9
Clarify the validate() behaviour in a few cases which weren't mentioned in the documentation, but which are necessary for users to get the correct behaviour. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28Merge branch 'for-mingo' of ↵Ingo Molnar8-46/+55
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull rcu/next + tools/memory-model changes from Paul E. McKenney: - RCU flavor consolidation cleanups and optmizations - Documentation updates - Miscellaneous fixes - SRCU updates - RCU-sync flavor consolidation - Torture-test updates - Linux-kernel memory-consistency-model updates, most notably the addition of plain C-language accesses Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-28power: supply: add input power and voltage limit propertiesEnric Balletbo i Serra1-0/+2
For thermal management strategy you might be interested on limit the input power for a power supply. We already have current limit but basically what we probably want is to limit power. So, introduce the input_power_limit property. Although the common use case is limit the input power, in some specific cases it is the voltage that is problematic (i.e some regulators have different efficiencies at higher voltage resulting in more heat). So introduce also the input_voltage_limit property. This happens in one Chromebook and is used on the Pixel C's thermal management strategy to effectively limit the input power to 5V 3A when the screen is on. When the screen is on, the display, the CPU, and the GPU all contribute more heat to the system than while the screen is off, and we made a tradeoff to throttle the charger in order to give more of the thermal budget to those other components. So there's nothing fundamentally broken about the hardware that would cause the Pixel C to malfunction if we were charging at 9V or 12V instead of 5V when the screen is on, i.e. if userspace doesn't change this. What would happen is that you wouldn't meet Google's skin temperature targets on the system if the charger was allowed to run at 9V or 12V with the screen on. For folks hacking on Pixel Cs (which is now outside of Google's official support window for Android) and customizing their own kernel and userspace this would be acceptable, but we wanted to expose this feature in the power supply properties because the feature does exist in the Emedded Controller firmware of the Pixel C and all of Google's Chromebooks with USB-C made since 2015 in case someone running an up to date kernel wanted to limit the charging power for thermal or other reasons. This patch exposes a new property, similar to input current limit, to re-configure the maximum voltage from the external supply at runtime based on system-level knowledge or user input. Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Reviewed-by: Guenter Roeck <groeck@chromium.org> Acked-by: Adam Thomson <Adam.Thomson.Opensource@diasemi.com> Reviewed-by: Benson Leung <bleung@chromium.org> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2019-06-28pid: add pidfd_open()Christian Brauner1-0/+1
This adds the pidfd_open() syscall. It allows a caller to retrieve pollable pidfds for a process which did not get created via CLONE_PIDFD, i.e. for a process that is created via traditional fork()/clone() calls that is only referenced by a PID: int pidfd = pidfd_open(1234, 0); ret = pidfd_send_signal(pidfd, SIGSTOP, NULL, 0); With the introduction of pidfds through CLONE_PIDFD it is possible to created pidfds at process creation time. However, a lot of processes get created with traditional PID-based calls such as fork() or clone() (without CLONE_PIDFD). For these processes a caller can currently not create a pollable pidfd. This is a problem for Android's low memory killer (LMK) and service managers such as systemd. Both are examples of tools that want to make use of pidfds to get reliable notification of process exit for non-parents (pidfd polling) and race-free signal sending (pidfd_send_signal()). They intend to switch to this API for process supervision/management as soon as possible. Having no way to get pollable pidfds from PID-only processes is one of the biggest blockers for them in adopting this api. With pidfd_open() making it possible to retrieve pidfds for PID-based processes we enable them to adopt this api. In line with Arnd's recent changes to consolidate syscall numbers across architectures, I have added the pidfd_open() syscall to all architectures at the same time. Signed-off-by: Christian Brauner <christian@brauner.io> Reviewed-by: David Howells <dhowells@redhat.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Kees Cook <keescook@chromium.org> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jann Horn <jannh@google.com> Cc: Andy Lutomirsky <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-api@vger.kernel.org
2019-06-28pidfd: add polling supportJoel Fernandes (Google)1-0/+3
This patch adds polling support to pidfd. Android low memory killer (LMK) needs to know when a process dies once it is sent the kill signal. It does so by checking for the existence of /proc/pid which is both racy and slow. For example, if a PID is reused between when LMK sends a kill signal and checks for existence of the PID, since the wrong PID is now possibly checked for existence. Using the polling support, LMK will be able to get notified when a process exists in race-free and fast way, and allows the LMK to do other things (such as by polling on other fds) while awaiting the process being killed to die. For notification to polling processes, we follow the same existing mechanism in the kernel used when the parent of the task group is to be notified of a child's death (do_notify_parent). This is precisely when the tasks waiting on a poll of pidfd are also awakened in this patch. We have decided to include the waitqueue in struct pid for the following reasons: 1. The wait queue has to survive for the lifetime of the poll. Including it in task_struct would not be option in this case because the task can be reaped and destroyed before the poll returns. 2. By including the struct pid for the waitqueue means that during de_thread(), the new thread group leader automatically gets the new waitqueue/pid even though its task_struct is different. Appropriate test cases are added in the second patch to provide coverage of all the cases the patch is handling. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Daniel Colascione <dancol@google.com> Cc: Jann Horn <jannh@google.com> Cc: Tim Murray <timmurray@google.com> Cc: Jonathan Kowalski <bl0pbl33p@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Kees Cook <keescook@chromium.org> Cc: David Howells <dhowells@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: kernel-team@android.com Reviewed-by: Oleg Nesterov <oleg@redhat.com> Co-developed-by: Daniel Colascione <dancol@google.com> Signed-off-by: Daniel Colascione <dancol@google.com> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Christian Brauner <christian@brauner.io>
2019-06-28lib/genalloc.c: Add algorithm, align and zeroed family of DMA allocatorsFredrik Noring1-1/+9
Provide the algorithm option to DMA allocators as well, along with convenience variants for zeroed and aligned memory. The following four functions are added: - gen_pool_dma_alloc_algo() - gen_pool_dma_alloc_align() - gen_pool_dma_zalloc_algo() - gen_pool_dma_zalloc_align() Signed-off-by: Fredrik Noring <noring@nocrew.org> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-06-28x86/kdump/64: Restrict kdump kernel reservation to <64TBBaoquan He1-0/+1
Restrict kdump to only reserve crashkernel below 64TB. The reaons is that the kdump may jump from a 5-level paging mode to a 4-level paging mode kernel. If a 4-level paging mode kdump kernel is put above 64TB, then the kdump kernel cannot start. The 1st kernel reserves the kdump kernel region during bootup. At that point it is not known whether the kdump kernel has 5-level or 4-level paging support. To support both restrict the kdump kernel reservation to the lower 64TB address space to ensure that a 4-level paging mode kdump kernel can be loaded and successfully started. [ tglx: Massaged changelog ] Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: bp@alien8.de Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/20190524073810.24298-4-bhe@redhat.com
2019-06-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2-0/+4
The new route handling in ip_mc_finish_output() from 'net' overlapped with the new support for returning congestion notifications from BPF programs. In order to handle this I had to take the dev_loopback_xmit() calls out of the switch statement. The aquantia driver conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-28Merge branch 'for-linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - fix for one corner case in HID++ protocol with respect to handling very long reports, from Hans de Goede - power management fix in Intel-ISH driver, from Hyungwoo Yang - use-after-free fix in Intel-ISH driver, from Dan Carpenter - a couple of new device IDs/quirks from Kai-Heng Feng, Kyle Godbey and Oleksandr Natalenko * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: intel-ish-hid: fix wrong driver_data usage HID: multitouch: Add pointstick support for ALPS Touchpad HID: logitech-dj: Fix forwarding of very long HID++ reports HID: uclogic: Add support for Huion HS64 tablet HID: chicony: add another quirk for PixArt mouse HID: intel-ish-hid: Fix a use after free in load_fw_from_host()
2019-06-28iomap: don't mark the inode dirty in iomap_write_endAndreas Gruenbacher1-0/+1
Marking the inode dirty for each page copied into the page cache can be very inefficient for file systems that use the VFS dirty inode tracking, and is completely pointless for those that don't use the VFS dirty inode tracking. So instead, only set an iomap flag when changing the in-core inode size, and open code the rest of __generic_write_end. Partially based on code from Christoph Hellwig. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-28bpf: implement getsockopt and setsockopt hooksStanislav Fomichev4-0/+58
Implement new BPF_PROG_TYPE_CGROUP_SOCKOPT program type and BPF_CGROUP_{G,S}ETSOCKOPT cgroup hooks. BPF_CGROUP_SETSOCKOPT can modify user setsockopt arguments before passing them down to the kernel or bypass kernel completely. BPF_CGROUP_GETSOCKOPT can can inspect/modify getsockopt arguments that kernel returns. Both hooks reuse existing PTR_TO_PACKET{,_END} infrastructure. The buffer memory is pre-allocated (because I don't think there is a precedent for working with __user memory from bpf). This might be slow to do for each {s,g}etsockopt call, that's why I've added __cgroup_bpf_prog_array_is_empty that exits early if there is nothing attached to a cgroup. Note, however, that there is a race between __cgroup_bpf_prog_array_is_empty and BPF_PROG_RUN_ARRAY where cgroup program layout might have changed; this should not be a problem because in general there is a race between multiple calls to {s,g}etsocktop and user adding/removing bpf progs from a cgroup. The return code of the BPF program is handled as follows: * 0: EPERM * 1: success, continue with next BPF program in the cgroup chain v9: * allow overwriting setsockopt arguments (Alexei Starovoitov): * use set_fs (same as kernel_setsockopt) * buffer is always kzalloc'd (no small on-stack buffer) v8: * use s32 for optlen (Andrii Nakryiko) v7: * return only 0 or 1 (Alexei Starovoitov) * always run all progs (Alexei Starovoitov) * use optval=0 as kernel bypass in setsockopt (Alexei Starovoitov) (decided to use optval=-1 instead, optval=0 might be a valid input) * call getsockopt hook after kernel handlers (Alexei Starovoitov) v6: * rework cgroup chaining; stop as soon as bpf program returns 0 or 2; see patch with the documentation for the details * drop Andrii's and Martin's Acked-by (not sure they are comfortable with the new state of things) v5: * skip copy_to_user() and put_user() when ret == 0 (Martin Lau) v4: * don't export bpf_sk_fullsock helper (Martin Lau) * size != sizeof(__u64) for uapi pointers (Martin Lau) * offsetof instead of bpf_ctx_range when checking ctx access (Martin Lau) v3: * typos in BPF_PROG_CGROUP_SOCKOPT_RUN_ARRAY comments (Andrii Nakryiko) * reverse christmas tree in BPF_PROG_CGROUP_SOCKOPT_RUN_ARRAY (Andrii Nakryiko) * use __bpf_md_ptr instead of __u32 for optval{,_end} (Martin Lau) * use BPF_FIELD_SIZEOF() for consistency (Martin Lau) * new CG_SOCKOPT_ACCESS macro to wrap repeated parts v2: * moved bpf_sockopt_kern fields around to remove a hole (Martin Lau) * aligned bpf_sockopt_kern->buf to 8 bytes (Martin Lau) * bpf_prog_array_is_empty instead of bpf_prog_array_length (Martin Lau) * added [0,2] return code check to verifier (Martin Lau) * dropped unused buf[64] from the stack (Martin Lau) * use PTR_TO_SOCKET for bpf_sockopt->sk (Martin Lau) * dropped bpf_target_off from ctx rewrites (Martin Lau) * use return code for kernel bypass (Martin Lau & Andrii Nakryiko) Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-06-28keys: Replace uid/gid/perm permissions checking with an ACLDavid Howells1-55/+66
Replace the uid/gid/perm permissions checking on a key with an ACL to allow the SETATTR and SEARCH permissions to be split. This will also allow a greater range of subjects to represented. ============ WHY DO THIS? ============ The problem is that SETATTR and SEARCH cover a slew of actions, not all of which should be grouped together. For SETATTR, this includes actions that are about controlling access to a key: (1) Changing a key's ownership. (2) Changing a key's security information. (3) Setting a keyring's restriction. And actions that are about managing a key's lifetime: (4) Setting an expiry time. (5) Revoking a key. and (proposed) managing a key as part of a cache: (6) Invalidating a key. Managing a key's lifetime doesn't really have anything to do with controlling access to that key. Expiry time is awkward since it's more about the lifetime of the content and so, in some ways goes better with WRITE permission. It can, however, be set unconditionally by a process with an appropriate authorisation token for instantiating a key, and can also be set by the key type driver when a key is instantiated, so lumping it with the access-controlling actions is probably okay. As for SEARCH permission, that currently covers: (1) Finding keys in a keyring tree during a search. (2) Permitting keyrings to be joined. (3) Invalidation. But these don't really belong together either, since these actions really need to be controlled separately. Finally, there are number of special cases to do with granting the administrator special rights to invalidate or clear keys that I would like to handle with the ACL rather than key flags and special checks. =============== WHAT IS CHANGED =============== The SETATTR permission is split to create two new permissions: (1) SET_SECURITY - which allows the key's owner, group and ACL to be changed and a restriction to be placed on a keyring. (2) REVOKE - which allows a key to be revoked. The SEARCH permission is split to create: (1) SEARCH - which allows a keyring to be search and a key to be found. (2) JOIN - which allows a keyring to be joined as a session keyring. (3) INVAL - which allows a key to be invalidated. The WRITE permission is also split to create: (1) WRITE - which allows a key's content to be altered and links to be added, removed and replaced in a keyring. (2) CLEAR - which allows a keyring to be cleared completely. This is split out to make it possible to give just this to an administrator. (3) REVOKE - see above. Keys acquire ACLs which consist of a series of ACEs, and all that apply are unioned together. An ACE specifies a subject, such as: (*) Possessor - permitted to anyone who 'possesses' a key (*) Owner - permitted to the key owner (*) Group - permitted to the key group (*) Everyone - permitted to everyone Note that 'Other' has been replaced with 'Everyone' on the assumption that you wouldn't grant a permit to 'Other' that you wouldn't also grant to everyone else. Further subjects may be made available by later patches. The ACE also specifies a permissions mask. The set of permissions is now: VIEW Can view the key metadata READ Can read the key content WRITE Can update/modify the key content SEARCH Can find the key by searching/requesting LINK Can make a link to the key SET_SECURITY Can change owner, ACL, expiry INVAL Can invalidate REVOKE Can revoke JOIN Can join this keyring CLEAR Can clear this keyring The KEYCTL_SETPERM function is then deprecated. The KEYCTL_SET_TIMEOUT function then is permitted if SET_SECURITY is set, or if the caller has a valid instantiation auth token. The KEYCTL_INVALIDATE function then requires INVAL. The KEYCTL_REVOKE function then requires REVOKE. The KEYCTL_JOIN_SESSION_KEYRING function then requires JOIN to join an existing keyring. The JOIN permission is enabled by default for session keyrings and manually created keyrings only. ====================== BACKWARD COMPATIBILITY ====================== To maintain backward compatibility, KEYCTL_SETPERM will translate the permissions mask it is given into a new ACL for a key - unless KEYCTL_SET_ACL has been called on that key, in which case an error will be returned. It will convert possessor, owner, group and other permissions into separate ACEs, if each portion of the mask is non-zero. SETATTR permission turns on all of INVAL, REVOKE and SET_SECURITY. WRITE permission turns on WRITE, REVOKE and, if a keyring, CLEAR. JOIN is turned on if a keyring is being altered. The KEYCTL_DESCRIBE function translates the ACL back into a permissions mask to return depending on possessor, owner, group and everyone ACEs. It will make the following mappings: (1) INVAL, JOIN -> SEARCH (2) SET_SECURITY -> SETATTR (3) REVOKE -> WRITE if SETATTR isn't already set (4) CLEAR -> WRITE Note that the value subsequently returned by KEYCTL_DESCRIBE may not match the value set with KEYCTL_SETATTR. ======= TESTING ======= This passes the keyutils testsuite for all but a couple of tests: (1) tests/keyctl/dh_compute/badargs: The first wrong-key-type test now returns EOPNOTSUPP rather than ENOKEY as READ permission isn't removed if the type doesn't have ->read(). You still can't actually read the key. (2) tests/keyctl/permitting/valid: The view-other-permissions test doesn't work as Other has been replaced with Everyone in the ACL. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-28keys: Pass the network namespace into request_key mechanismDavid Howells2-5/+45
Create a request_key_net() function and use it to pass the network namespace domain tag into DNS revolver keys and rxrpc/AFS keys so that keys for different domains can coexist in the same keyring. Signed-off-by: David Howells <dhowells@redhat.com> cc: netdev@vger.kernel.org cc: linux-nfs@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: linux-afs@lists.infradead.org
2019-06-28workqueue: Make alloc/apply/free_workqueue_attrs() staticThomas Gleixner1-4/+0
None of those functions have any users outside of workqueue.c. Confine them. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Tejun Heo <tj@kernel.org>
2019-06-27Merge tag 'blk-dim-v2' of ↵David S. Miller2-418/+366
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mamameed says: ==================== Generic DIM From: Tal Gilboa and Yamin Fridman Implement net DIM over a generic DIM library, add RDMA DIM dim.h lib exposes an implementation of the DIM algorithm for dynamically-tuned interrupt moderation for networking interfaces. We want a similar functionality for other protocols, which might need to optimize interrupts differently. Main motivation here is DIM for NVMf storage protocol. Current DIM implementation prioritizes reducing interrupt overhead over latency. Also, in order to reduce DIM's own overhead, the algorithm might take some time to identify it needs to change profiles. While this is acceptable for networking, it might not work well on other scenarios. Here we propose a new structure to DIM. The idea is to allow a slightly modified functionality without the risk of breaking Net DIM behavior for netdev. We verified there are no degradations in current DIM behavior with the modified solution. Suggested solution: - Common logic is implemented in lib/dim/dim.c - Net DIM (existing) logic is implemented in lib/dim/net_dim.c, which uses the common logic in dim.c - Any new DIM logic will be implemented in "lib/dim/new_dim.c". This new implementation will expose modified versions of profiles, dim_step() and dim_decision(). - DIM API is declared in include/linux/dim.h for all implementations. Pros for this solution are: - Zero impact on existing net_dim implementation and usage - Relatively more code reuse (compared to two separate solutions) - Increased extensibility ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27mtd: spinand: Add initial support for Paragon PN26G0xAJeff Kletsky1-0/+1
Add initial support for Paragon Technology PN26G01Axxxxx and PN26G02Axxxxx SPI NAND Datasheets available at http://www.xtxtech.com/upfile/2016082517274590.pdf http://www.xtxtech.com/upfile/2016082517282329.pdf Signed-off-by: Jeff Kletsky <git-commits@allycomm.com> Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2019-06-27mtd: Add flag to indicate panic_writeKamal Dasu1-0/+6
Added a flag to indicate a panic_write so that low level drivers can use it to take required action where applicable, to ensure oops data gets written to assigned mtd device. Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2019-06-27mtd: spinand: Add support for two-byte device IDsJeff Kletsky1-2/+2
The GigaDevice GD5F1GQ4UFxxG SPI NAND utilizes two-byte device IDs. http://www.gigadevice.com/datasheet/gd5f1gq4xfxxg/ Signed-off-by: Jeff Kletsky <git-commits@allycomm.com> Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2019-06-27mtd: spinand: Define macros for page-read ops with three-byte addressesJeff Kletsky1-0/+30
The GigaDevice GD5F1GQ4UFxxG SPI NAND utilizes three-byte addresses for its page-read ops. http://www.gigadevice.com/datasheet/gd5f1gq4xfxxg/ Signed-off-by: Jeff Kletsky <git-commits@allycomm.com> Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2019-06-27mtd: rawnand: gpmi: Implement exec_opSascha Hauer1-0/+1
The gpmi driver performance suffers from NAND operations being split in multiple small DMA transfers. This has been forced by the NAND layer in the former days, but now with exec_op we can use the controller as intended. With this patch gpmi_nfc_exec_op becomes the main entry point to NAND operations. Here all instructions are collected and chained as separate DMA transfers. In the end whole chain is fired and waited to be finished. gpmi_nfc_exec_op only does the hardware operations, bad block marker swapping and buffer scrambling is done by the callers. It's worth noting that the nand_*_op functions always take the buffer lengths for the data that the NAND chip actually transfers. When doing BCH we have to calculate the net data size from the raw data size in some places. This patch has been tested with 2048/64 and 2048/128 byte NAND on i.MX6q. mtd_oobtest, mtd_subpagetest and mtd_speedtest run without errors. nandbiterrs, nandpagetest and nandsubpagetest userspace tests from mtdutils run without errors and UBIFS can successfully be mounted. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2019-06-27dmaengine: mxs: rename custom flagSascha Hauer1-0/+2
The mxs dma driver uses the flags parameter in dmaengine_prep_slave_sg() for custom flags, but still uses the dmaengine specific names of the flags. Do a little bit better and at least give the flag a custom name. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Acked-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2019-06-27dmaengine: mxs: Add header file to be shared with gpmi nand driverSascha Hauer1-0/+21
The mxs dma driver can do PIO transfers. A pointer to the PIO words to transfer is passed in the struct scatterlist * argument of dmaengine_prep_slave_sg(). It's quite ugly and non obvious to cast u32 * to struct scatterlist * each time when calling dmaengine_prep_slave_sg(), so add a static inline wrapper function to be called by the user along with a description what is going on. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Acked-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2019-06-27mtd: rawnand: export NAND operation tracerSascha Hauer1-0/+36
The NAND core has a NAND operation tracing function, but it can only be used by drivers using the generic option parser from the NAND core. Export the tracing function as a static inline function in rawnand.h so that drivers implementing exec_op directly do not have to write their own operation tracing. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2019-06-27mtd: onenand: Add support for 8Gb datasize onenandJonathan Bakker1-0/+1
Used in several S5PV210-based Galaxy S devices, among them SGH-T959V, SGH-T959P, SGH-T839, and SPH-D700. Signed-off-by: Jonathan Bakker <xc-racer2@live.ca> Signed-off-by: Paweł Chmiel <pawel.mikolaj.chmiel@gmail.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2019-06-27Merge branch '40GbE' of ↵David S. Miller1-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2019-06-26 This series contains updates to ixgbe and i40e only. Mauro S. M. Rodrigues update the ixgbe driver to handle transceivers who comply with SFF-8472 but do not implement the Digital Diagnostic Monitoring (DOM) interface. Update the driver to check the necessary bits to see if DOM is implemented before trying to read the additional 256 bytes in the EEPROM for DOM data. Young Xiao fixes a potential divide by zero issue in ixgbe driver. Aleksandr fixes i40e to recognize 2.5 and 5.0 GbE link speeds so that it is not reported as "Unknown bps". Fixes the driver to read the firmware LLDP agent status during DCB initialization, and to properly log the LLDP agent status to help with debugging when DCB fails to initialize. Martyna fixes i40e for the missing supported and advertised link modes information in ethtool. Jake fixes a function header comment that was incorrect for a PTP function in i40e. Maciej fixes an issue for i40e when a XDP program is loaded the descriptor count gets reset to the default value, resolve the issue by making the current descriptor count persistent across resets. Alice corrects a copyright date which she found to be incorrect. Piotr adds a log entry when the traffic class 0 is added or deleted, which was not being logged previously. Gustavo A. R. Silva updates i40e to use struct_size() where possible. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27mtd: Add support for HyperBus memory devicesVignesh Raghavendra1-0/+84
Cypress' HyperBus is Low Signal Count, High Performance Double Data Rate Bus interface between a host system master and one or more slave interfaces. HyperBus is used to connect microprocessor, microcontroller, or ASIC devices with random access NOR flash memory (called HyperFlash) or self refresh DRAM (called HyperRAM). Its a 8-bit data bus (DQ[7:0]) with Read-Write Data Strobe (RWDS) signal and either Single-ended clock(3.0V parts) or Differential clock (1.8V parts). It uses ChipSelect lines to select b/w multiple slaves. At bus level, it follows a separate protocol described in HyperBus specification[1]. HyperFlash follows CFI AMD/Fujitsu Extended Command Set (0x0002) similar to that of existing parallel NORs. Since HyperBus is x8 DDR bus, its equivalent to x16 parallel NOR flash with respect to bits per clock cycle. But HyperBus operates at >166MHz frequencies. HyperRAM provides direct random read/write access to flash memory array. But, HyperBus memory controllers seem to abstract implementation details and expose a simple MMIO interface to access connected flash. Add support for registering HyperFlash devices with MTD framework. MTD maps framework along with CFI chip support framework are used to support communicating with flash. Framework is modelled along the lines of spi-nor framework. HyperBus memory controller (HBMC) drivers calls hyperbus_register_device() to register a single HyperFlash device. HyperFlash core parses MMIO access information from DT, sets up the map_info struct, probes CFI flash and registers it with MTD framework. Some HBMC masters need calibration/training sequence[3] to be carried out, in order for DLL inside the controller to lock, by reading a known string/pattern. This is done by repeatedly reading CFI Query Identification String. Calibration needs to be done before trying to detect flash as part of CFI flash probe. HyperRAM is not supported at the moment. HyperBus specification can be found at[1] HyperFlash datasheet can be found at[2] [1] https://www.cypress.com/file/213356/download [2] https://www.cypress.com/file/213346/download [3] http://www.ti.com/lit/ug/spruid7b/spruid7b.pdf Table 12-5741. HyperFlash Access Sequence Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2019-06-27mtd: cfi_cmdset_0002: Add support for polling status registerVignesh Raghavendra1-0/+7
HyperFlash devices are compliant with CFI AMD/Fujitsu Extended Command Set (0x0002) for flash operations, therefore drivers/mtd/chips/cfi_cmdset_0002.c can be used as is. But these devices do not support DQ polling method of determining chip ready/good status. These flashes provide Status Register whose bits can be polled to know status of flash operation. Cypress HyperFlash datasheet here[1], talks about CFI Amd/Fujitsu Extended Query version 1.5. Bit 0 of "Software Features supported" field of CFI Primary Vendor-Specific Extended Query table indicates presence/absence of status register and Bit 1 indicates whether or not DQ polling is supported. Using these bits, its possible to determine whether flash supports DQ polling or need to use Status Register. Add support for polling Status Register to know device ready/status of erase/write operations when DQ polling is not supported. Print error messages on erase/program failure by looking at related Status Register bits. [1] https://www.cypress.com/file/213346/download Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com> Reviewed-by: Tokunori Ikegami <ikegami.t@gmail.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2019-06-27mm/hmm: Fix error flows in hmm_invalidate_range_startJason Gunthorpe1-1/+1
If the trylock on the hmm->mirrors_sem fails the function will return without decrementing the notifiers that were previously incremented. Since the caller will not call invalidate_range_end() on EAGAIN this will result in notifiers becoming permanently incremented and deadlock. If the sync_cpu_device_pagetables() required blocking the function will not return EAGAIN even though the device continues to touch the pages. This is a violation of the mmu notifier contract. Switch, and rename, the ranges_lock to a spin lock so we can reliably obtain it without blocking during error unwind. The error unwind is necessary since the notifiers count must be held incremented across the call to sync_cpu_device_pagetables() as we cannot allow the range to become marked valid by a parallel invalidate_start/end() pair while doing sync_cpu_device_pagetables(). Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Philip Yang <Philip.Yang@amd.com>
2019-06-27arm_pmu: acpi: spe: Add initial MADT/SPE probingJeremy Linton1-0/+2
ACPI 6.3 adds additional fields to the MADT GICC structure to describe SPE PPI's. We pick these out of the cached reference to the madt_gicc structure similarly to the core PMU code. We then create a platform device referring to the IRQ and let the user/module loader decide whether to load the SPE driver. Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-06-27ACPI/PPTT: Add function to return ACPI 6.3 Identical tokensJeremy Linton1-0/+5
ACPI 6.3 adds a flag to indicate that child nodes are all identical cores. This is useful to authoritatively determine if a set of (possibly offline) cores are identical or not. Since the flag doesn't give us a unique id we can generate one and use it to create bitmaps of sibling nodes, or simply in a loop to determine if a subset of cores are identical. Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-06-27siox: Add helper macro to simplify driver registrationEnrico Weigelt1-0/+10
Add more helper macros for trivial driver init cases, similar to the already existing module_platform_driver() or module_i2c_driver(). This helps to reduce driver init boilerplate. Signed-off-by: Enrico Weigelt <info@metux.net> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-06-27gpio: Add comments on #if/#else/#endifEnrico Weigelt3-11/+11
Improve readability a bit by commenting #if/#else/#endif statements with the checked preprocessor symbols. Signed-off-by: Enrico Weigelt <info@metux.net> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-06-27regulator: rk808: Add RK809 and RK817 support.Heiko Stuebner1-0/+3
Add support for the rk809 and rk817 regulator driver. Their specifications are as follows: 1. The RK809 and RK809 consist of 5 DCDCs, 9 LDOs and have the same registers for these components except dcdc5. 2. The dcdc5 is a boost dcdc for RK817 and is a buck for RK809. 3. The RK817 has one switch but The Rk809 has two. The output voltages are configurable and are meant to supply power to the main processor and other components. Signed-off-by: Tony Xie <tony.xie@rock-chips.com> Acked-by: Mark Brown <broonie@kernel.org> [rebased on top of 5.2-rc1] Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2019-06-27mfd: rk808: Add RK817 and RK809 supportTony Xie1-0/+172
The RK809 and RK817 are a Power Management IC (PMIC) for multimedia and handheld devices. They contains the following components: - Regulators - RTC - Clocking Both RK809 and RK817 chips are using a similar register map, so we can reuse the RTC and Clocking functionality. Most of regulators have a some implementation also. Signed-off-by: Tony Xie <tony.xie@rock-chips.com> Acked-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2019-06-27block: Remove unused codeDamien Le Moal1-11/+0
bio_flush_dcache_pages() is unused. Remove it. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-27docs: thermal: convert to ReSTMauro Carvalho Chehab1-2/+2
Rename the thermal documentation files to ReST, add an index for them and adjust in order to produce a nice html output via the Sphinx build system. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-06-27mfd: bd70528: Support ROHM bd70528 PMIC coreMatti Vaittinen1-0/+408
ROHM BD70528MWV is an ultra-low quiescent current general purpose single-chip power management IC for battery-powered portable devices. Add MFD core which enables chip access for following subdevices: - regulators/LED drivers - battery-charger - gpios - 32.768kHz clk - RTC - watchdog Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2019-06-27mfd: regulator: clk: Split rohm-bd718x7.hMatti Vaittinen2-14/+28
Split the bd718x7.h to ROHM common and bd718x7 specific parts so that we do not need to add same things in every new ROHM PMIC header. Please note that this change requires changes also in bd718x7 sub-device drivers for regulators and clk. Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Acked-by: Mark Brown <broonie@kernel.org> Acked-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2019-06-27PCI: PM: Avoid skipping bus-level PM on platforms without ACPIRafael J. Wysocki1-2/+24
There are platforms that do not call pm_set_suspend_via_firmware(), so pm_suspend_via_firmware() returns 'false' on them, but the power states of PCI devices (PCIe ports in particular) are changed as a result of powering down core platform components during system-wide suspend. Thus the pm_suspend_via_firmware() checks in pci_pm_suspend_noirq() and pci_pm_resume_noirq() introduced by commit 3e26c5feed2a ("PCI: PM: Skip devices in D0 for suspend-to- idle") are not sufficient to determine that devices left in D0 during suspend will remain in D0 during resume and so the bus-level power management can be skipped for them. For this reason, introduce a new global suspend flag, PM_SUSPEND_FLAG_NO_PLATFORM, set it for suspend-to-idle only and replace the pm_suspend_via_firmware() checks mentioned above with checks against this flag. Fixes: 3e26c5feed2a ("PCI: PM: Skip devices in D0 for suspend-to-idle") Reported-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-06-26Allow 0.0.0.0/8 as a valid address rangeDave Taht1-1/+1
The longstanding prohibition against using 0.0.0.0/8 dates back to two issues with the early internet. There was an interoperability problem with BSD 4.2 in 1984, fixed in BSD 4.3 in 1986. BSD 4.2 has long since been retired. Secondly, addresses of the form 0.x.y.z were initially defined only as a source address in an ICMP datagram, indicating "node number x.y.z on this IPv4 network", by nodes that know their address on their local network, but do not yet know their network prefix, in RFC0792 (page 19). This usage of 0.x.y.z was later repealed in RFC1122 (section 3.2.2.7), because the original ICMP-based mechanism for learning the network prefix was unworkable on many networks such as Ethernet (which have longer addresses that would not fit into the 24 "node number" bits). Modern networks use reverse ARP (RFC0903) or BOOTP (RFC0951) or DHCP (RFC2131) to find their full 32-bit address and CIDR netmask (and other parameters such as default gateways). 0.x.y.z has had 16,777,215 addresses in 0.0.0.0/8 space left unused and reserved for future use, since 1989. This patch allows for these 16m new IPv4 addresses to appear within a box or on the wire. Layer 2 switches don't care. 0.0.0.0/32 is still prohibited, of course. Signed-off-by: Dave Taht <dave.taht@gmail.com> Signed-off-by: John Gilmore <gnu@toad.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-26keys: Network namespace domain tagDavid Howells1-0/+3
Create key domain tags for network namespaces and make it possible to automatically tag keys that are used by networked services (e.g. AF_RXRPC, AFS, DNS) with the default network namespace if not set by the caller. This allows keys with the same description but in different namespaces to coexist within a keyring. Signed-off-by: David Howells <dhowells@redhat.com> cc: netdev@vger.kernel.org cc: linux-nfs@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: linux-afs@lists.infradead.org
2019-06-26keys: Garbage collect keys for which the domain has been removedDavid Howells1-0/+2
If a key operation domain (such as a network namespace) has been removed then attempt to garbage collect all the keys that use it. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-26keys: Include target namespace in match criteriaDavid Howells1-0/+10
Currently a key has a standard matching criteria of { type, description } and this is used to only allow keys with unique criteria in a keyring. This means, however, that you cannot have keys with the same type and description but a different target namespace in the same keyring. This is a potential problem for a containerised environment where, say, a container is made up of some parts of its mount space involving netfs superblocks from two different network namespaces. This is also a problem for shared system management keyrings such as the DNS records keyring or the NFS idmapper keyring that might contain keys from different network namespaces. Fix this by including a namespace component in a key's matching criteria. Keyring types are marked to indicate which, if any, namespace is relevant to keys of that type, and that namespace is set when the key is created from the current task's namespace set. The capability bit KEYCTL_CAPS1_NS_KEY_TAG is set if the kernel is employing this feature. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-26keys: Move the user and user-session keyrings to the user_namespaceDavid Howells2-16/+7
Move the user and user-session keyrings to the user_namespace struct rather than pinning them from the user_struct struct. This prevents these keyrings from propagating across user-namespaces boundaries with regard to the KEY_SPEC_* flags, thereby making them more useful in a containerised environment. The issue is that a single user_struct may be represent UIDs in several different namespaces. The way the patch does this is by attaching a 'register keyring' in each user_namespace and then sticking the user and user-session keyrings into that. It can then be searched to retrieve them. Signed-off-by: David Howells <dhowells@redhat.com> cc: Jann Horn <jannh@google.com>
2019-06-26keys: Namespace keyring namesDavid Howells2-0/+7
Keyring names are held in a single global list that any process can pick from by means of keyctl_join_session_keyring (provided the keyring grants Search permission). This isn't very container friendly, however. Make the following changes: (1) Make default session, process and thread keyring names begin with a '.' instead of '_'. (2) Keyrings whose names begin with a '.' aren't added to the list. Such keyrings are system specials. (3) Replace the global list with per-user_namespace lists. A keyring adds its name to the list for the user_namespace that it is currently in. (4) When a user_namespace is deleted, it just removes itself from the keyring name list. The global keyring_name_lock is retained for accessing the name lists. This allows (4) to work. This can be tested by: # keyctl newring foo @s 995906392 # unshare -U $ keyctl show ... 995906392 --alswrv 65534 65534 \_ keyring: foo ... $ keyctl session foo Joined session keyring: 935622349 As can be seen, a new session keyring was created. The capability bit KEYCTL_CAPS1_NS_KEYRING_NAME is set if the kernel is employing this feature. Signed-off-by: David Howells <dhowells@redhat.com> cc: Eric W. Biederman <ebiederm@xmission.com>
2019-06-26keys: Add a 'recurse' flag for keyring searchesDavid Howells1-1/+2
Add a 'recurse' flag for keyring searches so that the flag can be omitted and recursion disabled, thereby allowing just the nominated keyring to be searched and none of the children. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-26keys: Cache the hash value to avoid lots of recalculationDavid Howells1-0/+3
Cache the hash of the key's type and description in the index key so that we're not recalculating it every time we look at a key during a search. The hash function does a bunch of multiplications, so evading those is probably worthwhile - especially as this is done for every key examined during a search. This also allows the methods used by assoc_array to get chunks of index-key to be simplified. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-26keys: Simplify key description managementDavid Howells1-1/+13
Simplify key description management by cramming the word containing the length with the first few chars of the description also. This simplifies the code that generates the index-key used by assoc_array. It should speed up key searching a bit too. Signed-off-by: David Howells <dhowells@redhat.com>