summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2016-09-15blk-mq: provide a default queue mapping for PCI deviceChristoph Hellwig1-0/+9
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-15blk-mq: allow the driver to pass in a queue mappingChristoph Hellwig1-0/+3
This allows drivers specify their own queue mapping by overriding the setup-time function that builds the mq_map. This can be used for example to build the map based on the MSI-X vector mapping provided by the core interrupt layer for PCI devices. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-15blk-mq: remove ->map_queueChristoph Hellwig1-7/+0
All drivers use the default, so provide an inline version of it. If we ever need other queue mapping we can add an optional method back, although supporting will also require major changes to the queue setup code. This provides better code generation, and better debugability as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-15blk-mq: only allocate a single mq_map per tag_setChristoph Hellwig1-0/+1
The mapping is identical for all queues in a tag_set, so stop wasting memory for building multiple. Note that for now I've kept the mq_map pointer in the request_queue, but we'll need to investigate if we can remove it without suffering too much from the additional pointer chasing. The same would apply to the mq_ops pointer as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-15Merge branch 'irq/for-block' of ↵Jens Axboe20-36/+226
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into for-4.9/msi-irq
2016-09-14pci/msi: Retrieve affinity for a vectorThomas Gleixner1-0/+6
Add a helper to get the affinity mask for a given PCI irq vector. For MSI or MSI-X vectors these are stored by the IRQ core, while for legacy interrupts we will always return cpu_possible_map. [hch: updated to follow the style of pci_irq_vector()] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: axboe@fb.com Cc: keith.busch@intel.com Cc: agordeev@redhat.com Cc: linux-block@vger.kernel.org Link: http://lkml.kernel.org/r/1473862739-15032-6-git-send-email-hch@lst.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-14genirq/affinity: Remove old irq spread infrastructureThomas Gleixner1-7/+0
No more users. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Cc: axboe@fb.com Cc: keith.busch@intel.com Cc: agordeev@redhat.com Cc: linux-block@vger.kernel.org Link: http://lkml.kernel.org/r/1473862739-15032-5-git-send-email-hch@lst.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-14genirq/affinity: Provide smarter irq spreading infrastructureThomas Gleixner1-0/+15
The current irq spreading infrastructure is just looking at a cpumask and tries to spread the interrupts over the mask. Thats suboptimal as it does not take numa nodes into account. Change the logic so the interrupts are spread across numa nodes and inside the nodes. If there are more cpus than vectors per node, then we set the affinity to several cpus. If HT siblings are available we take that into account and try to set all siblings to a single vector. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Cc: axboe@fb.com Cc: keith.busch@intel.com Cc: agordeev@redhat.com Cc: linux-block@vger.kernel.org Link: http://lkml.kernel.org/r/1473862739-15032-3-git-send-email-hch@lst.de
2016-09-14genirq/msi: Add cpumask allocation to alloc_msi_entryThomas Gleixner1-2/+3
For irq spreading want to store affinity masks in the msi_entry. Add the infrastructure for it. We allocate an array of cpumasks with an array size of the number of used vectors in the entry, so we can hand in the information per linux interrupt later. As we hand in the number of used vectors, we assign them right away. Convert all the call sites. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: axboe@fb.com Cc: keith.busch@intel.com Cc: agordeev@redhat.com Cc: linux-block@vger.kernel.org Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/1473862739-15032-2-git-send-email-hch@lst.de
2016-09-14blk-mq: introduce blk_mq_delay_kick_requeue_list()Mike Snitzer2-1/+2
blk_mq_delay_kick_requeue_list() provides the ability to kick the q->requeue_list after a specified time. To do this the request_queue's 'requeue_work' member was changed to a delayed_work. blk_mq_delay_kick_requeue_list() allows DM to defer processing requeued requests while it doesn't make sense to immediately requeue them (e.g. when all paths in a DM multipath have failed). Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-14block: remove IOPRIO_BITSChristoph Hellwig1-1/+0
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-14bio.h: remove a very outdated commentChristoph Hellwig1-2/+0
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-14block: remove bio_destructor_tChristoph Hellwig1-1/+0
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-14block: Improve bio_set_op_attrs() robustnessBart Van Assche1-5/+12
Since REQ_OP_BITS == 3 and __REQ_NR_BITS == 30 it is not that hard to pass an op_flags argument to bio_set_op_attrs() that is larger than the number of bits reserved for the op_flags argument. Complain if this happens. Additionally, ensure that negative arguments trigger a complaint (1 << ... is signed while 1U << ... is unsigned; adding 0U to an integer expression causes it to be promoted to an unsigned type). Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Mike Christie <mchristi@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Damien Le Moal <damien.lemoal@hgst.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-14block, dm-crypt, btrfs: Introduce bio_flags()Bart Van Assche1-1/+2
Introduce the bio_flags() macro. Ensure that the second argument of bio_set_op_attrs() only contains flags and no operation. This patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Mike Christie <mchristi@redhat.com> Cc: Chris Mason <clm@fb.com> (maintainer:BTRFS FILE SYSTEM) Cc: Josef Bacik <jbacik@fb.com> (maintainer:BTRFS FILE SYSTEM) Cc: Mike Snitzer <snitzer@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Damien Le Moal <damien.lemoal@hgst.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-14block: Document that bio_op() uses the data type of bio.bi_opfBart Van Assche1-1/+1
Make it clear that the sizeof(unsigned int) expression in BIO_OP_SHIFT refers to the bi_opf member of struct bio. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Mike Christie <mchristi@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Damien Le Moal <damien.lemoal@hgst.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-14block: remove blk_mq_alloc_single_hw_queue() prototypeLinus Walleij1-1/+0
The blk_mq_alloc_single_hw_queue() is a prototype artifact that should have been removed with commit cdef54dd85ad66e77262ea57796a3e81683dd5d6 "blk-mq: remove alloc_hctx and free_hctx methods" where the last users of it were deleted. Fixes: cdef54dd85ad ("blk-mq: remove alloc_hctx and free_hctx methods") Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-14block: add poll_considered statisticStephen Bates1-0/+1
In order to help determine the effectiveness of polling in a running system it is usful to determine the ratio of how often the poll function is called vs how often the completion is checked. For this reason we add a poll_considered variable and add it to the sysfs entry for io_poll. Signed-off-by: Stephen Bates <sbates@raithlin.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-10Merge tag 'for_linus_stable' of ↵Linus Torvalds1-3/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull fscrypto fixes fromTed Ts'o: "Fix some brown-paper-bag bugs for fscrypto, including one one which allows a malicious user to set an encryption policy on an empty directory which they do not own" * tag 'for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: fscrypto: require write access to mount to set encryption policy fscrypto: only allow setting encryption policy on directories fscrypto: add authorization check for setting encryption policy
2016-09-10fscrypto: require write access to mount to set encryption policyEric Biggers1-3/+2
Since setting an encryption policy requires writing metadata to the filesystem, it should be guarded by mnt_want_write/mnt_drop_write. Otherwise, a user could cause a write to a frozen or readonly filesystem. This was handled correctly by f2fs but not by ext4. Make fscrypt_process_policy() handle it rather than relying on the filesystem to get it right. Signed-off-by: Eric Biggers <ebiggers@google.com> Cc: stable@vger.kernel.org # 4.1+; check fs/{ext4,f2fs} Signed-off-by: Theodore Ts'o <tytso@mit.edu> Acked-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-07usercopy: force check_object_size() inlineKees Cook1-2/+2
Just for good measure, make sure that check_object_size() is always inlined too, as already done for copy_*_user() and __copy_*_user(). Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Kees Cook <keescook@chromium.org>
2016-09-06usercopy: fold builtin_const check into inline functionKees Cook1-1/+2
Instead of having each caller of check_object_size() need to remember to check for a const size parameter, move the check into check_object_size() itself. This actually matches the original implementation in PaX, though this commit cleans up the now-redundant builtin_const() calls in the various architectures. Signed-off-by: Kees Cook <keescook@chromium.org>
2016-09-03Merge tag 'staging-4.8-rc5' of ↵Linus Torvalds3-7/+5
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging/IIO driver fixes from Greg KH: "Here are a number of small fixes for staging and IIO drivers that resolve reported problems. Full details are in the shortlog. All of these have been in linux-next with no reported issues" * tag 'staging-4.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (35 commits) arm: dts: rockchip: add reset node for the exist saradc SoCs arm64: dts: rockchip: add reset saradc node for rk3368 SoCs iio: adc: rockchip_saradc: reset saradc controller before programming it iio: accel: kxsd9: Fix raw read return iio: adc: ti_am335x_adc: Increase timeout value waiting for ADC sample iio: adc: ti_am335x_adc: Protect FIFO1 from concurrent access include/linux: fix excess fence.h kernel-doc notation staging: wilc1000: correctly check if associatedsta has not been found staging: wilc1000: NULL dereference on error staging: wilc1000: txq_event: Fix coding error MAINTAINERS: Add file patterns for ion device tree bindings MAINTAINERS: Update maintainer entry for wilc1000 iio: chemical: atlas-ph-sensor: fix typo in val assignment iio: fix sched WARNING "do not call blocking ops when !TASK_RUNNING" staging: comedi: ni_mio_common: fix AO inttrig backwards compatibility staging: comedi: dt2811: fix a precedence bug staging: comedi: adv_pci1760: Do not return EINVAL for CMDF_ROUND_DOWN. staging: comedi: ni_mio_common: fix wrong insn_write handler staging: comedi: comedi_test: fix timer race conditions staging: comedi: daqboard2000: bug fix board type matching code ...
2016-09-03Merge tag 'tty-4.8-rc5' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull serial driver fixes from Greg KH: "Here are some small serial driver fixes for 4.8-rc5. One fixes an oft-reported build issue with the fintek driver, another reverts a patch that was causing problems, one fixes a crash, and some new device ids were added. All of these have been in linux-next for a while" * tag 'tty-4.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: serial: 8250: added acces i/o products quad and octal serial cards serial: 8250_mid: fix divide error bug if baud rate is 0 Revert "tty/serial/8250: use mctrl_gpio helpers" 8250/fintek: rename IRQ_MODE macro
2016-09-03Merge tag 'usb-4.8-rc5' of ↵Linus Torvalds1-0/+153
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB/PHY fixes from Greg KH: "Here are some USB and PHY driver fixes for 4.8-rc5 Nothing major, lots of little fixes for reported bugs, and a build fix for a missing .h file that the phy drivers needed. All of these have been in linux-next for a while with no reported issues" * tag 'usb-4.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (24 commits) usb: musb: Fix locking errors for host only mode usb: dwc3: gadget: always decrement by 1 usb: dwc3: debug: fix ep name on trace output usb: gadget: udc: core: don't starve DMA resources USB: serial: option: add WeTelecom 0x6802 and 0x6803 products USB: avoid left shift by -1 USB: fix typo in wMaxPacketSize validation usb: gadget: Add the gserial port checking in gs_start_tx() usb: dwc3: gadget: don't rely on jiffies while holding spinlock usb: gadget: fsl_qe_udc: signedness bug in qe_get_frame() usb: gadget: function: f_rndis: socket buffer may be NULL usb: gadget: function: f_eem: socket buffer may be NULL usb: renesas_usbhs: gadget: fix return value check in usbhs_mod_gadget_probe() usb: dwc2: Add reset control to dwc2 usb: dwc3: core: allow device to runtime_suspend several times usb: dwc3: pci: runtime_resume child device USB: serial: option: add WeTelecom WM-D200 usb: chipidea: udc: don't touch DP when controller is in host mode USB: serial: mos7840: fix non-atomic allocation in write path USB: serial: mos7720: fix non-atomic allocation in write path ...
2016-09-03Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds1-1/+1
Pull block fixes from Jens Axboe: "A collection of fixes for the nvme over fabrics code" * 'for-linus' of git://git.kernel.dk/linux-block: nvme-rdma: Get rid of redundant defines nvme-rdma: Get rid of duplicate variable nvme: fabrics drivers don't need the nvme-pci driver nvme-fabrics: get a reference when reusing a nvme_host structure nvme-fabrics: change NQN UUID to big-endian format nvme-loop: set sqsize to 0-based value, per spec nvme-rdma: fix sqsize/hsqsize per spec fabrics: define admin sqsize min default, per spec nvmet-rdma: +1 to *queue_size from hsqsize/hrqsize nvmet-rdma: Fix use after free nvme-rdma: initialize ret to zero to avoid returning garbage
2016-09-03Merge tag 'acpi-4.8-rc5' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes ffrom Rafael Wysocki: "Two stable-candidate fixes for the ACPI early device probing code added during the 4.4 cycle, one fixing a typo in a stub macro used when CONFIG_ACPI is unset and one that prevents sleeping functions from being called under a spinlock (Lorenzo Pieralisi)" * tag 'acpi-4.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / drivers: replace acpi_probe_lock spinlock with mutex ACPI / drivers: fix typo in ACPI_DECLARE_PROBE_ENTRY macro
2016-09-02ACPI / drivers: fix typo in ACPI_DECLARE_PROBE_ENTRY macroLorenzo Pieralisi1-1/+1
When the ACPI_DECLARE_PROBE_ENTRY macro was added in commit e647b532275b ("ACPI: Add early device probing infrastructure"), a stub macro adding an unused entry was added for the !CONFIG_ACPI Kconfig option case to make sure kernel code making use of the macro did not require to be guarded within CONFIG_ACPI in order to be compiled. The stub macro was never used since all kernel code that defines ACPI_DECLARE_PROBE_ENTRY entries is currently guarded within CONFIG_ACPI; it contains a typo that should be nonetheless fixed. Fix the typo in the stub (ie !CONFIG_ACPI) ACPI_DECLARE_PROBE_ENTRY() macro so that it can actually be used if needed. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Fixes: e647b532275b (ACPI: Add early device probing infrastructure) Cc: 4.4+ <stable@vger.kernel.org> # 4.4+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-09-02Merge branch 'overlayfs-linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs fixes from Miklos Szeredi: "Most of this is regression fixes for posix acl behavior introduced in 4.8-rc1 (these were caught by the pjd-fstest suite). The are also miscellaneous fixes marked as stable material and cleanups. Other than overlayfs code, it touches <linux/fs.h> to add a constant with which to disable posix acl caching. No changes needed to the actual caching code, it automatically does the right thing, although later we may want to optimize this case. I'm now testing overlayfs with the following test suites to catch regressions: - unionmount-testsuite - xfstests - pjd-fstest" * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: update doc ovl: listxattr: use strnlen() ovl: Switch to generic_getxattr ovl: copyattr after setting POSIX ACL ovl: Switch to generic_removexattr ovl: Get rid of ovl_xattr_noacl_handlers array ovl: Fix OVL_XATTR_PREFIX ovl: fix spelling mistake: "directries" -> "directories" ovl: don't cache acl on overlay layer ovl: use cached acl on underlying layer ovl: proper cleanup of workdir ovl: remove posix_acl_default from workdir ovl: handle umask and posix_acl_default correctly on creation ovl: don't copy up opaqueness
2016-09-02Merge branch 'akpm' (patches from Andrew)Linus Torvalds3-11/+18
Merge fixes from Andrew Morton: "14 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: rapidio/tsi721: fix incorrect detection of address translation condition rapidio/documentation/mport_cdev: add missing parameter description kernel/fork: fix CLONE_CHILD_CLEARTID regression in nscd MAINTAINERS: Vladimir has moved mm, mempolicy: task->mempolicy must be NULL before dropping final reference printk/nmi: avoid direct printk()-s from __printk_nmi_flush() treewide: remove references to the now unnecessary DEFINE_PCI_DEVICE_TABLE drivers/scsi/wd719x.c: remove last declaration using DEFINE_PCI_DEVICE_TABLE mm, vmscan: only allocate and reclaim from zones with pages managed by the buddy allocator lib/test_hash.c: fix warning in preprocessor symbol evaluation lib/test_hash.c: fix warning in two-dimensional array init kconfig: tinyconfig: provide whole choice blocks to avoid warnings kexec: fix double-free when failing to relocate the purgatory mm, oom: prevent premature OOM killer invocation for high order request
2016-09-02mm, mempolicy: task->mempolicy must be NULL before dropping final referenceDavid Rientjes1-0/+4
KASAN allocates memory from the page allocator as part of kmem_cache_free(), and that can reference current->mempolicy through any number of allocation functions. It needs to be NULL'd out before the final reference is dropped to prevent a use-after-free bug: BUG: KASAN: use-after-free in alloc_pages_current+0x363/0x370 at addr ffff88010b48102c CPU: 0 PID: 15425 Comm: trinity-c2 Not tainted 4.8.0-rc2+ #140 ... Call Trace: dump_stack kasan_object_err kasan_report_error __asan_report_load2_noabort alloc_pages_current <-- use after free depot_save_stack save_stack kasan_slab_free kmem_cache_free __mpol_put <-- free do_exit This patch sets current->mempolicy to NULL before dropping the final reference. Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1608301442180.63329@chino.kir.corp.google.com Fixes: cd11016e5f52 ("mm, kasan: stackdepot implementation. Enable stackdepot for SLAB") Signed-off-by: David Rientjes <rientjes@google.com> Reported-by: Vegard Nossum <vegard.nossum@oracle.com> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: <stable@vger.kernel.org> [4.6+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-02treewide: remove references to the now unnecessary DEFINE_PCI_DEVICE_TABLEJoe Perches1-9/+0
It's been eliminated from the sources, remove it from everywhere else. Link: http://lkml.kernel.org/r/076eff466fd7edb550c25c8b25d76924ca0eba62.1472660229.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Andy Whitcroft <apw@canonical.com> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-02mm, vmscan: only allocate and reclaim from zones with pages managed by the ↵Mel Gorman1-2/+14
buddy allocator Firmware Assisted Dump (FA_DUMP) on ppc64 reserves substantial amounts of memory when booting a secondary kernel. Srikar Dronamraju reported that multiple nodes may have no memory managed by the buddy allocator but still return true for populated_zone(). Commit 1d82de618ddd ("mm, vmscan: make kswapd reclaim in terms of nodes") was reported to cause kswapd to spin at 100% CPU usage when fadump was enabled. The old code happened to deal with the situation of a populated node with zero free pages by co-incidence but the current code tries to reclaim populated zones without realising that is impossible. We cannot just convert populated_zone() as many existing users really need to check for present_pages. This patch introduces a managed_zone() helper and uses it in the few cases where it is critical that the check is made for managed pages -- zonelist construction and page reclaim. Link: http://lkml.kernel.org/r/20160831195104.GB8119@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-02Merge branch 'stable-4.8' of git://git.infradead.org/users/pcmoore/auditLinus Torvalds1-0/+1
Pull audit fixes from Paul Moore: "Two small patches to fix some bugs with the audit-by-executable functionality we introduced back in v4.3 (both patches are marked for the stable folks)" * 'stable-4.8' of git://git.infradead.org/users/pcmoore/audit: audit: fix exe_file access in audit_exe_compare mm: introduce get_task_exe_file
2016-09-02Merge tag 'xfs-iomap-for-linus-4.8-rc5' of ↵Linus Torvalds1-1/+7
git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs Pull xfs and iomap fixes from Dave Chinner: "Most of these changes are small regression fixes that address problems introduced in the 4.8-rc1 window. The two fixes that aren't (IO completion fix and superblock inprogress check) are fixes for problems introduced some time ago and need to be pushed back to stable kernels. Changes in this update: - iomap FIEMAP_EXTENT_MERGED usage fix - additional mount-time feature restrictions - rmap btree query fixes - freeze/unmount io completion workqueue fix - memory corruption fix for deferred operations handling" * tag 'xfs-iomap-for-linus-4.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: xfs: track log done items directly in the deferred pending work item iomap: don't set FIEMAP_EXTENT_MERGED for extent based filesystems xfs: prevent dropping ioend completions during buftarg wait xfs: fix superblock inprogress check xfs: simple btree query range should look right if LE lookup fails xfs: fix some key handling problems in _btree_simple_query_range xfs: don't log the entire end of the AGF xfs: disallow mounting of realtime + rmap filesystems xfs: don't perform lookups on zero-height btrees
2016-09-01ovl: don't cache acl on overlay layerMiklos Szeredi1-0/+1
Some operations (setxattr/chmod) can make the cached acl stale. We either need to clear overlay's acl cache for the affected inode or prevent acl caching on the overlay altogether. Preventing caching has the following advantages: - no double caching, less memory used - overlay cache doesn't go stale when fs clears it's own cache Possible disadvantage is performance loss. If that becomes a problem get_acl() can be optimized for overlayfs. This patch disables caching by pre setting i_*acl to a value that - has bit 0 set, so is_uncached_acl() will return true - is not equal to ACL_NOT_CACHED, so get_acl() will not overwrite it The constant -3 was chosen for this purpose. Fixes: 39a25b2b3762 ("ovl: define ->get_acl() for overlay inodes") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-08-31mm: introduce get_task_exe_fileMateusz Guzik1-0/+1
For more convenient access if one has a pointer to the task. As a minor nit take advantage of the fact that only task lock + rcu are needed to safely grab ->exe_file. This saves mm refcount dance. Use the helper in proc_exe_link. Signed-off-by: Mateusz Guzik <mguzik@redhat.com> Acked-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: Richard Guy Briggs <rgb@redhat.com> Cc: <stable@vger.kernel.org> # 4.3.x Signed-off-by: Paul Moore <paul@paul-moore.com>
2016-08-31Revert "tty/serial/8250: use mctrl_gpio helpers"Andy Shevchenko1-1/+0
Serial console is broken in v4.8-rcX. Mika and I independently bisected down to commit 4ef03d328769 ("tty/serial/8250: use mctrl_gpio helpers"). Since neither author nor anyone else didn't propose a solution we better revert it for now. This reverts commit 4ef03d328769eddbfeca1f1c958fdb181a69c341. Link: https://lkml.kernel.org/r/20160809130229.GN1729@lahna.fi.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-30mm/usercopy: get rid of CONFIG_DEBUG_STRICT_USER_COPY_CHECKSJosh Poimboeuf1-1/+1
There are three usercopy warnings which are currently being silenced for gcc 4.6 and newer: 1) "copy_from_user() buffer size is too small" compile warning/error This is a static warning which happens when object size and copy size are both const, and copy size > object size. I didn't see any false positives for this one. So the function warning attribute seems to be working fine here. Note this scenario is always a bug and so I think it should be changed to *always* be an error, regardless of CONFIG_DEBUG_STRICT_USER_COPY_CHECKS. 2) "copy_from_user() buffer size is not provably correct" compile warning This is another static warning which happens when I enable __compiletime_object_size() for new compilers (and CONFIG_DEBUG_STRICT_USER_COPY_CHECKS). It happens when object size is const, but copy size is *not*. In this case there's no way to compare the two at build time, so it gives the warning. (Note the warning is a byproduct of the fact that gcc has no way of knowing whether the overflow function will be called, so the call isn't dead code and the warning attribute is activated.) So this warning seems to only indicate "this is an unusual pattern, maybe you should check it out" rather than "this is a bug". I get 102(!) of these warnings with allyesconfig and the __compiletime_object_size() gcc check removed. I don't know if there are any real bugs hiding in there, but from looking at a small sample, I didn't see any. According to Kees, it does sometimes find real bugs. But the false positive rate seems high. 3) "Buffer overflow detected" runtime warning This is a runtime warning where object size is const, and copy size > object size. All three warnings (both static and runtime) were completely disabled for gcc 4.6 with the following commit: 2fb0815c9ee6 ("gcc4: disable __compiletime_object_size for GCC 4.6+") That commit mistakenly assumed that the false positives were caused by a gcc bug in __compiletime_object_size(). But in fact, __compiletime_object_size() seems to be working fine. The false positives were instead triggered by #2 above. (Though I don't have an explanation for why the warnings supposedly only started showing up in gcc 4.6.) So remove warning #2 to get rid of all the false positives, and re-enable warnings #1 and #3 by reverting the above commit. Furthermore, since #1 is a real bug which is detected at compile time, upgrade it to always be an error. Having done all that, CONFIG_DEBUG_STRICT_USER_COPY_CHECKS is no longer needed. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Nilay Vaish <nilayvaish@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-30Merge tag 'phy-for-4.8-rc' of ↵Greg Kroah-Hartman1-0/+153
git://git.kernel.org/pub/scm/linux/kernel/git/kishon/linux-phy into usb-linus Kishon writes: phy: for 4.8 -rc *) Fix to get host-only mode working in sun4i *) Fix a compilation error because of missing header file *) Other minor fixes Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
2016-08-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2-2/+12
Pull networking fixes from David Miller: 1) Segregate namespaces properly in conntrack dumps, from Liping Zhang. 2) tcp listener refcount fix in netfilter tproxy, from Eric Dumazet. 3) Fix timeouts in qed driver due to xmit_more, from Yuval Mintz. 4) Fix use-after-free in tcp_xmit_retransmit_queue(). 5) Userspace header fixups (use of __u32, missing includes, etc.) from Mikko Rapeli. 6) Further refinements to fragmentation wrt gso and tunnels, from Shmulik Ladkani. 7) Trigger poll correctly for zero length UDP packets, from Eric Dumazet. 8) TCP window scaling fix, also from Eric Dumazet. 9) SLAB_DESTROY_BY_RCU is not relevant any more for UDP sockets. 10) Module refcount leak in qdisc_create_dflt(), from Eric Dumazet. 11) Fix deadlock in cp_rx_poll() of 8139cp driver, from Gao Feng. 12) Memory leak in rhashtable's alloc_bucket_locks(), from Eric Dumazet. 13) Add new device ID to alx driver, from Owen Lin. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (83 commits) Add Killer E2500 device ID in alx driver. net: smc91x: fix SMC accesses Documentation: networking: dsa: Remove platform device TODO net/mlx5: Increase number of ethtool steering priorities net/mlx5: Add error prints when validate ETS failed net/mlx5e: Fix memory leak if refreshing TIRs fails net/mlx5e: Add ethtool counter for TX xmit_more net/mlx5e: Fix ethtool -g/G rx ring parameter report with striding RQ net/mlx5e: Don't wait for SQ completions on close net/mlx5e: Don't post fragmented MPWQE when RQ is disabled net/mlx5e: Don't wait for RQ completions on close net/mlx5e: Limit UMR length to the device's limitation rhashtable: fix a memory leak in alloc_bucket_locks() sfc: fix potential stack corruption from running past stat bitmask team: loadbalance: push lacpdus to exact delivery net: hns: dereference ppe_cb->ppe_common_cb if it is non-null 8139cp: Fix one possible deadloop in cp_rx_poll i40e: Change some init flow for the client Revert "phy: IRQ cannot be shared" net: dsa: bcm_sf2: Fix race condition while unmasking interrupts ...
2016-08-29Merge branch 'nvmf-4.8-rc' of git://git.infradead.org/nvme-fabrics into ↵Jens Axboe1-1/+1
for-linus Sagi writes: Mostly stability fixes and cleanups: - NQN endianess fix from Daniel - possible use-after-free fix from Vincent - nvme-rdma connect semantics fixes from Jay - Remove redundant variables in rdma driver - Kbuild fix from Christoph - nvmf_host referencing fix from Christoph - uninit variable fix from Colin
2016-08-29blk-mq: improve layout of blk_mq_hw_ctxJens Axboe1-4/+5
Various cache line optimizations: - Move delay_work towards the end. It's huge, and we don't use it a lot (only SCSI). - Move the atomic state into the same cacheline as the the dispatch list and lock. - Rearrange a few members to pack it better. - Shrink the max-order for dispatch accounting from 10 to 7. This means that ->dispatched[] and ->run now take up their own cacheline. This shrinks struct blk_mq_hw_ctx down to 8 cachelines. Signed-off-by: Jens Axboe <axboe@fb.com>
2016-08-29blk-mq: turn hctx->run_work into a regular work structJens Axboe1-1/+1
We don't need the larger delayed work struct, since we always run it immediately. Signed-off-by: Jens Axboe <axboe@fb.com>
2016-08-29block: add kblockd_schedule_work_on()Jens Axboe1-1/+1
Add a helper to schedule a regular struct work on a particular CPU. Signed-off-by: Jens Axboe <axboe@fb.com>
2016-08-29workqueue: add cancel_work()Jens Axboe1-0/+1
Like cancel_delayed_work(), but for regular work. Signed-off-by: Jens Axboe <axboe@fb.com> Mehed-by: Tejun Heo <tj@kernel.org> Acked-by: Tejun Heo <tj@kernel.org>
2016-08-29net: smc91x: fix SMC accessesRussell King1-0/+10
Commit b70661c70830 ("net: smc91x: use run-time configuration on all ARM machines") broke some ARM platforms through several mistakes. Firstly, the access size must correspond to the following rule: (a) at least one of 16-bit or 8-bit access size must be supported (b) 32-bit accesses are optional, and may be enabled in addition to the above. Secondly, it provides no emulation of 16-bit accesses, instead blindly making 16-bit accesses even when the platform specifies that only 8-bit is supported. Reorganise smc91x.h so we can make use of the existing 16-bit access emulation already provided - if 16-bit accesses are supported, use 16-bit accesses directly, otherwise if 8-bit accesses are supported, use the provided 16-bit access emulation. If neither, BUG(). This exactly reflects the driver behaviour prior to the commit being fixed. Since the conversion incorrectly cut down the available access sizes on several platforms, we also need to go through every platform and fix up the overly-restrictive access size: Arnd assumed that if a platform can perform 32-bit, 16-bit and 8-bit accesses, then only a 32-bit access size needed to be specified - not so, all available access sizes must be specified. This likely fixes some performance regressions in doing this: if a platform does not support 8-bit accesses, 8-bit accesses have been emulated by performing a 16-bit read-modify-write access. Tested on the Intel Assabet/Neponset platform, which supports only 8-bit accesses, which was broken by the original commit. Fixes: b70661c70830 ("net: smc91x: use run-time configuration on all ARM machines") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Tested-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-29iomap: don't set FIEMAP_EXTENT_MERGED for extent based filesystemsChristoph Hellwig1-1/+7
Filesystems like XFS that use extents should not set the FIEMAP_EXTENT_MERGED flag in the fiemap extent structures. To allow for both behaviors for the upcoming gfs2 usage split the iomap type field into type and flags, and only set FIEMAP_EXTENT_MERGED if the IOMAP_F_MERGED flag is set. The flags field will also come in handy for future features such as shared extents on reflink-enabled file systems. Reported-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-08-29Merge tag 'drm-fixes-for-4.8-rc4' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds1-0/+2
Pull drm fixes from Dave Airlie: "A bunch of fixes covering i915, amdgpu, one tegra and some core DRM ones. Nothing too strange at this point" * tag 'drm-fixes-for-4.8-rc4' of git://people.freedesktop.org/~airlied/linux: (21 commits) drm/atomic: Don't potentially reset color_mgmt_changed on successive property updates. drm: Protect fb_defio in drivers with CONFIG_KMS_FBDEV_EMULATION drm/amdgpu: skip TV/CV in display parsing drm/amdgpu: avoid a possible array overflow drm/amdgpu: fix lru size grouping v2 drm/tegra: dsi: Enhance runtime power management drm/i915: Fix botched merge that downgrades CSR versions. drm/i915/skl: Ensure pipes with changed wms get added to the state drm/i915/gen9: Only copy WM results for changed pipes to skl_hw drm/i915/skl: Add support for the SAGV, fix underrun hangs drm/i915/gen6+: Interpret mailbox error flags drm/i915: Reattach comment, complete type specification drm/i915: Unconditionally flush any chipset buffers before execbuf drm/i915/gen9: Drop invalid WARN() during data rate calculation drm/i915/gen9: Initialize intel_state->active_crtcs during WM sanitization (v2) drm: Reject page_flip for !DRIVER_MODESET drm/amdgpu: fix timeout value check in amd_sched_job_recovery drm/amdgpu: fix sdma_v2_4_ring_test_ib drm/amdgpu: fix amdgpu_move_blit on 32bit systems drm/radeon: fix radeon_move_blit on 32bit systems ...
2016-08-28Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-0/+1
Pull KVM fixes from Paolo Bonzini: "ARM: - fixes for ITS init issues, error handling, IRQ leakage, race conditions - an erratum workaround for timers - some removal of misleading use of errors and comments - a fix for GICv3 on 32-bit guests MIPS: - fix for where the guest could wrongly map the first page of physical memory x86: - nested virtualization fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: MIPS: KVM: Check for pfn noslot case kvm: nVMX: fix nested tsc scaling KVM: nVMX: postpone VMCS changes on MSR_IA32_APICBASE write KVM: nVMX: fix msr bitmaps to prevent L2 from accessing L0 x2APIC arm64: KVM: report configured SRE value to 32-bit world arm64: KVM: remove misleading comment on pmu status KVM: arm/arm64: timer: Workaround misconfigured timer interrupt arm64: Document workaround for Cortex-A72 erratum #853709 KVM: arm/arm64: Change misleading use of is_error_pfn KVM: arm64: ITS: avoid re-mapping LPIs KVM: arm64: check for ITS device on MSI injection KVM: arm64: ITS: move ITS registration into first VCPU run KVM: arm64: vgic-its: Make updates to propbaser/pendbaser atomic KVM: arm64: vgic-its: Plug race in vgic_put_irq KVM: arm64: vgic-its: Handle errors from vgic_add_lpi KVM: arm64: ITS: return 1 on successful MSI injection