summaryrefslogtreecommitdiff
path: root/drivers/scsi
AgeCommit message (Collapse)AuthorFilesLines
2016-04-11fnic: Cleanup the I/O pending with fw and has timed out and is used to issue ↵Satish Kharat2-11/+29
LUN reset In case of LUN reset, the device reset command is issued with one of the I/Os that has timed out on that LUN. The change is to also return this I/O with error status set to DID_RESET. In case when the reset is issued using the sg_reset tool (from sg3_utils) it is a new command and new_sc is set to 1. Fnic driver version changed from 1.6.0.19 to 1.6.0.20 [mkp: Fixed checkpatch warning] Signed-off-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Sesidhar Baddela <sebaddel@cisco.com> Reviewed-by: Ewan Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11fnic: Fix to cleanup aborted IO to avoid device being offlined by mid-layerSatish Kharat2-7/+28
If an I/O times out and an abort issued by host, if the abort is successful we need to set scsi status as DID_ABORT. Or else the mid-layer error handler which looks for this error code, will offline the device. Also if the original I/O is not found in fnic firmware, we will consider the abort as successful. The start_time assignment is moved because of the new goto. Fnic driver version changed from 1.6.0.17a to 1.6.0.19, version 1.6.0.18 has been skipped [mkp: Fixed checkpatch warning] Signed-off-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Sesidhar Baddela <sebaddel@cisco.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11scsi-trace: define ZBC_IN and ZBC_OUTHannes Reinecke1-0/+70
Add new trace functions for ZBC_IN and ZBC_OUT. Reviewed-by: Doug Gilbert <dgilbert@interlog.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11scsi-trace: Decode MAINTENANCE_IN and MAINTENANCE_OUT commandsHannes Reinecke1-0/+91
Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11scsi: disable automatic target scanHannes Reinecke8-24/+48
On larger installations it is useful to disable automatic LUN scanning, and only add the required LUNs via udev rules. This can speed up bootup dramatically. This patch introduces a new scan module parameter value 'manual', which works like 'none', but can be overridden by setting the 'rescan' value from scsi_scan_target to 'SCSI_SCAN_MANUAL'. And it updates all relevant callers to set the 'rescan' value to 'SCSI_SCAN_MANUAL' if invoked via the 'scan' option in sysfs. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Tested-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11ncr5380: Call complete_cmd() for disconnected commands on bus resetFinn Thain1-1/+1
I'm told that some targets are liable to disconnect a REQUEST SENSE command. Theoretically this would cause a command undergoing autosense to be moved onto the disconnected list. The bus reset handler must call complete_cmd() for these commands, otherwise the hostdata->sensing pointer will not get cleared. That would cause autosense processing to stall and a timeout or an incorrect scsi_eh_restore_cmnd() would eventually follow. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reported-by: Michael Schmitz <schmitzmic@gmail.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11mac_scsi: Fix pseudo DMA implementationFinn Thain2-95/+119
Fix various issues: Comments about bus errors are incorrect. The PDMA asm must return the size of the memory access that faulted so the transfer count can be adjusted accordingly. A phase change may cause a bus error but should not be treated as failure. A bus error does not always imply a phase change and generally the transfer may continue. Scatter/gather doesn't seem to work with PDMA due to overruns. This is a pity because peak throughput seems to double with SG_ALL. Tested on a Mac LC III and a PowerBook 520. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11atari_scsi: Allow can_queue to be increased for FalconFinn Thain1-61/+22
The benefit of limiting can_queue to 1 is that atari_scsi shares the ST DMA chip more fairly with other drivers (e.g. falcon-ide). Unfortunately, this can limit SCSI bus utilization. On systems without IDE, atari_scsi should issue SCSI commands whenever it can arbitrate for the bus. Make that possible by making can_queue configurable. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11atari_scsi: Set a reasonable default for cmd_per_lunFinn Thain1-2/+1
This setting does not need to be conditional on Atari ST or TT. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11ncr5380: Update usage documentationFinn Thain1-35/+1
Update kernel parameter documentation for atari_scsi, mac_scsi and g_NCR5380 drivers. Remove duplication. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11ncr5380: Remove DONT_USE_INTR and AUTOPROBE_IRQ macrosFinn Thain8-32/+4
Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11ncr5380: Remove remaining register storage qualifiersFinn Thain1-3/+3
Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11ncr5380: Fix register decoding for debuggingFinn Thain1-17/+25
Decode all bits in the chip registers. They are all useful at times. Fix printk severity so that this output can be suppressed along with the other debugging output. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11dmx3191d: Drop max_sectors limitFinn Thain1-1/+0
The dmx3191d driver is not capable of DMA or PDMA so all transfers use PIO. Now that large slow PIO transfers periodically stop and call cond_resched(), the max_sectors limit can go away. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11ncr5380: Reduce max_lun limitFinn Thain1-0/+2
The driver has a limit of eight LUs because of the byte-sized bitfield that is used for busy flags. That means the maximum LUN is 7. The default is 8. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11ncr5380: Remove disused atari_NCR5380.c core driverFinn Thain6-2699/+4
Now that atari_scsi and sun3_scsi have been converted to use the NCR5380.c core driver, remove atari_NCR5380.c. Also remove the last vestiges of its Tagged Command Queueing implementation from the wrapper drivers. The TCQ support in atari_NCR5380.c is abandoned by this patch. It is not merged into the remaining core driver because, 1) atari_scsi defines SUPPORT_TAGS but leaves FLAG_TAGGED_QUEUING disabled by default, which indicates that it is mostly undesirable. 2) I'm told that it doesn't work correctly when enabled. 3) The algorithm does not make use of block layer tags which it will have to do because scmd->tag is deprecated. 4) sun3_scsi doesn't define SUPPORT_TAGS at all, yet the the SUPPORT_TAGS macro interacts with the CONFIG_SUN3 macro in 'interesting' ways. 5) Compile-time configuration with macros like SUPPORT_TAGS caused the configuration space to explode, leading to untestable and unmaintainable code that is too hard to reason about. The merge_contiguous_buffers() code is also abandoned. This was unused by sun3_scsi. Only atari_scsi used it and then only on TT, because only TT supports scatter/gather. I suspect that the TT would work fine with ENABLE_CLUSTERING instead. If someone can benchmark the difference then perhaps the merge_contiguous_buffers() code can be be justified. Until then we are better off without the extra complexity. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11sun3_scsi: Adopt NCR5380.c core driverFinn Thain2-15/+124
Add support for the custom Sun 3 DMA logic to the NCR5380.c core driver. This code is copied from atari_NCR5380.c. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11atari_scsi: Adopt NCR5380.c core driverFinn Thain2-3/+35
Add support for the Atari ST DMA chip to the NCR5380.c core driver. This code is copied from atari_NCR5380.c. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11ncr5380: Merge DMA implementation from atari_NCR5380 core driverFinn Thain13-40/+152
Adopt the DMA implementation from atari_NCR5380.c. This means that atari_scsi and sun3_scsi can make use of the NCR5380.c core driver and the atari_NCR5380.c driver fork can be made redundant. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11ncr5380: Adopt uniform DMA setup conventionFinn Thain4-18/+20
Standardize the DMA setup hooks so that the DMA implementation in atari_NCR5380.c can be reconciled with pseudo DMA implementation in NCR5380.c. Calls to NCR5380_dma_recv_setup() and NCR5380_dma_send_setup() return a negative value on failure, zero on PDMA transfer success and a positive byte count for DMA setup success. This convention is not entirely new, but is now applied consistently. Also remove a pointless Status Register access: the *phase assignment is redundant because after NCR5380_transfer_dma() returns control to NCR5380_information_transfer(), that routine then returns control to NCR5380_main(), which means *phase is dead. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11ncr5380: Use DMA hooks for PDMAFinn Thain13-41/+50
Those wrapper drivers which use DMA define the REAL_DMA macro and those which use pseudo DMA define PSEUDO_DMA. These macros need to be removed for a number of reasons, not least of which is to have drivers share more code. Redefine the PDMA send and receive hooks as DMA setup hooks, so that the DMA code can be shared by all 5380 wrapper drivers. This will help to reunify the forked core driver. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11ncr5380: Remove BOARD_REQUIRES_NO_DELAY macroFinn Thain4-10/+14
The io_recovery_delay macro is intended to insert a microsecond delay between the chip register accesses that begin a DMA operation. This is reportedly needed for some ISA boards. Reverse the sense of the macro test so that in the common case, where no delay is required, drivers need not define the macro. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11ncr5380: Remove PSEUDO_DMA macroFinn Thain14-78/+6
For those wrapper drivers which only implement Programmed IO, have NCR5380_dma_xfer_len() evaluate to zero. That allows PDMA to be easily disabled at run-time and so the PSEUDO_DMA macro is no longer needed. Also remove the spin counters used for debugging pseudo DMA drivers. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11ncr5380: Disable the DMA errata workaround flag by defaultFinn Thain8-20/+14
The only chip that needs the workarounds enabled is an early NMOS device. That means that the common case is to disable them. Unfortunately the sense of the flag is such that it has to be set for the common case. Rename the flag so that zero can be used to mean "no errata workarounds needed". This simplifies the code. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11atari_NCR5380: Remove DMA_MIN_SIZE macroFinn Thain3-19/+22
Only the atari_scsi and sun3_scsi drivers define DMA_MIN_SIZE. Both drivers also define NCR5380_dma_xfer_len, which means DMA_MIN_SIZE can be removed from the core driver. This removes another discrepancy between the two core drivers. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11ncr5380: Remove REAL_DMA and REAL_DMA_POLL macrosFinn Thain5-415/+22
For the NCR5380.c core driver, these macros are never used. If REAL_DMA were to be defined, compilation would fail. For the atari_NCR5380.c core driver, REAL_DMA is always defined. Hence these macros are pointless. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11ncr5380: Remove FLAG_NO_PSEUDO_DMA where possibleFinn Thain5-6/+23
Drivers that define PSEUDO_DMA also define NCR5380_dma_xfer_len. The core driver must call NCR5380_dma_xfer_len which means FLAG_NO_PSEUDO_DMA can be eradicated from the core driver. dmx3191d doesn't define PSEUDO_DMA and has no use for FLAG_NO_PSEUDO_DMA, so remove it there also. Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Michael Schmitz <schmitzmic@gmail.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11g_ncr5380: Remove CONFIG_SCSI_GENERIC_NCR53C400Finn Thain3-77/+25
This change brings a number of improvements: fewer macros, better test coverage, simpler code and sane Kconfig options. The downside is a small chance of incompatibility (which seems unavoidable). CONFIG_SCSI_GENERIC_NCR53C400 exists to enable or inhibit pseudo DMA transfers when the driver is used with 53C400-compatible cards. Thanks to Ondrej Zary's patches, PDMA now works which means it can be enabled unconditionally. Due to bad design, CONFIG_SCSI_GENERIC_NCR53C400 ties together unrelated functionality as it sets both PSEUDO_DMA and BIOSPARAM macros. This patch effectively enables PSEUDO_DMA and disables BIOSPARAM. The defconfigs and the Kconfig default leave CONFIG_SCSI_GENERIC_NCR53C400 undefined. Red Hat 9 and CentOS 2.1 were the same. This leaves both PSEUDO_DMA and BIOSPARAM disabled. The effect of this patch should be better performance from enabling PSEUDO_DMA. On the other hand, Debian 4 and SLES 10 had CONFIG_SCSI_GENERIC_NCR53C400 enabled, so both PSEUDO_DMA and BIOSPARAM were enabled. This patch might affect configurations like this by disabling BIOSPARAM. My best guess is that this could be a problem only in the vanishingly rare case that 1) the CHS values stored in the boot device partition table are wrong and 2) a 5380 card is in use (because PDMA on 53C400 used to be broken). Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11scsi: reduce CONFIG_SCSI_CONSTANTS=y impact by 8kRasmus Villemoes2-7/+23
On 64 bit, struct error_info has 6 bytes of padding, which amounts to over 4k of wasted space in the additional[] array. We could easily get rid of that by instead using separate arrays for the codes and the pointers. However, we can do even better than that and save an additional 6 bytes per entry: In the table, just store the sizeof() the corresponding string literal. The cumulative sum of these is then the appropriate offset into additional_text, which is built from the concatenation (with '\0's inbetween) of the strings. $ scripts/bloat-o-meter /tmp/vmlinux vmlinux add/remove: 0/0 grow/shrink: 1/1 up/down: 24/-8488 (-8464) function old new delta scsi_extd_sense_format 136 160 +24 additional 11312 2824 -8488 The Kconfig help text used to say that CONFIG_SCSI_CONSTANTS=y costs around 75 KB, but that was a little exaggerated. The actual number was closer to 44K, and 36K with this patch. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11scsi: move Additional Sense Codes to separate fileRasmus Villemoes2-827/+829
This is a purely mechanical move of the list of additional sense codes to a separate file, in preparation for reducing the impact of choosing CONFIG_SCSI_CONSTANTS=y by about 8k. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-11scsi: make some Additional Sense strings more grep'ableRasmus Villemoes1-18/+9
There's little point in breaking these strings over multiple lines. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-09Merge tag 'scsi-fixes' of ↵Linus Torvalds9-109/+139
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is a set of eight fixes. Two are trivial gcc-6 updates (brace additions and unused variable removal). There's a couple of cxlflash regressions, a correction for sd being overly chatty on revalidation (causing excess log increases). A VPD issue which could crash USB devices because they seem very intolerant to VPD inquiries, an ALUA deadlock fix and a mpt3sas buffer overrun fix" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: Do not attach VPD to devices that don't support it sd: Fix excessive capacity printing on devices with blocks bigger than 512 bytes scsi_dh_alua: Fix a recently introduced deadlock scsi: Declare local symbols static cxlflash: Move to exponential back-off when cmd_room is not available cxlflash: Fix regression issue with re-ordering patch mpt3sas: Don't overreach ioc->reply_post[] during initialization aacraid: add missing curly braces
2016-04-05Merge branch 'fixes-base' into fixesJames Bottomley21-155/+234
2016-04-05scsi: Do not attach VPD to devices that don't support itHannes Reinecke2-19/+3
The patch "scsi: rescan VPD attributes" introduced a regression in which devices that don't support VPD were being scanned for VPD attributes anyway. This could cause issues for some devices and should be avoided so the check for scsi_level has been moved out of scsi_add_lun and into scsi_attach_vpd so that all callers will not scan VPD for devices that don't support it. [mkp: Merge fix] Fixes: 09e2b0b14690 ("scsi: rescan VPD attributes") Cc: <stable@vger.kernel.org> #v4.5+ Suggested-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-04mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macrosKirill A. Shutemov2-3/+3
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-01sd: Fix excessive capacity printing on devices with blocks bigger than 512 bytesMartin K. Petersen2-21/+14
During revalidate we check whether device capacity has changed before we decide whether to output disk information or not. The check for old capacity failed to take into account that we scaled sdkp->capacity based on the reported logical block size. And therefore the capacity test would always fail for devices with sectors bigger than 512 bytes and we would print several copies of the same discovery information. Avoid scaling sdkp->capacity and instead adjust the value on the fly when setting the block device capacity and generating fake C/H/S geometry. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: <stable@vger.kernel.org> Reported-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Hannes Reinicke <hare@suse.de> Reviewed-by: Ewan Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-03-30scsi_dh_alua: Fix a recently introduced deadlockBart Van Assche1-2/+2
While retesting the SRP initiator I ran the command "rmmod mlx4_ib" while I/O was in progress. That command triggers SCSI device removal indirectly. Avoid that this action triggers the following deadlock: ================================= [ INFO: inconsistent lock state ] 4.6.0-rc0-dbg+ #2 Tainted: G O --------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. multipathd/484 [HC0[0]:SC0[0]:HE1:SE1] takes: (&(&pg->lock)->rlock){+.?...}, at: [<ffffffffa04f50a2>] alua_bus_detach+0x52/0xa0 [scsi_dh_alua] {IN-SOFTIRQ-W} state was registered at: [<ffffffff810a64a9>] __lock_acquire+0x7e9/0x1ad0 [<ffffffff810a7fd0>] lock_acquire+0x60/0x80 [<ffffffff8159910e>] _raw_spin_lock_irqsave+0x3e/0x60 [<ffffffffa04f5131>] alua_rtpg_queue+0x41/0x1d0 [scsi_dh_alua] [<ffffffffa04f5531>] alua_check+0xe1/0x220 [scsi_dh_alua] [<ffffffffa04f5709>] alua_check_sense+0x99/0xb0 [scsi_dh_alua] [<ffffffff813f0d01>] scsi_check_sense+0x71/0x3f0 [<ffffffff813f2f8b>] scsi_decide_disposition+0x18b/0x1d0 [<ffffffff813f6e52>] scsi_softirq_done+0x52/0x140 [<ffffffff812a26f2>] blk_done_softirq+0x52/0x90 [<ffffffff8105bc1f>] __do_softirq+0x10f/0x230 [<ffffffff8105bec8>] irq_exit+0xa8/0xb0 [<ffffffff8101a675>] do_IRQ+0x65/0x110 [<ffffffff8159a2c9>] ret_from_intr+0x0/0x19 [<ffffffff811732f1>] kmem_cache_alloc+0x151/0x190 [<ffffffff8118e534>] create_object+0x34/0x2d0 [<ffffffff8158eaa6>] kmemleak_alloc_percpu+0x56/0xd0 [<ffffffff8113ab0d>] pcpu_alloc+0x38d/0x660 [<ffffffff8113aded>] __alloc_percpu_gfp+0xd/0x10 [<ffffffff812e56a5>] __percpu_counter_init+0x55/0xb0 [<ffffffff812b4989>] blkg_alloc+0x79/0x230 [<ffffffff812b6756>] blkcg_init_queue+0x26/0x1d0 [<ffffffff81297eed>] blk_alloc_queue_node+0x27d/0x2e0 [<ffffffffa017766c>] dm_create+0x20c/0x570 [dm_mod] [<ffffffffa017e356>] dev_create+0x56/0x2c0 [dm_mod] [<ffffffffa017dcae>] ctl_ioctl+0x26e/0x520 [dm_mod] [<ffffffffa017df6e>] dm_ctl_ioctl+0xe/0x20 [dm_mod] [<ffffffff811aa8ee>] do_vfs_ioctl+0x8e/0x660 [<ffffffff811aaefc>] SyS_ioctl+0x3c/0x70 [<ffffffff81599929>] entry_SYSCALL_64_fastpath+0x1c/0xac irq event stamp: 4290931 hardirqs last enabled at (4290931): [ 1662.892772] [<ffffffff81599341>] _raw_spin_unlock_irqrestore+0x31/0x50 hardirqs last disabled at (4290930): [<ffffffff815990e7>] _raw_spin_lock_irqsave+0x17/0x60 softirqs last enabled at (4290774): [<ffffffff8105bcdb>] __do_softirq+0x1cb/0x230 softirqs last disabled at (4289831): [<ffffffff8105bec8>] irq_exit+0xa8/0xb0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&pg->lock)->rlock); <Interrupt> lock(&(&pg->lock)->rlock); *** DEADLOCK *** 2 locks held by multipathd/484: #0: (&bdev->bd_mutex){+.+.+.}, at: [<ffffffff811d1cc3>] __blkdev_put+0x33/0x360 #1: (sd_ref_mutex){+.+...}, at: [<ffffffff81400afc>] scsi_disk_put+0x1c/0x40 stack backtrace: CPU: 6 PID: 484 Comm: multipathd Tainted: G O 4.6.0-rc0-dbg+ #2 Call Trace: [<ffffffff812bd115>] dump_stack+0x67/0x92 [<ffffffff810a5175>] print_usage_bug+0x215/0x240 [<ffffffff810a56ea>] mark_lock+0x54a/0x610 [<ffffffff810a6505>] __lock_acquire+0x845/0x1ad0 [<ffffffff810a7fd0>] lock_acquire+0x60/0x80 [<ffffffff81598f23>] _raw_spin_lock+0x33/0x50 [<ffffffffa04f50a2>] alua_bus_detach+0x52/0xa0 [scsi_dh_alua] [<ffffffff813ff6f7>] scsi_dh_release_device+0x17/0x50 [<ffffffff813fb8da>] scsi_device_dev_release_usercontext+0x2a/0x120 [<ffffffff810701f0>] execute_in_process_context+0x80/0x90 [<ffffffff813fb8a7>] scsi_device_dev_release+0x17/0x20 [<ffffffff813c8cfd>] device_release+0x2d/0x90 [<ffffffff812bfa8a>] kobject_release+0x7a/0x190 [<ffffffff812bf946>] kobject_put+0x26/0x50 [<ffffffff813c8ee2>] put_device+0x12/0x20 [<ffffffff813edc86>] scsi_device_put+0x26/0x30 [<ffffffff81400b0d>] scsi_disk_put+0x2d/0x40 [<ffffffff81400b68>] sd_release+0x48/0xb0 [<ffffffff811d1f2e>] __blkdev_put+0x29e/0x360 [<ffffffff811d24b9>] blkdev_put+0x49/0x170 [<ffffffff811d2600>] blkdev_close+0x20/0x30 [<ffffffff81198f48>] __fput+0xe8/0x1f0 [<ffffffff81199089>] ____fput+0x9/0x10 [<ffffffff81075d9e>] task_work_run+0x6e/0xa0 [<ffffffff81001119>] exit_to_usermode_loop+0xa9/0xb0 [<ffffffff81001590>] syscall_return_slowpath+0xb0/0xc0 [<ffffffff815999b7>] entry_SYSCALL_64_fastpath+0xaa/0xac Fixes: cb0a168cb6b8 (scsi_dh_alua: update 'access_state' field) Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Hannes Reinicke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ewan Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-03-30scsi: Declare local symbols staticBart Van Assche1-3/+5
Avoid that building with W=1 causes gcc to report warnings about symbols that have not been declared. Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinicke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ewan Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-03-29cxlflash: Move to exponential back-off when cmd_room is not availableManoj N. Kumar1-4/+4
While profiling the cxlflash_queuecommand() path under a heavy load it was found that number of retries to find cmd_room was fairly high. There are two problems with the current back-off: a) It starts with a udelay of 0 b) It backs-off linearly Tried several approaches (a higher multiple 10*n, 100*n, as well as n^2, 2^n) and found that the exponential back-off(2^n) approach had the least overall cost. Cost as being defined as overall time spent waiting. The fix is to change the linear back-off to an exponential back-off. This solution also takes care of the problem with the initial delay (starts with 1 usec). Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-03-29cxlflash: Fix regression issue with re-ordering patchManoj N. Kumar2-42/+93
While running 'sg_reset -H' back to back the following exception was seen: [ 735.115695] Faulting instruction address: 0xd0000000098c0864 cpu 0x0: Vector: 300 (Data Access) at [c000000ffffafa80] pc: d0000000098c0864: cxlflash_async_err_irq+0x84/0x5c0 [cxlflash] lr: c00000000013aed0: handle_irq_event_percpu+0xa0/0x310 sp: c000000ffffafd00 msr: 9000000000009033 dar: 2010000 dsisr: 40000000 current = 0xc000000001510880 paca = 0xc00000000fb80000 softe: 0 irq_happened: 0x01 pid = 0, comm = swapper/0 Linux version 4.5.0-491-26f710d+ enter ? for help [c000000ffffafe10] c00000000013aed0 handle_irq_event_percpu+0xa0/0x310 [c000000ffffafed0] c00000000013b1a8 handle_irq_event+0x68/0xc0 [c000000ffffaff00] c0000000001404ec handle_fasteoi_irq+0xec/0x2a0 [c000000ffffaff30] c00000000013a084 generic_handle_irq+0x54/0x80 [c000000ffffaff60] c000000000011130 __do_irq+0x80/0x1d0 [c000000ffffaff90] c000000000024d40 call_do_irq+0x14/0x24 [c000000001573a20] c000000000011318 do_IRQ+0x98/0x140 [c000000001573a70] c000000000002594 hardware_interrupt_common+0x114/0x180 This exception is being hit because the async_err interrupt path performs an MMIO to read the interrupt status register. The MMIO region in this case is not available. Commit 6ded8b3cbd9a ("cxlflash: Unmap problem state area before detaching master context") re-ordered the sequence in which term_mc() and stop_afu() are called. This introduces a window for interrupts to come in with the problem space area unmapped, that did not exist previously. The fix is to separate the disabling of all AFU interrupts to a distinct function, term_intr() so that it is the first thing that is done in the tear down process. To keep the initialization process symmetric, separate the AFU interrupt setup also to a distinct function: init_intr(). Fixes: 6ded8b3cbd9a ("cxlflash: Unmap problem state area before detaching master context") Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-03-26Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds20-154/+1171
Pull more SCSI updates from James Bottomley: "The only new stuff which missed the first pull request is an update to the UFS driver. The rest is an assortment of bug fixes and minor tweaks which appeared recently (some are fixes for recent code and some are stuff spotted recently by the checkers or the new gcc-6 compiler [most of Arnd's stuff])" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (32 commits) scsi_common: do not clobber fixed sense information scsi: ufs: select CONFIG_NLS scsi: fc: use get/put_unaligned64 for wwn access fnic: move printk()s outside of the critical code section. qla2xxx: avoid maybe_uninitialized warning megaraid_sas: add missing curly braces in ioctl handler lpfc: fix misleading indentation scsi_transport_sas: add 'scsi_target_id' sysfs attribute scsi_dh_alua: uninitialized variable in alua_check_vpd() scsi: ufs-qcom: add printouts of testbus debug registers scsi: ufs-qcom: enable/disable the device ref clock scsi: ufs-qcom: set PA_Local_TX_LCC_Enable before link startup scsi: ufs: add device quirk delay before putting UFS rails in LPM scsi: ufs: fix leakage during link off state scsi: ufs: tune UniPro parameters to optimize hibern8 exit time scsi: ufs: handle non spec compliant bkops behaviour by device scsi: ufs: add retry for query descriptors scsi: ufs: add error recovery after DL NAC error scsi: ufs: make error handling bit faster scsi: ufs: disable vccq if it's not needed by UFS device ...
2016-03-24Merge branch 'for-next-merge' of ↵Linus Torvalds1-17/+0
git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull more SCSI target updates from Nicholas Bellinger: "This series contains cxgb4 driver prerequisites for supporting iscsi segmentation offload (ISO), that will be utilized for a number of future v4.7 developments in iscsi-target for supporting generic hw offloads" * 'for-next-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: cxgb4: update Kconfig and Makefile cxgb4: add iSCSI DDP page pod manager cxgb4, iw_cxgb4: move delayed ack macro definitions cxgb4: move VLAN_NONE macro definition cxgb4: update struct cxgb4_lld_info definition cxgb4: add definitions for iSCSI target ULD cxgb4, cxgb4i: move struct cpl_rx_data_ddp definition cxgb4, iw_cxgb4, cxgb4i: remove duplicate definitions cxgb4, iw_cxgb4: move definitions to common header file cxgb4: large receive offload support cxgb4: allocate resources for CXGB4_ULD_ISCSIT cxgb4: add new ULD type CXGB4_ULD_ISCSIT
2016-03-23mpt3sas: Don't overreach ioc->reply_post[] during initializationCalvin Owens1-17/+16
In _base_make_ioc_operational(), we walk ioc->reply_queue_list and pull a pointer out of successive elements of ioc->reply_post[] for each entry in that list if RDPQ is enabled. Since the code pulls the pointer for the next iteration at the bottom of the loop, it triggers the a KASAN dump on the final iteration: BUG: KASAN: slab-out-of-bounds in _base_make_ioc_operational+0x47b7/0x47e0 [mpt3sas] at addr ffff880754816ab0 Read of size 8 by task modprobe/305 <snip> Call Trace: [<ffffffff81dfc591>] dump_stack+0x4d/0x6c [<ffffffff814c9689>] print_trailer+0xf9/0x150 [<ffffffff814ceda4>] object_err+0x34/0x40 [<ffffffff814d1231>] kasan_report_error+0x221/0x530 [<ffffffff814d1673>] __asan_report_load8_noabort+0x43/0x50 [<ffffffffa0043637>] _base_make_ioc_operational+0x47b7/0x47e0 [mpt3sas] [<ffffffffa0049a51>] mpt3sas_base_attach+0x1991/0x2120 [mpt3sas] [<ffffffffa0053c93>] _scsih_probe+0xeb3/0x16b0 [mpt3sas] [<ffffffff81ebd047>] local_pci_probe+0xc7/0x170 [<ffffffff81ebf2cf>] pci_device_probe+0x20f/0x290 [<ffffffff820d50cd>] really_probe+0x17d/0x600 [<ffffffff820d56a3>] __driver_attach+0x153/0x190 [<ffffffff820cffac>] bus_for_each_dev+0x11c/0x1a0 [<ffffffff820d421d>] driver_attach+0x3d/0x50 [<ffffffff820d378a>] bus_add_driver+0x44a/0x5f0 [<ffffffff820d666c>] driver_register+0x18c/0x3b0 [<ffffffff81ebcb76>] __pci_register_driver+0x156/0x200 [<ffffffffa00c8135>] _mpt3sas_init+0x135/0x1000 [mpt3sas] [<ffffffff81000423>] do_one_initcall+0x113/0x2b0 [<ffffffff813caa5a>] do_init_module+0x1d0/0x4d8 [<ffffffff81273909>] load_module+0x6729/0x8dc0 [<ffffffff81276123>] SYSC_init_module+0x183/0x1a0 [<ffffffff8127625e>] SyS_init_module+0xe/0x10 [<ffffffff828fe7d7>] entry_SYSCALL_64_fastpath+0x12/0x6a Fix this by pulling the value at the beginning of the loop. Signed-off-by: Calvin Owens <calvinowens@fb.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Jens Axboe <axboe@fb.com> Acked-by: Chaitra Basappa <chaitra.basappa@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-03-22Merge branch 'for-next' of ↵Linus Torvalds5-64/+130
git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "The highlights this round include: - Add target_alloc_session() w/ callback helper for doing se_session allocation + tag + se_node_acl lookup. (HCH + nab) - Tree-wide fabric driver conversion to use target_alloc_session() - Convert sbp-target to use percpu_ida tag pre-allocation, and TARGET_SCF_ACK_KREF I/O krefs (Chris Boot + nab) - Convert usb-gadget to use percpu_ida tag pre-allocation, and TARGET_SCF_ACK_KREF I/O krefs (Andrzej Pietrasiewicz + nab) - Convert xen-scsiback to use percpu_ida tag pre-allocation, and TARGET_SCF_ACK_KREF I/O krefs (Juergen Gross + nab) - Convert tcm_fc to use TARGET_SCF_ACK_KREF I/O + TMR krefs - Convert ib_srpt to use percpu_ida tag pre-allocation - Add DebugFS node for qla2xxx target sess list (Quinn) - Rework iser-target connection termination (Jenny + Sagi) - Convert iser-target to new CQ API (HCH) - Add pass-through WRITE_SAME support for IBLOCK (Mike Christie) - Introduce data_bitmap for asynchronous access of data area (Sheng Yang + Andy) - Fix target_release_cmd_kref shutdown comp leak (Himanshu Madhani) Also, there is a separate PULL request coming for cxgb4 NIC driver prerequisites for supporting hw iscsi segmentation offload (ISO), that will be the base for a number of v4.7 developments involving iscsi-target hw offloads" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (36 commits) target: Fix target_release_cmd_kref shutdown comp leak target: Avoid DataIN transfers for non-GOOD SAM status target/user: Report capability of handling out-of-order completions to userspace target/user: Fix size_t format-spec build warning target/user: Don't free expired command when time out target/user: Introduce data_bitmap, replace data_length/data_head/data_tail target/user: Free data ring in unified function target/user: Use iovec[] to describe continuous area target: Remove enum transport_lunflags_table target/iblock: pass WRITE_SAME to device if possible iser-target: Kill the ->isert_cmd back pointer in struct iser_tx_desc iser-target: Kill struct isert_rdma_wr iser-target: Convert to new CQ API iser-target: Split and properly type the login buffer iser-target: Remove ISER_RECV_DATA_SEG_LEN iser-target: Remove impossible condition from isert_wait_conn iser-target: Remove redundant wait in release_conn iser-target: Rework connection termination iser-target: Separate flows for np listeners and connections cma events iser-target: Add new state ISER_CONN_BOUND to isert_conn ...
2016-03-22Merge remote-tracking branch 'mkp-scsi/4.6/scsi-fixes' into miscJames Bottomley6-19/+31
2016-03-22cxgb4, cxgb4i: move struct cpl_rx_data_ddp definitionVarun Prakash1-12/+0
move struct cpl_rx_data_ddp definition to common header file t4_msg.h. Signed-off-by: Varun Prakash <varun@chelsio.com> Acked-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-22cxgb4, iw_cxgb4, cxgb4i: remove duplicate definitionsVarun Prakash1-5/+0
move struct ulptx_idata definition to common header file t4_msg.h. Signed-off-by: Varun Prakash <varun@chelsio.com> Acked-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-22aacraid: add missing curly bracesArnd Bergmann1-1/+2
gcc-6 warns about obviously wrong indentation for newly added code in aac_slave_configure(): drivers/scsi/aacraid/linit.c: In function 'aac_slave_configure': drivers/scsi/aacraid/linit.c:458:3: warning: statement is indented as if it were guarded by... [-Wmisleading-indentation] sdev->tagged_supported = 1; ^~~~ drivers/scsi/aacraid/linit.c:455:4: note: ...this 'else' clause, but it is not gcc is correct, and evidently this was meant to be within the curly braces that should have been there to start with. This patch adds them, which avoids the warning and makes it clear what was intended here. Nothing changes in behavior because in the 'if' block, the sdev->tagged_supported flag is known to be set already. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 6bf3b630d0a7 ("aacraid: SCSI blk tag support") Reviewed-by: Raghava Aditya Renukunta <raghavaaditya.renukunta@pmcs.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-03-21Merge branch 'mm-pkeys-for-linus' of ↵Linus Torvalds1-2/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 protection key support from Ingo Molnar: "This tree adds support for a new memory protection hardware feature that is available in upcoming Intel CPUs: 'protection keys' (pkeys). There's a background article at LWN.net: https://lwn.net/Articles/643797/ The gist is that protection keys allow the encoding of user-controllable permission masks in the pte. So instead of having a fixed protection mask in the pte (which needs a system call to change and works on a per page basis), the user can map a (handful of) protection mask variants and can change the masks runtime relatively cheaply, without having to change every single page in the affected virtual memory range. This allows the dynamic switching of the protection bits of large amounts of virtual memory, via user-space instructions. It also allows more precise control of MMU permission bits: for example the executable bit is separate from the read bit (see more about that below). This tree adds the MM infrastructure and low level x86 glue needed for that, plus it adds a high level API to make use of protection keys - if a user-space application calls: mmap(..., PROT_EXEC); or mprotect(ptr, sz, PROT_EXEC); (note PROT_EXEC-only, without PROT_READ/WRITE), the kernel will notice this special case, and will set a special protection key on this memory range. It also sets the appropriate bits in the Protection Keys User Rights (PKRU) register so that the memory becomes unreadable and unwritable. So using protection keys the kernel is able to implement 'true' PROT_EXEC on x86 CPUs: without protection keys PROT_EXEC implies PROT_READ as well. Unreadable executable mappings have security advantages: they cannot be read via information leaks to figure out ASLR details, nor can they be scanned for ROP gadgets - and they cannot be used by exploits for data purposes either. We know about no user-space code that relies on pure PROT_EXEC mappings today, but binary loaders could start making use of this new feature to map binaries and libraries in a more secure fashion. There is other pending pkeys work that offers more high level system call APIs to manage protection keys - but those are not part of this pull request. Right now there's a Kconfig that controls this feature (CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) that is default enabled (like most x86 CPU feature enablement code that has no runtime overhead), but it's not user-configurable at the moment. If there's any serious problem with this then we can make it configurable and/or flip the default" * 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits) x86/mm/pkeys: Fix mismerge of protection keys CPUID bits mm/pkeys: Fix siginfo ABI breakage caused by new u64 field x86/mm/pkeys: Fix access_error() denial of writes to write-only VMA mm/core, x86/mm/pkeys: Add execute-only protection keys support x86/mm/pkeys: Create an x86 arch_calc_vm_prot_bits() for VMA flags x86/mm/pkeys: Allow kernel to modify user pkey rights register x86/fpu: Allow setting of XSAVE state x86/mm: Factor out LDT init from context init mm/core, x86/mm/pkeys: Add arch_validate_pkey() mm/core, arch, powerpc: Pass a protection key in to calc_vm_flag_bits() x86/mm/pkeys: Actually enable Memory Protection Keys in the CPU x86/mm/pkeys: Add Kconfig prompt to existing config option x86/mm/pkeys: Dump pkey from VMA in /proc/pid/smaps x86/mm/pkeys: Dump PKRU with other kernel registers mm/core, x86/mm/pkeys: Differentiate instruction fetches x86/mm/pkeys: Optimize fault handling in access_error() mm/core: Do not enforce PKEY permissions on remote mm access um, pkeys: Add UML arch_*_access_permitted() methods mm/gup, x86/mm/pkeys: Check VMAs and PTEs for protection keys x86/mm/gup: Simplify get_user_pages() PTE bit handling ...
2016-03-20Merge tag 'powerpc-4.6-1' of ↵Linus Torvalds2-17/+2
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "This was delayed a day or two by some build-breakage on old toolchains which we've now fixed. There's two PCI commits both acked by Bjorn. There's one commit to mm/hugepage.c which is (co)authored by Kirill. Highlights: - Restructure Linux PTE on Book3S/64 to Radix format from Paul Mackerras - Book3s 64 MMU cleanup in preparation for Radix MMU from Aneesh Kumar K.V - Add POWER9 cputable entry from Michael Neuling - FPU/Altivec/VSX save/restore optimisations from Cyril Bur - Add support for new ftrace ABI on ppc64le from Torsten Duwe Various cleanups & minor fixes from: - Adam Buchbinder, Andrew Donnellan, Balbir Singh, Christophe Leroy, Cyril Bur, Luis Henriques, Madhavan Srinivasan, Pan Xinhui, Russell Currey, Sukadev Bhattiprolu, Suraj Jitindar Singh. General: - atomics: Allow architectures to define their own __atomic_op_* helpers from Boqun Feng - Implement atomic{, 64}_*_return_* variants and acquire/release/ relaxed variants for (cmp)xchg from Boqun Feng - Add powernv_defconfig from Jeremy Kerr - Fix BUG_ON() reporting in real mode from Balbir Singh - Add xmon command to dump OPAL msglog from Andrew Donnellan - Add xmon command to dump process/task similar to ps(1) from Douglas Miller - Clean up memory hotplug failure paths from David Gibson pci/eeh: - Redesign SR-IOV on PowerNV to give absolute isolation between VFs from Wei Yang. - EEH Support for SRIOV VFs from Wei Yang and Gavin Shan. - PCI/IOV: Rename and export virtfn_{add, remove} from Wei Yang - PCI: Add pcibios_bus_add_device() weak function from Wei Yang - MAINTAINERS: Update EEH details and maintainership from Russell Currey cxl: - Support added to the CXL driver for running on both bare-metal and hypervisor systems, from Christophe Lombard and Frederic Barrat. - Ignore probes for virtual afu pci devices from Vaibhav Jain perf: - Export Power8 generic and cache events to sysfs from Sukadev Bhattiprolu - hv-24x7: Fix usage with chip events, display change in counter values, display domain indices in sysfs, eliminate domain suffix in event names, from Sukadev Bhattiprolu Freescale: - Updates from Scott: "Highlights include 8xx optimizations, 32-bit checksum optimizations, 86xx consolidation, e5500/e6500 cpu hotplug, more fman and other dt bits, and minor fixes/cleanup" * tag 'powerpc-4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (179 commits) powerpc: Fix unrecoverable SLB miss during restore_math() powerpc/8xx: Fix do_mtspr_cpu6() build on older compilers powerpc/rcpm: Fix build break when SMP=n powerpc/book3e-64: Use hardcoded mttmr opcode powerpc/fsl/dts: Add "jedec,spi-nor" flash compatible powerpc/T104xRDB: add tdm riser card node to device tree powerpc32: PAGE_EXEC required for inittext powerpc/mpc85xx: Add pcsphy nodes to FManV3 device tree powerpc/mpc85xx: Add MDIO bus muxing support to the board device tree(s) powerpc/86xx: Introduce and use common dtsi powerpc/86xx: Update device tree powerpc/86xx: Move dts files to fsl directory powerpc/86xx: Switch to kconfig fragments approach powerpc/86xx: Update defconfigs powerpc/86xx: Consolidate common platform code powerpc32: Remove one insn in mulhdu powerpc32: small optimisation in flush_icache_range() powerpc: Simplify test in __dma_sync() powerpc32: move xxxxx_dcache_range() functions inline powerpc32: Remove clear_pages() and define clear_page() inline ...