summaryrefslogtreecommitdiff
path: root/drivers/ata/libata-scsi.c
AgeCommit message (Collapse)AuthorFilesLines
2020-09-12libata: implement ATA_HORKAGE_MAX_TRIM_128M and apply to SandisksTejun Heo1-1/+7
commit 3b5455636fe26ea21b4189d135a424a6da016418 upstream. All three generations of Sandisk SSDs lock up hard intermittently. Experiments showed that disabling NCQ lowered the failure rate significantly and the kernel has been disabling NCQ for some models of SD7's and 8's, which is obviously undesirable. Karthik worked with Sandisk to root cause the hard lockups to trim commands larger than 128M. This patch implements ATA_HORKAGE_MAX_TRIM_128M which limits max trim size to 128M and applies it to all three generations of Sandisk SSDs. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Karthik Shivaram <karthikgs@fb.com> Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-30ata/libata: Fix usage of page address by page_address in ↵Ye Bin1-3/+6
ata_scsi_mode_select_xlat function [ Upstream commit f650ef61e040bcb175dd8762164b00a5d627f20e ] BUG: KASAN: use-after-free in ata_scsi_mode_select_xlat+0x10bd/0x10f0 drivers/ata/libata-scsi.c:4045 Read of size 1 at addr ffff88803b8cd003 by task syz-executor.6/12621 CPU: 1 PID: 12621 Comm: syz-executor.6 Not tainted 4.19.95 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xac/0xee lib/dump_stack.c:118 print_address_description+0x60/0x223 mm/kasan/report.c:253 kasan_report_error mm/kasan/report.c:351 [inline] kasan_report mm/kasan/report.c:409 [inline] kasan_report.cold+0xae/0x2d8 mm/kasan/report.c:393 ata_scsi_mode_select_xlat+0x10bd/0x10f0 drivers/ata/libata-scsi.c:4045 ata_scsi_translate+0x2da/0x680 drivers/ata/libata-scsi.c:2035 __ata_scsi_queuecmd drivers/ata/libata-scsi.c:4360 [inline] ata_scsi_queuecmd+0x2e4/0x790 drivers/ata/libata-scsi.c:4409 scsi_dispatch_cmd+0x2ee/0x6c0 drivers/scsi/scsi_lib.c:1867 scsi_queue_rq+0xfd7/0x1990 drivers/scsi/scsi_lib.c:2170 blk_mq_dispatch_rq_list+0x1e1/0x19a0 block/blk-mq.c:1186 blk_mq_do_dispatch_sched+0x147/0x3d0 block/blk-mq-sched.c:108 blk_mq_sched_dispatch_requests+0x427/0x680 block/blk-mq-sched.c:204 __blk_mq_run_hw_queue+0xbc/0x200 block/blk-mq.c:1308 __blk_mq_delay_run_hw_queue+0x3c0/0x460 block/blk-mq.c:1376 blk_mq_run_hw_queue+0x152/0x310 block/blk-mq.c:1413 blk_mq_sched_insert_request+0x337/0x6c0 block/blk-mq-sched.c:397 blk_execute_rq_nowait+0x124/0x320 block/blk-exec.c:64 blk_execute_rq+0xc5/0x112 block/blk-exec.c:101 sg_scsi_ioctl+0x3b0/0x6a0 block/scsi_ioctl.c:507 sg_ioctl+0xd37/0x23f0 drivers/scsi/sg.c:1106 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:501 [inline] do_vfs_ioctl+0xae6/0x1030 fs/ioctl.c:688 ksys_ioctl+0x76/0xa0 fs/ioctl.c:705 __do_sys_ioctl fs/ioctl.c:712 [inline] __se_sys_ioctl fs/ioctl.c:710 [inline] __x64_sys_ioctl+0x6f/0xb0 fs/ioctl.c:710 do_syscall_64+0xa0/0x2e0 arch/x86/entry/common.c:293 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x45c479 Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fb0e9602c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007fb0e96036d4 RCX: 000000000045c479 RDX: 0000000020000040 RSI: 0000000000000001 RDI: 0000000000000003 RBP: 000000000076bfc0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 000000000000046d R14: 00000000004c6e1a R15: 000000000076bfcc Allocated by task 12577: set_track mm/kasan/kasan.c:460 [inline] kasan_kmalloc mm/kasan/kasan.c:553 [inline] kasan_kmalloc+0xbf/0xe0 mm/kasan/kasan.c:531 __kmalloc+0xf3/0x1e0 mm/slub.c:3749 kmalloc include/linux/slab.h:520 [inline] load_elf_phdrs+0x118/0x1b0 fs/binfmt_elf.c:441 load_elf_binary+0x2de/0x4610 fs/binfmt_elf.c:737 search_binary_handler fs/exec.c:1654 [inline] search_binary_handler+0x15c/0x4e0 fs/exec.c:1632 exec_binprm fs/exec.c:1696 [inline] __do_execve_file.isra.0+0xf52/0x1a90 fs/exec.c:1820 do_execveat_common fs/exec.c:1866 [inline] do_execve fs/exec.c:1883 [inline] __do_sys_execve fs/exec.c:1964 [inline] __se_sys_execve fs/exec.c:1959 [inline] __x64_sys_execve+0x8a/0xb0 fs/exec.c:1959 do_syscall_64+0xa0/0x2e0 arch/x86/entry/common.c:293 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Freed by task 12577: set_track mm/kasan/kasan.c:460 [inline] __kasan_slab_free+0x129/0x170 mm/kasan/kasan.c:521 slab_free_hook mm/slub.c:1370 [inline] slab_free_freelist_hook mm/slub.c:1397 [inline] slab_free mm/slub.c:2952 [inline] kfree+0x8b/0x1a0 mm/slub.c:3904 load_elf_binary+0x1be7/0x4610 fs/binfmt_elf.c:1118 search_binary_handler fs/exec.c:1654 [inline] search_binary_handler+0x15c/0x4e0 fs/exec.c:1632 exec_binprm fs/exec.c:1696 [inline] __do_execve_file.isra.0+0xf52/0x1a90 fs/exec.c:1820 do_execveat_common fs/exec.c:1866 [inline] do_execve fs/exec.c:1883 [inline] __do_sys_execve fs/exec.c:1964 [inline] __se_sys_execve fs/exec.c:1959 [inline] __x64_sys_execve+0x8a/0xb0 fs/exec.c:1959 do_syscall_64+0xa0/0x2e0 arch/x86/entry/common.c:293 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The buggy address belongs to the object at ffff88803b8ccf00 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 259 bytes inside of 512-byte region [ffff88803b8ccf00, ffff88803b8cd100) The buggy address belongs to the page: page:ffffea0000ee3300 count:1 mapcount:0 mapping:ffff88806cc03080 index:0xffff88803b8cc780 compound_mapcount: 0 flags: 0x100000000008100(slab|head) raw: 0100000000008100 ffffea0001104080 0000000200000002 ffff88806cc03080 raw: ffff88803b8cc780 00000000800c000b 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88803b8ccf00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88803b8ccf80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff88803b8cd000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88803b8cd080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88803b8cd100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc You can refer to "https://www.lkml.org/lkml/2019/1/17/474" reproduce this error. The exception code is "bd_len = p[3];", "p" value is ffff88803b8cd000 which belongs to the cache kmalloc-512 of size 512. The "page_address(sg_page(scsi_sglist(scmd)))" maybe from sg_scsi_ioctl function "buffer" which allocated by kzalloc, so "buffer" may not page aligned. This also looks completely buggy on highmem systems and really needs to use a kmap_atomic. --Christoph Hellwig To address above bugs, Paolo Bonzini advise to simpler to just make a char array of size CACHE_MPAGE_LEN+8+8+4-2(or just 64 to make it easy), use sg_copy_to_buffer to copy from the sglist into the buffer, and workthere. Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-24libata: Remove extra scsi_host_put() in ata_scsi_add_hosts()John Garry1-6/+3
[ Upstream commit 1d72f7aec3595249dbb83291ccac041a2d676c57 ] If the call to scsi_add_host_with_dma() in ata_scsi_add_hosts() fails, then we may get use-after-free KASAN warns: ================================================================== BUG: KASAN: use-after-free in kobject_put+0x24/0x180 Read of size 1 at addr ffff0026b8c80364 by task swapper/0/1 CPU: 1 PID: 1 Comm: swapper/0 Tainted: G W 5.6.0-rc3-00004-g5a71b206ea82-dirty #1765 Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDD, BIOS 2280-V2 CS V3.B160.01 02/24/2020 Call trace: dump_backtrace+0x0/0x298 show_stack+0x14/0x20 dump_stack+0x118/0x190 print_address_description.isra.9+0x6c/0x3b8 __kasan_report+0x134/0x23c kasan_report+0xc/0x18 __asan_load1+0x5c/0x68 kobject_put+0x24/0x180 put_device+0x10/0x20 scsi_host_put+0x10/0x18 ata_devres_release+0x74/0xb0 release_nodes+0x2d0/0x470 devres_release_all+0x50/0x78 really_probe+0x2d4/0x560 driver_probe_device+0x7c/0x148 device_driver_attach+0x94/0xa0 __driver_attach+0xa8/0x110 bus_for_each_dev+0xe8/0x158 driver_attach+0x30/0x40 bus_add_driver+0x220/0x2e0 driver_register+0xbc/0x1d0 __pci_register_driver+0xbc/0xd0 ahci_pci_driver_init+0x20/0x28 do_one_initcall+0xf0/0x608 kernel_init_freeable+0x31c/0x384 kernel_init+0x10/0x118 ret_from_fork+0x10/0x18 Allocated by task 5: save_stack+0x28/0xc8 __kasan_kmalloc.isra.8+0xbc/0xd8 kasan_kmalloc+0xc/0x18 __kmalloc+0x1a8/0x280 scsi_host_alloc+0x44/0x678 ata_scsi_add_hosts+0x74/0x268 ata_host_register+0x228/0x488 ahci_host_activate+0x1c4/0x2a8 ahci_init_one+0xd18/0x1298 local_pci_probe+0x74/0xf0 work_for_cpu_fn+0x2c/0x48 process_one_work+0x488/0xc08 worker_thread+0x330/0x5d0 kthread+0x1c8/0x1d0 ret_from_fork+0x10/0x18 Freed by task 5: save_stack+0x28/0xc8 __kasan_slab_free+0x118/0x180 kasan_slab_free+0x10/0x18 slab_free_freelist_hook+0xa4/0x1a0 kfree+0xd4/0x3a0 scsi_host_dev_release+0x100/0x148 device_release+0x7c/0xe0 kobject_put+0xb0/0x180 put_device+0x10/0x20 scsi_host_put+0x10/0x18 ata_scsi_add_hosts+0x210/0x268 ata_host_register+0x228/0x488 ahci_host_activate+0x1c4/0x2a8 ahci_init_one+0xd18/0x1298 local_pci_probe+0x74/0xf0 work_for_cpu_fn+0x2c/0x48 process_one_work+0x488/0xc08 worker_thread+0x330/0x5d0 kthread+0x1c8/0x1d0 ret_from_fork+0x10/0x18 There is also refcount issue, as well: WARNING: CPU: 1 PID: 1 at lib/refcount.c:28 refcount_warn_saturate+0xf8/0x170 The issue is that we make an erroneous extra call to scsi_host_put() for that host: So in ahci_init_one()->ata_host_alloc_pinfo()->ata_host_alloc(), we setup a device release method - ata_devres_release() - which intends to release the SCSI hosts: static void ata_devres_release(struct device *gendev, void *res) { ... for (i = 0; i < host->n_ports; i++) { struct ata_port *ap = host->ports[i]; if (!ap) continue; if (ap->scsi_host) scsi_host_put(ap->scsi_host); } ... } However in the ata_scsi_add_hosts() error path, we also call scsi_host_put() for the SCSI hosts. Fix by removing the the scsi_host_put() calls in ata_scsi_add_hosts() and leave this to ata_devres_release(). Fixes: f31871951b38 ("libata: separate out ata_host_alloc() and ata_host_register()") Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-25libata: have ata_scsi_rw_xlat() fail invalid passthrough requestsJens Axboe1-0/+21
commit 2d7271501720038381d45fb3dcbe4831228fc8cc upstream. For passthrough requests, libata-scsi takes what the user passes in as gospel. This can be problematic if the user fills in the CDB incorrectly. One example of that is in request sizes. For read/write commands, the CDB contains fields describing the transfer length of the request. These should match with the SG_IO header fields, but libata-scsi currently does no validation of that. Check that the number of blocks in the CDB for passthrough requests matches what was mapped into the request. If the CDB asks for more data then the validated SG_IO header fields, error it. Reported-by: Krishna Ram Prakash R <krp@gtux.in> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-17ata: Fix ZBC_OUT all bit handlingDamien Le Moal1-3/+8
commit 6edf1d4cb0acde3a0a5dac849f33031bd7abb7b1 upstream. If the ALL bit is set in the ZBC_OUT command, the command zone ID field (block) should be ignored. Reported-by: David Butterfield <david.butterfield@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Cc: stable@vger.kernel.org Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-17ata: Fix ZBC_OUT command block checkDamien Le Moal1-6/+7
commit b320a0a9f23c98f21631eb27bcbbca91c79b1c6e upstream. The block (LBA) specified must not exceed the last addressable LBA, which is dev->nr_sectors - 1. So fix the correct check is "if (block >= dev->n_sectors)" and not "if (block > dev->n_sectords)". Additionally, the asc/ascq to return for an LBA that is not a zone start LBA should be ILLEGAL REQUEST, regardless if the bad LBA is out of range. Reported-by: David Butterfield <david.butterfield@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Cc: stable@vger.kernel.org Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30libata: Fix compile warning with ATA_DEBUG enabledDong Bo1-1/+1
[ Upstream commit 0d3e45bc6507bd1f8728bf586ebd16c2d9e40613 ] This fixs the following comile warnings with ATA_DEBUG enabled, which detected by Linaro GCC 5.2-2015.11: drivers/ata/libata-scsi.c: In function 'ata_scsi_dump_cdb': ./include/linux/kern_levels.h:5:18: warning: format '%d' expects argument of type 'int', but argument 6 has type 'u64 {aka long long unsigned int}' [-Wformat=] tj: Patch hand-applied and description trimmed. Signed-off-by: Dong Bo <dongbo4@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-28libata: don't try to pass through NCQ commands to non-NCQ devicesEric Biggers1-0/+6
commit 2c1ec6fda2d07044cda922ee25337cf5d4b429b3 upstream. syzkaller hit a WARN() in ata_bmdma_qc_issue() when writing to /dev/sg0. This happened because it issued an ATA pass-through command (ATA_16) where the protocol field indicated that NCQ should be used -- but the device did not support NCQ. We could just remove the WARN() from libata-sff.c, but the real problem seems to be that the SCSI -> ATA translation code passes through NCQ commands without verifying that the device actually supports NCQ. Fix this by adding the appropriate check to ata_scsi_pass_thru(). Here's reproducer that works in QEMU when /dev/sg0 refers to a disk of the default type ("82371SB PIIX3 IDE"): #include <fcntl.h> #include <unistd.h> int main() { char buf[53] = { 0 }; buf[36] = 0x85; /* ATA_16 */ buf[37] = (12 << 1); /* FPDMA */ buf[38] = 0x1; /* Has data */ buf[51] = 0xC8; /* ATA_CMD_READ */ write(open("/dev/sg0", O_RDWR), buf, sizeof(buf)); } Fixes: ee7fb331c3ac ("libata: add support for NCQ commands for SG interface") Reported-by: syzbot+2f69ca28df61bdfc77cd36af2e789850355a221e@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> # v4.4+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-28libata: fix length validation of ATAPI-relayed SCSI commandsEric Biggers1-1/+3
commit 058f58e235cbe03e923b30ea7c49995a46a8725f upstream. syzkaller reported a crash in ata_bmdma_fill_sg() when writing to /dev/sg1. The immediate cause was that the ATA command's scatterlist was not DMA-mapped, which causes 'pi - 1' to underflow, resulting in a write to 'qc->ap->bmdma_prd[0xffffffff]'. Strangely though, the flag ATA_QCFLAG_DMAMAP was set in qc->flags. The root cause is that when __ata_scsi_queuecmd() is preparing to relay a SCSI command to an ATAPI device, it doesn't correctly validate the CDB length before copying it into the 16-byte buffer 'cdb' in 'struct ata_queued_cmd'. Namely, it validates the fixed CDB length expected based on the SCSI opcode but not the actual CDB length, which can be larger due to the use of the SG_NEXT_CMD_LEN ioctl. Since 'flags' is the next member in ata_queued_cmd, a buffer overflow corrupts it. Fix it by requiring that the actual CDB length be <= 16 (ATAPI_CDB_LEN). [Really it seems the length should be required to be <= dev->cdb_len, but the current behavior seems to have been intentionally introduced by commit 607126c2a21c ("libata-scsi: be tolerant of 12-byte ATAPI commands in 16-byte CDBs") to work around a userspace bug in mplayer. Probably the workaround is no longer needed (mplayer was fixed in 2007), but continuing to allow lengths to up 16 appears harmless for now.] Here's a reproducer that works in QEMU when /dev/sg1 refers to the CD-ROM drive that qemu-system-x86_64 creates by default: #include <fcntl.h> #include <sys/ioctl.h> #include <unistd.h> #define SG_NEXT_CMD_LEN 0x2283 int main() { char buf[53] = { [36] = 0x7e, [52] = 0x02 }; int fd = open("/dev/sg1", O_RDWR); ioctl(fd, SG_NEXT_CMD_LEN, &(int){ 17 }); write(fd, buf, sizeof(buf)); } The crash was: BUG: unable to handle kernel paging request at ffff8cb97db37ffc IP: ata_bmdma_fill_sg drivers/ata/libata-sff.c:2623 [inline] IP: ata_bmdma_qc_prep+0xa4/0xc0 drivers/ata/libata-sff.c:2727 PGD fb6c067 P4D fb6c067 PUD 0 Oops: 0002 [#1] SMP CPU: 1 PID: 150 Comm: syz_ata_bmdma_q Not tainted 4.15.0-next-20180202 #99 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014 [...] Call Trace: ata_qc_issue+0x100/0x1d0 drivers/ata/libata-core.c:5421 ata_scsi_translate+0xc9/0x1a0 drivers/ata/libata-scsi.c:2024 __ata_scsi_queuecmd drivers/ata/libata-scsi.c:4326 [inline] ata_scsi_queuecmd+0x8c/0x210 drivers/ata/libata-scsi.c:4375 scsi_dispatch_cmd+0xa2/0xe0 drivers/scsi/scsi_lib.c:1727 scsi_request_fn+0x24c/0x530 drivers/scsi/scsi_lib.c:1865 __blk_run_queue_uncond block/blk-core.c:412 [inline] __blk_run_queue+0x3a/0x60 block/blk-core.c:432 blk_execute_rq_nowait+0x93/0xc0 block/blk-exec.c:78 sg_common_write.isra.7+0x272/0x5a0 drivers/scsi/sg.c:806 sg_write+0x1ef/0x340 drivers/scsi/sg.c:677 __vfs_write+0x31/0x160 fs/read_write.c:480 vfs_write+0xa7/0x160 fs/read_write.c:544 SYSC_write fs/read_write.c:589 [inline] SyS_write+0x4d/0xc0 fs/read_write.c:581 do_syscall_64+0x5e/0x110 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x21/0x86 Fixes: 607126c2a21c ("libata-scsi: be tolerant of 12-byte ATAPI commands in 16-byte CDBs") Reported-by: syzbot+1ff6f9fcc3c35f1c72a95e26528c8e7e3276e4da@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> # v2.6.24+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-11libata: array underflow in ata_find_dev()Dan Carpenter1-2/+4
commit 59a5e266c3f5c1567508888dd61a45b86daed0fa upstream. My static checker complains that "devno" can be negative, meaning that we read before the start of the loop. I've looked at the code, and I think the warning is right. This come from /proc so it's root only or it would be quite a quite a serious bug. The call tree looks like this: proc_scsi_write() <- gets id and channel from simple_strtoul() -> scsi_add_single_device() <- calls shost->transportt->user_scan() -> ata_scsi_user_scan() -> ata_find_dev() Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-12-08libata-scsi: disable SCT Write Same for the momentNicolai Stange1-0/+1
SCT Write Same support had been introduced with commit 7b2030942859 ("libata: Add support for SCT Write Same") Some problems, namely excessive userspace segfaults, had been reported at http://lkml.kernel.org/r/20160908192736.GA4356@gmail.com This lead to commit 0ce1b18c42a5 ("libata: Some drives failing on SCT Write Same") which strived to disable SCT Write Same on !ZAC devices. Due to the way this was done and to the logic in sd_config_write_same(), this didn't work for those devices that have ->max_ws_blocks > SD_MAX_WS10_BLOCKS: for these, ->no_write_same and ->max_write_same_sectors would still be non-zero, but ->ws10 == ->ws16 == 0. This would cause sd_setup_write_same_cmnd() to demultiplex REQ_OP_WRITE_SAME requests to WRITE_SAME, and these in turn aren't supported by libata-scsi: EXT4-fs (dm-1): Delayed block allocation failed for inode 2625094 at logical offset 2032 with max blocks 2 with error 121 EXT4-fs (dm-1): This should not happen!! Data will be lost 121 == EREMOTEIO is what scsi_io_completion() asserts in case of invalid opcodes. Back to the original problem of userspace segfaults: this can be tracked down to ata_format_sct_write_same() overwriting the input page. Sometimes, this page is ZERO_PAGE(0) which ceases to be filled with zeros from that point on. Since ZERO_PAGE(0) is used for userspace .bss mappings, code of the following is doomed: static char *a = NULL; /* .bss */ ... if (a) *a = 'a'; This problem is not solved by disabling SCT Write Same for !ZAC devices only. It can certainly be fixed, but the final release is quite close -- so disable SCT Write Same for all ATA devices rather than introducing some SCT key buffer allocation schemes at this point. Fixes: 7b2030942859 ("libata: Add support for SCT Write Same") Signed-off-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-11-01libata-scsi: Fixup ata_gen_passthru_sense()Hannes Reinecke1-1/+1
There's a typo in ata_gen_passthru_sense(), where the first byte would be overwritten incorrectly later on. Reported-by: Charles Machalow <csm10495@gmail.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Fixes: 11093cb1ef56 ("libata-scsi: generate correct ATA pass-through sense") Cc: stable@vger.kernel.org # v4.7+ Signed-off-by: Tejun Heo <tj@kernel.org>
2016-09-09libata: Some drives failing on SCT Write SameShaun Tancheff1-3/+3
Restrict support SCT Write Same to devices which also support ZAC where support is required. Reported-by: Mike Krinkin <krinkin.m.u@gmail.com> Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-08-25libata: SCT Write Same handle ATA_DFLAG_PIOShaun Tancheff1-0/+2
Use non DMA write log when ATA_DFLAG_PIO is set. Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Acked-by: Tejun Heo <tj@kernel.org>
2016-08-25libata: SCT Write Same / DSM TrimShaun Tancheff1-28/+57
Correct handling of devices with sector_size other that 512 bytes. In the case of a 4Kn device sector_size it is possible to describe a much larger DSM Trim than the current fixed default of 512 bytes. This patch assumes the minimum descriptor is sector_size and fills out the descriptor accordingly. The ACS-2 specification is quite clear that the DSM command payload is sized as number of 512 byte transfers so a 4Kn device will operate correctly without this patch. Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com> Acked-by: Tejun Heo <tj@kernel.org>
2016-08-25libata: Add support for SCT Write SameShaun Tancheff1-29/+170
SATA drives may support write same via SCT. This is useful for setting the drive contents to a specific pattern (0's). Translate a SCSI WRITE SAME 16 command to be either a DSM TRIM command or an SCT Write Same command. Based on the UNMAP flag: - When set translate to DSM TRIM - When not set translate to SCT Write Same Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Acked-by: Tejun Heo <tj@kernel.org>
2016-08-25libata: Safely overwrite attached page in WRITE SAME xlatShaun Tancheff1-5/+51
Safely overwriting the attached page to ATA format from the SCSI formatted variant. Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Acked-by: Tejun Heo <tj@kernel.org>
2016-08-10libata-scsi: fix MODE SELECT translation for Control mode pageTom Yan1-2/+2
scsi_done() was called repeatedly and apparently because of that, the kernel would call trace when we touch the Control mode page: Call Trace: [<ffffffff812ea0d2>] dump_stack+0x63/0x81 [<ffffffff81079cfb>] __warn+0xcb/0xf0 [<ffffffff81079e2d>] warn_slowpath_null+0x1d/0x20 [<ffffffffa00f51b0>] ata_eh_finish+0xe0/0xf0 [libata] [<ffffffffa00fb830>] sata_pmp_error_handler+0x640/0xa50 [libata] [<ffffffffa00470ed>] ahci_error_handler+0x1d/0x70 [libahci] [<ffffffffa00f55f0>] ata_scsi_port_error_handler+0x430/0x770 [libata] [<ffffffffa00eff8d>] ? ata_scsi_cmd_error_handler+0xdd/0x160 [libata] [<ffffffffa00f59d7>] ata_scsi_error+0xa7/0xf0 [libata] [<ffffffffa00913ba>] scsi_error_handler+0xaa/0x560 [scsi_mod] [<ffffffffa0091310>] ? scsi_eh_get_sense+0x180/0x180 [scsi_mod] [<ffffffff81098eb8>] kthread+0xd8/0xf0 [<ffffffff815d913f>] ret_from_fork+0x1f/0x40 [<ffffffff81098de0>] ? kthread_worker_fn+0x170/0x170 ---[ end trace 8b7501047e928a17 ]--- Removed the unnecessary code and let ata_scsi_translate() do the job. Also, since ata_mselect_control() has no ATA command to send to the device, ata_scsi_mode_select_xlat() should return 1 for it, so that ata_scsi_translate() will finish early to avoid ata_qc_issue(). Signed-off-by: Tom Yan <tom.ty89@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-08-09libata-scsi: use u8 array to store mode page copyTom Yan1-2/+2
ata_mselect_*() would initialize a char array for storing a copy of the current mode page. However, char could be signed char. In that case, bytes larger than 127 would be converted to negative number. For example, 0xff from def_control_mpage[] would become -1. This prevented ata_mselect_control() from working at all, since when it did the read-only bits check, there would always be a mismatch. Signed-off-by: Tom Yan <tom.ty89@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-07-27Merge branch 'for-4.8/core' of git://git.kernel.dk/linux-blockLinus Torvalds1-1/+1
Pull core block updates from Jens Axboe: - the big change is the cleanup from Mike Christie, cleaning up our uses of command types and modified flags. This is what will throw some merge conflicts - regression fix for the above for btrfs, from Vincent - following up to the above, better packing of struct request from Christoph - a 2038 fix for blktrace from Arnd - a few trivial/spelling fixes from Bart Van Assche - a front merge check fix from Damien, which could cause issues on SMR drives - Atari partition fix from Gabriel - convert cfq to highres timers, since jiffies isn't granular enough for some devices these days. From Jan and Jeff - CFQ priority boost fix idle classes, from me - cleanup series from Ming, improving our bio/bvec iteration - a direct issue fix for blk-mq from Omar - fix for plug merging not involving the IO scheduler, like we do for other types of merges. From Tahsin - expose DAX type internally and through sysfs. From Toshi and Yigal * 'for-4.8/core' of git://git.kernel.dk/linux-block: (76 commits) block: Fix front merge check block: do not merge requests without consulting with io scheduler block: Fix spelling in a source code comment block: expose QUEUE_FLAG_DAX in sysfs block: add QUEUE_FLAG_DAX for devices to advertise their DAX support Btrfs: fix comparison in __btrfs_map_block() block: atari: Return early for unsupported sector size Doc: block: Fix a typo in queue-sysfs.txt cfq-iosched: Charge at least 1 jiffie instead of 1 ns cfq-iosched: Fix regression in bonnie++ rewrite performance cfq-iosched: Convert slice_resid from u64 to s64 block: Convert fifo_time from ulong to u64 blktrace: avoid using timespec block/blk-cgroup.c: Declare local symbols static block/bio-integrity.c: Add #include "blk.h" block/partition-generic.c: Remove a set-but-not-used variable block: bio: kill BIO_MAX_SIZE cfq-iosched: temporarily boost queue priority for idle classes block: drbd: avoid to use BIO_MAX_SIZE block: bio: remove BIO_MAX_SECTORS ...
2016-07-20libata-scsi: better style in ata_msense_*()Tom Yan1-6/+13
`changeable` is the "version" of mode page requested by the user. It will be less confusing/misleading if we do not check it "together" with the setting bits of the drive. Not to mention that we currently have ata_mselect_*() implemented in a way that each of them will serve exclusively a particular bit on each page. The old style will hence make the condition look even more unnecessarily arcane if the ata_msense_*() is reflecting more than one bit. Signed-off-by: Tom Yan <tom.ty89@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-07-15libata-scsi: minor cleanup for ata_scsi_zbc_out_xlatDamien Le Moal1-4/+4
The reset_all variable name is misleading as this bit is also applicable to open, close, and finish actions. So rename that variable to "all" and remove the unnecessary mask operation that's already done earlier. Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com> [hch: split from the previous patch] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-07-15libata-scsi: Fix ZBC management out command translationDamien Le Moal1-2/+2
The subcommand for NCQ NON-DATA must be specified in the feature (low byte), not the high-order count byte. Also make sure to properly cast the all bit to a u16 before shiting it by 8 to avoid undefined behavior. Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com> [hch: split the original patch into two, updated changelog] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-07-15libata-scsi: Fix translation of REPORT ZONES commandDamien Le Moal1-2/+2
Include reporting options when translating REPORT ZONES commmand to ATA NCQ, and make sure we only look at the actually specified bits in the CDB for the options. Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com> [hch: update patch description] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-07-15ata: Handle ATA NCQ NO-DATA commands correctlyHannes Reinecke1-1/+4
Add a new taskfile protocol ATA_PROT_NCQ_NODATA to handle ATA NCQ NO-DATA commands correctly. And fixup ata_scsi_zbc_out_xlat() to use it. Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-07-14libata: use ata_is_ncq() accessorsHannes Reinecke1-2/+2
Use accessor functions instead of the raw value. Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-07-12libata-scsi: avoid repeated calculation of number of TRIM rangesTom Yan1-3/+3
Currently libata statically allows only 1-block (512-byte) payload for each TRIM command. Each payload can carry 64 TRIM ranges since each range requires 8 bytes. It is silly to keep doing the calculation (512 / 8) in different places. Hence, define the new ATA_MAX_TRIM_RNUM for the result. Signed-off-by: Tom Yan <tom.ty89@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-07-12libata-scsi: reject WRITE SAME (16) with n_block that exceeds limitTom Yan1-1/+7
Currently if a WRITE SAME (16) command is issued to the SATL with "number of blocks" that is larger than the "Maximum write same length" (which is the maximum number of blocks per TRIM command allowed in libata, currently 65535 * 512 / 8 blocks), the SATL will accept the command and translate it to a TRIM command with the upper limit. However, according to SBC (as of sbc4r11.pdf), the "device server" should terminate the command with "Invalid field in CDB" in that case. Signed-off-by: Tom Yan <tom.ty89@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-07-12libata-scsi: rename ata_msense_ctl_mode() to ata_msense_control()Tom Yan1-5/+5
To make it consistent with the recently added ata_mselect_control(). We probably shouldn't have the word "mode" in its name anyway, since that's not the case for other ata_msense_*() / ata_mselect_*() either. Signed-off-by: Tom Yan <tom.ty89@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-07-12libata-scsi: fix D_SENSE bit relection in control mode pageTom Yan1-1/+1
The bit should always be set to 1 when the requested version of page is "changeable" because we've made it so in ata_mselect_control(). Also, it should always be set to 1 if ATA_DFLAG_D_SENSE is set (when the requested version of page is "current" or "default"). Signed-off-by: Tom Yan <tom.ty89@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-07-12libata-scsi: correct SPC version descriptorTom Yan1-2/+2
The comment suggests we should be having an SPC-3 version descriptor but the 0260h is the code for "SPC-2 (no version claimed)". Correct it to 0300h so that it has the "SPC-3 (no version claimed)" descriptor. Note that we are claiming SPC-3 version compatibility in the VERSION field of the standard INQUIRY data. Therefore, I assume the typo was on the code but not on the comment. Signed-off-by: Tom Yan <tom.ty89@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-07-12libata-scsi: set CmdQue bit in standard INQUIRY data to 1Tom Yan1-1/+4
Avoid performance bottleneck when being SCSI pass-through'd to virtual machines with other OSes (e.g. Windows) via virtio-scsi and scsi-block in qemu. Ref.: https://github.com/YanVugenfirer/kvm-guest-drivers-windows/issues/63 Signed-off-by: Tom Yan <tom.ty89@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-07-12libata-scsi: set correct VERSION field for ZAC devicesTom Yan1-1/+3
Commit 856c46639309 ("libata: support device-managed ZAC devices") had the line that "bumps" the VERSION field in standard INQUIRY data removed. Add it back and claim SPC-5 version compatibility, which matches with the current version descriptor "SPC-5 (no version claimed)" that is used for ZAC devices. Signed-off-by: Tom Yan <tom.ty89@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-07-06libata-scsi: correct cbd to CDB in commentTom Yan1-1/+1
It's Command Descriptor Block. Also capitalized it. Signed-off-by: Tom Yan <tom.ty89@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-07-06libata-scsi: do not respond with "invalid field" for FORMAT UNITTom Yan1-5/+0
It does not make sense and is confusing to respond with "Invalid field in CDB" while we have no support at all implemented for FORMAT UNIT. It is decent to let it go to the default, which will respond with "Invalid command operation code" instead. Signed-off-by: Tom Yan <tom.ty89@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-06-07block, drivers, cgroup: use op_is_write helper instead of checking for REQ_WRITEMike Christie1-1/+1
We currently set REQ_WRITE/WRITE for all non READ IOs like discard, flush, writesame, etc. In the next patches where we no longer set up the op as a bitmap, we will not be able to detect a operation direction like writesame by testing if REQ_WRITE is set. This patch converts the drivers and cgroup to use the op_is_write helper. This should just cover the simple cases. I did dm, md and bcache in their own patches because they were more involved. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-24Merge branch 'for-4.7-zac' of ↵Linus Torvalds1-136/+616
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata ZAC support from Tejun Heo: "This contains Zone ATA Command support for Shingled Magnetic Recording devices. In addition to sending the new commands down to the device, as ZAC commands depend on getting a lot of responses from the device, piping up responses is beefed up too. However, it doesn't involve changes to libata core mechanism or its interaction with upper layers, so I'm not expecting too many fallouts. Kudos to Hannes for driving SMR support" * 'for-4.7-zac' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: (28 commits) libata: support host-aware and host-managed ZAC devices libata: support device-managed ZAC devices libata: NCQ encapsulation for ZAC MANAGEMENT OUT libata: Implement ZBC OUT translation libata: implement ZBC IN translation libata: fixup ZAC device disabling libata-scsi: Generate sense code for disabled devices libata-trace: decode subcommands libata: Check log page directory before accessing pages libata: Add command definitions for NCQ Encapsulation for READ LOG DMA EXT libata: Separate out ata_dev_config_ncq_send_recv() libata/libsas: Define ATA_CMD_NCQ_NON_DATA libsas: enable FPDMA SEND/RECEIVE libata: do not attempt to retrieve sense code twice libata-scsi: Set information sense field for invalid parameter libata-scsi: set bit pointer for sense code information libata-scsi: Set field pointer in sense code scsi: add scsi_set_sense_field_pointer() libata: Implement control mode page to select sense format libata-scsi: generate correct ATA pass-through sense ...
2016-05-09libata: support host-aware and host-managed ZAC devicesHannes Reinecke1-2/+36
Byte 69 bits 0:1 in the IDENTIFY DEVICE data indicate a host-aware ZAC device. Host-managed ZAC devices have their own individual signature, and to not set the bits in the IDENTIFY DEVICE data. And whenever we detect a ZAC-compatible device we should be displaying the zoned block characteristics VPD page. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-05-09libata: support device-managed ZAC devicesHannes Reinecke1-9/+10
Device-managed ZAC devices just set the zoned capabilities field in INQUIRY byte 69 (cf ACS-4). This corresponds to the 'zoned' field in the block device characteristics VPD page. As this is only defined in SPC-5/SBC-4 we also need to update the supported SCSI version descriptor. Reviewed-by: Shaun Tancheff <shaun.tancheff@seagate.com> Tested-by: Shaun Tancheff <shaun.tancheff@seagate.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-05-09libata: NCQ encapsulation for ZAC MANAGEMENT OUTHannes Reinecke1-5/+13
Add NCQ encapsulation for ZAC MANAGEMENT OUT and evaluate NCQ Non-Data log pages to figure out if NCQ encapsulation is supported. Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-05-09libata: Implement ZBC OUT translationHannes Reinecke1-0/+67
ZAC drives implement a 'ZAC Management Out' command template, which maps onto the ZBC OUT command. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-05-09libata: implement ZBC IN translationHannes Reinecke1-0/+157
ZAC drives implement a 'ZAC Management In' command template, which maps onto the ZBC IN command. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-05-09libata-scsi: Generate sense code for disabled devicesHannes Reinecke1-0/+6
If a device is disabled after error recovery it doesn't make any sense to generate an ATA sense, but we should rather return a generic sense code indicating the device is gone. Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-05-09libata-scsi: use %*ph to dump small buffersAndy Shevchenko1-5/+2
Replace custom approach by %*ph specifier to dump small buffers in hex format. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-04-13treewide: Fix typos in libata.xmlMasanari Iida1-1/+1
This patch fix spelling typos found in Documentation/Docbook/libata.xml. It is because the file was generated from comments in source, I had to fix comments in libata-core.c Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-04-04libata-scsi: Set information sense field for invalid parameterHannes Reinecke1-20/+61
Whenever the sense key is set to 'invalid parameter' we should be filling out the sense-key specific information field in the sense buffer. tj: Added description of @fp for ata_mselect_*(). Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-04-04libata-scsi: set bit pointer for sense code informationHannes Reinecke1-11/+19
When generating a sense code of 'Invalid field in CDB' we should be setting the bit pointer where appropriate. Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-04-04libata-scsi: Set field pointer in sense codeHannes Reinecke1-47/+108
If the sense code is 'Invalid field in CDB' we should be setting the field pointer to the offending byte. Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-04-04libata: Implement control mode page to select sense formatHannes Reinecke1-30/+86
Implement MODE SELECT for the control mode page to allow the OS to switch to descriptor sense. tj: Dropped s/sb/cmd->sense_buffer/ in ata_gen_ata_sense(). Added @dev description to ata_msense_ctl_mode(). Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-04-04libata-scsi: generate correct ATA pass-through senseHannes Reinecke1-34/+59
Generate ATA pass-through sense for both fixed and descriptor format sense. Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>