Age | Commit message (Collapse) | Author | Files | Lines |
|
commit 3b5455636fe26ea21b4189d135a424a6da016418 upstream.
All three generations of Sandisk SSDs lock up hard intermittently.
Experiments showed that disabling NCQ lowered the failure rate significantly
and the kernel has been disabling NCQ for some models of SD7's and 8's,
which is obviously undesirable.
Karthik worked with Sandisk to root cause the hard lockups to trim commands
larger than 128M. This patch implements ATA_HORKAGE_MAX_TRIM_128M which
limits max trim size to 128M and applies it to all three generations of
Sandisk SSDs.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Karthik Shivaram <karthikgs@fb.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
ata_scsi_mode_select_xlat function
[ Upstream commit f650ef61e040bcb175dd8762164b00a5d627f20e ]
BUG: KASAN: use-after-free in ata_scsi_mode_select_xlat+0x10bd/0x10f0
drivers/ata/libata-scsi.c:4045
Read of size 1 at addr ffff88803b8cd003 by task syz-executor.6/12621
CPU: 1 PID: 12621 Comm: syz-executor.6 Not tainted 4.19.95 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.10.2-1ubuntu1 04/01/2014
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0xac/0xee lib/dump_stack.c:118
print_address_description+0x60/0x223 mm/kasan/report.c:253
kasan_report_error mm/kasan/report.c:351 [inline]
kasan_report mm/kasan/report.c:409 [inline]
kasan_report.cold+0xae/0x2d8 mm/kasan/report.c:393
ata_scsi_mode_select_xlat+0x10bd/0x10f0 drivers/ata/libata-scsi.c:4045
ata_scsi_translate+0x2da/0x680 drivers/ata/libata-scsi.c:2035
__ata_scsi_queuecmd drivers/ata/libata-scsi.c:4360 [inline]
ata_scsi_queuecmd+0x2e4/0x790 drivers/ata/libata-scsi.c:4409
scsi_dispatch_cmd+0x2ee/0x6c0 drivers/scsi/scsi_lib.c:1867
scsi_queue_rq+0xfd7/0x1990 drivers/scsi/scsi_lib.c:2170
blk_mq_dispatch_rq_list+0x1e1/0x19a0 block/blk-mq.c:1186
blk_mq_do_dispatch_sched+0x147/0x3d0 block/blk-mq-sched.c:108
blk_mq_sched_dispatch_requests+0x427/0x680 block/blk-mq-sched.c:204
__blk_mq_run_hw_queue+0xbc/0x200 block/blk-mq.c:1308
__blk_mq_delay_run_hw_queue+0x3c0/0x460 block/blk-mq.c:1376
blk_mq_run_hw_queue+0x152/0x310 block/blk-mq.c:1413
blk_mq_sched_insert_request+0x337/0x6c0 block/blk-mq-sched.c:397
blk_execute_rq_nowait+0x124/0x320 block/blk-exec.c:64
blk_execute_rq+0xc5/0x112 block/blk-exec.c:101
sg_scsi_ioctl+0x3b0/0x6a0 block/scsi_ioctl.c:507
sg_ioctl+0xd37/0x23f0 drivers/scsi/sg.c:1106
vfs_ioctl fs/ioctl.c:46 [inline]
file_ioctl fs/ioctl.c:501 [inline]
do_vfs_ioctl+0xae6/0x1030 fs/ioctl.c:688
ksys_ioctl+0x76/0xa0 fs/ioctl.c:705
__do_sys_ioctl fs/ioctl.c:712 [inline]
__se_sys_ioctl fs/ioctl.c:710 [inline]
__x64_sys_ioctl+0x6f/0xb0 fs/ioctl.c:710
do_syscall_64+0xa0/0x2e0 arch/x86/entry/common.c:293
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45c479
Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89
f7 48
89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f
83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fb0e9602c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007fb0e96036d4 RCX: 000000000045c479
RDX: 0000000020000040 RSI: 0000000000000001 RDI: 0000000000000003
RBP: 000000000076bfc0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 000000000000046d R14: 00000000004c6e1a R15: 000000000076bfcc
Allocated by task 12577:
set_track mm/kasan/kasan.c:460 [inline]
kasan_kmalloc mm/kasan/kasan.c:553 [inline]
kasan_kmalloc+0xbf/0xe0 mm/kasan/kasan.c:531
__kmalloc+0xf3/0x1e0 mm/slub.c:3749
kmalloc include/linux/slab.h:520 [inline]
load_elf_phdrs+0x118/0x1b0 fs/binfmt_elf.c:441
load_elf_binary+0x2de/0x4610 fs/binfmt_elf.c:737
search_binary_handler fs/exec.c:1654 [inline]
search_binary_handler+0x15c/0x4e0 fs/exec.c:1632
exec_binprm fs/exec.c:1696 [inline]
__do_execve_file.isra.0+0xf52/0x1a90 fs/exec.c:1820
do_execveat_common fs/exec.c:1866 [inline]
do_execve fs/exec.c:1883 [inline]
__do_sys_execve fs/exec.c:1964 [inline]
__se_sys_execve fs/exec.c:1959 [inline]
__x64_sys_execve+0x8a/0xb0 fs/exec.c:1959
do_syscall_64+0xa0/0x2e0 arch/x86/entry/common.c:293
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Freed by task 12577:
set_track mm/kasan/kasan.c:460 [inline]
__kasan_slab_free+0x129/0x170 mm/kasan/kasan.c:521
slab_free_hook mm/slub.c:1370 [inline]
slab_free_freelist_hook mm/slub.c:1397 [inline]
slab_free mm/slub.c:2952 [inline]
kfree+0x8b/0x1a0 mm/slub.c:3904
load_elf_binary+0x1be7/0x4610 fs/binfmt_elf.c:1118
search_binary_handler fs/exec.c:1654 [inline]
search_binary_handler+0x15c/0x4e0 fs/exec.c:1632
exec_binprm fs/exec.c:1696 [inline]
__do_execve_file.isra.0+0xf52/0x1a90 fs/exec.c:1820
do_execveat_common fs/exec.c:1866 [inline]
do_execve fs/exec.c:1883 [inline]
__do_sys_execve fs/exec.c:1964 [inline]
__se_sys_execve fs/exec.c:1959 [inline]
__x64_sys_execve+0x8a/0xb0 fs/exec.c:1959
do_syscall_64+0xa0/0x2e0 arch/x86/entry/common.c:293
entry_SYSCALL_64_after_hwframe+0x44/0xa9
The buggy address belongs to the object at ffff88803b8ccf00
which belongs to the cache kmalloc-512 of size 512
The buggy address is located 259 bytes inside of
512-byte region [ffff88803b8ccf00, ffff88803b8cd100)
The buggy address belongs to the page:
page:ffffea0000ee3300 count:1 mapcount:0 mapping:ffff88806cc03080
index:0xffff88803b8cc780 compound_mapcount: 0
flags: 0x100000000008100(slab|head)
raw: 0100000000008100 ffffea0001104080 0000000200000002 ffff88806cc03080
raw: ffff88803b8cc780 00000000800c000b 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff88803b8ccf00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88803b8ccf80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88803b8cd000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88803b8cd080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88803b8cd100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
You can refer to "https://www.lkml.org/lkml/2019/1/17/474" reproduce
this error.
The exception code is "bd_len = p[3];", "p" value is ffff88803b8cd000
which belongs to the cache kmalloc-512 of size 512. The "page_address(sg_page(scsi_sglist(scmd)))"
maybe from sg_scsi_ioctl function "buffer" which allocated by kzalloc, so "buffer"
may not page aligned.
This also looks completely buggy on highmem systems and really needs to use a
kmap_atomic. --Christoph Hellwig
To address above bugs, Paolo Bonzini advise to simpler to just make a char array
of size CACHE_MPAGE_LEN+8+8+4-2(or just 64 to make it easy), use sg_copy_to_buffer
to copy from the sglist into the buffer, and workthere.
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 1d72f7aec3595249dbb83291ccac041a2d676c57 ]
If the call to scsi_add_host_with_dma() in ata_scsi_add_hosts() fails,
then we may get use-after-free KASAN warns:
==================================================================
BUG: KASAN: use-after-free in kobject_put+0x24/0x180
Read of size 1 at addr ffff0026b8c80364 by task swapper/0/1
CPU: 1 PID: 1 Comm: swapper/0 Tainted: G W 5.6.0-rc3-00004-g5a71b206ea82-dirty #1765
Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDD, BIOS 2280-V2 CS V3.B160.01 02/24/2020
Call trace:
dump_backtrace+0x0/0x298
show_stack+0x14/0x20
dump_stack+0x118/0x190
print_address_description.isra.9+0x6c/0x3b8
__kasan_report+0x134/0x23c
kasan_report+0xc/0x18
__asan_load1+0x5c/0x68
kobject_put+0x24/0x180
put_device+0x10/0x20
scsi_host_put+0x10/0x18
ata_devres_release+0x74/0xb0
release_nodes+0x2d0/0x470
devres_release_all+0x50/0x78
really_probe+0x2d4/0x560
driver_probe_device+0x7c/0x148
device_driver_attach+0x94/0xa0
__driver_attach+0xa8/0x110
bus_for_each_dev+0xe8/0x158
driver_attach+0x30/0x40
bus_add_driver+0x220/0x2e0
driver_register+0xbc/0x1d0
__pci_register_driver+0xbc/0xd0
ahci_pci_driver_init+0x20/0x28
do_one_initcall+0xf0/0x608
kernel_init_freeable+0x31c/0x384
kernel_init+0x10/0x118
ret_from_fork+0x10/0x18
Allocated by task 5:
save_stack+0x28/0xc8
__kasan_kmalloc.isra.8+0xbc/0xd8
kasan_kmalloc+0xc/0x18
__kmalloc+0x1a8/0x280
scsi_host_alloc+0x44/0x678
ata_scsi_add_hosts+0x74/0x268
ata_host_register+0x228/0x488
ahci_host_activate+0x1c4/0x2a8
ahci_init_one+0xd18/0x1298
local_pci_probe+0x74/0xf0
work_for_cpu_fn+0x2c/0x48
process_one_work+0x488/0xc08
worker_thread+0x330/0x5d0
kthread+0x1c8/0x1d0
ret_from_fork+0x10/0x18
Freed by task 5:
save_stack+0x28/0xc8
__kasan_slab_free+0x118/0x180
kasan_slab_free+0x10/0x18
slab_free_freelist_hook+0xa4/0x1a0
kfree+0xd4/0x3a0
scsi_host_dev_release+0x100/0x148
device_release+0x7c/0xe0
kobject_put+0xb0/0x180
put_device+0x10/0x20
scsi_host_put+0x10/0x18
ata_scsi_add_hosts+0x210/0x268
ata_host_register+0x228/0x488
ahci_host_activate+0x1c4/0x2a8
ahci_init_one+0xd18/0x1298
local_pci_probe+0x74/0xf0
work_for_cpu_fn+0x2c/0x48
process_one_work+0x488/0xc08
worker_thread+0x330/0x5d0
kthread+0x1c8/0x1d0
ret_from_fork+0x10/0x18
There is also refcount issue, as well:
WARNING: CPU: 1 PID: 1 at lib/refcount.c:28 refcount_warn_saturate+0xf8/0x170
The issue is that we make an erroneous extra call to scsi_host_put()
for that host:
So in ahci_init_one()->ata_host_alloc_pinfo()->ata_host_alloc(), we setup
a device release method - ata_devres_release() - which intends to release
the SCSI hosts:
static void ata_devres_release(struct device *gendev, void *res)
{
...
for (i = 0; i < host->n_ports; i++) {
struct ata_port *ap = host->ports[i];
if (!ap)
continue;
if (ap->scsi_host)
scsi_host_put(ap->scsi_host);
}
...
}
However in the ata_scsi_add_hosts() error path, we also call
scsi_host_put() for the SCSI hosts.
Fix by removing the the scsi_host_put() calls in ata_scsi_add_hosts() and
leave this to ata_devres_release().
Fixes: f31871951b38 ("libata: separate out ata_host_alloc() and ata_host_register()")
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 2d7271501720038381d45fb3dcbe4831228fc8cc upstream.
For passthrough requests, libata-scsi takes what the user passes in
as gospel. This can be problematic if the user fills in the CDB
incorrectly. One example of that is in request sizes. For read/write
commands, the CDB contains fields describing the transfer length of
the request. These should match with the SG_IO header fields, but
libata-scsi currently does no validation of that.
Check that the number of blocks in the CDB for passthrough requests
matches what was mapped into the request. If the CDB asks for more
data then the validated SG_IO header fields, error it.
Reported-by: Krishna Ram Prakash R <krp@gtux.in>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6edf1d4cb0acde3a0a5dac849f33031bd7abb7b1 upstream.
If the ALL bit is set in the ZBC_OUT command, the command zone ID field
(block) should be ignored.
Reported-by: David Butterfield <david.butterfield@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Cc: stable@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b320a0a9f23c98f21631eb27bcbbca91c79b1c6e upstream.
The block (LBA) specified must not exceed the last addressable LBA,
which is dev->nr_sectors - 1. So fix the correct check is
"if (block >= dev->n_sectors)" and not "if (block > dev->n_sectords)".
Additionally, the asc/ascq to return for an LBA that is not a zone start
LBA should be ILLEGAL REQUEST, regardless if the bad LBA is out of
range.
Reported-by: David Butterfield <david.butterfield@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Cc: stable@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 0d3e45bc6507bd1f8728bf586ebd16c2d9e40613 ]
This fixs the following comile warnings with ATA_DEBUG enabled,
which detected by Linaro GCC 5.2-2015.11:
drivers/ata/libata-scsi.c: In function 'ata_scsi_dump_cdb':
./include/linux/kern_levels.h:5:18: warning: format '%d' expects
argument of type 'int', but argument 6 has type 'u64 {aka long
long unsigned int}' [-Wformat=]
tj: Patch hand-applied and description trimmed.
Signed-off-by: Dong Bo <dongbo4@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 2c1ec6fda2d07044cda922ee25337cf5d4b429b3 upstream.
syzkaller hit a WARN() in ata_bmdma_qc_issue() when writing to /dev/sg0.
This happened because it issued an ATA pass-through command (ATA_16)
where the protocol field indicated that NCQ should be used -- but the
device did not support NCQ.
We could just remove the WARN() from libata-sff.c, but the real problem
seems to be that the SCSI -> ATA translation code passes through NCQ
commands without verifying that the device actually supports NCQ.
Fix this by adding the appropriate check to ata_scsi_pass_thru().
Here's reproducer that works in QEMU when /dev/sg0 refers to a disk of
the default type ("82371SB PIIX3 IDE"):
#include <fcntl.h>
#include <unistd.h>
int main()
{
char buf[53] = { 0 };
buf[36] = 0x85; /* ATA_16 */
buf[37] = (12 << 1); /* FPDMA */
buf[38] = 0x1; /* Has data */
buf[51] = 0xC8; /* ATA_CMD_READ */
write(open("/dev/sg0", O_RDWR), buf, sizeof(buf));
}
Fixes: ee7fb331c3ac ("libata: add support for NCQ commands for SG interface")
Reported-by: syzbot+2f69ca28df61bdfc77cd36af2e789850355a221e@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org> # v4.4+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 058f58e235cbe03e923b30ea7c49995a46a8725f upstream.
syzkaller reported a crash in ata_bmdma_fill_sg() when writing to
/dev/sg1. The immediate cause was that the ATA command's scatterlist
was not DMA-mapped, which causes 'pi - 1' to underflow, resulting in a
write to 'qc->ap->bmdma_prd[0xffffffff]'.
Strangely though, the flag ATA_QCFLAG_DMAMAP was set in qc->flags. The
root cause is that when __ata_scsi_queuecmd() is preparing to relay a
SCSI command to an ATAPI device, it doesn't correctly validate the CDB
length before copying it into the 16-byte buffer 'cdb' in 'struct
ata_queued_cmd'. Namely, it validates the fixed CDB length expected
based on the SCSI opcode but not the actual CDB length, which can be
larger due to the use of the SG_NEXT_CMD_LEN ioctl. Since 'flags' is
the next member in ata_queued_cmd, a buffer overflow corrupts it.
Fix it by requiring that the actual CDB length be <= 16 (ATAPI_CDB_LEN).
[Really it seems the length should be required to be <= dev->cdb_len,
but the current behavior seems to have been intentionally introduced by
commit 607126c2a21c ("libata-scsi: be tolerant of 12-byte ATAPI commands
in 16-byte CDBs") to work around a userspace bug in mplayer. Probably
the workaround is no longer needed (mplayer was fixed in 2007), but
continuing to allow lengths to up 16 appears harmless for now.]
Here's a reproducer that works in QEMU when /dev/sg1 refers to the
CD-ROM drive that qemu-system-x86_64 creates by default:
#include <fcntl.h>
#include <sys/ioctl.h>
#include <unistd.h>
#define SG_NEXT_CMD_LEN 0x2283
int main()
{
char buf[53] = { [36] = 0x7e, [52] = 0x02 };
int fd = open("/dev/sg1", O_RDWR);
ioctl(fd, SG_NEXT_CMD_LEN, &(int){ 17 });
write(fd, buf, sizeof(buf));
}
The crash was:
BUG: unable to handle kernel paging request at ffff8cb97db37ffc
IP: ata_bmdma_fill_sg drivers/ata/libata-sff.c:2623 [inline]
IP: ata_bmdma_qc_prep+0xa4/0xc0 drivers/ata/libata-sff.c:2727
PGD fb6c067 P4D fb6c067 PUD 0
Oops: 0002 [#1] SMP
CPU: 1 PID: 150 Comm: syz_ata_bmdma_q Not tainted 4.15.0-next-20180202 #99
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
[...]
Call Trace:
ata_qc_issue+0x100/0x1d0 drivers/ata/libata-core.c:5421
ata_scsi_translate+0xc9/0x1a0 drivers/ata/libata-scsi.c:2024
__ata_scsi_queuecmd drivers/ata/libata-scsi.c:4326 [inline]
ata_scsi_queuecmd+0x8c/0x210 drivers/ata/libata-scsi.c:4375
scsi_dispatch_cmd+0xa2/0xe0 drivers/scsi/scsi_lib.c:1727
scsi_request_fn+0x24c/0x530 drivers/scsi/scsi_lib.c:1865
__blk_run_queue_uncond block/blk-core.c:412 [inline]
__blk_run_queue+0x3a/0x60 block/blk-core.c:432
blk_execute_rq_nowait+0x93/0xc0 block/blk-exec.c:78
sg_common_write.isra.7+0x272/0x5a0 drivers/scsi/sg.c:806
sg_write+0x1ef/0x340 drivers/scsi/sg.c:677
__vfs_write+0x31/0x160 fs/read_write.c:480
vfs_write+0xa7/0x160 fs/read_write.c:544
SYSC_write fs/read_write.c:589 [inline]
SyS_write+0x4d/0xc0 fs/read_write.c:581
do_syscall_64+0x5e/0x110 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x21/0x86
Fixes: 607126c2a21c ("libata-scsi: be tolerant of 12-byte ATAPI commands in 16-byte CDBs")
Reported-by: syzbot+1ff6f9fcc3c35f1c72a95e26528c8e7e3276e4da@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org> # v2.6.24+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 59a5e266c3f5c1567508888dd61a45b86daed0fa upstream.
My static checker complains that "devno" can be negative, meaning that
we read before the start of the loop. I've looked at the code, and I
think the warning is right. This come from /proc so it's root only or
it would be quite a quite a serious bug. The call tree looks like this:
proc_scsi_write() <- gets id and channel from simple_strtoul()
-> scsi_add_single_device() <- calls shost->transportt->user_scan()
-> ata_scsi_user_scan()
-> ata_find_dev()
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
SCT Write Same support had been introduced with
commit 7b2030942859 ("libata: Add support for SCT Write Same")
Some problems, namely excessive userspace segfaults, had been reported at
http://lkml.kernel.org/r/20160908192736.GA4356@gmail.com
This lead to commit 0ce1b18c42a5 ("libata: Some drives failing on
SCT Write Same") which strived to disable SCT Write Same on !ZAC devices.
Due to the way this was done and to the logic in sd_config_write_same(),
this didn't work for those devices that have
->max_ws_blocks > SD_MAX_WS10_BLOCKS: for these, ->no_write_same and
->max_write_same_sectors would still be non-zero,
but ->ws10 == ->ws16 == 0. This would cause sd_setup_write_same_cmnd() to
demultiplex REQ_OP_WRITE_SAME requests to WRITE_SAME, and these in turn
aren't supported by libata-scsi:
EXT4-fs (dm-1): Delayed block allocation failed for inode 2625094 at
logical offset 2032 with max blocks 2 with error 121
EXT4-fs (dm-1): This should not happen!! Data will be lost
121 == EREMOTEIO is what scsi_io_completion() asserts in case of
invalid opcodes.
Back to the original problem of userspace segfaults: this can be tracked
down to ata_format_sct_write_same() overwriting the input page. Sometimes,
this page is ZERO_PAGE(0) which ceases to be filled with zeros from that
point on. Since ZERO_PAGE(0) is used for userspace .bss mappings, code of
the following is doomed:
static char *a = NULL; /* .bss */
...
if (a)
*a = 'a';
This problem is not solved by disabling SCT Write Same for !ZAC devices
only.
It can certainly be fixed, but the final release is quite close -- so
disable SCT Write Same for all ATA devices rather than introducing some
SCT key buffer allocation schemes at this point.
Fixes: 7b2030942859 ("libata: Add support for SCT Write Same")
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
There's a typo in ata_gen_passthru_sense(), where the first byte
would be overwritten incorrectly later on.
Reported-by: Charles Machalow <csm10495@gmail.com>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Fixes: 11093cb1ef56 ("libata-scsi: generate correct ATA pass-through sense")
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Restrict support SCT Write Same to devices which also support ZAC where
support is required.
Reported-by: Mike Krinkin <krinkin.m.u@gmail.com>
Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Use non DMA write log when ATA_DFLAG_PIO is set.
Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
|
|
Correct handling of devices with sector_size other that 512 bytes.
In the case of a 4Kn device sector_size it is possible to describe a much
larger DSM Trim than the current fixed default of 512 bytes.
This patch assumes the minimum descriptor is sector_size and fills out
the descriptor accordingly.
The ACS-2 specification is quite clear that the DSM command payload is
sized as number of 512 byte transfers so a 4Kn device will operate
correctly without this patch.
Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Acked-by: Tejun Heo <tj@kernel.org>
|
|
SATA drives may support write same via SCT. This is useful
for setting the drive contents to a specific pattern (0's).
Translate a SCSI WRITE SAME 16 command to be either a DSM TRIM
command or an SCT Write Same command.
Based on the UNMAP flag:
- When set translate to DSM TRIM
- When not set translate to SCT Write Same
Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
|
|
Safely overwriting the attached page to ATA format from the SCSI formatted
variant.
Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
|
|
scsi_done() was called repeatedly and apparently because of that,
the kernel would call trace when we touch the Control mode page:
Call Trace:
[<ffffffff812ea0d2>] dump_stack+0x63/0x81
[<ffffffff81079cfb>] __warn+0xcb/0xf0
[<ffffffff81079e2d>] warn_slowpath_null+0x1d/0x20
[<ffffffffa00f51b0>] ata_eh_finish+0xe0/0xf0 [libata]
[<ffffffffa00fb830>] sata_pmp_error_handler+0x640/0xa50 [libata]
[<ffffffffa00470ed>] ahci_error_handler+0x1d/0x70 [libahci]
[<ffffffffa00f55f0>] ata_scsi_port_error_handler+0x430/0x770 [libata]
[<ffffffffa00eff8d>] ? ata_scsi_cmd_error_handler+0xdd/0x160 [libata]
[<ffffffffa00f59d7>] ata_scsi_error+0xa7/0xf0 [libata]
[<ffffffffa00913ba>] scsi_error_handler+0xaa/0x560 [scsi_mod]
[<ffffffffa0091310>] ? scsi_eh_get_sense+0x180/0x180 [scsi_mod]
[<ffffffff81098eb8>] kthread+0xd8/0xf0
[<ffffffff815d913f>] ret_from_fork+0x1f/0x40
[<ffffffff81098de0>] ? kthread_worker_fn+0x170/0x170
---[ end trace 8b7501047e928a17 ]---
Removed the unnecessary code and let ata_scsi_translate() do the job.
Also, since ata_mselect_control() has no ATA command to send to the
device, ata_scsi_mode_select_xlat() should return 1 for it, so that
ata_scsi_translate() will finish early to avoid ata_qc_issue().
Signed-off-by: Tom Yan <tom.ty89@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
ata_mselect_*() would initialize a char array for storing a copy of
the current mode page. However, char could be signed char. In that
case, bytes larger than 127 would be converted to negative number.
For example, 0xff from def_control_mpage[] would become -1. This
prevented ata_mselect_control() from working at all, since when it
did the read-only bits check, there would always be a mismatch.
Signed-off-by: Tom Yan <tom.ty89@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Pull core block updates from Jens Axboe:
- the big change is the cleanup from Mike Christie, cleaning up our
uses of command types and modified flags. This is what will throw
some merge conflicts
- regression fix for the above for btrfs, from Vincent
- following up to the above, better packing of struct request from
Christoph
- a 2038 fix for blktrace from Arnd
- a few trivial/spelling fixes from Bart Van Assche
- a front merge check fix from Damien, which could cause issues on
SMR drives
- Atari partition fix from Gabriel
- convert cfq to highres timers, since jiffies isn't granular enough
for some devices these days. From Jan and Jeff
- CFQ priority boost fix idle classes, from me
- cleanup series from Ming, improving our bio/bvec iteration
- a direct issue fix for blk-mq from Omar
- fix for plug merging not involving the IO scheduler, like we do for
other types of merges. From Tahsin
- expose DAX type internally and through sysfs. From Toshi and Yigal
* 'for-4.8/core' of git://git.kernel.dk/linux-block: (76 commits)
block: Fix front merge check
block: do not merge requests without consulting with io scheduler
block: Fix spelling in a source code comment
block: expose QUEUE_FLAG_DAX in sysfs
block: add QUEUE_FLAG_DAX for devices to advertise their DAX support
Btrfs: fix comparison in __btrfs_map_block()
block: atari: Return early for unsupported sector size
Doc: block: Fix a typo in queue-sysfs.txt
cfq-iosched: Charge at least 1 jiffie instead of 1 ns
cfq-iosched: Fix regression in bonnie++ rewrite performance
cfq-iosched: Convert slice_resid from u64 to s64
block: Convert fifo_time from ulong to u64
blktrace: avoid using timespec
block/blk-cgroup.c: Declare local symbols static
block/bio-integrity.c: Add #include "blk.h"
block/partition-generic.c: Remove a set-but-not-used variable
block: bio: kill BIO_MAX_SIZE
cfq-iosched: temporarily boost queue priority for idle classes
block: drbd: avoid to use BIO_MAX_SIZE
block: bio: remove BIO_MAX_SECTORS
...
|
|
`changeable` is the "version" of mode page requested by the user.
It will be less confusing/misleading if we do not check it
"together" with the setting bits of the drive.
Not to mention that we currently have ata_mselect_*() implemented
in a way that each of them will serve exclusively a particular bit
on each page. The old style will hence make the condition look even
more unnecessarily arcane if the ata_msense_*() is reflecting more
than one bit.
Signed-off-by: Tom Yan <tom.ty89@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
The reset_all variable name is misleading as this bit is also applicable to
open, close, and finish actions. So rename that variable to "all" and remove
the unnecessary mask operation that's already done earlier.
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
[hch: split from the previous patch]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
The subcommand for NCQ NON-DATA must be specified in the feature
(low byte), not the high-order count byte. Also make sure to properly
cast the all bit to a u16 before shiting it by 8 to avoid undefined
behavior.
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
[hch: split the original patch into two, updated changelog]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Include reporting options when translating REPORT ZONES commmand to
ATA NCQ, and make sure we only look at the actually specified bits
in the CDB for the options.
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
[hch: update patch description]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Add a new taskfile protocol ATA_PROT_NCQ_NODATA to handle
ATA NCQ NO-DATA commands correctly.
And fixup ata_scsi_zbc_out_xlat() to use it.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Use accessor functions instead of the raw value.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Currently libata statically allows only 1-block (512-byte) payload
for each TRIM command. Each payload can carry 64 TRIM ranges since
each range requires 8 bytes.
It is silly to keep doing the calculation (512 / 8) in different
places. Hence, define the new ATA_MAX_TRIM_RNUM for the result.
Signed-off-by: Tom Yan <tom.ty89@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Currently if a WRITE SAME (16) command is issued to the SATL with
"number of blocks" that is larger than the "Maximum write same length"
(which is the maximum number of blocks per TRIM command allowed in
libata, currently 65535 * 512 / 8 blocks), the SATL will accept the
command and translate it to a TRIM command with the upper limit.
However, according to SBC (as of sbc4r11.pdf), the "device server"
should terminate the command with "Invalid field in CDB" in that case.
Signed-off-by: Tom Yan <tom.ty89@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
To make it consistent with the recently added ata_mselect_control().
We probably shouldn't have the word "mode" in its name anyway, since
that's not the case for other ata_msense_*() / ata_mselect_*() either.
Signed-off-by: Tom Yan <tom.ty89@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
The bit should always be set to 1 when the requested version of
page is "changeable" because we've made it so in ata_mselect_control().
Also, it should always be set to 1 if ATA_DFLAG_D_SENSE is set (when
the requested version of page is "current" or "default").
Signed-off-by: Tom Yan <tom.ty89@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
The comment suggests we should be having an SPC-3 version descriptor
but the 0260h is the code for "SPC-2 (no version claimed)". Correct
it to 0300h so that it has the "SPC-3 (no version claimed)" descriptor.
Note that we are claiming SPC-3 version compatibility in the VERSION
field of the standard INQUIRY data. Therefore, I assume the typo was
on the code but not on the comment.
Signed-off-by: Tom Yan <tom.ty89@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Avoid performance bottleneck when being SCSI pass-through'd to
virtual machines with other OSes (e.g. Windows) via virtio-scsi
and scsi-block in qemu.
Ref.: https://github.com/YanVugenfirer/kvm-guest-drivers-windows/issues/63
Signed-off-by: Tom Yan <tom.ty89@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Commit 856c46639309 ("libata: support device-managed ZAC devices")
had the line that "bumps" the VERSION field in standard INQUIRY data
removed. Add it back and claim SPC-5 version compatibility, which
matches with the current version descriptor "SPC-5 (no version claimed)"
that is used for ZAC devices.
Signed-off-by: Tom Yan <tom.ty89@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
It's Command Descriptor Block. Also capitalized it.
Signed-off-by: Tom Yan <tom.ty89@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
It does not make sense and is confusing to respond with "Invalid
field in CDB" while we have no support at all implemented for
FORMAT UNIT. It is decent to let it go to the default, which
will respond with "Invalid command operation code" instead.
Signed-off-by: Tom Yan <tom.ty89@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
We currently set REQ_WRITE/WRITE for all non READ IOs
like discard, flush, writesame, etc. In the next patches where we
no longer set up the op as a bitmap, we will not be able to
detect a operation direction like writesame by testing if REQ_WRITE is
set.
This patch converts the drivers and cgroup to use the
op_is_write helper. This should just cover the simple
cases. I did dm, md and bcache in their own patches
because they were more involved.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata ZAC support from Tejun Heo:
"This contains Zone ATA Command support for Shingled Magnetic Recording
devices.
In addition to sending the new commands down to the device, as ZAC
commands depend on getting a lot of responses from the device, piping
up responses is beefed up too. However, it doesn't involve changes to
libata core mechanism or its interaction with upper layers, so I'm not
expecting too many fallouts.
Kudos to Hannes for driving SMR support"
* 'for-4.7-zac' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: (28 commits)
libata: support host-aware and host-managed ZAC devices
libata: support device-managed ZAC devices
libata: NCQ encapsulation for ZAC MANAGEMENT OUT
libata: Implement ZBC OUT translation
libata: implement ZBC IN translation
libata: fixup ZAC device disabling
libata-scsi: Generate sense code for disabled devices
libata-trace: decode subcommands
libata: Check log page directory before accessing pages
libata: Add command definitions for NCQ Encapsulation for READ LOG DMA EXT
libata: Separate out ata_dev_config_ncq_send_recv()
libata/libsas: Define ATA_CMD_NCQ_NON_DATA
libsas: enable FPDMA SEND/RECEIVE
libata: do not attempt to retrieve sense code twice
libata-scsi: Set information sense field for invalid parameter
libata-scsi: set bit pointer for sense code information
libata-scsi: Set field pointer in sense code
scsi: add scsi_set_sense_field_pointer()
libata: Implement control mode page to select sense format
libata-scsi: generate correct ATA pass-through sense
...
|
|
Byte 69 bits 0:1 in the IDENTIFY DEVICE data indicate a
host-aware ZAC device.
Host-managed ZAC devices have their own individual signature,
and to not set the bits in the IDENTIFY DEVICE data.
And whenever we detect a ZAC-compatible device we should
be displaying the zoned block characteristics VPD page.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Device-managed ZAC devices just set the zoned capabilities field
in INQUIRY byte 69 (cf ACS-4). This corresponds to the 'zoned'
field in the block device characteristics VPD page.
As this is only defined in SPC-5/SBC-4 we also need to update
the supported SCSI version descriptor.
Reviewed-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Tested-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Add NCQ encapsulation for ZAC MANAGEMENT OUT and evaluate
NCQ Non-Data log pages to figure out if NCQ encapsulation
is supported.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
ZAC drives implement a 'ZAC Management Out' command template,
which maps onto the ZBC OUT command.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
ZAC drives implement a 'ZAC Management In' command template,
which maps onto the ZBC IN command.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
If a device is disabled after error recovery it doesn't make
any sense to generate an ATA sense, but we should rather
return a generic sense code indicating the device is gone.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Replace custom approach by %*ph specifier to dump small buffers in hex format.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
This patch fix spelling typos found in Documentation/Docbook/libata.xml.
It is because the file was generated from comments in source,
I had to fix comments in libata-core.c
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Whenever the sense key is set to 'invalid parameter' we should
be filling out the sense-key specific information field in the
sense buffer.
tj: Added description of @fp for ata_mselect_*().
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
When generating a sense code of 'Invalid field in CDB' we
should be setting the bit pointer where appropriate.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
If the sense code is 'Invalid field in CDB' we should be
setting the field pointer to the offending byte.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Implement MODE SELECT for the control mode page to allow the OS
to switch to descriptor sense.
tj: Dropped s/sb/cmd->sense_buffer/ in ata_gen_ata_sense(). Added
@dev description to ata_msense_ctl_mode().
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Generate ATA pass-through sense for both fixed and descriptor
format sense.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|