<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/zonefs, branch v6.1.168</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.1.168</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.1.168'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-02-23T08:12:45+00:00</updated>
<entry>
<title>zonefs: Improve error handling</title>
<updated>2024-02-23T08:12:45+00:00</updated>
<author>
<name>Damien Le Moal</name>
<email>dlemoal@kernel.org</email>
</author>
<published>2024-02-08T08:26:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=09fad23a1a3249088e8f840685d10ae1650519a9'/>
<id>urn:sha1:09fad23a1a3249088e8f840685d10ae1650519a9</id>
<content type='text'>
commit 14db5f64a971fce3d8ea35de4dfc7f443a3efb92 upstream.

Write error handling is racy and can sometime lead to the error recovery
path wrongly changing the inode size of a sequential zone file to an
incorrect value  which results in garbage data being readable at the end
of a file. There are 2 problems:

1) zonefs_file_dio_write() updates a zone file write pointer offset
   after issuing a direct IO with iomap_dio_rw(). This update is done
   only if the IO succeed for synchronous direct writes. However, for
   asynchronous direct writes, the update is done without waiting for
   the IO completion so that the next asynchronous IO can be
   immediately issued. However, if an asynchronous IO completes with a
   failure right before the i_truncate_mutex lock protecting the update,
   the update may change the value of the inode write pointer offset
   that was corrected by the error path (zonefs_io_error() function).

2) zonefs_io_error() is called when a read or write error occurs. This
   function executes a report zone operation using the callback function
   zonefs_io_error_cb(), which does all the error recovery handling
   based on the current zone condition, write pointer position and
   according to the mount options being used. However, depending on the
   zoned device being used, a report zone callback may be executed in a
   context that is different from the context of __zonefs_io_error(). As
   a result, zonefs_io_error_cb() may be executed without the inode
   truncate mutex lock held, which can lead to invalid error processing.

Fix both problems as follows:
- Problem 1: Perform the inode write pointer offset update before a
  direct write is issued with iomap_dio_rw(). This is safe to do as
  partial direct writes are not supported (IOMAP_DIO_PARTIAL is not
  set) and any failed IO will trigger the execution of zonefs_io_error()
  which will correct the inode write pointer offset to reflect the
  current state of the one on the device.
- Problem 2: Change zonefs_io_error_cb() into zonefs_handle_io_error()
  and call this function directly from __zonefs_io_error() after
  obtaining the zone information using blkdev_report_zones() with a
  simple callback function that copies to a local stack variable the
  struct blk_zone obtained from the device. This ensures that error
  handling is performed holding the inode truncate mutex.
  This change also simplifies error handling for conventional zone files
  by bypassing the execution of report zones entirely. This is safe to
  do because the condition of conventional zones cannot be read-only or
  offline and conventional zone files are always fully mapped with a
  constant file size.

Reported-by: Shin'ichiro Kawasaki &lt;shinichiro.kawasaki@wdc.com&gt;
Fixes: 8dcc1a9d90c1 ("fs: New zonefs file system")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal &lt;dlemoal@kernel.org&gt;
Tested-by: Shin'ichiro Kawasaki &lt;shinichiro.kawasaki@wdc.com&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Reviewed-by: Himanshu Madhani &lt;himanshu.madhani@oracle.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>zonefs: Always invalidate last cached page on append write</title>
<updated>2023-04-06T10:10:52+00:00</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@opensource.wdc.com</email>
</author>
<published>2023-03-29T04:16:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d7c67be755ccc488c7c2ae9b98bbd8b6409be4f6'/>
<id>urn:sha1:d7c67be755ccc488c7c2ae9b98bbd8b6409be4f6</id>
<content type='text'>
commit c1976bd8f23016d8706973908f2bb0ac0d852a8f upstream.

When a direct append write is executed, the append offset may correspond
to the last page of a sequential file inode which might have been cached
already by buffered reads, page faults with mmap-read or non-direct
readahead. To ensure that the on-disk and cached data is consistant for
such last cached page, make sure to always invalidate it in
zonefs_file_dio_append(). If the invalidation fails, return -EBUSY to
userspace to differentiate from IO errors.

This invalidation will always be a no-op when the FS block size (device
zone write granularity) is equal to the page size (e.g. 4K).

Reported-by: Hans Holmberg &lt;Hans.Holmberg@wdc.com&gt;
Fixes: 02ef12a663c7 ("zonefs: use REQ_OP_ZONE_APPEND for sync DIO")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal &lt;damien.lemoal@opensource.wdc.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Tested-by: Hans Holmberg &lt;hans.holmberg@wdc.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>zonefs: Do not propagate iomap_dio_rw() ENOTBLK error to user space</title>
<updated>2023-04-06T10:10:51+00:00</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@opensource.wdc.com</email>
</author>
<published>2023-03-30T00:47:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4a8f1f5122667b1e54327ddbab7cfd096ac7dc00'/>
<id>urn:sha1:4a8f1f5122667b1e54327ddbab7cfd096ac7dc00</id>
<content type='text'>
commit 77af13ba3c7f91d91c377c7e2d122849bbc17128 upstream.

The call to invalidate_inode_pages2_range() in __iomap_dio_rw() may
fail, in which case -ENOTBLK is returned and this error code is
propagated back to user space trhough iomap_dio_rw() -&gt;
zonefs_file_dio_write() return chain. This error code is fairly obscure
and may confuse the user. Avoid this and be consistent with the behavior
of zonefs_file_dio_append() for similar invalidate_inode_pages2_range()
errors by returning -EBUSY to user space when iomap_dio_rw() returns
-ENOTBLK.

Suggested-by: Christoph Hellwig &lt;hch@infradead.org&gt;
Fixes: 8dcc1a9d90c1 ("fs: New zonefs file system")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal &lt;damien.lemoal@opensource.wdc.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Tested-by: Hans Holmberg &lt;hans.holmberg@wdc.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>zonefs: Fix error message in zonefs_file_dio_append()</title>
<updated>2023-04-06T10:10:34+00:00</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@opensource.wdc.com</email>
</author>
<published>2023-03-20T13:49:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=fc426026c3a3bd939f1ebae31f880ecf2724cc74'/>
<id>urn:sha1:fc426026c3a3bd939f1ebae31f880ecf2724cc74</id>
<content type='text'>
[ Upstream commit 88b170088ad2c3e27086fe35769aa49f8a512564 ]

Since the expected write location in a sequential file is always at the
end of the file (append write), when an invalid write append location is
detected in zonefs_file_dio_append(), print the invalid written location
instead of the expected write location.

Fixes: a608da3bd730 ("zonefs: Detect append writes at invalid locations")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal &lt;damien.lemoal@opensource.wdc.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Reviewed-by: Himanshu Madhani &lt;himanshu.madhani@oracle.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>zonefs: Separate zone information from inode information</title>
<updated>2023-04-06T10:10:34+00:00</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@opensource.wdc.com</email>
</author>
<published>2022-11-16T09:15:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=81cf745f110557673c8c7f0263bd8c30165bd07b'/>
<id>urn:sha1:81cf745f110557673c8c7f0263bd8c30165bd07b</id>
<content type='text'>
[ Upstream commit aa7f243f32e1d18036ee00d71d3ccfad70ae2121 ]

In preparation for adding dynamic inode allocation, separate an inode
zone information from the zonefs inode structure. The new data structure
zonefs_zone is introduced to store in memory information about a zone
that must be kept throughout the lifetime of the device mount.

Linking between a zone file inode and its zone information is done by
setting the inode i_private field to point to a struct zonefs_zone.
Using the i_private pointer avoids the need for adding a pointer in
struct zonefs_inode_info. Beside the vfs inode, this structure is
reduced to a mutex and a write open counter.

One struct zonefs_zone is created per file inode on mount. These
structures are organized in an array using the new struct
zonefs_zone_group data structure to represent zone groups. The
zonefs_zone arrays are indexed per file number (the index of a struct
zonefs_zone in its array directly gives the file number/name for that
zone file inode).

Signed-off-by: Damien Le Moal &lt;damien.lemoal@opensource.wdc.com&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Stable-dep-of: 88b170088ad2 ("zonefs: Fix error message in zonefs_file_dio_append()")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>zonefs: Reduce struct zonefs_inode_info size</title>
<updated>2023-04-06T10:10:34+00:00</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@opensource.wdc.com</email>
</author>
<published>2022-11-24T10:43:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7558b249cb4e77b2cf58725895e05bbe0fd1a80e'/>
<id>urn:sha1:7558b249cb4e77b2cf58725895e05bbe0fd1a80e</id>
<content type='text'>
[ Upstream commit 34422914dc00b291d1c47dbdabe93b154c2f2b25 ]

Instead of using the i_ztype field in struct zonefs_inode_info to
indicate the zone type of an inode, introduce the new inode flag
ZONEFS_ZONE_CNV to be set in the i_flags field of struct
zonefs_inode_info to identify conventional zones. If this flag is not
set, the zone of an inode is considered to be a sequential zone.

The helpers zonefs_zone_is_cnv(), zonefs_zone_is_seq(),
zonefs_inode_is_cnv() and zonefs_inode_is_seq() are introduced to
simplify testing the zone type of a struct zonefs_inode_info and of a
struct inode.

Signed-off-by: Damien Le Moal &lt;damien.lemoal@opensource.wdc.com&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Stable-dep-of: 88b170088ad2 ("zonefs: Fix error message in zonefs_file_dio_append()")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>zonefs: Simplify IO error handling</title>
<updated>2023-04-06T10:10:34+00:00</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@opensource.wdc.com</email>
</author>
<published>2022-11-25T02:06:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3741898b169476c17ece7962e5acf415c7041265'/>
<id>urn:sha1:3741898b169476c17ece7962e5acf415c7041265</id>
<content type='text'>
[ Upstream commit 46a9c526eef7fb68a00321e2a9591ce5276ae92b ]

Simplify zonefs_check_zone_condition() by moving the code that changes
an inode access rights to the new function zonefs_inode_update_mode().
Furthermore, since on mount an inode wpoffset is always zero when
zonefs_check_zone_condition() is called during an inode initialization,
the "mount" boolean argument is not necessary for the readonly zone
case. This argument is thus removed.

zonefs_io_error_cb() is also modified to use the inode offline and
zone state flags instead of checking the device zone condition. The
multiple calls to zonefs_check_zone_condition() are reduced to the first
call on entry, which allows removing the "warn" argument.
zonefs_inode_update_mode() is also used to update an inode access rights
as zonefs_io_error_cb() modifies the inode flags depending on the volume
error handling mode (defined with a mount option). Since an inode mode
change differs for read-only zones between mount time and IO error time,
the flag ZONEFS_ZONE_INIT_MODE is used to differentiate both cases.

Signed-off-by: Damien Le Moal &lt;damien.lemoal@opensource.wdc.com&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Stable-dep-of: 88b170088ad2 ("zonefs: Fix error message in zonefs_file_dio_append()")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>zonefs: Reorganize code</title>
<updated>2023-04-06T10:10:34+00:00</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@opensource.wdc.com</email>
</author>
<published>2022-11-25T00:39:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a624b4796f38b0859043249f02bcf09f43d35dd4'/>
<id>urn:sha1:a624b4796f38b0859043249f02bcf09f43d35dd4</id>
<content type='text'>
[ Upstream commit 4008e2a0b01aba982356fd15b128a47bf11bd9c7 ]

Move all code related to zone file operations from super.c to the new
file.c file. Inode and zone management code remains in super.c.

Signed-off-by: Damien Le Moal &lt;damien.lemoal@opensource.wdc.com&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Stable-dep-of: 88b170088ad2 ("zonefs: Fix error message in zonefs_file_dio_append()")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>zonefs: Detect append writes at invalid locations</title>
<updated>2023-01-24T06:24:33+00:00</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@opensource.wdc.com</email>
</author>
<published>2023-01-06T08:43:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d3317baa34e99c10c08a8f3280cda062d7112c3a'/>
<id>urn:sha1:d3317baa34e99c10c08a8f3280cda062d7112c3a</id>
<content type='text'>
commit a608da3bd730d718f2d3ebec1c26f9865f8f17ce upstream.

Using REQ_OP_ZONE_APPEND operations for synchronous writes to sequential
files succeeds regardless of the zone write pointer position, as long as
the target zone is not full. This means that if an external (buggy)
application writes to the zone of a sequential file underneath the file
system, subsequent file write() operation will succeed but the file size
will not be correct and the file will contain invalid data written by
another application.

Modify zonefs_file_dio_append() to check the written sector of an append
write (returned in bio-&gt;bi_iter.bi_sector) and return -EIO if there is a
mismatch with the file zone wp offset field. This change triggers a call
to zonefs_io_error() and a zone check. Modify zonefs_io_error_cb() to
not expose the unexpected data after the current inode size when the
errors=remount-ro mode is used. Other error modes are correctly handled
already.

Fixes: 02ef12a663c7 ("zonefs: use REQ_OP_ZONE_APPEND for sync DIO")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal &lt;damien.lemoal@opensource.wdc.com&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>zonefs: Fix active zone accounting</title>
<updated>2022-11-25T08:01:22+00:00</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@opensource.wdc.com</email>
</author>
<published>2022-11-21T07:29:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=db58653ce0c7cf4d155727852607106f890005c0'/>
<id>urn:sha1:db58653ce0c7cf4d155727852607106f890005c0</id>
<content type='text'>
If a file zone transitions to the offline or readonly state from an
active state, we must clear the zone active flag and decrement the
active seq file counter. Do so in zonefs_account_active() using the new
zonefs inode flags ZONEFS_ZONE_OFFLINE and ZONEFS_ZONE_READONLY. These
flags are set if necessary in zonefs_check_zone_condition() based on the
result of report zones operation after an IO error.

Fixes: 87c9ce3ffec9 ("zonefs: Add active seq file accounting")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal &lt;damien.lemoal@opensource.wdc.com&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
</content>
</entry>
</feed>
