<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/zonefs/super.c, branch v6.1.174</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.1.174</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.1.174'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-02-23T08:12:45+00:00</updated>
<entry>
<title>zonefs: Improve error handling</title>
<updated>2024-02-23T08:12:45+00:00</updated>
<author>
<name>Damien Le Moal</name>
<email>dlemoal@kernel.org</email>
</author>
<published>2024-02-08T08:26:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=09fad23a1a3249088e8f840685d10ae1650519a9'/>
<id>urn:sha1:09fad23a1a3249088e8f840685d10ae1650519a9</id>
<content type='text'>
commit 14db5f64a971fce3d8ea35de4dfc7f443a3efb92 upstream.

Write error handling is racy and can sometime lead to the error recovery
path wrongly changing the inode size of a sequential zone file to an
incorrect value  which results in garbage data being readable at the end
of a file. There are 2 problems:

1) zonefs_file_dio_write() updates a zone file write pointer offset
   after issuing a direct IO with iomap_dio_rw(). This update is done
   only if the IO succeed for synchronous direct writes. However, for
   asynchronous direct writes, the update is done without waiting for
   the IO completion so that the next asynchronous IO can be
   immediately issued. However, if an asynchronous IO completes with a
   failure right before the i_truncate_mutex lock protecting the update,
   the update may change the value of the inode write pointer offset
   that was corrected by the error path (zonefs_io_error() function).

2) zonefs_io_error() is called when a read or write error occurs. This
   function executes a report zone operation using the callback function
   zonefs_io_error_cb(), which does all the error recovery handling
   based on the current zone condition, write pointer position and
   according to the mount options being used. However, depending on the
   zoned device being used, a report zone callback may be executed in a
   context that is different from the context of __zonefs_io_error(). As
   a result, zonefs_io_error_cb() may be executed without the inode
   truncate mutex lock held, which can lead to invalid error processing.

Fix both problems as follows:
- Problem 1: Perform the inode write pointer offset update before a
  direct write is issued with iomap_dio_rw(). This is safe to do as
  partial direct writes are not supported (IOMAP_DIO_PARTIAL is not
  set) and any failed IO will trigger the execution of zonefs_io_error()
  which will correct the inode write pointer offset to reflect the
  current state of the one on the device.
- Problem 2: Change zonefs_io_error_cb() into zonefs_handle_io_error()
  and call this function directly from __zonefs_io_error() after
  obtaining the zone information using blkdev_report_zones() with a
  simple callback function that copies to a local stack variable the
  struct blk_zone obtained from the device. This ensures that error
  handling is performed holding the inode truncate mutex.
  This change also simplifies error handling for conventional zone files
  by bypassing the execution of report zones entirely. This is safe to
  do because the condition of conventional zones cannot be read-only or
  offline and conventional zone files are always fully mapped with a
  constant file size.

Reported-by: Shin'ichiro Kawasaki &lt;shinichiro.kawasaki@wdc.com&gt;
Fixes: 8dcc1a9d90c1 ("fs: New zonefs file system")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal &lt;dlemoal@kernel.org&gt;
Tested-by: Shin'ichiro Kawasaki &lt;shinichiro.kawasaki@wdc.com&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Reviewed-by: Himanshu Madhani &lt;himanshu.madhani@oracle.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>zonefs: Separate zone information from inode information</title>
<updated>2023-04-06T10:10:34+00:00</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@opensource.wdc.com</email>
</author>
<published>2022-11-16T09:15:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=81cf745f110557673c8c7f0263bd8c30165bd07b'/>
<id>urn:sha1:81cf745f110557673c8c7f0263bd8c30165bd07b</id>
<content type='text'>
[ Upstream commit aa7f243f32e1d18036ee00d71d3ccfad70ae2121 ]

In preparation for adding dynamic inode allocation, separate an inode
zone information from the zonefs inode structure. The new data structure
zonefs_zone is introduced to store in memory information about a zone
that must be kept throughout the lifetime of the device mount.

Linking between a zone file inode and its zone information is done by
setting the inode i_private field to point to a struct zonefs_zone.
Using the i_private pointer avoids the need for adding a pointer in
struct zonefs_inode_info. Beside the vfs inode, this structure is
reduced to a mutex and a write open counter.

One struct zonefs_zone is created per file inode on mount. These
structures are organized in an array using the new struct
zonefs_zone_group data structure to represent zone groups. The
zonefs_zone arrays are indexed per file number (the index of a struct
zonefs_zone in its array directly gives the file number/name for that
zone file inode).

Signed-off-by: Damien Le Moal &lt;damien.lemoal@opensource.wdc.com&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Stable-dep-of: 88b170088ad2 ("zonefs: Fix error message in zonefs_file_dio_append()")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>zonefs: Reduce struct zonefs_inode_info size</title>
<updated>2023-04-06T10:10:34+00:00</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@opensource.wdc.com</email>
</author>
<published>2022-11-24T10:43:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7558b249cb4e77b2cf58725895e05bbe0fd1a80e'/>
<id>urn:sha1:7558b249cb4e77b2cf58725895e05bbe0fd1a80e</id>
<content type='text'>
[ Upstream commit 34422914dc00b291d1c47dbdabe93b154c2f2b25 ]

Instead of using the i_ztype field in struct zonefs_inode_info to
indicate the zone type of an inode, introduce the new inode flag
ZONEFS_ZONE_CNV to be set in the i_flags field of struct
zonefs_inode_info to identify conventional zones. If this flag is not
set, the zone of an inode is considered to be a sequential zone.

The helpers zonefs_zone_is_cnv(), zonefs_zone_is_seq(),
zonefs_inode_is_cnv() and zonefs_inode_is_seq() are introduced to
simplify testing the zone type of a struct zonefs_inode_info and of a
struct inode.

Signed-off-by: Damien Le Moal &lt;damien.lemoal@opensource.wdc.com&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Stable-dep-of: 88b170088ad2 ("zonefs: Fix error message in zonefs_file_dio_append()")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>zonefs: Simplify IO error handling</title>
<updated>2023-04-06T10:10:34+00:00</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@opensource.wdc.com</email>
</author>
<published>2022-11-25T02:06:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3741898b169476c17ece7962e5acf415c7041265'/>
<id>urn:sha1:3741898b169476c17ece7962e5acf415c7041265</id>
<content type='text'>
[ Upstream commit 46a9c526eef7fb68a00321e2a9591ce5276ae92b ]

Simplify zonefs_check_zone_condition() by moving the code that changes
an inode access rights to the new function zonefs_inode_update_mode().
Furthermore, since on mount an inode wpoffset is always zero when
zonefs_check_zone_condition() is called during an inode initialization,
the "mount" boolean argument is not necessary for the readonly zone
case. This argument is thus removed.

zonefs_io_error_cb() is also modified to use the inode offline and
zone state flags instead of checking the device zone condition. The
multiple calls to zonefs_check_zone_condition() are reduced to the first
call on entry, which allows removing the "warn" argument.
zonefs_inode_update_mode() is also used to update an inode access rights
as zonefs_io_error_cb() modifies the inode flags depending on the volume
error handling mode (defined with a mount option). Since an inode mode
change differs for read-only zones between mount time and IO error time,
the flag ZONEFS_ZONE_INIT_MODE is used to differentiate both cases.

Signed-off-by: Damien Le Moal &lt;damien.lemoal@opensource.wdc.com&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Stable-dep-of: 88b170088ad2 ("zonefs: Fix error message in zonefs_file_dio_append()")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>zonefs: Reorganize code</title>
<updated>2023-04-06T10:10:34+00:00</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@opensource.wdc.com</email>
</author>
<published>2022-11-25T00:39:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a624b4796f38b0859043249f02bcf09f43d35dd4'/>
<id>urn:sha1:a624b4796f38b0859043249f02bcf09f43d35dd4</id>
<content type='text'>
[ Upstream commit 4008e2a0b01aba982356fd15b128a47bf11bd9c7 ]

Move all code related to zone file operations from super.c to the new
file.c file. Inode and zone management code remains in super.c.

Signed-off-by: Damien Le Moal &lt;damien.lemoal@opensource.wdc.com&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Stable-dep-of: 88b170088ad2 ("zonefs: Fix error message in zonefs_file_dio_append()")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>zonefs: Detect append writes at invalid locations</title>
<updated>2023-01-24T06:24:33+00:00</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@opensource.wdc.com</email>
</author>
<published>2023-01-06T08:43:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d3317baa34e99c10c08a8f3280cda062d7112c3a'/>
<id>urn:sha1:d3317baa34e99c10c08a8f3280cda062d7112c3a</id>
<content type='text'>
commit a608da3bd730d718f2d3ebec1c26f9865f8f17ce upstream.

Using REQ_OP_ZONE_APPEND operations for synchronous writes to sequential
files succeeds regardless of the zone write pointer position, as long as
the target zone is not full. This means that if an external (buggy)
application writes to the zone of a sequential file underneath the file
system, subsequent file write() operation will succeed but the file size
will not be correct and the file will contain invalid data written by
another application.

Modify zonefs_file_dio_append() to check the written sector of an append
write (returned in bio-&gt;bi_iter.bi_sector) and return -EIO if there is a
mismatch with the file zone wp offset field. This change triggers a call
to zonefs_io_error() and a zone check. Modify zonefs_io_error_cb() to
not expose the unexpected data after the current inode size when the
errors=remount-ro mode is used. Other error modes are correctly handled
already.

Fixes: 02ef12a663c7 ("zonefs: use REQ_OP_ZONE_APPEND for sync DIO")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal &lt;damien.lemoal@opensource.wdc.com&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>zonefs: Fix active zone accounting</title>
<updated>2022-11-25T08:01:22+00:00</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@opensource.wdc.com</email>
</author>
<published>2022-11-21T07:29:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=db58653ce0c7cf4d155727852607106f890005c0'/>
<id>urn:sha1:db58653ce0c7cf4d155727852607106f890005c0</id>
<content type='text'>
If a file zone transitions to the offline or readonly state from an
active state, we must clear the zone active flag and decrement the
active seq file counter. Do so in zonefs_account_active() using the new
zonefs inode flags ZONEFS_ZONE_OFFLINE and ZONEFS_ZONE_READONLY. These
flags are set if necessary in zonefs_check_zone_condition() based on the
result of report zones operation after an IO error.

Fixes: 87c9ce3ffec9 ("zonefs: Add active seq file accounting")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal &lt;damien.lemoal@opensource.wdc.com&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
</content>
</entry>
<entry>
<title>zonefs: Fix race between modprobe and mount</title>
<updated>2022-11-22T05:18:32+00:00</updated>
<author>
<name>Zhang Xiaoxu</name>
<email>zhangxiaoxu5@huawei.com</email>
</author>
<published>2022-11-20T10:57:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4e45886956a20942800259f326a04417292ae314'/>
<id>urn:sha1:4e45886956a20942800259f326a04417292ae314</id>
<content type='text'>
There is a race between modprobe and mount as below:

 modprobe zonefs                | mount -t zonefs
--------------------------------|-------------------------
 zonefs_init                    |
  register_filesystem       [1] |
                                | zonefs_fill_super    [2]
  zonefs_sysfs_init         [3] |

1. register zonefs suceess, then
2. user can mount the zonefs
3. if sysfs initialize failed, the module initialize failed.

Then the mount process maybe some error happened since the module
initialize failed.

Let's register zonefs after all dependency resource ready. And
reorder the dependency resource release in module exit.

Fixes: 9277a6d4fbd4 ("zonefs: Export open zone resource information through sysfs")
Signed-off-by: Zhang Xiaoxu &lt;zhangxiaoxu5@huawei.com&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Reviewed-by: Chaitanya Kulkarni &lt;kch@nvidia.com&gt;
Signed-off-by: Damien Le Moal &lt;damien.lemoal@opensource.wdc.com&gt;
</content>
</entry>
<entry>
<title>zonefs: fix zone report size in __zonefs_io_error()</title>
<updated>2022-11-16T07:08:28+00:00</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@opensource.wdc.com</email>
</author>
<published>2022-10-25T04:39:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7dd12d65ac646046a3fe0bbf9a4e86f4514207b3'/>
<id>urn:sha1:7dd12d65ac646046a3fe0bbf9a4e86f4514207b3</id>
<content type='text'>
When an IO error occurs, the function __zonefs_io_error() is used to
issue a zone report to obtain the latest zone information from the
device. This function gets a zone report for all zones used as storage
for a file, which is always 1 zone except for files representing
aggregated conventional zones.

The number of zones of a zone report for a file is calculated in
__zonefs_io_error() by doing a bit-shift of the inode i_zone_size field,
which is equal to or larger than the device zone size. However, this
calculation does not take into account that the last zone of a zoned
device may be smaller than the zone size reported by bdev_zone_sectors()
(which is used to set the bit shift size). As a result, if an error
occurs for an IO targetting such last smaller zone, the zone report will
ask for 0 zones, leading to an invalid zone report.

Fix this by using the fact that all files require a 1 zone report,
except if the inode i_zone_size field indicates a zone size larger than
the device zone size. This exception case corresponds to a mount with
aggregated conventional zones.

A check for this exception is added to the file inode initialization
during mount. If an invalid setup is detected, emit an error and fail
the mount (check contributed by Johannes Thumshirn).

Signed-off-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Signed-off-by: Damien Le Moal &lt;damien.lemoal@opensource.wdc.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'iomap-6.0-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux</title>
<updated>2022-08-11T20:11:49+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-08-11T20:11:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8745889a7fd04d14f461f6536c45f70cbaf3ee02'/>
<id>urn:sha1:8745889a7fd04d14f461f6536c45f70cbaf3ee02</id>
<content type='text'>
Pull more iomap updates from Darrick Wong:
 "In the past 10 days or so I've not heard any ZOMG STOP style
  complaints about removing -&gt;writepage support from gfs2 or zonefs, so
  here's the pull request removing them (and the underlying fs iomap
  support) from the kernel:

   - Remove iomap_writepage and all callers, since the mm apparently
     never called the zonefs or gfs2 writepage functions"

* tag 'iomap-6.0-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  iomap: remove iomap_writepage
  zonefs: remove -&gt;writepage
  gfs2: remove -&gt;writepage
  gfs2: stop using generic_writepages in gfs2_ail1_start_one
</content>
</entry>
</feed>
