diff options
| author | Takashi Iwai <tiwai@suse.de> | 2022-04-19 18:26:01 +0300 | 
|---|---|---|
| committer | Takashi Iwai <tiwai@suse.de> | 2022-04-19 18:26:01 +0300 | 
| commit | 0aea30a07ec6b50de0fc5f5b2ec34a68ead86b61 (patch) | |
| tree | ee7d7d116570f39e47399c8f691a5a7565077eeb /Documentation/filesystems | |
| parent | 4ddef9c4d70aae0c9029bdec7c3f7f1c1c51ff8c (diff) | |
| parent | 5b933c7262c5b0ea11ea3c3b3ea81add04895954 (diff) | |
| download | linux-0aea30a07ec6b50de0fc5f5b2ec34a68ead86b61.tar.xz | |
Merge tag 'asoc-fix-v5.18-rc3' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v5.18
A collection of fixes that came in since the merge window, plus one new
device ID for an x86 laptop.  Nothing that really stands out with
particularly big impact outside of the affected device.
Diffstat (limited to 'Documentation/filesystems')
| -rw-r--r-- | Documentation/filesystems/caching/netfs-api.rst | 7 | ||||
| -rw-r--r-- | Documentation/filesystems/cifs/ksmbd.rst | 4 | ||||
| -rw-r--r-- | Documentation/filesystems/dax.rst | 6 | ||||
| -rw-r--r-- | Documentation/filesystems/erofs.rst | 2 | ||||
| -rw-r--r-- | Documentation/filesystems/ext4/blocks.rst | 2 | ||||
| -rw-r--r-- | Documentation/filesystems/fscrypt.rst | 25 | ||||
| -rw-r--r-- | Documentation/filesystems/fsverity.rst | 6 | ||||
| -rw-r--r-- | Documentation/filesystems/locking.rst | 54 | ||||
| -rw-r--r-- | Documentation/filesystems/netfs_library.rst | 156 | ||||
| -rw-r--r-- | Documentation/filesystems/porting.rst | 6 | ||||
| -rw-r--r-- | Documentation/filesystems/vfs.rst | 73 | 
11 files changed, 208 insertions, 133 deletions
| diff --git a/Documentation/filesystems/caching/netfs-api.rst b/Documentation/filesystems/caching/netfs-api.rst index f84e9ffdf0b4..5066113acad5 100644 --- a/Documentation/filesystems/caching/netfs-api.rst +++ b/Documentation/filesystems/caching/netfs-api.rst @@ -345,8 +345,9 @@ The following facilities are provided to manage this:  To support this, the following functions are provided:: -	int fscache_set_page_dirty(struct page *page, -				   struct fscache_cookie *cookie); +	bool fscache_dirty_folio(struct address_space *mapping, +				 struct folio *folio, +				 struct fscache_cookie *cookie);  	void fscache_unpin_writeback(struct writeback_control *wbc,  				     struct fscache_cookie *cookie);  	void fscache_clear_inode_writeback(struct fscache_cookie *cookie, @@ -354,7 +355,7 @@ To support this, the following functions are provided::  					   const void *aux);  The *set* function is intended to be called from the filesystem's -``set_page_dirty`` address space operation.  If ``I_PINNING_FSCACHE_WB`` is not +``dirty_folio`` address space operation.  If ``I_PINNING_FSCACHE_WB`` is not  set, it sets that flag and increments the use count on the cookie (the caller  must already have called ``fscache_use_cookie()``). diff --git a/Documentation/filesystems/cifs/ksmbd.rst b/Documentation/filesystems/cifs/ksmbd.rst index b0d354fd8066..1af600db2e70 100644 --- a/Documentation/filesystems/cifs/ksmbd.rst +++ b/Documentation/filesystems/cifs/ksmbd.rst @@ -82,10 +82,10 @@ Signing Update                 Supported.  Pre-authentication integrity   Supported.  SMB3 encryption(CCM, GCM)      Supported. (CCM and GCM128 supported, GCM256 in                                 progress) -SMB direct(RDMA)               Partially Supported. SMB3 Multi-channel is -                               required to connect to Windows client. +SMB direct(RDMA)               Supported.  SMB3 Multi-channel             Partially Supported. Planned to implement                                 replay/retry mechanisms for future. +Receive Side Scaling mode      Supported.  SMB3.1.1 POSIX extension       Supported.  ACLs                           Partially Supported. only DACLs available, SACLs                                 (auditing) is planned for the future. For diff --git a/Documentation/filesystems/dax.rst b/Documentation/filesystems/dax.rst index e3b30429d703..c04609d8ee24 100644 --- a/Documentation/filesystems/dax.rst +++ b/Documentation/filesystems/dax.rst @@ -23,11 +23,11 @@ on it as usual.  The `DAX` code currently only supports files with a block  size equal to your kernel's `PAGE_SIZE`, so you may need to specify a block  size when creating the filesystem. -Currently 4 filesystems support `DAX`: ext2, ext4, xfs and virtiofs. +Currently 5 filesystems support `DAX`: ext2, ext4, xfs, virtiofs and erofs.  Enabling `DAX` on them is different. -Enabling DAX on ext2 --------------------- +Enabling DAX on ext2 and erofs +------------------------------  When mounting the filesystem, use the ``-o dax`` option on the command line or  add 'dax' to the options in ``/etc/fstab``.  This works to enable `DAX` on all files diff --git a/Documentation/filesystems/erofs.rst b/Documentation/filesystems/erofs.rst index 7119aa213be7..bef6d3040ce4 100644 --- a/Documentation/filesystems/erofs.rst +++ b/Documentation/filesystems/erofs.rst @@ -40,7 +40,7 @@ Here is the main features of EROFS:     Inode metadata size    32 bytes      64 bytes     Max file size          4 GB          16 EB (also limited by max. vol size)     Max uids/gids          65536         4294967296 -   File change time       no            yes (64 + 32-bit timestamp) +   Per-inode timestamp    no            yes (64 + 32-bit timestamp)     Max hardlinks          65536         4294967296     Metadata reserved      4 bytes       14 bytes     =====================  ============  ===================================== diff --git a/Documentation/filesystems/ext4/blocks.rst b/Documentation/filesystems/ext4/blocks.rst index bd722ecd92d6..b0f80ea87c90 100644 --- a/Documentation/filesystems/ext4/blocks.rst +++ b/Documentation/filesystems/ext4/blocks.rst @@ -39,7 +39,7 @@ For 32-bit filesystems, limits are as follows:       - 4TiB       - 8TiB       - 16TiB -     - 256PiB +     - 256TiB     * - Blocks Per Block Group       - 8,192       - 16,384 diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst index 4d5d50dca65c..6ccd5efb25b7 100644 --- a/Documentation/filesystems/fscrypt.rst +++ b/Documentation/filesystems/fscrypt.rst @@ -1047,8 +1047,8 @@ astute users may notice some differences in behavior:    may be used to overwrite the source files but isn't guaranteed to be    effective on all filesystems and storage devices. -- Direct I/O is not supported on encrypted files.  Attempts to use -  direct I/O on such files will fall back to buffered I/O. +- Direct I/O is supported on encrypted files only under some +  circumstances.  For details, see `Direct I/O support`_.  - The fallocate operations FALLOC_FL_COLLAPSE_RANGE and    FALLOC_FL_INSERT_RANGE are not supported on encrypted files and will @@ -1179,6 +1179,27 @@ Inline encryption doesn't affect the ciphertext or other aspects of  the on-disk format, so users may freely switch back and forth between  using "inlinecrypt" and not using "inlinecrypt". +Direct I/O support +================== + +For direct I/O on an encrypted file to work, the following conditions +must be met (in addition to the conditions for direct I/O on an +unencrypted file): + +* The file must be using inline encryption.  Usually this means that +  the filesystem must be mounted with ``-o inlinecrypt`` and inline +  encryption hardware must be present.  However, a software fallback +  is also available.  For details, see `Inline encryption support`_. + +* The I/O request must be fully aligned to the filesystem block size. +  This means that the file position the I/O is targeting, the lengths +  of all I/O segments, and the memory addresses of all I/O buffers +  must be multiples of this value.  Note that the filesystem block +  size may be greater than the logical block size of the block device. + +If either of the above conditions is not met, then direct I/O on the +encrypted file will fall back to buffered I/O. +  Implementation details  ====================== diff --git a/Documentation/filesystems/fsverity.rst b/Documentation/filesystems/fsverity.rst index 1d831e3cbcb3..8cc536d08f51 100644 --- a/Documentation/filesystems/fsverity.rst +++ b/Documentation/filesystems/fsverity.rst @@ -549,7 +549,7 @@ Pagecache  ~~~~~~~~~  For filesystems using Linux's pagecache, the ``->readpage()`` and -``->readpages()`` methods must be modified to verify pages before they +``->readahead()`` methods must be modified to verify pages before they  are marked Uptodate.  Merely hooking ``->read_iter()`` would be  insufficient, since ``->read_iter()`` is not used for memory maps. @@ -611,7 +611,7 @@ workqueue, and then the workqueue work does the decryption or  verification.  Finally, pages where no decryption or verity error  occurred are marked Uptodate, and the pages are unlocked. -Files on ext4 and f2fs may contain holes.  Normally, ``->readpages()`` +Files on ext4 and f2fs may contain holes.  Normally, ``->readahead()``  simply zeroes holes and sets the corresponding pages Uptodate; no bios  are issued.  To prevent this case from bypassing fs-verity, these  filesystems use fsverity_verify_page() to verify hole pages. @@ -778,7 +778,7 @@ weren't already directly answered in other parts of this document.      - To prevent bypassing verification, pages must not be marked        Uptodate until they've been verified.  Currently, each        filesystem is responsible for marking pages Uptodate via -      ``->readpages()``.  Therefore, currently it's not possible for +      ``->readahead()``.  Therefore, currently it's not possible for        the VFS to do the verification on its own.  Changing this would        require significant changes to the VFS and all filesystems. diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst index 3f9b1497ebb8..c26d854275a0 100644 --- a/Documentation/filesystems/locking.rst +++ b/Documentation/filesystems/locking.rst @@ -239,10 +239,8 @@ prototypes::  	int (*writepage)(struct page *page, struct writeback_control *wbc);  	int (*readpage)(struct file *, struct page *);  	int (*writepages)(struct address_space *, struct writeback_control *); -	int (*set_page_dirty)(struct page *page); +	bool (*dirty_folio)(struct address_space *, struct folio *folio);  	void (*readahead)(struct readahead_control *); -	int (*readpages)(struct file *filp, struct address_space *mapping, -			struct list_head *pages, unsigned nr_pages);  	int (*write_begin)(struct file *, struct address_space *mapping,  				loff_t pos, unsigned len, unsigned flags,  				struct page **pagep, void **fsdata); @@ -250,21 +248,21 @@ prototypes::  				loff_t pos, unsigned len, unsigned copied,  				struct page *page, void *fsdata);  	sector_t (*bmap)(struct address_space *, sector_t); -	void (*invalidatepage) (struct page *, unsigned int, unsigned int); +	void (*invalidate_folio) (struct folio *, size_t start, size_t len);  	int (*releasepage) (struct page *, int);  	void (*freepage)(struct page *);  	int (*direct_IO)(struct kiocb *, struct iov_iter *iter);  	bool (*isolate_page) (struct page *, isolate_mode_t);  	int (*migratepage)(struct address_space *, struct page *, struct page *);  	void (*putback_page) (struct page *); -	int (*launder_page)(struct page *); -	int (*is_partially_uptodate)(struct page *, unsigned long, unsigned long); +	int (*launder_folio)(struct folio *); +	bool (*is_partially_uptodate)(struct folio *, size_t from, size_t count);  	int (*error_remove_page)(struct address_space *, struct page *);  	int (*swap_activate)(struct file *);  	int (*swap_deactivate)(struct file *);  locking rules: -	All except set_page_dirty and freepage may block +	All except dirty_folio and freepage may block  ======================	======================== =========	===============  ops			PageLocked(page)	 i_rwsem	invalidate_lock @@ -272,20 +270,19 @@ ops			PageLocked(page)	 i_rwsem	invalidate_lock  writepage:		yes, unlocks (see below)  readpage:		yes, unlocks				shared  writepages: -set_page_dirty		no +dirty_folio		maybe  readahead:		yes, unlocks				shared -readpages:		no					shared  write_begin:		locks the page		 exclusive  write_end:		yes, unlocks		 exclusive  bmap: -invalidatepage:		yes					exclusive +invalidate_folio:	yes					exclusive  releasepage:		yes  freepage:		yes  direct_IO:  isolate_page:		yes  migratepage:		yes (both)  putback_page:		yes -launder_page:		yes +launder_folio:		yes  is_partially_uptodate:	yes  error_remove_page:	yes  swap_activate:		no @@ -300,9 +297,6 @@ completion.  ->readahead() unlocks the pages that I/O is attempted on like ->readpage(). -->readpages() populates the pagecache with the passed pages and starts -I/O against them.  They come unlocked upon I/O completion. -  ->writepage() is used for two purposes: for "memory cleansing" and for  "sync".  These are quite different operations and the behaviour may differ  depending upon the mode. @@ -361,22 +355,22 @@ If nr_to_write is NULL, all dirty pages must be written.  writepages should _only_ write pages which are present on  mapping->io_pages. -->set_page_dirty() is called from various places in the kernel -when the target page is marked as needing writeback.  It may be called -under spinlock (it cannot block) and is sometimes called with the page -not locked. +->dirty_folio() is called from various places in the kernel when +the target folio is marked as needing writeback.  The folio cannot be +truncated because either the caller holds the folio lock, or the caller +has found the folio while holding the page table lock which will block +truncation.  ->bmap() is currently used by legacy ioctl() (FIBMAP) provided by some  filesystems and by the swapper. The latter will eventually go away.  Please,  keep it that way and don't breed new callers. -->invalidatepage() is called when the filesystem must attempt to drop +->invalidate_folio() is called when the filesystem must attempt to drop  some or all of the buffers from the page when it is being truncated. It -returns zero on success. If ->invalidatepage is zero, the kernel uses -block_invalidatepage() instead. The filesystem must exclusively acquire -invalidate_lock before invalidating page cache in truncate / hole punch path -(and thus calling into ->invalidatepage) to block races between page cache -invalidation and page cache filling functions (fault, read, ...). +returns zero on success.  The filesystem must exclusively acquire +invalidate_lock before invalidating page cache in truncate / hole punch +path (and thus calling into ->invalidate_folio) to block races between page +cache invalidation and page cache filling functions (fault, read, ...).  ->releasepage() is called when the kernel is about to try to drop the  buffers from the page in preparation for freeing it.  It returns zero to @@ -386,9 +380,9 @@ the kernel assumes that the fs has no private interest in the buffers.  ->freepage() is called when the kernel is done dropping the page  from the page cache. -->launder_page() may be called prior to releasing a page if -it is still found to be dirty. It returns zero if the page was successfully -cleaned, or an error value if not. Note that in order to prevent the page +->launder_folio() may be called prior to releasing a folio if +it is still found to be dirty. It returns zero if the folio was successfully +cleaned, or an error value if not. Note that in order to prevent the folio  getting mapped back in and redirtied, it needs to be kept locked  across the entire operation. @@ -438,13 +432,13 @@ prototypes::  locking rules:  ======================	=============	=================	========= -ops			inode->i_lock	blocked_lock_lock	may block +ops			   flc_lock  	blocked_lock_lock	may block  ======================	=============	=================	========= -lm_notify:		yes		yes			no +lm_notify:		no      	yes			no  lm_grant:		no		no			no  lm_break:		yes		no			no  lm_change		yes		no			no -lm_breaker_owns_lease:	no		no			no +lm_breaker_owns_lease:	yes     	no			no  ======================	=============	=================	=========  buffer_head diff --git a/Documentation/filesystems/netfs_library.rst b/Documentation/filesystems/netfs_library.rst index 136f8da3d0e2..69f00179fdfe 100644 --- a/Documentation/filesystems/netfs_library.rst +++ b/Documentation/filesystems/netfs_library.rst @@ -7,6 +7,8 @@ Network Filesystem Helper Library  .. Contents:   - Overview. + - Per-inode context. +   - Inode context helper functions.   - Buffered read helpers.     - Read helper functions.     - Read helper structures. @@ -28,6 +30,69 @@ Note that the library module doesn't link against local caching directly, so  access must be provided by the netfs. +Per-Inode Context +================= + +The network filesystem helper library needs a place to store a bit of state for +its use on each netfs inode it is helping to manage.  To this end, a context +structure is defined:: + +	struct netfs_i_context { +		const struct netfs_request_ops *ops; +		struct fscache_cookie	*cache; +	}; + +A network filesystem that wants to use netfs lib must place one of these +directly after the VFS ``struct inode`` it allocates, usually as part of its +own struct.  This can be done in a way similar to the following:: + +	struct my_inode { +		struct { +			/* These must be contiguous */ +			struct inode		vfs_inode; +			struct netfs_i_context  netfs_ctx; +		}; +		... +	}; + +This allows netfslib to find its state by simple offset from the inode pointer, +thereby allowing the netfslib helper functions to be pointed to directly by the +VFS/VM operation tables. + +The structure contains the following fields: + + * ``ops`` + +   The set of operations provided by the network filesystem to netfslib. + + * ``cache`` + +   Local caching cookie, or NULL if no caching is enabled.  This field does not +   exist if fscache is disabled. + + +Inode Context Helper Functions +------------------------------ + +To help deal with the per-inode context, a number helper functions are +provided.  Firstly, a function to perform basic initialisation on a context and +set the operations table pointer:: + +	void netfs_i_context_init(struct inode *inode, +				  const struct netfs_request_ops *ops); + +then two functions to cast between the VFS inode structure and the netfs +context:: + +	struct netfs_i_context *netfs_i_context(struct inode *inode); +	struct inode *netfs_inode(struct netfs_i_context *ctx); + +and finally, a function to get the cache cookie pointer from the context +attached to an inode (or NULL if fscache is disabled):: + +	struct fscache_cookie *netfs_i_cookie(struct inode *inode); + +  Buffered Read Helpers  ===================== @@ -70,38 +135,22 @@ Read Helper Functions  Three read helpers are provided:: -	void netfs_readahead(struct readahead_control *ractl, -			     const struct netfs_read_request_ops *ops, -			     void *netfs_priv); +	void netfs_readahead(struct readahead_control *ractl);  	int netfs_readpage(struct file *file, -			   struct folio *folio, -			   const struct netfs_read_request_ops *ops, -			   void *netfs_priv); +			   struct page *page);  	int netfs_write_begin(struct file *file,  			      struct address_space *mapping,  			      loff_t pos,  			      unsigned int len,  			      unsigned int flags,  			      struct folio **_folio, -			      void **_fsdata, -			      const struct netfs_read_request_ops *ops, -			      void *netfs_priv); - -Each corresponds to a VM operation, with the addition of a couple of parameters -for the use of the read helpers: - - * ``ops`` - -   A table of operations through which the helpers can talk to the filesystem. +			      void **_fsdata); - * ``netfs_priv`` - -   Filesystem private data (can be NULL). - -Both of these values will be stored into the read request structure. +Each corresponds to a VM address space operation.  These operations use the +state in the per-inode context. -For ->readahead() and ->readpage(), the network filesystem should just jump -into the corresponding read helper; whereas for ->write_begin(), it may be a +For ->readahead() and ->readpage(), the network filesystem just point directly +at the corresponding read helper; whereas for ->write_begin(), it may be a  little more complicated as the network filesystem might want to flush  conflicting writes or track dirty data and needs to put the acquired folio if  an error occurs after calling the helper. @@ -116,7 +165,7 @@ occurs, the request will get partially completed if sufficient data is read.  Additionally, there is:: -  * void netfs_subreq_terminated(struct netfs_read_subrequest *subreq, +  * void netfs_subreq_terminated(struct netfs_io_subrequest *subreq,  				 ssize_t transferred_or_error,  				 bool was_async); @@ -132,7 +181,7 @@ Read Helper Structures  The read helpers make use of a couple of structures to maintain the state of  the read.  The first is a structure that manages a read request as a whole:: -	struct netfs_read_request { +	struct netfs_io_request {  		struct inode		*inode;  		struct address_space	*mapping;  		struct netfs_cache_resources cache_resources; @@ -140,7 +189,7 @@ the read.  The first is a structure that manages a read request as a whole::  		loff_t			start;  		size_t			len;  		loff_t			i_size; -		const struct netfs_read_request_ops *netfs_ops; +		const struct netfs_request_ops *netfs_ops;  		unsigned int		debug_id;  		...  	}; @@ -187,8 +236,8 @@ The above fields are the ones the netfs can use.  They are:  The second structure is used to manage individual slices of the overall read  request:: -	struct netfs_read_subrequest { -		struct netfs_read_request *rreq; +	struct netfs_io_subrequest { +		struct netfs_io_request *rreq;  		loff_t			start;  		size_t			len;  		size_t			transferred; @@ -244,32 +293,26 @@ Read Helper Operations  The network filesystem must provide the read helpers with a table of operations  through which it can issue requests and negotiate:: -	struct netfs_read_request_ops { -		void (*init_rreq)(struct netfs_read_request *rreq, struct file *file); -		bool (*is_cache_enabled)(struct inode *inode); -		int (*begin_cache_operation)(struct netfs_read_request *rreq); -		void (*expand_readahead)(struct netfs_read_request *rreq); -		bool (*clamp_length)(struct netfs_read_subrequest *subreq); -		void (*issue_op)(struct netfs_read_subrequest *subreq); -		bool (*is_still_valid)(struct netfs_read_request *rreq); +	struct netfs_request_ops { +		void (*init_request)(struct netfs_io_request *rreq, struct file *file); +		int (*begin_cache_operation)(struct netfs_io_request *rreq); +		void (*expand_readahead)(struct netfs_io_request *rreq); +		bool (*clamp_length)(struct netfs_io_subrequest *subreq); +		void (*issue_read)(struct netfs_io_subrequest *subreq); +		bool (*is_still_valid)(struct netfs_io_request *rreq);  		int (*check_write_begin)(struct file *file, loff_t pos, unsigned len,  					 struct folio *folio, void **_fsdata); -		void (*done)(struct netfs_read_request *rreq); +		void (*done)(struct netfs_io_request *rreq);  		void (*cleanup)(struct address_space *mapping, void *netfs_priv);  	};  The operations are as follows: - * ``init_rreq()`` + * ``init_request()``     [Optional] This is called to initialise the request structure.  It is given     the file for reference and can modify the ->netfs_priv value. - * ``is_cache_enabled()`` - -   [Required] This is called by netfs_write_begin() to ask if the file is being -   cached.  It should return true if it is being cached and false otherwise. -   * ``begin_cache_operation()``     [Optional] This is called to ask the network filesystem to call into the @@ -305,7 +348,7 @@ The operations are as follows:     This should return 0 on success and an error code on error. - * ``issue_op()`` + * ``issue_read()``     [Required] The helpers use this to dispatch a subrequest to the server for     reading.  In the subrequest, ->start, ->len and ->transferred indicate what @@ -420,12 +463,12 @@ The network filesystem's ->begin_cache_operation() method is called to set up a  cache and this must call into the cache to do the work.  If using fscache, for  example, the cache would call:: -	int fscache_begin_read_operation(struct netfs_read_request *rreq, +	int fscache_begin_read_operation(struct netfs_io_request *rreq,  					 struct fscache_cookie *cookie);  passing in the request pointer and the cookie corresponding to the file. -The netfs_read_request object contains a place for the cache to hang its +The netfs_io_request object contains a place for the cache to hang its  state::  	struct netfs_cache_resources { @@ -443,7 +486,7 @@ operation table looks like the following::  		void (*expand_readahead)(struct netfs_cache_resources *cres,  					 loff_t *_start, size_t *_len, loff_t i_size); -		enum netfs_read_source (*prepare_read)(struct netfs_read_subrequest *subreq, +		enum netfs_io_source (*prepare_read)(struct netfs_io_subrequest *subreq,  						       loff_t i_size);  		int (*read)(struct netfs_cache_resources *cres, @@ -462,6 +505,10 @@ operation table looks like the following::  			     struct iov_iter *iter,  			     netfs_io_terminated_t term_func,  			     void *term_func_priv); + +		int (*query_occupancy)(struct netfs_cache_resources *cres, +				       loff_t start, size_t len, size_t granularity, +				       loff_t *_data_start, size_t *_data_len);  	};  With a termination handler function pointer:: @@ -536,6 +583,18 @@ The methods defined in the table are:     indicating whether the termination is definitely happening in the caller's     context. + * ``query_occupancy()`` + +   [Required] Called to find out where the next piece of data is within a +   particular region of the cache.  The start and length of the region to be +   queried are passed in, along with the granularity to which the answer needs +   to be aligned.  The function passes back the start and length of the data, +   if any, available within that region.  Note that there may be a hole at the +   front. + +   It returns 0 if some data was found, -ENODATA if there was no usable data +   within the region or -ENOBUFS if there is no caching on this file. +  Note that these methods are passed a pointer to the cache resource structure,  not the read request structure as they could be used in other situations where  there isn't a read request structure as well, such as writing dirty data to the @@ -546,4 +605,5 @@ API Function Reference  ======================  .. kernel-doc:: include/linux/netfs.h -.. kernel-doc:: fs/netfs/read_helper.c +.. kernel-doc:: fs/netfs/buffered_read.c +.. kernel-doc:: fs/netfs/io.c diff --git a/Documentation/filesystems/porting.rst b/Documentation/filesystems/porting.rst index bf19fd6b86e7..7c1583dbeb59 100644 --- a/Documentation/filesystems/porting.rst +++ b/Documentation/filesystems/porting.rst @@ -45,6 +45,12 @@ typically between calling iget_locked() and unlocking the inode.  At some point that will become mandatory. +**mandatory** + +The foo_inode_info should always be allocated through alloc_inode_sb() rather +than kmem_cache_alloc() or kmalloc() related to set up the inode reclaim context +correctly. +  ---  **mandatory** diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst index bf5c48066fac..794bd1a66bfb 100644 --- a/Documentation/filesystems/vfs.rst +++ b/Documentation/filesystems/vfs.rst @@ -658,7 +658,7 @@ pages, however the address_space has finer control of write sizes.  The read process essentially only requires 'readpage'.  The write  process is more complicated and uses write_begin/write_end or -set_page_dirty to write data into the address_space, and writepage and +dirty_folio to write data into the address_space, and writepage and  writepages to writeback data to storage.  Adding and removing pages to/from an address_space is protected by the @@ -724,10 +724,8 @@ cache in your filesystem.  The following members are defined:  		int (*writepage)(struct page *page, struct writeback_control *wbc);  		int (*readpage)(struct file *, struct page *);  		int (*writepages)(struct address_space *, struct writeback_control *); -		int (*set_page_dirty)(struct page *page); +		bool (*dirty_folio)(struct address_space *, struct folio *);  		void (*readahead)(struct readahead_control *); -		int (*readpages)(struct file *filp, struct address_space *mapping, -				 struct list_head *pages, unsigned nr_pages);  		int (*write_begin)(struct file *, struct address_space *mapping,  				   loff_t pos, unsigned len, unsigned flags,  				struct page **pagep, void **fsdata); @@ -735,7 +733,7 @@ cache in your filesystem.  The following members are defined:  				 loff_t pos, unsigned len, unsigned copied,  				 struct page *page, void *fsdata);  		sector_t (*bmap)(struct address_space *, sector_t); -		void (*invalidatepage) (struct page *, unsigned int, unsigned int); +		void (*invalidate_folio) (struct folio *, size_t start, size_t len);  		int (*releasepage) (struct page *, int);  		void (*freepage)(struct page *);  		ssize_t (*direct_IO)(struct kiocb *, struct iov_iter *iter); @@ -745,10 +743,10 @@ cache in your filesystem.  The following members are defined:  		int (*migratepage) (struct page *, struct page *);  		/* put migration-failed page back to right list */  		void (*putback_page) (struct page *); -		int (*launder_page) (struct page *); +		int (*launder_folio) (struct folio *); -		int (*is_partially_uptodate) (struct page *, unsigned long, -					      unsigned long); +		bool (*is_partially_uptodate) (struct folio *, size_t from, +					       size_t count);  		void (*is_dirty_writeback) (struct page *, bool *, bool *);  		int (*error_remove_page) (struct mapping *mapping, struct page *page);  		int (*swap_activate)(struct file *); @@ -793,34 +791,29 @@ cache in your filesystem.  The following members are defined:  	This will choose pages from the address space that are tagged as  	DIRTY and will pass them to ->writepage. -``set_page_dirty`` -	called by the VM to set a page dirty.  This is particularly -	needed if an address space attaches private data to a page, and -	that data needs to be updated when a page is dirtied.  This is +``dirty_folio`` +	called by the VM to mark a folio as dirty.  This is particularly +	needed if an address space attaches private data to a folio, and +	that data needs to be updated when a folio is dirtied.  This is  	called, for example, when a memory mapped page gets modified. -	If defined, it should set the PageDirty flag, and the -	PAGECACHE_TAG_DIRTY tag in the radix tree. +	If defined, it should set the folio dirty flag, and the +	PAGECACHE_TAG_DIRTY search mark in i_pages.  ``readahead``  	Called by the VM to read pages associated with the address_space  	object.  The pages are consecutive in the page cache and are  	locked.  The implementation should decrement the page refcount  	after starting I/O on each page.  Usually the page will be -	unlocked by the I/O completion handler.  If the filesystem decides -	to stop attempting I/O before reaching the end of the readahead -	window, it can simply return.  The caller will decrement the page -	refcount and unlock the remaining pages for you.  Set PageUptodate -	if the I/O completes successfully.  Setting PageError on any page -	will be ignored; simply unlock the page if an I/O error occurs. - -``readpages`` -	called by the VM to read pages associated with the address_space -	object.  This is essentially just a vector version of readpage. -	Instead of just one page, several pages are requested. -	readpages is only used for read-ahead, so read errors are -	ignored.  If anything goes wrong, feel free to give up. -	This interface is deprecated and will be removed by the end of -	2020; implement readahead instead. +	unlocked by the I/O completion handler.  The set of pages are +	divided into some sync pages followed by some async pages, +	rac->ra->async_size gives the number of async pages.  The +	filesystem should attempt to read all sync pages but may decide +	to stop once it reaches the async pages.  If it does decide to +	stop attempting I/O, it can simply return.  The caller will +	remove the remaining pages from the address space, unlock them +	and decrement the page refcount.  Set PageUptodate if the I/O +	completes successfully.  Setting PageError on any page will be +	ignored; simply unlock the page if an I/O error occurs.  ``write_begin``  	Called by the generic buffered write code to ask the filesystem @@ -868,15 +861,15 @@ cache in your filesystem.  The following members are defined:  	to find out where the blocks in the file are and uses those  	addresses directly. -``invalidatepage`` -	If a page has PagePrivate set, then invalidatepage will be -	called when part or all of the page is to be removed from the +``invalidate_folio`` +	If a folio has private data, then invalidate_folio will be +	called when part or all of the folio is to be removed from the  	address space.  This generally corresponds to either a  	truncation, punch hole or a complete invalidation of the address  	space (in the latter case 'offset' will always be 0 and 'length' -	will be PAGE_SIZE).  Any private data associated with the page +	will be folio_size()).  Any private data associated with the page  	should be updated to reflect this truncation.  If offset is 0 -	and length is PAGE_SIZE, then the private data should be +	and length is folio_size(), then the private data should be  	released, because the page must be able to be completely  	discarded.  This may be done by calling the ->releasepage  	function, but in this case the release MUST succeed. @@ -930,16 +923,16 @@ cache in your filesystem.  The following members are defined:  ``putback_page``  	Called by the VM when isolated page's migration fails. -``launder_page`` -	Called before freeing a page - it writes back the dirty page. -	To prevent redirtying the page, it is kept locked during the +``launder_folio`` +	Called before freeing a folio - it writes back the dirty folio. +	To prevent redirtying the folio, it is kept locked during the  	whole operation.  ``is_partially_uptodate``  	Called by the VM when reading a file through the pagecache when -	the underlying blocksize != pagesize.  If the required block is -	up to date then the read can complete without needing the IO to -	bring the whole page up to date. +	the underlying blocksize is smaller than the size of the folio. +	If the required block is up to date then the read can complete +	without needing I/O to bring the whole page up to date.  ``is_dirty_writeback``  	Called by the VM when attempting to reclaim a page.  The VM uses | 
