summaryrefslogtreecommitdiff
path: root/fs/nfs/pnfs.h
AgeCommit message (Collapse)AuthorFilesLines
2012-07-16NFSv4.1 mark layout when already returnedAndy Adamson1-0/+19
When the file layout driver is fencing a DS, _pnfs_return_layout can be called mulitple times per inode due to in-flight i/o referencing lsegs on it's plh_segs list. Remember that LAYOUTRETURN has been called, and do not call it again. Allow LAYOUTRETURNs after a subsequent LAYOUTGET. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-29NFS: Create an write_pageio_init() functionBryan Schumaker1-3/+3
pNFS needs to select a write function based on the layout driver currently in use, so I let each NFS version decide how to best handle initializing writes. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-29NFS: Create an read_pageio_init() functionBryan Schumaker1-3/+3
pNFS needs to select a read function based on the layout driver currently in use, so I let each NFS version decide how to best handle initializing reads. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-07NFS4: Fix open bug when pnfs module blacklistedFred Isaman1-1/+1
Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-25NFSv4.1 cache mdsthreshold values on OPENAndy Adamson1-0/+21
Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-20NFSv4.1 resend LAYOUTGET on data server invalid layout errorsAndy Adamson1-0/+1
The "invalid layout" class of errors is handled by destroying the layout and getting a new layout from the server. Currently, the layout must be destroyed before a new layout can be obtained. This means that all references (e.g.lsegs) to the "to be destroyed" layout header must be dropped before it can be destroyed. This in turn means waiting for all in flight RPC's using the old layout as well as draining the data server session slot table wait queue. Set the NFS_LAYOUT_INVALID flag to redirect I/O to the MDS while waiting for the old layout to be destroyed. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-20NFSv4.1: mark deviceid invalid on filelayout DS connection errorsAndy Adamson1-0/+4
This prevents the use of any layout for i/o that references the deviceid. I/O is redirected through the MDS. Redirect the unhandled failed I/O to the MDS without marking either the layout or the deviceid invalid. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-04-30NFS: pnfs_pageio_init_read() and init_write() need an extra argumentBryan Schumaker1-2/+4
This is only when CONFIG_NFS_V4_1 isn't enabled. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Acked-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-04-27NFS: rewrite directio write to use async coalesce codeFred Isaman1-0/+17
This also has the advantage that it allows directio to use pnfs. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-04-27NFS: create struct nfs_commit_infoFred Isaman1-24/+48
It is COMMIT that is handled the most differently between the paged and direct paths. Create a structure that encapsulates everything either path needs to know about the commit state. We could use void to hide some of the layout driver stuff, but Trond suggests pulling it out to ensure type checking, given the huge changes being made, and the fact that it doesn't interfere with other drivers. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-04-27NFS: create completion structure to pass into page_init functionsFred Isaman1-2/+4
Factors out the code that will need to change when directio starts using these code paths. This will allow directio to use the generic pagein and flush routines Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-21NFS: Fix more NFS debug related build warningsTrond Myklebust1-1/+7
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-20NFS: Use cond_resched_lock() to reduce latencies in the commit scansTrond Myklebust1-4/+4
Ensure that we conditionally drop the inode->i_lock when it is safe to do so in the commit loops. We do so after locking the nfs_page, but before removing it from the commit list. We can then use list_safe_reset_next to recover the loop after the lock is retaken. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-17NFSv4.1: Clean ups and bugfixes for the pNFS read/writeback/commit codeTrond Myklebust1-27/+28
Move more pnfs-isms out of the generic commit code. Bugfixes: - filelayout_scan_commit_lists doesn't need to get/put the lseg. In fact since it is run under the inode->i_lock, the lseg_put() can deadlock. - Ensure that we distinguish between what needs to be done for commit-to-data server and what needs to be done for commit-to-MDS using the new flag PG_COMMIT_TO_DS. Otherwise we may end up calling put_lseg() on a bucket for a struct nfs_page that got written through the MDS. - Fix a case where we were using list_del() on an nfs_page->wb_list instead of list_del_init(). - filelayout_initiate_commit needs to call filelayout_commit_release on error instead of the mds_ops->rpc_release(). Otherwise it won't clear the commit lock. Cleanups: - Let the files layout manage the commit lists for the pNFS case. Don't expose stuff like pnfs_choose_commit_list, and the fact that the commit buckets hold references to the layout segment in common code. - Cast out the put_lseg() calls for the struct nfs_read/write_data->lseg into the pNFS layer from whence they came. - Let the pNFS layer manage the NFS_INO_PNFS_COMMIT bit. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Fred Isaman <iisaman@netapp.com>
2012-03-11NFS: remove nfs_inode radix treeFred Isaman1-40/+42
The radix tree is only being used to compile lists of reqs needing commit. It is simpler to just put the reqs directly into a list. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-27NFSv4.1: Don't call nfs4_deviceid_purge_client() unless we're NFSv4.1Trond Myklebust1-3/+0
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-06pnfs-obj: Must return layout on IO errorBoaz Harrosh1-0/+1
As mandated by the standard. In case of an IO error, a pNFS objects layout driver must return it's layout. This is because all device errors are reported to the server as part of the layout return buffer. This is implemented the same way PNFS_LAYOUTRET_ON_SETATTR is done, through a bit flag on the pnfs_layoutdriver_type->flags member. The flag is set by the layout driver that wants a layout_return preformed at pnfs_ld_{write,read}_done in case of an error. (Though I have not defined a wrapper like pnfs_ld_layoutret_on_setattr because this code is never called outside of pnfs.c and pnfs IO paths) Without this patch 3.[0-2] Kernels leak memory and have an annoying WARN_ON after every IO error utilizing the pnfs-obj driver. [This patch is for 3.2 Kernel. 3.1/0 Kernels need a different patch] CC: Stable Tree <stable@kernel.org> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-18pnfs: recoalesce when ld read pagelist failsPeng Tao1-1/+1
For pnfs pagelist read failure, we need to pg_recoalesce and resend IO to mds. Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Cc: stable@kernel.org [3.0] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-18pnfs: recoalesce when ld write pagelist failsPeng Tao1-1/+1
For pnfs pagelist write failure, we need to pg_recoalesce and resend IO to mds. Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Cc: stable@kernel.org [3.0] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-18pnfs: make _set_lo_fail genericPeng Tao1-0/+1
file layout and block layout both use it to set mark layout io failure bit. So make it generic. Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Cc: stable@kernel.org [3.0] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-08-01NFS: Re-enable compilation of nfs with !CONFIG_NFS_V4 || !CONFIG_NFS_V4_1Trond Myklebust1-1/+1
Fix two recently introduced compile problems: Fix a typo in fs/nfs/pnfs.h Move the pnfs_blksize declaration outside the CONFIG_NFS_V4 section in struct nfs_server. Reported-by: Jens Axboe <jaxboe@fusionio.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-31pnfsblock: call and parse getdevicelistFred Isaman1-1/+0
Call GETDEVICELIST during mount, then call and parse GETDEVICEINFO for each device returned. [pnfsblock: get rid of deprecated xdr macros] Signed-off-by: Jim Rees <rees@umich.edu> [pnfsblock: fix pnfs_deviceid references] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> [pnfsblock: fix print format warnings for sector_t and size_t] [pnfs-block: #include <linux/vmalloc.h>] [pnfsblock: no PNFS_NFS_SERVER] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfsblock: fix bug determining size of striped volume] [pnfsblock: fix oops when using multiple devices] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com> [pnfsblock: get rid of vmap and deviceid->area structure] Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-31pnfs: cleanup_layoutcommitAndy Adamson1-0/+3
This gives layout driver a chance to cleanup structures they put in at encode_layoutcommit. Signed-off-by: Andy Adamson <andros@netapp.com> [fixup layout header pointer for layoutcommit] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com> [rm inode and pnfs_layout_hdr args from cleanup_layoutcommit()] Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-31pnfs: add set-clear layoutdriver interfaceBenny Halevy1-2/+6
To allow layout driver to issue getdevicelist at mount time, and clean up at umount time. [fixup non NFS_V4_1 set_pnfs_layoutdriver definition] [pnfs: pass mntfh down the init_pnfs path] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-31pnfs: GETDEVICELISTAndy Adamson1-0/+12
The block driver uses GETDEVICELIST Signed-off-by: Andy Adamson <andros@netapp.com> [pass struct nfs_server * to getdevicelist] [get machince creds for getdevicelist] [fix getdevicelist decode sizing] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-31pnfs: let layoutcommit handle a list of lsegPeng Tao1-0/+2
There can be multiple lseg per file, so layoutcommit should be able to handle it. [Needed in v3.0] CC: Stable Tree <stable@kernel.org> Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-31pnfs: save layoutcommit cred at layout header initPeng Tao1-1/+1
No need to save it for every lseg. No need to save it at every pnfs_set_layoutcommit. [Needed in v3.0] CC: Stable Tree <stable@kernel.org> Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-31pnfs: save layoutcommit lwb at layout headerPeng Tao1-1/+1
No need to save it for every lseg. [Needed in v3.0] CC: Stable Tree <stable@kernel.org> Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-15NFS: Move the pnfs write code into pnfs.cTrond Myklebust1-9/+1
...and ensure that we recoalese to take into account differences in differences in block sizes when falling back to write through the MDS. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-15NFS: Move the pnfs read code into pnfs.cTrond Myklebust1-9/+1
...and ensure that we recoalese to take into account differences in block sizes when falling back to read through the MDS. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-12NFSv4.1: do not use deviceids after MDS clientid invalidationAndy Adamson1-0/+8
Mark all deviceids established under an expired MDS clientid as invalid. Stop all new i/o through DS and send through the MDS. Don't use any new LAYOUTGETs that use the invalid deviceid. Purge all layouts established under the expired MDS clientid. Remove the MDS clientid deviceid and data servers reference Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-12NFSv4.1: File layout only supports whole file layoutsAndy Adamson1-0/+6
Ask for whole file layouts. Until support for layout segments is fully supported in the file layout code, discard non-whole file layouts. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-12NFSv4.1: Clean ups for the device id cacheTrond Myklebust1-1/+0
The fact that the global device id cache holds a reference to the nfs4_deviceid_node until it is invisible to rcu lookups implies that we can always assume that the reference count is non-zero in _find_get_deviceid. Also clean up nfs4_put_deviceid_node and the removal of the device id from the cache. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-12NFSv4.1: Add an initialisation callback for pNFSTrond Myklebust1-12/+2
Ensure that we always get a layout before setting up the i/o request. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-12NFS: Cleanup of the nfs_pageio code in preparation for a pnfs bugfixTrond Myklebust1-12/+13
We need to ensure that the layouts are set up before we can decide to coalesce requests. To do so, we want to further split up the struct nfs_pageio_descriptor operations into an initialisation callback, a coalescing test callback, and a 'do i/o' callback. This patch cleans up the existing callback methods before adding the 'initialisation' callback. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-06-15NFS: fix umount of pnfs filesystemsWeston Andros Adamson1-0/+1
Unmounting a pnfs filesystem hangs using filelayout and possibly others. This fixes the use of the rcu protected node by making use of a new 'tmpnode' for the temporary purge list. Also, the spinlock shouldn't be held when calling synchronize_rcu(). Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-05-29NFSv4.1: use pnfs_generic_pg_test directly by layout driverBenny Halevy1-2/+4
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2011-05-29NFSv4.1: change pg_test return type to boolBenny Halevy1-2/+2
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2011-05-29NFSv4.1: unify pnfs_pageio_init functionsBenny Halevy1-11/+10
Use common code for pnfs_pageio_init_{read,write} and use a common generic pg_test function. Note that this function always assumes the the layout driver's pg_test method is implemented. [Fix BUG] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2011-05-29pnfs: encode_layoutcommitBenny Halevy1-0/+4
Add a layout driver method to encode the layout type specific opaque part of layout commit in-line in the xdr stream. Currently, the pnfs-objects layout driver uses it to encode metadata hints to the MDS and the blocks layout driver to commit provisionally allocated extents to the file. Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2011-05-29pnfs: encode_layoutreturnAndy Adamson1-0/+4
Add a layout driver method to encode the layout type specific opaque part of layout return in-line in the xdr stream. Currently the pnfs-objects layout driver uses it to encode i/o error information on LAYOUTRETURN. Signed-off-by: Andy Adamson <andros@netapp.com> [fixup layout header pointer for encode_layoutreturn] Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2011-05-29pnfs: layoutret_on_setattrBenny Halevy1-0/+22
With the objects layout security model, we have object capabilities that are associated with the layout and we anticipate that the server will issue a cb_layoutrecall for any setattr that changes security related attributes (user/group/mode/acl) or truncates the file. Therefore, the layout is returned before issuing the setattr to avoid the anticipated cb_layoutrecall. Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2011-05-29pnfs: layoutreturnBenny Halevy1-0/+18
NFSv4.1 LAYOUTRETURN implementation Currently, does not support layout-type payload encoding. Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Andy Adamson <andros@citi.umich.edu> Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: Zhang Jingwang <zhangjingwang@nrchpc.ac.cn> [call pnfs_return_layout right before pnfs_destroy_layout] [remove assert_spin_locked from pnfs_clear_lseg_list] [remove wait parameter from the layoutreturn path.] [remove return_type field from nfs4_layoutreturn_args] [remove range from nfs4_layoutreturn_args] [no need to send layoutcommit from _pnfs_return_layout] [don't wait on sync layoutreturn] [fix layout stateid in layoutreturn args] [fixed NULL deref in _pnfs_return_layout] [removed recaim member of nfs4_layoutreturn_args] Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2011-05-29pnfs: support for non-rpc layout driversBenny Halevy1-0/+2
Non-rpc layout driver such as for objects and blocks implement their own I/O path and error handling logic. Therefore bypass NFS-based error handling for these layout drivers. [fix lseg ref-count bugs, and null de-refs] [Fall out from: non-rpc layout drivers] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> [get rid of PNFS_USE_RPC_CODE] [get rid of __nfs4_write_done_cb] [revert useless change in nfs4_write_done_cb] Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2011-05-29pnfs: alloc and free layout_hdr layoutdriver methodsBenny Halevy1-0/+4
[gfp_flags] Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2011-05-29pnfs: Use byte-range for cb_layoutrecallBenny Halevy1-1/+1
Use recalled range to invalidate particular layout segments in the layout cache. Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2011-05-29pnfs: Use byte-range for layoutgetBenny Halevy1-2/+4
Add offset and count parameters to pnfs_update_layout and use them to get the layout in the pageio path. Order cache layout segments in the following order: * offset (ascending) * length (descending) * iomode (RW before READ) Test byte range against the layout segment in use in pnfs_{read,write}_pg_test so not to coalesce pages not using the same layout segment. [fix lseg ordering] [clean up pnfs_find_lseg lseg arg] [remove unnecessary FIXME] [fix ordering in pnfs_insert_layout] [clean up pnfs_insert_layout] Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2011-05-29NFSv4.1: use layout driver in global device cacheBenny Halevy1-3/+3
pnfs deviceids are unique per server, per layout type. struct nfs_client is currently used to distinguish deviceids from different nfs servers, yet these may clash between different layout types on the same server. Therefore, use the layout driver associated with each deviceid at insertion time to look it up, unhash, or delete it. Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2011-05-29pnfs: CB_NOTIFY_DEVICEIDMarc Eshel1-0/+1
Note: This functionlaity is incomplete as all layout segments referring to the 'to be removed device id' need to be reaped, and all in flight I/O drained. [use be32 res in nfs4_callback_devicenotify] [use nfs_client to qualify deviceid for cb_notify_deviceid] [use global deviceid cache for CB_NOTIFY_DEVICEID] [refactor device cache _lookup_deviceid] [refactor device cache _find_get_deviceid] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Bug in new global-device-cache code] [layout_driver MUST set free_deviceid_node if using dev-cache] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2011-05-29NFSv4.1: purge deviceid cache on nfs_free_clientBenny Halevy1-0/+11
Use the pnfs_layoutdriver_type both as a qualifier for the deviceid, distinguishing deviceid from different layout types on the server, and for freeing the layout-driver allocated structure containing the nfs4_deviceid_node. [BUG in _deviceid_purge_client] [layout_driver MUST set free_deviceid_node if using dev-cache] [let ver < 4.1 compile] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> [removed EXPORT_SYMBOL_GPL(nfs4_deviceid_purge_client)] Signed-off-by: Benny Halevy <bhalevy@panasas.com>