summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-12-02NFSv4: enhance nfs4_copy_lock_stateid to use a flock stateid if there is oneNeilBrown1-5/+9
A process can have two possible lock owner for a given open file: a per-process Posix lock owner and a per-open-file flock owner Use both of these when searching for a suitable stateid to use. With this patch, READ/WRITE requests will use the correct stateid if a flock lock is active. Signed-off-by: NeilBrown <neilb@suse.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02NFSv4: change nfs4_select_rw_stateid to take a lock_context inplace of ↵NeilBrown3-16/+12
lock_owner The only time that a lock_context is not immediately available is in setattr, and now that it has an open_context, it can easily find one with nfs_get_lock_context. This removes the need for the on-stack nfs_lockowner. This change is preparation for correctly support flock stateids. Signed-off-by: NeilBrown <neilb@suse.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02NFSv4: change nfs4_do_setattr to take an open_context instead of a nfs4_state.NeilBrown1-15/+13
The open_context can always lead directly to the state, and is always easily available, so this is a straightforward change. Doing this makes more information available to _nfs4_do_setattr() for use in the next patch. Signed-off-by: NeilBrown <neilb@suse.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02NFSv4: add flock_owner to open contextNeilBrown5-8/+12
An open file description (struct file) in a given process can be associated with two different lock owners. It can have a Posix lock owner which will be different in each process that has a fd on the file. It can have a Flock owner which will be the same in all processes. When searching for a lock stateid to use, we need to consider both of these owners So add a new "flock_owner" to the "nfs_open_context" (of which there is one for each open file description). This flock_owner does not need to be reference-counted as there is a 1-1 relation between 'struct file' and nfs open contexts, and it will never be part of a list of contexts. So there is no need for a 'flock_context' - just the owner is enough. The io_count included in the (Posix) lock_context provides no guarantee that all read-aheads that could use the state have completed, so not supporting it for flock locks in not a serious problem. Synchronization between flock and read-ahead can be added later if needed. When creating an open_context for a non-openning create call, we don't have a 'struct file' to pass in, so the lock context gets initialized with a NULL owner, but this will never be used. The flock_owner is not used at all in this patch, that will come later. Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02NFS: remove l_pid field from nfs_lockownerNeilBrown5-9/+2
this field is not used in any important way and probably should have been removed by Commit: 8003d3c4aaa5 ("nfs4: treat lock owners as opaque values") which removed the pid argument from nfs4_get_lock_state. Except in unusual and uninteresting cases, two threads with the same ->tgid will have the same ->files pointer, so keeping them both for comparison brings no benefit. Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02NFS: Remove unused argument from nfs_direct_write_complete()Anna Schumaker1-6/+6
This parameter hasn't been used since 2a009ec9 (Linux 3.13-rc3), so let's remove it from this function. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02NFS: Remove unused authflavour parameter from nfs_get_client()Anna Schumaker8-43/+23
This parameter hasn't been used since f8407299 (Linux 3.11-rc2), so let's remove it from this function and callers. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02nfs: fix false positives in nfs40_walk_client_list()J. Bruce Fields1-1/+21
It's possible that two different servers can return the same (clientid, verifier) pair purely by coincidence. Both are 64-bit values, but depending on the server implementation, they can be highly predictable and collisions may be quite likely, especially when there are lots of servers. So, check for this case. If the clientid and verifier both match, then we actually know they *can't* be the same server, since a new SETCLIENTID to an already-known server should have changed the verifier. This helps fix a bug that could cause the client to mount a filesystem from the wrong server. Reviewed-by: Jeff Layton <jlayton@redhat.com> Tested-by: Yongcheng Yang <yoyang@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02sunrpc: Don't engage exponential backoff when connection attempt is rejected.NeilBrown2-1/+4
xs_connect() contains an exponential backoff mechanism so the repeated connection attempts are delayed by longer and longer amounts. This is appropriate when the connection failed due to a timeout, but it not appropriate when a definitive "no" answer is received. In such cases, call_connect_status() imposes a minimum 3-second back-off, so not having the exponetial back-off will never result in immediate retries. The current situation is a problem when the NFS server tries to register with rpcbind but rpcbind isn't running. All connection attempts are made on the same "xprt" and as the connection is never "closed", the exponential back delays successive attempts to register, or de-register, different protocols. This results in a multi-minute delay with no benefit. So, when call_connect_status() receives a definitive "no", use xprt_conditional_disconnect() to cancel the previous connection attempt. This will set XPRT_CLOSE_WAIT so that xprt->ops->close() calls xs_close() which resets the reestablish_timeout. To ensure xprt_conditional_disconnect() does the right thing, we ensure that rq_connect_cookie is set before a connection attempt, and allow xprt_conditional_disconnect() to complete even when the transport is not fully connected. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02pNFS: Skip invalid stateids when doing a bulk destroyTrond Myklebust1-0/+2
If the layout stateid is already invalid, we have no work to do. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02pNFS: Wait on outstanding layoutreturns to complete in pnfs_roc()Trond Myklebust1-0/+9
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02pNFS: Don't mark the layout as freed if the last lseg is marked for returnTrond Myklebust1-0/+2
Address another memory leak. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02pNFS: Sync the layout state bits in pnfs_cache_lseg_for_layoutreturnTrond Myklebust1-14/+15
Ensure that the layout state bits are synced when we cache a layout segment for layoutreturn using an appropriate call to pnfs_set_plh_return_info. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02pNFS: Fix bugs in _pnfs_return_layoutTrond Myklebust1-3/+10
We need to honour the NFS_LAYOUT_RETURN_REQUESTED bit regardless of whether or not there are layout segments pending. Furthermore, we should ensure that we leave the plh_return_segs list empty. This patch fixes a memory leak of the layout segments on plh_return_segs. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02pNFS: Clear all layout segment state in pnfs_mark_layout_stateid_invalidTrond Myklebust1-1/+18
When the layout state is invalidated, then so is the layout segment state, and hence we do need to clean up the state bits. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02pNFS: Prevent unnecessary layoutreturns after delegreturnTrond Myklebust1-2/+4
If we cannot grab the inode or superblock, then we cannot pin the layout header, and so we cannot send a layoutreturn as part of an async delegreturn call. In this case, we currently end up sending an extra layoutreturn after the delegreturn. Since the layout was implicitly returned by the delegreturn, that just gets a BAD_STATEID. The fix is to simply complete the return-on-close immediately. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02pNFS: Enable layoutreturn operation for return-on-closeTrond Myklebust3-118/+96
Amend the pnfs return on close helper functions to enable sending the layoutreturn op in CLOSE/DELEGRETURN. This closes a potential race between CLOSE/DELEGRETURN and parallel OPEN calls to the same file, and allows the client and the server to agree on whether or not there is an outstanding layout. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02pNFS: Clean up - add a helper to initialise struct layoutreturn_argsTrond Myklebust1-7/+18
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02NFSv4: Add encode/decode of the layoutreturn op in DELEGRETURNTrond Myklebust3-9/+54
Add XDR encoding for the layoutreturn op, and storage for the layoutreturn arguments to the DELEGRETURN compound. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02NFSv4: Add encode/decode of the layoutreturn op in CLOSETrond Myklebust3-9/+69
Add XDR encoding for the layoutreturn op, and storage for the layoutreturn arguments to the CLOSE compound. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02NFSv4: Fix missing operation accounting in NFS4_dec_delegreturn_szTrond Myklebust1-0/+1
We need to account for the reply to the PUTFH operation in the DELEGRETURN compound. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02pNFS: Don't mark layout segments invalid on layoutreturn in pnfs_rocTrond Myklebust1-7/+13
The layoutreturn call will take care of invalidating the layout segments once the call is successful. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02pNFS: Get rid of unnecessary layout parameter in encode_layoutreturn callbackTrond Myklebust5-11/+9
The parameter is already present in the "args" structure. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02pNFS: Skip checking for return-on-close if the layout is invalidTrond Myklebust1-1/+2
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02pNFS: Remove spurious wake up in pnfs_layout_remove_lseg()Trond Myklebust1-3/+0
There is no change to the value of NFS_LAYOUT_RETURN, so we should not be waking up the RPC call. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02NFSv4: Ignore LAYOUTRETURN result if the layout doesn't match or is invalidTrond Myklebust3-4/+9
Fix a potential race with CB_LAYOUTRECALL in which the server recalls the remaining layout segments while our LAYOUTRETURN is still in transit. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02pNFS: Do not free layout segments that are marked for returnTrond Myklebust3-22/+73
We may want to process and transmit layout stat information for the layout segments that are being returned, so we should defer freeing them until after the layoutreturn has completed. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02pNFS: Delay getting the layout header in CB_LAYOUTRECALL handlersTrond Myklebust1-32/+67
Instead of grabbing the layout, we want to get the inode so that we can reduce races between layoutget and layoutrecall when the server does not support call referring. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02pNFS: consolidate the different range intersection testsTrond Myklebust3-57/+43
Both pnfs.c and the flexfiles code have their own versions of the range intersection testing, and the "end_offset" helper. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02pNFS: Fix race in pnfs_wait_on_layoutreturnTrond Myklebust1-5/+3
We must put the task to sleep while holding the inode->i_lock in order to ensure atomicity with the test for NFS_LAYOUT_RETURN. Fixes: 500d701f336b ("NFS41: make close wait for layoutreturn") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02pNFS: On error, do not send LAYOUTGET until the LAYOUTRETURN has completedTrond Myklebust2-1/+6
If there is an I/O error, we should not call LAYOUTGET until the LAYOUTRETURN that reports the error is complete. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org # v4.8+
2016-12-02pNFS: Force a retry of LAYOUTGET if the stateid doesn't match our cacheTrond Myklebust1-5/+6
If the server sends us a completely new stateid, and the client thinks it already holds a layout, then force a retry of the LAYOUTGET after invalidating the existing layout in order to avoid corruption due to races. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02pNFS: Clear NFS_LAYOUT_RETURN_REQUESTED when invalidating the layout stateidTrond Myklebust1-8/+9
We must ensure that we don't schedule a layoutreturn if the layout stateid has been marked as invalid. Fixes: 2a59a0411671e ("pNFS: Fix pnfs_set_layout_stateid() to clear...") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org # v4.8+
2016-12-02pNFS: Don't clear the layout stateid if a layout return is outstandingTrond Myklebust1-1/+3
If we no longer hold any layout segments, we're normally expected to consider the layout stateid to be invalid. However we cannot assume this if we're about to, or in the process of sending a layoutreturn. Fixes: 334a8f37115b ("pNFS: Don't forget the layout stateid if...") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org # v4.8+
2016-12-02pNFS: Fix a deadlock between read resends and layoutreturnTrond Myklebust2-0/+8
We must not call nfs_pageio_init_read() on a new nfs_pageio_descriptor while holding a reference to a layout segment, as that can deadlock pnfs_update_layout(). Fixes: d67ae825a59d6 ("pnfs/flexfiles: Add the FlexFile Layout Driver") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org # v4.0+
2016-12-02NFSv4.1: Fix regression in callback retry handlingFred Isaman1-1/+1
When initializing a freshly created slot for the calllback channel, the seq_nr needs to be 0, not 1. Otherwise validate_seqid and nfs4_slot_wait_on_seqid get confused and believe that the mpty slot corresponds to a previously sent reply. Signed-off-by: Fred Isaman <fred.isaman@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02NFSv4: Optimise away forced revalidation when we know the attributes are OKTrond Myklebust2-5/+1
The NFS_INO_REVAL_FORCED flag needs to be set if we just got a delegation, and we see that there might still be some ambiguity as to whether or not our attribute or data cache are valid. In practice, this means that a call to nfs_check_inode_attributes() will have noticed a discrepancy between cached attributes and measured ones, so let's move the setting of NFS_INO_REVAL_FORCED to there. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02NFSv4: Don't request close-to-open attribute when holding a delegationTrond Myklebust2-3/+10
If holding a delegation, we do not need to ask the server to return close-to-open cache consistency attributes as part of the CLOSE compound. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02NFSv4: Don't request a GETATTR on open_downgrade.Trond Myklebust1-8/+2
If we're not closing the file completely, there is no need to request close-to-open attributes. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02NFSv4: Don't ask for the change attribute when reclaiming stateTrond Myklebust1-1/+0
We don't need to ask for the change attribute when returning a delegation or recovering from a server reboot, and it could actually cause us to obtain an incorrect value if we're using a pNFS flavour that requires LAYOUTCOMMIT. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02NFSv4: Don't check file access when reclaiming stateTrond Myklebust1-3/+11
If we're reclaiming state after a reboot, or as part of returning a delegation, we don't need to check access modes again. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-11-28Linux 4.9-rc7Linus Torvalds1-1/+1
2016-11-27Merge git://git.infradead.org/intel-iommuLinus Torvalds4-12/+34
Pull IOMMU fixes from David Woodhouse: "Two minor fixes. The first fixes the assignment of SR-IOV virtual functions to the correct IOMMU unit, and the second fixes the excessively large (and physically contiguous) PASID tables used with SVM" * git://git.infradead.org/intel-iommu: iommu/vt-d: Fix PASID table allocation iommu/vt-d: Fix IOMMU lookup for SR-IOV Virtual Functions
2016-11-27Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linusLinus Torvalds5-9/+29
Pull MIPS fixes from Ralf Baechle: "Another round of MIPS fixes for 4.9: - Fix unreadable output in __do_page_fault due to the KERN_CONT patchset - Correctly handle MIPS R6 fixes to the c0_wired register" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: MIPS: mm: Fix output of __do_page_fault MIPS: Mask out limit field when calculating wired entry count
2016-11-27Merge branch 'for-linus' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs splice fix from Al Viro. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix default_file_splice_read()
2016-11-27fix default_file_splice_read()Al Viro1-1/+2
Botched calculation of number of pages. As the result, we were dropping pieces when doing splice to pipe from e.g. 9p. Reported-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-11-27Merge branch 'i2c/for-current' of ↵Linus Torvalds1-39/+25
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Here is a revert and two bugfixes for the I2C designware driver. Please note that we are still hunting down a regression for the i2c-octeon driver. While there is a fix pending, we have unclear feedback from the testers currently. An rc8 would be quite helpful for this case" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: Revert "i2c: designware: do not disable adapter after transfer" i2c: designware: fix rx fifo depth tracking i2c: designware: report short transfers
2016-11-27Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds47-131/+208
Pull ARM fix from Russell King: "This resolves the ksyms issues by reverting the commit which introduced the breakage" There was what I consider to be a better fix, but it's late in the rc game, so I'll take the revert. * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: Revert "arm: move exports to definitions"
2016-11-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds35-60/+101
Pull networking fixes from David Miller: 1) Fix leak in fsl/fman driver, from Dan Carpenter. 2) Call flow dissector initcall earlier than any networking driver can register and start to use it, from Eric Dumazet. 3) Some dup header fixes from Geliang Tang. 4) TIPC link monitoring compat fix from Jon Paul Maloy. 5) Link changes require EEE re-negotiation in bcm_sf2 driver, from Florian Fainelli. 6) Fix bogus handle ID passed into tfilter_notify_chain(), from Roman Mashak. 7) Fix dump size calculation in rtnl_calcit(), from Zhang Shengju. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits) tipc: resolve connection flow control compatibility problem mvpp2: use correct size for memset net/mlx5: drop duplicate header delay.h net: ieee802154: drop duplicate header delay.h ibmvnic: drop duplicate header seq_file.h fsl/fman: fix a leak in tgec_free() net: ethtool: don't require CAP_NET_ADMIN for ETHTOOL_GLINKSETTINGS tipc: improve sanity check for received domain records tipc: fix compatibility bug in link monitoring net: ethernet: mvneta: Remove IFF_UNICAST_FLT which is not implemented dwc_eth_qos: drop duplicate headers net sched filters: fix filter handle ID in tfilter_notify_chain() net: dsa: bcm_sf2: Ensure we re-negotiate EEE during after link change bnxt: do not busy-poll when link is down udplite: call proper backlog handlers ipv6: bump genid when the IFA_F_TENTATIVE flag is clear net/mlx4_en: Free netdev resources under state lock net: revert "net: l2tp: Treat NET_XMIT_CN as success in l2tp_eth_dev_xmit" rtnetlink: fix the wrong minimal dump size getting from rtnl_calcit() bnxt_en: Fix a VXLAN vs GENEVE issue ...
2016-11-26Merge branch 'libnvdimm-fixes' of ↵Linus Torvalds2-3/+5
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: - Fix a crash that occurs at driver initialization if the memory region is already busy (request_mem_region() fails). - Fix a vma validation check that mistakenly allows a private device- dax mapping to be established. Device-dax explicitly forbids private mappings so it can guarantee a given fault granularity and backing memory type. Both of these fixes have soaked in -next and are tagged for -stable. * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: device-dax: fail all private mapping attempts device-dax: check devm_nsio_enable() return value