summaryrefslogtreecommitdiff
path: root/fs/nfs/flexfilelayout/flexfilelayoutdev.c
AgeCommit message (Collapse)AuthorFilesLines
2017-04-20pNFS/flexfiles: Check the result of nfs4_pnfs_ds_connectTrond Myklebust1-1/+1
The check in nfs4_ff_layout_prepare_ds() seems to be missing. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Fixes: a33e4b036d461 ("pNFS: return status from nfs4_pnfs_ds_connect") Cc: Weston Andros Adamson <dros@primarydata.com> Cc: stable@vger.kernel.org # v4.11
2017-04-20nfs: flexfilelayout: remove v3-only data server limitationTigran Mkrtchyan1-1/+7
Flexfilelayout supports data servers which talk NFS v3 and v4.{0,1,2}. However, this code path is disabled and v3 only servers are accepted. This change removes this limitation. Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-03-31nfs: flexfiles: fix kernel OOPS if MDS returns unsupported DS typeTigran Mkrtchyan1-0/+4
this fix aims to fix dereferencing of a mirror in an error state when MDS returns unsupported DS type (IOW, not v3), which causes the following oops: [ 220.370709] BUG: unable to handle kernel NULL pointer dereference at 0000000000000065 [ 220.370842] IP: ff_layout_mirror_valid+0x2d/0x110 [nfs_layout_flexfiles] [ 220.370920] PGD 0 [ 220.370972] Oops: 0000 [#1] SMP [ 220.371013] Modules linked in: nfnetlink_queue nfnetlink_log bluetooth nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_raw ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security iptable_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_security ebtable_filter ebtables ip6table_filter ip6_tables binfmt_misc intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel btrfs kvm arc4 snd_hda_codec_hdmi iwldvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate mac80211 xor uvcvideo [ 220.371814] videobuf2_vmalloc videobuf2_memops snd_hda_codec_idt mei_wdt videobuf2_v4l2 snd_hda_codec_generic iTCO_wdt ppdev videobuf2_core iTCO_vendor_support dell_rbtn dell_wmi iwlwifi sparse_keymap dell_laptop dell_smbios snd_hda_intel dcdbas videodev snd_hda_codec dell_smm_hwmon snd_hda_core media cfg80211 intel_uncore snd_hwdep raid6_pq snd_seq intel_rapl_perf snd_seq_device joydev i2c_i801 rfkill lpc_ich snd_pcm parport_pc mei_me parport snd_timer dell_smo8800 mei snd shpchp soundcore tpm_tis tpm_tis_core tpm nfsd auth_rpcgss nfs_acl lockd grace sunrpc i915 nouveau mxm_wmi ttm i2c_algo_bit drm_kms_helper crc32c_intel e1000e drm sdhci_pci firewire_ohci sdhci serio_raw mmc_core firewire_core ptp crc_itu_t pps_core wmi fjes video [ 220.372568] CPU: 7 PID: 4988 Comm: cat Not tainted 4.10.5-200.fc25.x86_64 #1 [ 220.372647] Hardware name: Dell Inc. Latitude E6520/0J4TFW, BIOS A06 07/11/2011 [ 220.372729] task: ffff94791f6ea580 task.stack: ffffb72b88c0c000 [ 220.372802] RIP: 0010:ff_layout_mirror_valid+0x2d/0x110 [nfs_layout_flexfiles] [ 220.372883] RSP: 0018:ffffb72b88c0f970 EFLAGS: 00010246 [ 220.372945] RAX: 0000000000000000 RBX: ffff9479015ca600 RCX: ffffffffffffffed [ 220.373025] RDX: ffffffffffffffed RSI: ffff9479753dc980 RDI: 0000000000000000 [ 220.373104] RBP: ffffb72b88c0f988 R08: 000000000001c980 R09: ffffffffc0ea6112 [ 220.373184] R10: ffffef17477d9640 R11: ffff9479753dd6c0 R12: ffff9479211c7440 [ 220.373264] R13: ffff9478f45b7790 R14: 0000000000000001 R15: ffff9479015ca600 [ 220.373345] FS: 00007f555fa3e700(0000) GS:ffff9479753c0000(0000) knlGS:0000000000000000 [ 220.373435] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 220.373506] CR2: 0000000000000065 CR3: 0000000196044000 CR4: 00000000000406e0 [ 220.373586] Call Trace: [ 220.373627] nfs4_ff_layout_prepare_ds+0x5e/0x200 [nfs_layout_flexfiles] [ 220.373708] ff_layout_pg_init_read+0x81/0x160 [nfs_layout_flexfiles] [ 220.373806] __nfs_pageio_add_request+0x11f/0x4a0 [nfs] [ 220.373886] ? nfs_create_request.part.14+0x37/0x330 [nfs] [ 220.373967] nfs_pageio_add_request+0xb2/0x260 [nfs] [ 220.374042] readpage_async_filler+0xaf/0x280 [nfs] [ 220.374103] read_cache_pages+0xef/0x1b0 [ 220.374166] ? nfs_read_completion+0x210/0x210 [nfs] [ 220.374239] nfs_readpages+0x129/0x200 [nfs] [ 220.374293] __do_page_cache_readahead+0x1d0/0x2f0 [ 220.374352] ondemand_readahead+0x17d/0x2a0 [ 220.374403] page_cache_sync_readahead+0x2e/0x50 [ 220.374460] generic_file_read_iter+0x6c8/0x950 [ 220.374532] ? nfs_mapping_need_revalidate_inode+0x17/0x40 [nfs] [ 220.374617] nfs_file_read+0x6e/0xc0 [nfs] [ 220.374670] __vfs_read+0xe2/0x150 [ 220.374715] vfs_read+0x96/0x130 [ 220.374758] SyS_read+0x55/0xc0 [ 220.374801] entry_SYSCALL_64_fastpath+0x1a/0xa9 [ 220.374856] RIP: 0033:0x7f555f570bd0 [ 220.374900] RSP: 002b:00007ffeb73e1b38 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 220.374986] RAX: ffffffffffffffda RBX: 00007f555f839ae0 RCX: 00007f555f570bd0 [ 220.375066] RDX: 0000000000020000 RSI: 00007f555fa41000 RDI: 0000000000000003 [ 220.375145] RBP: 0000000000021010 R08: ffffffffffffffff R09: 0000000000000000 [ 220.375226] R10: 00007f555fa40010 R11: 0000000000000246 R12: 0000000000022000 [ 220.375305] R13: 0000000000021010 R14: 0000000000001000 R15: 0000000000002710 [ 220.375386] Code: 66 66 90 55 48 89 e5 41 54 53 49 89 fc 48 83 ec 08 48 85 f6 74 2e 48 8b 4e 30 48 89 f3 48 81 f9 00 f0 ff ff 77 1e 48 85 c9 74 15 <48> 83 79 78 00 b8 01 00 00 00 74 2c 48 83 c4 08 5b 41 5c 5d c3 [ 220.375653] RIP: ff_layout_mirror_valid+0x2d/0x110 [nfs_layout_flexfiles] RSP: ffffb72b88c0f970 [ 220.375748] CR2: 0000000000000065 [ 220.403538] ---[ end trace bcdca752211b7da9 ]--- Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-03-17pNFS/flexfiles: never nfs4_mark_deviceid_unavailableWeston Andros Adamson1-1/+1
The flexfiles layout should never mark a device unavailable. Move nfs4_mark_deviceid_unavailable out of nfs4_pnfs_ds_connect and call directly from files layout where it's still needed. The flexfiles driver still handles marked devices in error paths, but will now print a rate limited warning. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-03-17pNFS: return status from nfs4_pnfs_ds_connectWeston Andros Adamson1-1/+2
The nfs4_pnfs_ds_connect path can call rpc_create which can fail or it can wait on another context to reach the same failure. This checks that the rpc_create succeeded and returns the error to the caller. When an error is returned, both the files and flexfiles layouts will return NULL from _prepare_ds(). The flexfiles layout will also return the layout with the error NFS4ERR_NXIO. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-12-20pNFS/flexfiles: delete deviceid, don't mark inactiveWeston Andros Adamson1-1/+1
Instead of marking a device inactive, remove it from the cache entirely. Flexfiles has a way to report errors back to the server, so we don't want to stop devices from being tried again for 120 seconds. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-09pNFS/flexfiles: Fix a deadlock on LAYOUTGETFred Isaman1-11/+39
We encountered a deadlock where the SEQUENCE that accompanied the LAYOUTGET triggered a session drain, while ff_layout_alloc_lseg triggered a GETDEVICEINFO. The GETDEVICEINFO hung waiting for the session drain, while the LAYOUTGET held the slot waiting for alloc_lseg to finish. Avoid this by moving the call to nfs4_find_get_deviceid out of ff_layout_alloc_lseg and into nfs4_ff_layout_prepare_ds. Signed-off-by: Fred Isaman <fred.isaman@gmail.com> [dros@primarydata.com: pNFS/flexfiles: fix races in ff_layout_mirror_valid] Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-07pNFS/flexfiles: Fix ff_layout_add_ds_error_locked()Trond Myklebust1-1/+2
When we're merging an old entry into our new entry, we want to ensure that we add the list entry in the correct place. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-03pNFS/flexfiles: Refactor encoding of the layoutreturn payloadTrond Myklebust1-15/+63
Add the layout error payload to the flexfiles layoutreturn private data, and set up the encoding mechanisms. This is a refactoring in preparation for adding the layout iostats payload. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02NFS: Remove unused authflavour parameter from nfs_get_client()Anna Schumaker1-2/+1
This parameter hasn't been used since f8407299 (Linux 3.11-rc2), so let's remove it from this function and callers. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02pNFS: consolidate the different range intersection testsTrond Myklebust1-25/+8
Both pnfs.c and the flexfiles code have their own versions of the range intersection testing, and the "end_offset" helper. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-08-29pNFS/flexfiles: Fix an Oopsable condition when connection to the DS failsTrond Myklebust1-9/+10
If the attempt to connect to a DS fails inside ff_layout_pg_init_read or ff_layout_pg_init_write, then we currently end up clearing the layout segment carried by the struct nfs_pageio_descriptor, causing an Oops when we later call into ff_layout_read_pagelist/ff_layout_write_pagelist. The fix is to ensure we return the layout and then retry. Fixes: 446ca2195303 ("pNFS/flexfiles: When initing reads or writes, we...") Cc: stable@vger.kernel.org # v4.7+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-08-16pNFS/flexfiles: Set reasonable default retrans values for the data channelTrond Myklebust1-2/+2
Prior to this patch, the retrans value was set at 5, meaning that we could see a maximum retransmission timeout value of more than 6 minutes. That's a tad high for NFSv3 where the protocol does allow the server to drop requests at any time. Since this is a data channel, let's just set retrans to 0, and the default timeout to 60s. The user can continue to adjust these defaults using the dataserver_retrans and dataserver_timeo module parameters. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-05-25nfs/flexfiles: Helper function to detect FF_FLAGS_NO_READ_IOTom Haynes1-0/+6
The mds can inform the client not to use the IOMODE_RW layout segment for doing READs. I.e., it is basically a IOMODE_WRITE layout segment. It would do this to not interfere with the WRITEs. Signed-off-by: Tom Haynes <loghyr@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17flexfiles: add kerneldoc header to nfs4_ff_layout_prepare_dsJeff Layton1-1/+17
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17flexfiles: remove pointless setting of NFS_LAYOUT_RETURN_REQUESTEDJeff Layton1-7/+1
Setting just the NFS_LAYOUT_RETURN_REQUESTED flag doesn't do anything, unless there are lsegs that are also being marked for return. At the point where that happens this flag is also set, so these set_bit calls don't do anything useful. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17pNFS/flexfiles: When checking for available DSes, conditionally check for MDS ioTom Haynes1-0/+6
Whenever we check to see if we have the needed number of DSes for the action, we may also have to check to see whether IO is allowed to go to the MDS or not. [jlayton: fix merge conflict due to lack of localio patches here] Signed-off-by: Tom Haynes <loghyr@primarydata.com> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17NFSv4: Label stateids with the typeTrond Myklebust1-1/+2
In order to more easily distinguish what kind of stateid we are dealing with, introduce a type that can be used to label the stateid structure. The label will be useful both for debugging, but also when dealing with operations like SETATTR, READ and WRITE that can take several different types of stateid as arguments. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-09nfs: have flexfiles mirror keep creds for both ro and rw layoutsJeff Layton1-2/+5
A mirror can be shared between multiple layouts, even with different iomodes. That makes stats gathering simpler, but it causes a problem when we get different creds in READ vs. RW layouts. The current code drops the newer credentials onto the floor when this occurs. That's problematic when you fetch a READ layout first, and then a RW. If the READ layout doesn't have the correct creds to do a write, then writes will fail. We could just overwrite the READ credentials with the RW ones, but that would break the ability for the server to fence the layout for reads if things go awry. We need to be able to revert to the earlier READ creds if the RW layout is returned afterward. The simplest fix is to just keep two sets of creds per mirror. One for READ layouts and one for RW, and then use the appropriate set depending on the iomode of the layout segment. Also fix up some RCU nits that sparse found. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-09nfs: get a reference to the credential in ff_layout_alloc_lsegJeff Layton1-45/+2
We're just as likely to have allocation problems here as we would if we delay looking up the credential like we currently do. Fix the code to get a rpc_cred reference early, as soon as the mirror is set up. This allows us to eliminate the mirror early if there is a problem getting an rpc credential. This also allows us to drop the uid/gid from the layout_mirror struct as well. In the event that we find an existing mirror where this one would go, we swap in the new creds unconditionally, and drop the reference to the old one. Note that the old ff_layout_update_mirror_cred function wouldn't set this pointer unless the DS version was 3, but we don't know what the DS version is at this point. I'm a little unclear on why it did that as you still need creds to talk to v4 servers as well. I have the code set it regardless of the DS version here. Also note the change to using generic creds instead of calling lookup_cred directly. With that change, we also need to populate the group_info pointer in the acred as some functions expect that to never be NULL. Instead of allocating one every time however, we can allocate one when the module is loaded and share it since the group_info is refcounted. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-09nfs: have ff_layout_get_ds_cred take a reference to the credJeff Layton1-4/+26
In later patches, we're going to want to allow the creds to be updated when we get a new layout with updated creds. Have this function take a reference to the cred that is later put once the call has been dispatched. Also, prepare for this change by ensuring we follow RCU rules when getting a reference to the cred as well. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-09nfs: don't call nfs4_ff_layout_prepare_ds from ff_layout_get_ds_credJeff Layton1-5/+1
All the callers already call that function before calling into here, so it ends up being a no-op anyway. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-03-16nfs4: nfs4_ff_layout_prepare_ds should return NULL if connection failedJeff Layton1-0/+2
I hit the following oops out of the blue while testing with flexfiles: BUG: unable to handle kernel NULL pointer dereference at 00000000000000e8 IP: [<ffffffffa048f6b8>] nfs4_ff_find_or_create_ds_client+0x48/0x50 [nfs_layout_flexfiles] PGD 44031067 PUD 5062d067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: nfsv3 nfs_layout_flexfiles tun rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache dcdbas nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw bonding ipmi_devintf ipmi_msghandler snd_hda_codec_generic virtio_balloon ppdev snd_hda_intel snd_hda_controller snd_hda_codec iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_core parport_pc snd_hwdep parport snd_seq snd_seq_device snd_pcm snd_timer acpi_cpufreq snd soundcore i2c_piix4 xfs libcrc32c joydev virtio_net virtio_console qxl drm_kms_helper ttm crc32c_intel drm virtio_pci serio_raw ata_generic virtio_ring virtio pata_acpi CPU: 0 PID: 19138 Comm: test5 Not tainted 4.1.9-100.pd.90.el7.x86_64 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150714_191134- 04/01/2014 task: ffff88007b70cf00 ti: ffff88004cc44000 task.ti: ffff88004cc44000 RIP: 0010:[<ffffffffa048f6b8>] [<ffffffffa048f6b8>] nfs4_ff_find_or_create_ds_client+0x48/0x50 [nfs_layout_flexfiles] RSP: 0018:ffff88004cc47890 EFLAGS: 00010246 RAX: 0000000000000003 RBX: ffff880050932300 RCX: ffff88006978f488 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88003e0e8540 RBP: ffff88004cc47908 R08: 0000000000000000 R09: 0000000000000000 R10: ffff88007ff8c758 R11: 0000000000000005 R12: ffff88003e0e8540 R13: 0000000000000000 R14: ffff88006978f488 R15: ffff88004431cc80 FS: 00007fea40c7c740(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000e8 CR3: 0000000044318000 CR4: 00000000000406f0 Stack: ffffffffa048c934 ffff880050932310 0000000100000001 ffff88006978f510 ffff88006978f3c8 ffff88003e56cd90 ffff88004cc479d0 00000020a052aff0 000000000004b000 ffff88004cc47908 ffff880050932300 ffff88004cc479d0 Call Trace: [<ffffffffa048c934>] ? ff_layout_write_pagelist+0x64/0x220 [nfs_layout_flexfiles] [<ffffffffa057a3bf>] pnfs_generic_pg_writepages+0xaf/0x1b0 [nfsv4] [<ffffffffa051ab57>] nfs_pageio_doio+0x27/0x60 [nfs] [<ffffffffa051bfe4>] nfs_pageio_complete_mirror+0x54/0xa0 [nfs] [<ffffffffa051c7ad>] nfs_pageio_complete+0x2d/0x90 [nfs] [<ffffffffa052032d>] nfs_writepage_locked+0x8d/0xe0 [nfs] [<ffffffff811e4630>] ? page_referenced_one+0x1a0/0x1a0 [<ffffffffa05210e7>] nfs_wb_single_page+0xf7/0x190 [nfs] [<ffffffffa05108d1>] nfs_launder_page+0x41/0x90 [nfs] [<ffffffff811b8930>] invalidate_inode_pages2_range+0x340/0x3a0 [<ffffffff811b89a7>] invalidate_inode_pages2+0x17/0x20 [<ffffffffa0513e1e>] nfs_release+0x9e/0xb0 [nfs] [<ffffffffa050fa1d>] nfs_file_release+0x3d/0x60 [nfs] [<ffffffff8122481c>] __fput+0xdc/0x1e0 [<ffffffff8122496e>] ____fput+0xe/0x10 [<ffffffff810bde67>] task_work_run+0xa7/0xe0 [<ffffffff810af735>] get_signal+0x565/0x600 [<ffffffff811a9815>] ? __filemap_fdatawrite_range+0x65/0x90 [<ffffffff810144a7>] do_signal+0x37/0x730 [<ffffffffa0569921>] ? nfs4_file_fsync+0x81/0x150 [nfsv4] [<ffffffff81254dbb>] ? vfs_fsync_range+0x3b/0xb0 [<ffffffff811446a6>] ? __audit_syscall_exit+0x1e6/0x280 [<ffffffff81014bff>] do_notify_resume+0x5f/0xa0 [<ffffffff8178ec3c>] int_signal+0x12/0x17 Code: 48 8b 40 70 8b 00 83 f8 03 74 20 83 f8 04 75 13 55 48 89 ce 48 89 d7 48 89 e5 e8 14 0f 0e 00 5d c3 66 90 0f 0b 66 0f 1f 44 00 00 <48> 8b 82 e8 00 00 00 c3 66 66 66 66 90 55 48 89 e5 41 57 41 56 RIP [<ffffffffa048f6b8>] nfs4_ff_find_or_create_ds_client+0x48/0x50 [nfs_layout_flexfiles] RSP <ffff88004cc47890> CR2: 00000000000000e8 When the DS connection attempt fails, nfs4_ff_layout_prepare_ds marks it for the error but then just returns the ds as if it were usable. The comments though say: /* Upon return, either ds is connected, or ds is NULL */ Ensure that we set the return pointer to NULL in the event that the connection attempt fails. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-28NFS: Cleanup - rename NFS_LAYOUT_RETURN_BEFORE_CLOSETrond Myklebust1-1/+1
NFS_LAYOUT_RETURN_BEFORE_CLOSE is being used to signal that a layoutreturn is needed, either due to a layout recall or to a layout error. Rename it to NFS_LAYOUT_RETURN_REQUESTED in order to clarify its purpose. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-22Merge branch 'bugfixes'Trond Myklebust1-59/+40
* bugfixes: pNFS/flexfiles: Fix an XDR encoding bug in layoutreturn pNFS/flexfiles: Improve merging of errors in LAYOUTRETURN
2016-01-21pNFS/flexfiles: Improve merging of errors in LAYOUTRETURNTrond Myklebust1-59/+40
When we hit 22 errors, we start to overflow the memory buffers allocated to the LAYOUTRETURN errors. The issue is that currently, RPC call reply ordering determines how successful we are in merging errors that refer to contiguous READ or WRITE requests. Fix is to use an insertion sort to help detect contiguity. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28pNFS/flexfiles: Don't prevent flexfiles client from retrying LAYOUTGETTrond Myklebust1-12/+4
Fix a bug in which flexfiles clients are falling back to I/O through the MDS even when the FF_FLAGS_NO_IO_THRU_MDS flag is set. The flexfiles client will always report errors through the LAYOUTRETURN and/or LAYOUTERROR mechanisms, so it should normally be safe for it to retry the LAYOUTGET until it fails or succeeds. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-09-02NFSv4.1/flexfiles: Mark layout for return if the mirrors are invalidTrond Myklebust1-15/+30
If a read-write layout has an invalid mirror, then we should mark it as invalid, and return it. If a read-only layout has an invalid mirror, then mark it as invalid and check if there is still at least one valid mirror before we return it. Note: Also fix incorrect use of pnfs_generic_mark_devid_invalid(). We really want nfs4_mark_deviceid_unavailable(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-09-02NFSv4.1/flexfiles: RW layouts are valid only if all mirrors are validTrond Myklebust1-2/+28
Unlike read layouts, the writeable layout cannot fall back to using only one of the mirrors. It need to write to all of them. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-09-02NFSv4.1/flexfiles: Fix incorrect usage of pnfs_generic_mark_devid_invalid()Trond Myklebust1-2/+2
Unlike the files layout, flexfiles does not test for the NFS_DEVICEID_INVALID flag. Instead it relies on NFS_DEVICEID_UNAVAILABLE. Fix is to replace with nfs4_mark_deviceid_unavailable(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-28NFSv4.1/flexfiles: Fix a protocol error in layoutreturnTrond Myklebust1-2/+5
According to the flexfiles protocol, the layoutreturn should specify an array of errors in the following format: struct ff_ioerr4 { offset4 ffie_offset; length4 ffie_length; stateid4 ffie_stateid; device_error4 ffie_errors<>; }; This patch fixes up the code to ensure that our ffie_errors is indeed encoded as an array (albeit with only a single entry). Reported-by: Tom Haynes <thomas.haynes@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-06-26nfs: always update creds in mirror, even when we have an already connected dsJeff Layton1-2/+2
A ds can be associated with more than one mirror, but we currently skip setting a mirror's credentials if we find that it's already set up with a connected client. The upshot is that we can end up sending DS writes with MDS credentials instead of properly setting them up. Fix nfs4_ff_layout_prepare_ds to always verify that the mirror's credentials are set up, even when we have a DS that's already connected. Reported-by: Tom Haynes <thomas.haynes@primarydata.com> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Cc: stable@vger.kernel.org # 4.0+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-06-26nfs: fix potential credential leak in ff_layout_update_mirror_credJeff Layton1-1/+2
If we have two tasks racing to update a mirror's credentials, then they can end up leaking one (or more) sets of credentials. The first task will set mirror->cred and then the second task will just overwrite it. Use a cmpxchg to ensure that the creds are only set once. If we get to the point where we would set mirror->cred and find that they're already set, then we just release the creds that were just found. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Cc: stable@vger.kernel.org # 4.0+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-27NFSv4.1: Convert pNFS deviceid to use kfree_rcu()Trond Myklebust1-1/+1
Use of synchronize_rcu() when unmounting and potentially freeing a lot of deviceids is problematic. There really is no reason why we can't just use kfree_rcu() here. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-02-10pnfs/flexfiles: Do not dprintk after the freeTom Haynes1-1/+1
Found by 0-DAY kernel test infrastructure: fs/nfs/flexfilelayout/flexfilelayoutdev.c:520:13-16: ERROR: reference preceded by free on line 518 fs/nfs/flexfilelayout/flexfilelayoutdev.c:520:26-29: ERROR: reference preceded by free on line 518 fs/nfs/flexfilelayout/flexfilelayoutdev.c:520:39-42: ERROR: reference preceded by free on line 518 fs/nfs/flexfilelayout/flexfilelayoutdev.c:521:3-6: ERROR: reference preceded by free on line 518 Reported-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Tom Haynes <loghyr@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-02-03pnfs/flexfiles: Add the FlexFile Layout DriverTom Haynes1-0/+552
The flexfile layout is a new layout that extends the file layout. It is currently being drafted as a specification at https://datatracker.ietf.org/doc/draft-ietf-nfsv4-layout-types/ Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Tom Haynes <loghyr@primarydata.com> Signed-off-by: Tao Peng <bergwolf@primarydata.com>