summaryrefslogtreecommitdiff
path: root/fs/btrfs/volumes.c
AgeCommit message (Collapse)AuthorFilesLines
2018-08-06btrfs: fix in-memory value of total_devices after seed device deletionAnand Jain1-0/+3
In case of deleting the seed device the %cur_devices (seed) and the %fs_devices (parent) are different. Now, as the parent fs_devices::total_devices also maintains the total number of devices including the seed device, so decrement its in-memory value for the successful seed delete. We are already updating its corresponding on-disk btrfs_super_block::number_devices value. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06btrfs: prune unused includesDavid Sterba1-3/+0
Remove includes if none of the interfaces and exports is used in the given source file. Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06btrfs: drop unnecessary variable in btrfs_init_new_deviceAnand Jain1-4/+1
There is only usage of the declared devices variable, instead use its value directly. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06btrfs: use a temporary variable for fs_devices in btrfs_init_new_deviceAnand Jain1-21/+21
There are many instances of the %fs_info->fs_devices pointer dereferences, use a temporary variable instead. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06btrfs: tree-checker: Verify block_group_itemQu Wenruo1-1/+1
A crafted image with invalid block group items could make free space cache code to cause panic. We could detect such invalid block group item by checking: 1) Item size Known fixed value. 2) Block group size (key.offset) We have an upper limit on block group item (10G) 3) Chunk objectid Known fixed value. 4) Type Only 4 valid type values, DATA, METADATA, SYSTEM and DATA|METADATA. No more than 1 bit set for profile type. 5) Used space No more than the block group size. This should allow btrfs to detect and refuse to mount the crafted image. Link: https://bugzilla.kernel.org/show_bug.cgi?id=199849 Reported-by: Xu Wen <wen.xu@gatech.edu> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Gu Jinxiang <gujx@cn.fujitsu.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Tested-by: Gu Jinxiang <gujx@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06btrfs: Remove fs_info from btrfs_force_chunk_allocNikolay Borisov1-1/+1
It can be referenced from the passed transaction handle. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06btrfs: Remove fs_info from check_system_chunkNikolay Borisov1-1/+1
It can be referenced from trans since the function is always called within a transaction. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06btrfs: Remove fs_info from btrfs_alloc_chunkNikolay Borisov1-4/+3
It can be referenced from trans since the function is always called within a transaction. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06btrfs: Remove fs_info from btrfs_remove_block_groupNikolay Borisov1-1/+1
This function is always called with a valid transaction handle from where we can reference fs_info. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-08-06btrfs: Remove fs_info from btrfs_make_block_groupNikolay Borisov1-2/+2
This function is always called with a valid transaction handle from where we can reference the fs_info. No functional changes. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-07-13btrfs: restore uuid_mutex in btrfs_open_devicesDavid Sterba1-0/+2
Commit 542c5908abfe84f7b4c1 ("btrfs: replace uuid_mutex by device_list_mutex in btrfs_open_devices") switched to device_list_mutex as we need that for the device list traversal, but we also need uuid_mutex to protect access to fs_devices::opened to be consistent with other users of that. Fixes: 542c5908abfe84f7b4c1 ("btrfs: replace uuid_mutex by device_list_mutex in btrfs_open_devices") Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-30btrfs: Remove fs_info argument from btrfs_uuid_tree_addLu Fengqi1-3/+2
This function always takes a transaction handle which contains a reference to the fs_info. Use that and remove the extra argument. Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: add prefix "balance:" for log messagesAnand Jain1-14/+24
Kernel logs are very important for the forensic investigations of the issues in general make it easy to use it. This patch adds 'balance:' prefix so that it can be easily searched. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: use common variable for fs_devices in btrfs_destroy_dev_replace_tgtdevAnand Jain1-5/+7
Use a local btrfs_fs_devices variable to access the structure. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: drop uuid_mutex in btrfs_destroy_dev_replace_tgtdevAnand Jain1-2/+0
Delete the uuid_mutex lock here as this thread accesses the btrfs_fs_devices::devices only (counters or called functions do a list traversal). And the device_list_mutex lock is already taken. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: replace uuid_mutex by device_list_mutex in btrfs_open_devicesAnand Jain1-2/+3
btrfs_open_devices() is using the uuid_mutex, but as btrfs_open_devices is just limited to openning all the devices under for given fsid, so we don't need uuid_mutex. Instead it should hold the device_list_mutex as it updates the members of the btrfs_fs_devices and btrfs_device and not the whole fs_devs list. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: document uuid_mutex uasge in read_chunk_treeAnand Jain1-0/+4
read_chunk_tree() calls read_one_dev(), but for seed device we have to search the fs_uuids list, so we need the uuid_mutex. Add a comment comment, so that we can improve this part. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: use existing cur_devices, cleanup btrfs_rm_deviceAnand Jain1-4/+9
Instead of de-referencing the device->fs_devices use cur_devices which points to the same fs_devices and does not change. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: reduce uuid_mutex critical section while scanning devicesAnand Jain1-7/+5
The generic block device lookup or cleanup does not need the uuid mutex, that's only for the device_list_add. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: remove redundant btrfs_balance_control::fs_infoDavid Sterba1-4/+3
The fs_info is always available from the context so we don't need to store it in the structure. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: Remove devid parameter from btrfs_rmap_blockNikolay Borisov1-5/+2
This function is used in only one place and devid argument is always passed 0. So just remove it, similarly to how it was removed in the userspace code. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: do reverse path readahead in btrfs_shrink_deviceGu Jinxiang1-1/+1
In btrfs_shrink_device, before btrfs_search_slot, path->reada is set to READA_FORWARD. But I think READA_BACK is correct. Since: 1. key.offset is set to (u64)-1 2. after btrfs_search_slot, btrfs_previous_item is called So, for readahead previous items, READA_BACK is the correct one. Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: move btrfs_raid_mindev_errorvalues to btrfs_raid_attr tableAnand Jain1-16/+8
Add a new member struct btrfs_raid_attr::mindev_error so that btrfs_raid_array can maintain the error code to return if the minimum number of devices condition is not met while trying to delete a device in the given raid. And so we can drop btrfs_raid_mindev_error. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: move btrfs_raid_group values to btrfs_raid_attr tableAnand Jain1-11/+8
Add a new member struct btrfs_raid_attr::bg_flag so that btrfs_raid_array can maintain the bit map flag of the raid type, and so we can drop btrfs_raid_group. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: move btrfs_raid_type_names values to btrfs_raid_attr tableAnand Jain1-0/+15
Add a new member struct btrfs_raid_attr::raid_name so that btrfs_raid_array can maintain the name of the raid type, and so we can drop btrfs_raid_type_names. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: open code set_balance_controlDavid Sterba1-17/+8
The helper is quite simple and I'd like to see the locking in the caller. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: use mutex in btrfs_resume_balance_asyncDavid Sterba1-3/+3
While the spinlock does not cause problems, using the mutex is more correct and consistent with others. The global status of balance is eg. checked from btrfs_pause_balance or btrfs_cancel_balance with mutex. Resuming balance happens during mount or ro->rw remount. In the former case, no other user of the balance_ctl exists, in the latter, balance cannot run until the ro/rw transition is finished. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: drop lock parameter from update_ioctl_balance_args and renameDavid Sterba1-1/+1
The parameter controls locking of the stats part but we can lock it unconditionally, as this only happens once when balance starts. This is not performance critical. Add the prefix for an exported function. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: move and comment read-only check in btrfs_cancel_balanceDavid Sterba1-3/+10
Balance cannot be started on a read-only filesystem and will have to finish/exit before eg. going to read-only via remount. In case the filesystem is forcibly set to read-only after an error, balance will finish anyway and if the cancel call is too fast it will just wait for that to happen. The last case is when the balance is paused after mount but it's read-only and cancelling would want to delete the item. The test is moved after the check if balance is running at all, as it looks more logical to report "no balance running" instead of "read-only filesystem". Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: track running balance in a simpler wayDavid Sterba1-8/+10
Currently fs_info::balance_running is 0 or 1 and does not use the semantics of atomics. The pause and cancel check for 0, that can happen only after __btrfs_balance exits for whatever reason. Parallel calls to balance ioctl may enter btrfs_ioctl_balance multiple times but will block on the balance_mutex that protects the fs_info::flags bit. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: kill btrfs_fs_info::volume_mutexDavid Sterba1-22/+7
Mutual exclusion of device add/rm and balance was done by the volume mutex up to version 3.7. The commit 5ac00addc7ac091109 ("Btrfs: disallow mutually exclusive admin operations from user mode") added a bit that essentially tracked the same information. The status bit has an advantage over a mutex that it can be set without restrictions of function context, so it started to be used in the mount-time resuming of balance or device replace. But we don't really need to track the same information in two ways. 1) After the previous cleanups, the main ioctl handlers for add/del/resize copy the EXCL_OP bit next to the volume mutex, here it's clearly safe. 2) Resuming balance during mount or after rw remount will set only the EXCL_OP bit and the volume_mutex is held in the kernel thread that calls btrfs_balance. 3) Resuming device replace during mount or after rw remount is done after balance and is excluded by the EXCL_OP bit. It does not take the volume_mutex at all and completely relies on the EXCL_OP bit. 4) The resuming of balance and dev-replace cannot hapen at the same time as the ioctls cannot be started in parallel. Nevertheless, a crafted image could trigger that and a warning is printed. 5) Balance is normally excluded by EXCL_OP and also uses own mutex to protect against concurrent access to its status data. There's some trickery to maintain the right lock nesting in case we need to reexamine the status in btrfs_ioctl_balance. The volume_mutex is removed and the unlock/lock sequence is left in place as we might expect other waiters to proceed. 6) Similar to 5, the unlock/lock sequence is kept in btrfs_cancel_balance to allow waiters to continue. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: remove wrong use of volume_mutex from btrfs_dev_replace_startDavid Sterba1-4/+0
The volume mutex does not protect against anything in this case, the comment about scrub is right but not related to locking and looks confusing. The comment in btrfs_find_device_missing_or_by_path is wrong and confusing too. The device_list_mutex is not held here to protect device lookup, but in this case device replace cannot run in parallel with device removal (due to exclusive op protection), so we don't need further locking here. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: cleanup helpers that reset balance stateDavid Sterba1-16/+13
The function __cancel_balance name is confusing with the cancel operation of balance and it really resets the state of balance back to zero. The unset_balance_control helper is called only from one place and simple enough to be inlined. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: add sanity check when resuming balance after mountDavid Sterba1-1/+13
Replace a WARN_ON with a proper check and message in case something goes really wrong and resumed balance cannot set up its exclusive status. The check is a user friendly assertion, I don't expect to ever happen under normal circumstances. Also document that the paused balance starts here and owns the exclusive op status. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: move clearing of EXCL_OP out of __cancel_balanceDavid Sterba1-6/+7
Make the clearning visible in the callers so we can pair it with the test_and_set part. Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: move volume_mutex to callers of btrfs_rm_deviceDavid Sterba1-2/+0
Move locking and unlocking next to the BTRFS_FS_EXCL_OP bit manipulation so it's obvious that the two happen at the same time. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: move btrfs_init_dev_replace_tgtdev to dev-replace.c and make staticDavid Sterba1-99/+0
The function logically belongs there and there's only a single caller, no need to export it. No code changes. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: export and rename free_deviceDavid Sterba1-12/+12
The function will be used outside of volumes.c, the allocation btrfs_alloc_device is also exported. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: make success path out of btrfs_init_dev_replace_tgtdev more clearDavid Sterba1-1/+7
This is a preparatory cleanup that will make clear that the only successful way out of btrfs_init_dev_replace_tgtdev will also set the device_out to a valid pointer. With this guarantee, the callers can be simplified. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: cleanup btrfs_rm_device() promote fs_devices pointerAnand Jain1-7/+6
This function uses fs_info::fs_devices number of time, however we declare and use it only at the end, instead do it in the beginning of the function and use it. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: cleanup find_device() drop list_head pointerAnand Jain1-2/+1
find_device() declares struct list_head *head pointer and used only once, instead just use it directly. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: rename __btrfs_open_devices to open_fs_devicesAnand Jain1-4/+3
__btrfs_open_devices() is un-exported drop __ prefix and rename it to open_fs_devices(). Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: rename __btrfs_close_devices to close_fs_devicesAnand Jain1-6/+6
__btrfs_close_devices() is un-exported, drop the __ prefix and rename it to close_fs_devices(). Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: cleanup __btrfs_open_devices() drop head pointerAnand Jain1-2/+1
__btrfs_open_devices() declares struct list_head *head, however head is used only once, instead use btrfs_fs_devices::devices directly. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: rename struct btrfs_fs_devices::listAnand Jain1-8/+8
btrfs_fs_devices::list is the list of BTRFS fsid in the kernel, a generic name 'list' makes it's search very difficult, rename it to fs_list. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-28btrfs: add comment about BTRFS_FS_EXCL_OPAnand Jain1-0/+35
Adds comments about BTRFS_FS_EXCL_OP to existing comments about the device locks. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> [ minor updates ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-17btrfs: fix crash when trying to resume balance without the resume flagAnand Jain1-0/+9
We set the BTRFS_BALANCE_RESUME flag in the btrfs_recover_balance() only, which isn't called during the remount. So when resuming from the paused balance we hit the bug: kernel: kernel BUG at fs/btrfs/volumes.c:3890! :: kernel: balance_kthread+0x51/0x60 [btrfs] kernel: kthread+0x111/0x130 :: kernel: RIP: btrfs_balance+0x12e1/0x1570 [btrfs] RSP: ffffba7d0090bde8 Reproducer: On a mounted filesystem: btrfs balance start --full-balance /btrfs btrfs balance pause /btrfs mount -o remount,ro /dev/sdb /btrfs mount -o remount,rw /dev/sdb /btrfs To fix this set the BTRFS_BALANCE_RESUME flag in btrfs_resume_balance_async(). CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-04-12btrfs: replace GPL boilerplate by SPDX -- sourcesDavid Sterba1-14/+2
Remove GPL boilerplate text (long, short, one-line) and keep the rest, ie. personal, company or original source copyright statements. Add the SPDX header. Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: split dev-replace locking helpers for read and writeDavid Sterba1-9/+9
The current calls are unclear in what way btrfs_dev_replace_lock takes the locks, so drop the argument, split the helpers and use similar naming as for read and write locks. Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31btrfs: use lockdep_assert_held for mutexesDavid Sterba1-5/+5
Using lockdep_assert_held is preferred, replace mutex_is_locked. Signed-off-by: David Sterba <dsterba@suse.com>