summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2014-09-23f2fs: use more free segments until SSR is activatedJaegeuk Kim1-2/+4
Previously, f2fs activates SSR if the # of free segments reaches to the # of overprovisioned segments. In this case, SSR starts to use dirty segments only, so that the overprovisoned space cannot be selected for new data. This means that we have no chance to utilizae the overprovisioned space at all. This patch fixes that by allowing LFS allocations until the # of free segments reaches to the last threshold, reserved space. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-23f2fs: change the ipu_policy option to enable combinationsJaegeuk Kim3-27/+20
This patch changes the ipu_policy setting to use any combination of orthogonal policies. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-23f2fs: fix to search whole dirty segmap when get_victimChao Yu1-2/+2
In ->get_victim we get max_search value from dirty_i->nr_dirty without protection of seglist_lock, after that, nr_dirty can be increased/decreased before we hold seglist_lock lock. Then in main loop we attempt to traverse all dirty section one time to find victim section, but it's not accurate to use max_search as the total loop count, because we might lose checking several sections or check sections redundantly for the case of nr_dirty are increased or decreased previously. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-23f2fs: fix to clean previous mount option when remount_fsChao Yu1-0/+3
In manual of mount, we descript remount as below: "mount -o remount,rw /dev/foo /dir After this call all old mount options are replaced and arbitrary stuff from fstab is ignored, except the loop= option which is internally generated and maintained by the mount command." Previously f2fs do not clear up old mount options when remount_fs, so we have no chance of disabling previous option (e.g. flush_merge). Fix it. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-23f2fs: skip punching hole in special conditionChao Yu1-0/+7
Now punching hole in directory is not supported in f2fs, so let's limit file type in punch_hole(). In addition, in punch_hole if offset is exceed file size, we should skip punching hole. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-23f2fs: support large sector sizeChao Yu5-16/+25
Block size in f2fs is 4096 bytes, so theoretically, f2fs can support 4096 bytes sector device at maximum. But now f2fs only support 512 bytes size sector, so block device such as zRAM which uses page cache as its block storage space will not be mounted successfully as mismatch between sector size of zRAM and sector size of f2fs supported. In this patch we support large sector size in f2fs, so block device with sector size of 512/1024/2048/4096 bytes can be supported in f2fs. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-23f2fs: fix to truncate blocks past EOF in ->setattrChao Yu1-5/+12
By using FALLOC_FL_KEEP_SIZE in ->fallocate of f2fs, we can fallocate block past EOF without changing i_size of inode. These blocks past EOF will not be truncated in ->setattr as we truncate them only when change the file size. We should give a chance to truncate blocks out of filesize in setattr(). Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-23f2fs: update i_size when __allocate_data_blockJaegeuk Kim1-0/+8
The f2fs_direct_IO uses __allocate_data_block, but inside the allocation path, we should update i_size at the changed time to update its inode page. Otherwise, we can get wrong i_size after roll-forward recovery. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-23f2fs: use MAX_BIO_BLOCKS(sbi)Jaegeuk Kim5-9/+8
This patch cleans up a simple macro. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-23f2fs: remove redundant operation during roll-forward recoveryJaegeuk Kim2-25/+22
If same data is updated multiple times, we don't need to redo whole the operations. Let's just update the lastest one. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-23f2fs: do not skip latest inode informationJaegeuk Kim1-1/+10
In f2fs_sync_file, if there is no written appended writes, it skips to write its node blocks. But, if there is up-to-date inode page, we should write it to update its metadata during the roll-forward recovery. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-23f2fs: fix roll-forward missing scenariosJaegeuk Kim1-11/+60
We can summarize the roll forward recovery scenarios as follows. [Term] F: fsync_mark, D: dentry_mark 1. inode(x) | CP | inode(x) | dnode(F) -> Update the latest inode(x). 2. inode(x) | CP | inode(F) | dnode(F) -> No problem. 3. inode(x) | CP | dnode(F) | inode(x) -> Recover to the latest dnode(F), and drop the last inode(x) 4. inode(x) | CP | dnode(F) | inode(F) -> No problem. 5. CP | inode(x) | dnode(F) -> The inode(DF) was missing. Should drop this dnode(F). 6. CP | inode(DF) | dnode(F) -> No problem. 7. CP | dnode(F) | inode(DF) -> If f2fs_iget fails, then goto next to find inode(DF). 8. CP | dnode(F) | inode(x) -> If f2fs_iget fails, then goto next to find inode(DF). But it will fail due to no inode(DF). So, this patch adds some missing points such as #1, #5, #7, and #8. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-23f2fs: fix conditions to remain recovery information in f2fs_sync_fileJaegeuk Kim5-40/+56
This patch revisited whole the recovery information during the f2fs_sync_file. In this patch, there are three information to make a decision. a) IS_CHECKPOINTED, /* is it checkpointed before? */ b) HAS_FSYNCED_INODE, /* is the inode fsynced before? */ c) HAS_LAST_FSYNC, /* has the latest node fsync mark? */ And, the scenarios for our rule are based on: [Term] F: fsync_mark, D: dentry_mark 1. inode(x) | CP | inode(x) | dnode(F) 2. inode(x) | CP | inode(F) | dnode(F) 3. inode(x) | CP | dnode(F) | inode(x) | inode(F) 4. inode(x) | CP | dnode(F) | inode(F) 5. CP | inode(x) | dnode(F) | inode(DF) 6. CP | inode(DF) | dnode(F) 7. CP | dnode(F) | inode(DF) 8. CP | dnode(F) | inode(x) | inode(DF) For example, #3, the three conditions should be changed as follows. inode(x) | CP | dnode(F) | inode(x) | inode(F) a) x o o o o b) x x x x o c) x o o x o If f2fs_sync_file stops ------^, it should write inode(F) --------------^ So, the need_inode_block_update should return true, since c) get_nat_flag(e, HAS_LAST_FSYNC), is false. For example, #8, CP | alloc | dnode(F) | inode(x) | inode(DF) a) o x x x x b) x x x o c) o o x o If f2fs_sync_file stops -------^, it should write inode(DF) --------------^ Note that, the roll-forward policy should follow this rule, which means, if there are any missing blocks, we doesn't need to recover that inode. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-23f2fs: introduce a flag to represent each nat entry informationJaegeuk Kim2-10/+31
This patch introduces a flag in the nat entry structure to merge various information such as checkpointed and fsync_done marks. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-23f2fs: use meta_inode cache to improve roll-forward speedJaegeuk Kim4-42/+58
Previously, all the dnode pages should be read during the roll-forward recovery. Even worsely, whole the chain was traversed twice. This patch removes that redundant and costly read operations by using page cache of meta_inode and readahead function as well. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-16f2fs: fix double lock for inode page during roll-foward recoveryJaegeuk Kim1-7/+21
If the inode is same and its data index are needed to truncate, we can fall into double lock for its inode page via get_dnode_of_data. Error case is like this. 1. write data 1, 2, 3, 4, 5 in inode #4. 2. write data 100, 102, 103, 104, 105 in dnode #6 of inode #4. 3. sync 4. update data 100->106 in dnode #6. 5. fsync inode #4. 6. power-cut -> Then, 1. go back to #3's checkpoint 2. in do_recover_data, get_dnode_of_data() gets inode #4. 3. detect 100->106 in dnode #6. 4. check_index_in_prev_nodes tries to truncate 100 in dnode #6. 5. to trigger truncate_hole, get_dnode_of_data should grab inode #4. 6. detect *kernel hang* This patch should resolve that bug. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-16f2fs: fix a race condition in next_free_nidHuang Ying1-2/+4
The nm_i->fcnt checking is executed before spin_lock, so if another thread delete the last free_nid from the list, the wrong nid may be gotten. So fix the race condition by moving the nm_i->fnct checking into spin_lock. Signed-off-by: Huang, Ying <ying.huang@intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-16f2fs: use nm_i->next_scan_nid as default for next_free_nidHuang Ying1-1/+2
Now, if there is no free nid in nm_i->free_nid_list, 0 may be saved into next_free_nid of checkpoint, this may cause useless scanning for next mount. nm_i->next_scan_nid should be a better default value than 0. Signed-off-by: Huang, Ying <ying.huang@intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-16f2fs: give an option to enable in-place-updates during fsync to usersJaegeuk Kim7-10/+33
If user wrote F2FS_IPU_FSYNC:4 in /sys/fs/f2fs/ipu_policy, f2fs_sync_file only starts to try in-place-updates. And, if the number of dirty pages is over /sys/fs/f2fs/min_fsync_blocks, it keeps out-of-order manner. Otherwise, it triggers in-place-updates. This may be used by storage showing very high random write performance. For example, it can be used when, Seq. writes (Data) + wait + Seq. writes (Node) is pretty much slower than, Rand. writes (Data) Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-16f2fs: expand counting dirty pages in the inode page cacheJaegeuk Kim7-26/+39
Previously f2fs only counts dirty dentry pages, but there is no reason not to expand the scope. This patch changes the names on the management of dirty pages and to count dirty pages in each inode info as well. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-11f2fs: remove lengthy inode->i_inoJaegeuk Kim1-7/+8
This patch is to remove lengthy name by adding a new variable. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-10f2fs: fix negative value for lseek offsetJaegeuk Kim1-0/+2
If application throws negative value of lseek with SEEK_DATA|SEEK_HOLE, previous f2fs went into BUG_ON in get_dnode_of_data, which was reported by Tommi Rantala. He could make a simple code to detect this having: lseek(fd, -17595150933902LL, SEEK_DATA); This patch should resolve that bug. Reported-by: Tommi Rentala <tt.rantala@gmail.com> [Jaegeuk Kim: relocate the condition as suggested by Chao] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-10f2fs: avoid node page to be written twice in gc_node_segmentHuang Ying1-0/+6
In gc_node_segment, if node page gc is run concurrently with node page writeback, and check_valid_map and get_node_page run after page locked and before cur_valid_map is updated as below, it is possible for the page to be written twice unnecessarily. sync_node_pages try_lock_page ... check_valid_map f2fs_write_node_page ... write_node_page do_write_page allocate_data_block ... refresh_sit_entry /* update cur_valid_map */ ... ... unlock_page get_node_page ... set_page_dirty ... f2fs_put_page unlock_page This can be solved via calling check_valid_map after get_node_page again. Signed-off-by: Huang, Ying <ying.huang@intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-10f2fs: use lock-less list(llist) to simplify the flush cmd managementGu Zheng2-25/+12
We use flush cmd control to collect many flush cmds, and flush them together. In this case, we use two list to manage the flush cmds (collect and dispatch), and one spin lock is used to protect this. In fact, the lock-less list(llist) is very suitable to this case, and we use simplify this routine. - v2: -use llist_for_each_entry_safe to fix possible use-after-free issue. -remove the unused field from struct flush_cmd. Thanks for Yu's suggestion. - Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-10f2fs: refactor flush_sit_entries codes for reducing SIT writesChao Yu4-71/+186
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT writes"), we descripte the issue as below: "Although building NAT journal in cursum reduce the read/write work for NAT block, but previous design leave us lower performance when write checkpoint frequently for these cases: 1. if journal in cursum has already full, it's a bit of waste that we flush all nat entries to page for persistence, but not to cache any entries. 2. if journal in cursum is not full, we fill nat entries to journal util journal is full, then flush the left dirty entries to disk without merge journaled entries, so these journaled entries may be flushed to disk at next checkpoint but lost chance to flushed last time." Actually, we have the same problem in using SIT journal area. In this patch, firstly we will update sit journal with dirty entries as many as possible. Secondly if there is no space in sit journal, we will remove all entries in journal and walk through the whole dirty entry bitmap of sit, accounting dirty sit entries located in same SIT block to sit entry set. All entry sets are linked to list sit_entry_set in sm_info, sorted ascending order by count of entries in set. Later we flush entries in set which have fewest entries into journal as many as we can, and then flush dense set with merged entries to disk. In this way we can use sit journal area more effectively, also we will reduce SIT update, result in gaining in performance and saving lifetime of flash device. In my testing environment, it shows this patch can help to reduce SIT block update obviously. virtual machine + hard disk: fsstress -p 20 -n 400 -l 5 sit page num cp count sit pages/cp based 2006.50 1349.75 1.486 patched 1566.25 1463.25 1.070 Our latency of merging op is small when handling a great number of dirty SIT entries in flush_sit_entries: latency(ns) dirty sit count 36038 2151 49168 2123 37174 2232 Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-10f2fs: remove unneeded sit_i in macro SIT_BLOCK_OFFSET/START_SEGNOChao Yu2-7/+7
sit_i in macro SIT_BLOCK_OFFSET/START_SEGNO is not used, remove it. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-10f2fs: need fsck.f2fs if the recovery was failedJaegeuk Kim1-0/+3
If the roll-forward recovery was failed, we'd better conduct fsck.f2fs. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-10f2fs: handle bug cases by letting fsck.f2fs initiateJaegeuk Kim1-1/+9
This patch adds to handle corner buggy cases for fsck.f2fs. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-10f2fs: add BUG cases to initiate fsck.f2fsJaegeuk Kim2-5/+37
This patch replaces BUG cases with f2fs_bug_on to remain fsck.f2fs information. And it implements some void functions to initiate fsck.f2fs too. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-10f2fs: need fsck.f2fs when f2fs_bug_on is triggeredJaegeuk Kim11-63/+70
If any f2fs_bug_on is triggered, fsck.f2fs is needed. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-10f2fs: retain inconsistency information to initiate fsck.f2fsJaegeuk Kim4-0/+6
This patch adds sbi->need_fsck to conduct fsck.f2fs later. This flag can only be removed by fsck.f2fs. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04f2fs: introduce F2FS_I_SB, F2FS_M_SB, and F2FS_P_SBJaegeuk Kim14-114/+103
This patch adds three inline functions to clean up dirty casting codes. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-03Merge tag 'for-f2fs-3.17-rc4' of ↵Linus Torvalds19-231/+261
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs bug fixes from Jaegeuk Kim: "This series includes patches to: - fix recovery routines - fix bugs related to inline_data/xattr - fix when casting the dentry names - handle EIO or ENOMEM correctly - fix memory leak - fix lock coverage" * tag 'for-f2fs-3.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (28 commits) f2fs: reposition unlock_new_inode to prevent accessing invalid inode f2fs: fix wrong casting for dentry name f2fs: simplify by using a literal f2fs: truncate stale block for inline_data f2fs: use macro for code readability f2fs: introduce need_do_checkpoint for readability f2fs: fix incorrect calculation with total/free inode num f2fs: remove rename and use rename2 f2fs: skip if inline_data was converted already f2fs: remove rewrite_node_page f2fs: avoid double lock in truncate_blocks f2fs: prevent checkpoint during roll-forward f2fs: add WARN_ON in f2fs_bug_on f2fs: handle EIO not to break fs consistency f2fs: check s_dirty under cp_mutex f2fs: unlock_page when node page is redirtied out f2fs: introduce f2fs_cp_error for readability f2fs: give a chance to mount again when encountering errors f2fs: trigger release_dirty_inode in f2fs_put_super f2fs: don't skip checkpoint if there is no dirty node pages ...
2014-09-03Merge branch 'for-linus' of ↵Linus Torvalds4-19/+37
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull key subsystem fixes from James Morris: "Fixes for the keys subsystem, one of which addresses a use-after-free bug" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: PEFILE: Relax the check on the length of the PKCS#7 cert KEYS: Fix use-after-free in assoc_array_gc() KEYS: Fix public_key asymmetric key subtype name KEYS: Increase root_maxkeys and root_maxbytes sizes
2014-09-03ARC: [mm] Fix compilation breakageNoam Camus1-1/+1
Structure name and variable name were erroneously interchanged Signed-off-by: Noam Camus <noamc@ezchip.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> [ Also removed pointless cast from "void *". - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-09-03Merge tag 'arm64-fixes' of ↵Linus Torvalds9-30/+40
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull more arm64 fixes from Will Deacon: "Another handful of arm64 fixes here. They address some issues found by running smatch on the arch code (ignoring the false positives) and also stop 32-bit Android from losing track of its stack. There's one additional irq migration fix in the pipeline, but it came in after I'd tagged and tested this set. - a few fixes for real issues found by smatch (after Dan's talk at KS) - revert the /proc/cpuinfo changes merged during the merge window. We've opened a can of worms here, so we need to find out where we stand before we change this interface. - implement KSTK_ESP for compat tasks, otherwise 32-bit Android gets confused wondering where its [stack] has gone - misc fixes (fpsimd context handling, crypto, ...)" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: Revert "arm64: cpuinfo: print info for all CPUs" arm64: fix bug for reloading FPSIMD state after cpu power off arm64: report correct stack pointer in KSTK_ESP for compat tasks arm64: Add brackets around user_stack_pointer() arm64: perf: don't rely on layout of pt_regs when grabbing sp or pc arm64: ptrace: fix compat reg getter/setter return values arm64: ptrace: fix compat hardware watchpoint reporting arm64: Remove unused variable in head.S arm64/crypto: remove redundant update of data
2014-09-03Merge tag 'pci-v3.17-fixes-1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fix from Bjorn Helgaas: "This fixes an ARM allmodconfig build problem: Remove module option for ST Microelectronics SPEAr13xx" * tag 'pci-v3.17-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: spear: Remove module option
2014-09-03Merge branch 'leds-fixes-for-3.17' of ↵Linus Torvalds3-14/+14
git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds Pull LED fix from Bryan Wu: "Hugh, Jiri and many other people found a kernel oops due to a LED change merged recently. Now the right fix might just revert it and avoid the kernel oops" * 'leds-fixes-for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds: Revert "leds: convert blink timer to workqueue"
2014-09-03PEFILE: Relax the check on the length of the PKCS#7 certDavid Howells1-16/+33
Relax the check on the length of the PKCS#7 cert as it appears that the PE file wrapper size gets rounded up to the nearest 8. The debugging output looks like this: PEFILE: ==> verify_pefile_signature() PEFILE: ==> pefile_parse_binary() PEFILE: checksum @ 110 PEFILE: header size = 200 PEFILE: cert = 968 @547be0 [68 09 00 00 00 02 02 00 30 82 09 56 ] PEFILE: sig wrapper = { 968, 200, 2 } PEFILE: Signature data not PKCS#7 The wrapper is the first 8 bytes of the hex dump inside []. This indicates a length of 0x968 bytes, including the wrapper header - so 0x960 bytes of payload. The ASN.1 wrapper begins [ ... 30 82 09 56 ]. That indicates an object of size 0x956 - a four byte discrepency, presumably just padding for alignment purposes. So we just check that the ASN.1 container is no bigger than the payload and reduce the recorded size appropriately. Whilst we're at it, allow shorter PKCS#7 objects that manage to squeeze within 127 or 255 bytes. It's just about conceivable if no X.509 certs are included in the PKCS#7 message. Reported-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Peter Jones <pjones@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2014-09-03KEYS: Fix use-after-free in assoc_array_gc()David Howells1-1/+1
An edit script should be considered inaccessible by a function once it has called assoc_array_apply_edit() or assoc_array_cancel_edit(). However, assoc_array_gc() is accessing the edit script just after the gc_complete: label. Reported-by: Andreea-Cristina Bernat <bernat.ada@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Andreea-Cristina Bernat <bernat.ada@gmail.com> cc: shemming@brocade.com cc: paulmck@linux.vnet.ibm.com Cc: stable@vger.kernel.org Signed-off-by: James Morris <james.l.morris@oracle.com>
2014-09-03KEYS: Fix public_key asymmetric key subtype nameDavid Howells1-0/+1
The length of the name of an asymmetric key subtype must be stored in struct asymmetric_key_subtype::name_len so that it can be matched by a search for "<subkey_name>:<partial_fingerprint>". Fix the public_key subtype to have name_len set. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2014-09-03KEYS: Increase root_maxkeys and root_maxbytes sizesSteve Dickson1-2/+2
Now that NFS client uses the kernel key ring facility to store the NFSv4 id/gid mappings, the defaults for root_maxkeys and root_maxbytes need to be substantially increased. These values have been soak tested: https://bugzilla.redhat.com/show_bug.cgi?id=1033708#c73 Signed-off-by: Steve Dickson <steved@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2014-09-02Revert "leds: convert blink timer to workqueue"Jiri Kosina3-14/+14
This reverts commit 8b37e1bef5a6b60e949e28a4db3006e4b00bd758. It's broken as it changes led_blink_set() in a way that it can now sleep (while synchronously waiting for workqueue to be cancelled). That's a problem, because it's possible that this function gets called from atomic context (tpt_trig_timer() takes a readlock and thus disables preemption). This has been brought up 3 weeks ago already [1] but no proper fix has materialized, and I keep seeing the problem since 3.17-rc1. [1] https://lkml.org/lkml/2014/8/16/128 BUG: sleeping function called from invalid context at kernel/workqueue.c:2650 in_atomic(): 1, irqs_disabled(): 0, pid: 2335, name: wpa_supplicant 5 locks held by wpa_supplicant/2335: #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff814c7c92>] rtnl_lock+0x12/0x20 #1: (&wdev->mtx){+.+.+.}, at: [<ffffffffc06e649c>] cfg80211_mgd_wext_siwessid+0x5c/0x180 [cfg80211] #2: (&local->mtx){+.+.+.}, at: [<ffffffffc0817dea>] ieee80211_prep_connection+0x17a/0x9a0 [mac80211] #3: (&local->chanctx_mtx){+.+.+.}, at: [<ffffffffc08081ed>] ieee80211_vif_use_channel+0x5d/0x2a0 [mac80211] #4: (&trig->leddev_list_lock){.+.+..}, at: [<ffffffffc081e68c>] tpt_trig_timer+0xec/0x170 [mac80211] CPU: 0 PID: 2335 Comm: wpa_supplicant Not tainted 3.17.0-rc3 #1 Hardware name: LENOVO 7470BN2/7470BN2, BIOS 6DET38WW (2.02 ) 12/19/2008 ffff8800360b5a50 ffff8800751f76d8 ffffffff8159e97f ffff8800360b5a30 ffff8800751f76e8 ffffffff810739a5 ffff8800751f77b0 ffffffff8106862f ffffffff810685d0 0aa2209200000000 ffff880000000004 ffff8800361c59d0 Call Trace: [<ffffffff8159e97f>] dump_stack+0x4d/0x66 [<ffffffff810739a5>] __might_sleep+0xe5/0x120 [<ffffffff8106862f>] flush_work+0x5f/0x270 [<ffffffff810685d0>] ? mod_delayed_work_on+0x80/0x80 [<ffffffff810945ca>] ? mark_held_locks+0x6a/0x90 [<ffffffff81068a5f>] ? __cancel_work_timer+0x6f/0x100 [<ffffffff810946ed>] ? trace_hardirqs_on_caller+0xfd/0x1c0 [<ffffffff81068a6b>] __cancel_work_timer+0x7b/0x100 [<ffffffff81068b0e>] cancel_delayed_work_sync+0xe/0x10 [<ffffffff8147cf3b>] led_blink_set+0x1b/0x40 [<ffffffffc081e6b0>] tpt_trig_timer+0x110/0x170 [mac80211] [<ffffffffc081ecdd>] ieee80211_mod_tpt_led_trig+0x9d/0x160 [mac80211] [<ffffffffc07e4278>] __ieee80211_recalc_idle+0x98/0x140 [mac80211] [<ffffffffc07e59ce>] ieee80211_idle_off+0xe/0x10 [mac80211] [<ffffffffc0804e5b>] ieee80211_add_chanctx+0x3b/0x220 [mac80211] [<ffffffffc08062e4>] ieee80211_new_chanctx+0x44/0xf0 [mac80211] [<ffffffffc080838a>] ieee80211_vif_use_channel+0x1fa/0x2a0 [mac80211] [<ffffffffc0817df8>] ieee80211_prep_connection+0x188/0x9a0 [mac80211] [<ffffffffc081c246>] ieee80211_mgd_auth+0x256/0x2e0 [mac80211] [<ffffffffc07eab33>] ieee80211_auth+0x13/0x20 [mac80211] [<ffffffffc06cb006>] cfg80211_mlme_auth+0x106/0x270 [cfg80211] [<ffffffffc06ce085>] cfg80211_conn_do_work+0x155/0x3b0 [cfg80211] [<ffffffffc06cf670>] cfg80211_connect+0x3f0/0x540 [cfg80211] [<ffffffffc06e6148>] cfg80211_mgd_wext_connect+0x158/0x1f0 [cfg80211] [<ffffffffc06e651e>] cfg80211_mgd_wext_siwessid+0xde/0x180 [cfg80211] [<ffffffffc06e36c0>] ? cfg80211_wext_giwessid+0x50/0x50 [cfg80211] [<ffffffffc06e36dd>] cfg80211_wext_siwessid+0x1d/0x40 [cfg80211] [<ffffffff81584d0c>] ioctl_standard_iw_point+0x14c/0x3e0 [<ffffffff810946ed>] ? trace_hardirqs_on_caller+0xfd/0x1c0 [<ffffffff8158502a>] ioctl_standard_call+0x8a/0xd0 [<ffffffff81584fa0>] ? ioctl_standard_iw_point+0x3e0/0x3e0 [<ffffffff81584b76>] wireless_process_ioctl.constprop.10+0xb6/0x100 [<ffffffff8158521d>] wext_handle_ioctl+0x5d/0xb0 [<ffffffff814cfb29>] dev_ioctl+0x329/0x620 [<ffffffff810946ed>] ? trace_hardirqs_on_caller+0xfd/0x1c0 [<ffffffff8149c7f2>] sock_ioctl+0x142/0x2e0 [<ffffffff811b0140>] do_vfs_ioctl+0x300/0x520 [<ffffffff815a67fb>] ? sysret_check+0x1b/0x56 [<ffffffff810946ed>] ? trace_hardirqs_on_caller+0xfd/0x1c0 [<ffffffff811b03e1>] SyS_ioctl+0x81/0xa0 [<ffffffff815a67d6>] system_call_fastpath+0x1a/0x1f wlan0: send auth to 00:0b:6b:3c:8c:e4 (try 1/3) wlan0: authenticated wlan0: associate with 00:0b:6b:3c:8c:e4 (try 1/3) wlan0: RX AssocResp from 00:0b:6b:3c:8c:e4 (capab=0x431 status=0 aid=2) wlan0: associated IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready cfg80211: Calling CRDA for country: NA wlan0: Limiting TX power to 27 (27 - 0) dBm as advertised by 00:0b:6b:3c:8c:e4 ================================= [ INFO: inconsistent lock state ] 3.17.0-rc3 #1 Not tainted --------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. swapper/0/0 [HC0[0]:SC1[1]:HE1:SE0] takes: ((&(&led_cdev->blink_work)->work)){+.?...}, at: [<ffffffff810685d0>] flush_work+0x0/0x270 {SOFTIRQ-ON-W} state was registered at: [<ffffffff81094dbe>] __lock_acquire+0x30e/0x1a30 [<ffffffff81096c81>] lock_acquire+0x91/0x110 [<ffffffff81068608>] flush_work+0x38/0x270 [<ffffffff81068a6b>] __cancel_work_timer+0x7b/0x100 [<ffffffff81068b0e>] cancel_delayed_work_sync+0xe/0x10 [<ffffffff8147cf3b>] led_blink_set+0x1b/0x40 [<ffffffffc081e6b0>] tpt_trig_timer+0x110/0x170 [mac80211] [<ffffffffc081ecdd>] ieee80211_mod_tpt_led_trig+0x9d/0x160 [mac80211] [<ffffffffc07e4278>] __ieee80211_recalc_idle+0x98/0x140 [mac80211] [<ffffffffc07e59ce>] ieee80211_idle_off+0xe/0x10 [mac80211] [<ffffffffc0804e5b>] ieee80211_add_chanctx+0x3b/0x220 [mac80211] [<ffffffffc08062e4>] ieee80211_new_chanctx+0x44/0xf0 [mac80211] [<ffffffffc080838a>] ieee80211_vif_use_channel+0x1fa/0x2a0 [mac80211] [<ffffffffc0817df8>] ieee80211_prep_connection+0x188/0x9a0 [mac80211] [<ffffffffc081c246>] ieee80211_mgd_auth+0x256/0x2e0 [mac80211] [<ffffffffc07eab33>] ieee80211_auth+0x13/0x20 [mac80211] [<ffffffffc06cb006>] cfg80211_mlme_auth+0x106/0x270 [cfg80211] [<ffffffffc06ce085>] cfg80211_conn_do_work+0x155/0x3b0 [cfg80211] [<ffffffffc06cf670>] cfg80211_connect+0x3f0/0x540 [cfg80211] [<ffffffffc06e6148>] cfg80211_mgd_wext_connect+0x158/0x1f0 [cfg80211] [<ffffffffc06e651e>] cfg80211_mgd_wext_siwessid+0xde/0x180 [cfg80211] [<ffffffffc06e36dd>] cfg80211_wext_siwessid+0x1d/0x40 [cfg80211] [<ffffffff81584d0c>] ioctl_standard_iw_point+0x14c/0x3e0 [<ffffffff8158502a>] ioctl_standard_call+0x8a/0xd0 [<ffffffff81584b76>] wireless_process_ioctl.constprop.10+0xb6/0x100 [<ffffffff8158521d>] wext_handle_ioctl+0x5d/0xb0 [<ffffffff814cfb29>] dev_ioctl+0x329/0x620 [<ffffffff8149c7f2>] sock_ioctl+0x142/0x2e0 [<ffffffff811b0140>] do_vfs_ioctl+0x300/0x520 [<ffffffff811b03e1>] SyS_ioctl+0x81/0xa0 [<ffffffff815a67d6>] system_call_fastpath+0x1a/0x1f irq event stamp: 493416 hardirqs last enabled at (493416): [<ffffffff81068a5f>] __cancel_work_timer+0x6f/0x100 hardirqs last disabled at (493415): [<ffffffff81067e9f>] try_to_grab_pending+0x1f/0x160 softirqs last enabled at (493408): [<ffffffff81053ced>] _local_bh_enable+0x1d/0x50 softirqs last disabled at (493409): [<ffffffff81054c75>] irq_exit+0xa5/0xb0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock((&(&led_cdev->blink_work)->work)); <Interrupt> lock((&(&led_cdev->blink_work)->work)); *** DEADLOCK *** 2 locks held by swapper/0/0: #0: (((&tpt_trig->timer))){+.-...}, at: [<ffffffff810b4c50>] call_timer_fn+0x0/0x180 #1: (&trig->leddev_list_lock){.+.?..}, at: [<ffffffffc081e68c>] tpt_trig_timer+0xec/0x170 [mac80211] stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.17.0-rc3 #1 Hardware name: LENOVO 7470BN2/7470BN2, BIOS 6DET38WW (2.02 ) 12/19/2008 ffffffff8246eb30 ffff88007c203b00 ffffffff8159e97f ffffffff81a194c0 ffff88007c203b50 ffffffff81599c29 0000000000000001 ffffffff00000001 ffff880000000000 0000000000000006 ffffffff81a194c0 ffffffff81093ad0 Call Trace: <IRQ> [<ffffffff8159e97f>] dump_stack+0x4d/0x66 [<ffffffff81599c29>] print_usage_bug+0x1f4/0x205 [<ffffffff81093ad0>] ? check_usage_backwards+0x140/0x140 [<ffffffff810944d3>] mark_lock+0x223/0x2b0 [<ffffffff81094d60>] __lock_acquire+0x2b0/0x1a30 [<ffffffff81096c81>] lock_acquire+0x91/0x110 [<ffffffff810685d0>] ? mod_delayed_work_on+0x80/0x80 [<ffffffffc081e5a0>] ? __ieee80211_get_rx_led_name+0x10/0x10 [mac80211] [<ffffffff81068608>] flush_work+0x38/0x270 [<ffffffff810685d0>] ? mod_delayed_work_on+0x80/0x80 [<ffffffff810945ca>] ? mark_held_locks+0x6a/0x90 [<ffffffff81068a5f>] ? __cancel_work_timer+0x6f/0x100 [<ffffffffc081e5a0>] ? __ieee80211_get_rx_led_name+0x10/0x10 [mac80211] [<ffffffff8109469d>] ? trace_hardirqs_on_caller+0xad/0x1c0 [<ffffffffc081e5a0>] ? __ieee80211_get_rx_led_name+0x10/0x10 [mac80211] [<ffffffff81068a6b>] __cancel_work_timer+0x7b/0x100 [<ffffffff81068b0e>] cancel_delayed_work_sync+0xe/0x10 [<ffffffff8147cf3b>] led_blink_set+0x1b/0x40 [<ffffffffc081e6b0>] tpt_trig_timer+0x110/0x170 [mac80211] [<ffffffff810b4cc5>] call_timer_fn+0x75/0x180 [<ffffffff810b4c50>] ? process_timeout+0x10/0x10 [<ffffffffc081e5a0>] ? __ieee80211_get_rx_led_name+0x10/0x10 [mac80211] [<ffffffff810b50ac>] run_timer_softirq+0x1fc/0x2f0 [<ffffffff81054805>] __do_softirq+0x115/0x2e0 [<ffffffff81054c75>] irq_exit+0xa5/0xb0 [<ffffffff810049b3>] do_IRQ+0x53/0xf0 [<ffffffff815a74af>] common_interrupt+0x6f/0x6f <EOI> [<ffffffff8147b56e>] ? cpuidle_enter_state+0x6e/0x180 [<ffffffff8147b732>] cpuidle_enter+0x12/0x20 [<ffffffff8108bba0>] cpu_startup_entry+0x330/0x360 [<ffffffff8158fb51>] rest_init+0xc1/0xd0 [<ffffffff8158fa90>] ? csum_partial_copy_generic+0x170/0x170 [<ffffffff81af3ff2>] start_kernel+0x44f/0x45a [<ffffffff81af399c>] ? set_init_arg+0x53/0x53 [<ffffffff81af35ad>] x86_64_start_reservations+0x2a/0x2c [<ffffffff81af36a0>] x86_64_start_kernel+0xf1/0xf4 Cc: Vincent Donnefort <vdonnefort@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Bryan Wu <cooloney@gmail.com>
2014-09-02f2fs: reposition unlock_new_inode to prevent accessing invalid inodeChao Yu2-16/+6
As the race condition on the inode cache, following scenario can appear: [Thread a] [Thread b] ->f2fs_mkdir ->f2fs_add_link ->__f2fs_add_link ->init_inode_metadata failed here ->gc_thread_func ->f2fs_gc ->do_garbage_collect ->gc_data_segment ->f2fs_iget ->iget_locked ->wait_on_inode ->unlock_new_inode ->move_data_page ->make_bad_inode ->iput When we fail in create/symlink/mkdir/mknod/tmpfile, the new allocated inode should be set as bad to avoid being accessed by other thread. But in above scenario, it allows f2fs to access the invalid inode before this inode was set as bad. This patch fix the potential problem, and this issue was found by code review. change log from v1: o Add condition judgment in gc_data_segment() suggested by Changman Lee. o use iget_failed to simplify code. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-01Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq handling fixlet from Thomas Gleixner: "Just an export for an interrupt flow handler which is now used in gpio modules" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irq: Export handle_fasteoi_irq
2014-09-01Revert "arm64: cpuinfo: print info for all CPUs"Will Deacon1-18/+22
It turns out that vendors are relying on the format of /proc/cpuinfo, and we've even spotted out-of-tree hacks attempting to make it look identical to the format used by arch/arm/. That means we can't afford to churn this interface in mainline, so revert the recent reformatting of the file for arm64 pending discussions on the list to find out what people actually want. This reverts commit d7a49086f263164a2c4c178eb76412d48cd671d7. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-09-01arm64: fix bug for reloading FPSIMD state after cpu power offLeo Yan1-0/+1
Now arm64 defers reloading FPSIMD state, but this optimization also introduces the bug after cpu resume back from low power mode. The reason is after the cpu has been powered off, s/w need set the cpu's fpsimd_last_state to NULL so that it will force to reload FPSIMD state for the thread, otherwise there has the chance to meet the condition for both the task's fpsimd_state.cpu field contains the id of the current cpu, and the cpu's fpsimd_last_state per-cpu variable points to the task's fpsimd_state, so finally kernel will skip to reload the context during it return back to userland. Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Leo Yan <leoy@marvell.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-09-01Linux 3.17-rc3Linus Torvalds1-1/+1
2014-09-01Merge tag 'xtensa-20140830' of git://github.com/czankel/xtensa-linuxLinus Torvalds24-217/+497
Pull Xtensa updates from Chris Zankel: "Xtensa improvements for 3.17: - support highmem on cores with aliasing data cache. Enable highmem on kc705 by default - simplify addition of new core variants (no need to modify Kconfig / Makefiles) - improve robustness of unaligned access handler and its interaction with window overflow/underflow exception handlers - deprecate atomic and spill registers syscalls - clean up Kconfig: remove orphan MATH_EMULATION, sort 'select' statements - wire up renameat2 syscall. Various fixes: - fix address checks in dma_{alloc,free}_coherent (runtime BUG) - fix access to THREAD_RA/THREAD_SP/THREAD_DS (debug build breakage) - fix TLBTEMP_BASE_2 region handling in fast_second_level_miss (runtime unrecoverable exception) - fix a6 and a7 handling in fast_syscall_xtensa (runtime userspace register clobbering) - fix kernel/user jump out of fast_unaligned (potential runtime unrecoverabl exception) - replace termios IOCTL code definitions with constants (userspace build breakage)" * tag 'xtensa-20140830' of git://github.com/czankel/xtensa-linux: (25 commits) xtensa: deprecate fast_xtensa and fast_spill_registers syscalls xtensa: don't allow overflow/underflow on unaligned stack xtensa: fix a6 and a7 handling in fast_syscall_xtensa xtensa: allow single-stepping through unaligned load/store xtensa: move invalid unaligned instruction handler closer to its users xtensa: make fast_unaligned store restartable xtensa: add double exception fixup handler for fast_unaligned xtensa: fix kernel/user jump out of fast_unaligned xtensa: configure kc705 for highmem xtensa: support highmem in aliasing cache flushing code xtensa: support aliasing cache in kmap xtensa: support aliasing cache in k[un]map_atomic xtensa: implement clear_user_highpage and copy_user_highpage xtensa: fix TLBTEMP_BASE_2 region handling in fast_second_level_miss xtensa: allow fixmap and kmap span more than one page table xtensa: make fixmap region addressing grow with index xtensa: fix access to THREAD_RA/THREAD_SP/THREAD_DS xtensa: add renameat2 syscall xtensa: fix address checks in dma_{alloc,free}_coherent xtensa: replace IOCTL code definitions with constants ...
2014-09-01unicore32: Fix build errorGuenter Roeck1-4/+5
unicore32 builds fail with arch/unicore32/kernel/signal.c: In function ‘setup_frame’: arch/unicore32/kernel/signal.c:257: error: ‘usig’ undeclared (first use in this function) arch/unicore32/kernel/signal.c:279: error: ‘usig’ undeclared (first use in this function) arch/unicore32/kernel/signal.c: In function ‘handle_signal’: arch/unicore32/kernel/signal.c:306: warning: unused variable ‘tsk’ arch/unicore32/kernel/signal.c: In function ‘do_signal’: arch/unicore32/kernel/signal.c:376: error: implicit declaration of function ‘get_signsl’ make[1]: *** [arch/unicore32/kernel/signal.o] Error 1 make: *** [arch/unicore32/kernel/signal.o] Error 2 Bisect points to commit 649671c90eaf ("unicore32: Use get_signal() signal_setup_done()"). This code never even compiled. Reverting the patch does not work, since previously used functions no longer exist, so try to fix it up. Compile tested only. Fixes: 649671c90eaf ("unicore32: Use get_signal() signal_setup_done()") Cc: Richard Weinberger <richard@nod.at> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>