Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch adds tracepoints which would be useful for analyzing segment
usage from a perspective of high level sufile manipulation (check, alloc,
free). sufile is an important in-place updated metadata file, so
analyzing the behavior would be useful for performance turning.
example of usage (a case of allocation):
$ sudo bin/tpoint nilfs2:nilfs2_segment_usage_allocated
Tracing nilfs2:nilfs2_segment_usage_allocated. Ctrl-C to end.
segctord-17800 [002] ...1 10671.867294: nilfs2_segment_usage_allocated: sufile = ffff880054f908a8 segnum = 2
segctord-17800 [002] ...1 10675.073477: nilfs2_segment_usage_allocated: sufile = ffff880054f908a8 segnum = 3
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Benixon Dhas <benixon.dhas@wdc.com>
Cc: TK Kato <TK.Kato@wdc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This patch adds a tracepoint for transaction events of nilfs. With the
tracepoint, these events can be tracked: begin, abort, commit, trylock,
lock, and unlock. Basically, these events have corresponding functions
e.g. begin event corresponds nilfs_transaction_begin(). The unlock event
is an exception. It corresponds to the iteration in
nilfs_transaction_lock().
Only one tracepoint is introcued: nilfs2_transaction_transition. The
above events are distinguished with newly introduced enum. With this
tracepoint, we can analyse a critical section of segment constructoin.
Sample output by tpoint of perf-tools:
cp-4457 [000] ...1 63.266220: nilfs2_transaction_transition: sb = ffff8802112b8800 ti = ffff8800bf5ccc58 count = 1 flags = 9 state = BEGIN
cp-4457 [000] ...1 63.266221: nilfs2_transaction_transition: sb = ffff8802112b8800 ti = ffff8800bf5ccc58 count = 0 flags = 9 state = COMMIT
cp-4457 [000] ...1 63.266221: nilfs2_transaction_transition: sb = ffff8802112b8800 ti = ffff8800bf5ccc58 count = 0 flags = 9 state = COMMIT
segctord-4371 [001] ...1 68.261196: nilfs2_transaction_transition: sb = ffff8802112b8800 ti = ffff8800b889bdf8 count = 0 flags = 10 state = TRYLOCK
segctord-4371 [001] ...1 68.261280: nilfs2_transaction_transition: sb = ffff8802112b8800 ti = ffff8800b889bdf8 count = 0 flags = 10 state = LOCK
segctord-4371 [001] ...1 68.261877: nilfs2_transaction_transition: sb = ffff8802112b8800 ti = ffff8800b889bdf8 count = 1 flags = 10 state = BEGIN
segctord-4371 [001] ...1 68.262116: nilfs2_transaction_transition: sb = ffff8802112b8800 ti = ffff8800b889bdf8 count = 0 flags = 18 state = COMMIT
segctord-4371 [001] ...1 68.265032: nilfs2_transaction_transition: sb = ffff8802112b8800 ti = ffff8800b889bdf8 count = 0 flags = 18 state = UNLOCK
segctord-4371 [001] ...1 132.376847: nilfs2_transaction_transition: sb = ffff8802112b8800 ti = ffff8800b889bdf8 count = 0 flags = 10 state = TRYLOCK
This patch also does trivial cleaning of comma usage in collection stage
transition event for consistent coding style.
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This patch adds a tracepoint for tracking stage transition of block
collection in segment construction. With the tracepoint, we can analysis
the behavior of segment construction in depth. It would be useful for
bottleneck detection and debugging, etc.
The tracepoint is created with the standard trace API of linux (like ext3,
ext4, f2fs and btrfs). So we can analysis with existing tools easily. Of
course, more detailed analysis will be possible if we can create nilfs
specific analysis tools.
Below is an example of event dump with Brendan Gregg's perf-tools
(https://github.com/brendangregg/perf-tools). Time consumption between
each stage can be obtained.
$ sudo bin/tpoint nilfs2:nilfs2_collection_stage_transition
Tracing nilfs2:nilfs2_collection_stage_transition. Ctrl-C to end.
segctord-14875 [003] ...1 28311.067794: nilfs2_collection_stage_transition: sci = ffff8800ce6de000 stage = ST_INIT
segctord-14875 [003] ...1 28311.068139: nilfs2_collection_stage_transition: sci = ffff8800ce6de000 stage = ST_GC
segctord-14875 [003] ...1 28311.068139: nilfs2_collection_stage_transition: sci = ffff8800ce6de000 stage = ST_FILE
segctord-14875 [003] ...1 28311.068486: nilfs2_collection_stage_transition: sci = ffff8800ce6de000 stage = ST_IFILE
segctord-14875 [003] ...1 28311.068540: nilfs2_collection_stage_transition: sci = ffff8800ce6de000 stage = ST_CPFILE
segctord-14875 [003] ...1 28311.068561: nilfs2_collection_stage_transition: sci = ffff8800ce6de000 stage = ST_SUFILE
segctord-14875 [003] ...1 28311.068565: nilfs2_collection_stage_transition: sci = ffff8800ce6de000 stage = ST_DAT
segctord-14875 [003] ...1 28311.068573: nilfs2_collection_stage_transition: sci = ffff8800ce6de000 stage = ST_SR
segctord-14875 [003] ...1 28311.068574: nilfs2_collection_stage_transition: sci = ffff8800ce6de000 stage = ST_DONE
For capturing transition correctly, this patch adds wrappers for the
member scnt of nilfs_cstage. With this change, every transition of the
stage can produce trace event in a correct manner.
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
As a nilfs2 volume ages, the amount of available disk space decreases
little by little due to bloat of DAT (disk address translation) metadata
file. Even if we delete all files in a file system and free their block
addresses from the DAT file through a garbage collection, empty DAT blocks
are not freed.
This fixes the issue by extending the deallocator of block addresses so
that empty data blocks and empty bitmap blocks of DAT are deleted.
The following comparison shows the effect of this patch. Each shows disk
amount information of a nilfs2 volume that we cleaned out by deleting all
files and running gc after having filled 90% of its capacity.
Before:
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda1 500105212 3022844 472072192 1% /test
After:
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda1 500105212 16380 475078656 1% /test
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This adds delete functions for data blocks of metadata files using bitmap
based allocator. nilfs_palloc_delete_entry_block() deletes an entry block
(e.g. block storing dat entries), and nilfs_palloc_delete_bitmap_block()
deletes a bitmap block, respectively.
These helpers are intended to be used in the successive change on
deallocator of block addresses ("nilfs2: free unused dat file blocks
during garbage collection").
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This unfolds nilfs_palloc_group_is_in() helper function into
nilfs_palloc_freev() function to simplify a range check and an index
calculation repeatedy performed in a loop of the function.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The current implementation of nilfs_palloc_find_available_slot() function
is overkill. The underlying bit search routine is well optimized, so this
uses it more simply in nilfs_palloc_find_available_slot().
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
In the bitmap based allocator implementation, nilfs_mdt_bgl_lock() helper
is frequently used to get a spinlock protecting a target block group.
This reduces its usage and simplifies arguments of some related functions
by directly passing a pointer to the spinlock.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This uses nilfs_warning() to replace "printk(KERN_WARNING ...);" in the
bitmap based allocator implementation of nilfs2. The warning messages are
modified to include the device name and the inode number in each message.
This makes it clear which metadata file of which device has output
warnings such as "entry number xxxx already freed".
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Remove unneeded NULL test.
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@ expression x; @@
-if (x != NULL)
\(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x);
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The commit 96d0df79f264 ("proc: make proc_fd_permission() thread-friendly")
fixed the access to /proc/self/fd from sub-threads, but introduced another
problem: a sub-thread can't access /proc/<tid>/fd/ or /proc/thread-self/fd
if generic_permission() fails.
Change proc_fd_permission() to check same_thread_group(pid_task(), current).
Fixes: 96d0df79f264 ("proc: make proc_fd_permission() thread-friendly")
Reported-by: "Jin, Yihua" <yihua.jin@intel.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
For now in task_name() we ignore the return code of string_escape_str()
call. This is not good if buffer suddenly becomes not big enough. Do the
proper error handling there.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
sync_file_range(2) is documented to issue writeback only for pages that
are not currently being written. After all the system call has been
created for userspace to be able to issue background writeout and so
waiting for in-flight IO is undesirable there. However commit
ee53a891f474 ("mm: do_sync_mapping_range integrity fix") switched
do_sync_mapping_range() and thus sync_file_range() to issue writeback in
WB_SYNC_ALL mode since do_sync_mapping_range() was used by other code
relying on WB_SYNC_ALL semantics.
These days do_sync_mapping_range() went away and we can switch
sync_file_range(2) back to issuing WB_SYNC_NONE writeback. That should
help PostgreSQL avoid large latency spikes when flushing data in the
background.
Andres measured a 20% increase in transactions per second on an SSD disk.
Signed-off-by: Jan Kara <jack@suse.com>
Reported-by: Andres Freund <andres@anarazel.de>
Tested-By: Andres Freund <andres@anarazel.de>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
There are many places which use mapping_gfp_mask to restrict a more
generic gfp mask which would be used for allocations which are not
directly related to the page cache but they are performed in the same
context.
Let's introduce a helper function which makes the restriction explicit and
easier to track. This patch doesn't introduce any functional changes.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Michal Hocko <mhocko@suse.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
__GFP_WAIT was used to signal that the caller was in atomic context and
could not sleep. Now it is possible to distinguish between true atomic
context and callers that are not willing to sleep. The latter should
clear __GFP_DIRECT_RECLAIM so kswapd will still wake. As clearing
__GFP_WAIT behaves differently, there is a risk that people will clear the
wrong flags. This patch renames __GFP_WAIT to __GFP_RECLAIM to clearly
indicate what it does -- setting it allows all reclaim activity, clearing
them prevents it.
[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
sleep and avoiding waking kswapd
__GFP_WAIT has been used to identify atomic context in callers that hold
spinlocks or are in interrupts. They are expected to be high priority and
have access one of two watermarks lower than "min" which can be referred
to as the "atomic reserve". __GFP_HIGH users get access to the first
lower watermark and can be called the "high priority reserve".
Over time, callers had a requirement to not block when fallback options
were available. Some have abused __GFP_WAIT leading to a situation where
an optimisitic allocation with a fallback option can access atomic
reserves.
This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
cannot sleep and have no alternative. High priority users continue to use
__GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and
are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify
callers that want to wake kswapd for background reclaim. __GFP_WAIT is
redefined as a caller that is willing to enter direct reclaim and wake
kswapd for background reclaim.
This patch then converts a number of sites
o __GFP_ATOMIC is used by callers that are high priority and have memory
pools for those requests. GFP_ATOMIC uses this flag.
o Callers that have a limited mempool to guarantee forward progress clear
__GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
into this category where kswapd will still be woken but atomic reserves
are not used as there is a one-entry mempool to guarantee progress.
o Callers that are checking if they are non-blocking should use the
helper gfpflags_allow_blocking() where possible. This is because
checking for __GFP_WAIT as was done historically now can trigger false
positives. Some exceptions like dm-crypt.c exist where the code intent
is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
flag manipulations.
o Callers that built their own GFP flags instead of starting with GFP_KERNEL
and friends now also need to specify __GFP_KSWAPD_RECLAIM.
The first key hazard to watch out for is callers that removed __GFP_WAIT
and was depending on access to atomic reserves for inconspicuous reasons.
In some cases it may be appropriate for them to use __GFP_HIGH.
The second key hazard is callers that assembled their own combination of
GFP flags instead of starting with something like GFP_KERNEL. They may
now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless
if it's missed in most cases as other activity will wake kswapd.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs updates from Chris Mason:
"We have a lot of subvolume quota improvements in here, along with big
piles of cleanups from Dave Sterba and Anand Jain and others.
Josef pitched in a batch of allocator fixes based on production use
here at FB. We found that mount -o ssd_spread greatly improved our
performance on hardware raid5/6, but it exposed some CPU bottlenecks
in the allocator. These patches make a huge difference"
* 'for-linus-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (100 commits)
Btrfs: fix hole punching when using the no-holes feature
Btrfs: find_free_extent: Do not erroneously skip LOOP_CACHING_WAIT state
btrfs: Fix a data space underflow warning
btrfs: qgroup: Fix a rebase bug which will cause qgroup double free
btrfs: qgroup: Fix a race in delayed_ref which leads to abort trans
btrfs: clear PF_NOFREEZE in cleaner_kthread()
btrfs: qgroup: Don't copy extent buffer to do qgroup rescan
btrfs: add balance filters limits, stripes and usage to supported mask
btrfs: extend balance filter usage to take minimum and maximum
btrfs: add balance filter for stripes
btrfs: extend balance filter limit to take minimum and maximum
btrfs: fix use after free iterating extrefs
btrfs: check unsupported filters in balance arguments
Btrfs: fix regression running delayed references when using qgroups
Btrfs: fix regression when running delayed references
Btrfs: don't do extra bitmap search in one bit case
Btrfs: keep track of largest extent in bitmaps
Btrfs: don't keep trying to build clusters if we are fragmented
Btrfs: cut down on loops through the allocator
Btrfs: don't continue setting up space cache when enospc
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
"Add support for the CSUM_SEED feature which will allow future
userspace utilities to change the file system's UUID without rewriting
all of the file system metadata.
A number of miscellaneous fixes, the most significant of which are in
the ext4 encryption support. Anyone wishing to use the encryption
feature should backport all of the ext4 crypto patches up to 4.4 to
get fixes to a memory leak and file system corruption bug.
There are also cleanups in ext4's feature test macros and in ext4's
sysfs support code"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (26 commits)
fs/ext4: remove unnecessary new_valid_dev check
ext4: fix abs() usage in ext4_mb_check_group_pa
ext4: do not allow journal_opts for fs w/o journal
ext4: explicit mount options parsing cleanup
ext4, jbd2: ensure entering into panic after recording an error in superblock
[PATCH] fix calculation of meta_bg descriptor backups
ext4: fix potential use after free in __ext4_journal_stop
jbd2: fix checkpoint list cleanup
ext4: fix xfstest generic/269 double revoked buffer bug with bigalloc
ext4: make the bitmap read routines return real error codes
jbd2: clean up feature test macros with predicate functions
ext4: clean up feature test macros with predicate functions
ext4: call out CRC and corruption errors with specific error codes
ext4: store checksum seed in superblock
ext4: reserve code points for the project quota feature
ext4: promote ext4 over ext2 in the default probe order
jbd2: gate checksum calculations on crc driver presence, not sb flags
ext4: use private version of page_zero_new_buffers() for data=journal mode
ext4 crypto: fix bugs in ext4_encrypted_zeroout()
ext4 crypto: replace some BUG_ON()'s with error checks
...
|
|
The iput() function tests whether its argument is NULL and then
returns immediately. Thus the test around the call is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
If ubifs_tnc_next_ent() returns something else than -ENOENT
we leak file->private_data.
Signed-off-by: Richard Weinberger <richard@nod.at>
Reviewed-by: David Gstir <david@sigma-star.at>
|
|
As currently new_valid_dev always returns 1, so new_valid_dev check is not
needed, remove it.
Signed-off-by: Yaowei Bai <bywxiaobai@163.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracking updates from Steven Rostedt:
"Most of the changes are clean ups and small fixes. Some of them have
stable tags to them. I searched through my INBOX just as the merge
window opened and found lots of patches to pull. I ran them through
all my tests and they were in linux-next for a few days.
Features added this release:
----------------------------
- Module globbing. You can now filter function tracing to several
modules. # echo '*:mod:*snd*' > set_ftrace_filter (Dmitry Safonov)
- Tracer specific options are now visible even when the tracer is not
active. It was rather annoying that you can only see and modify
tracer options after enabling the tracer. Now they are in the
options/ directory even when the tracer is not active. Although
they are still only visible when the tracer is active in the
trace_options file.
- Trace options are now per instance (although some of the tracer
specific options are global)
- New tracefs file: set_event_pid. If any pid is added to this file,
then all events in the instance will filter out events that are not
part of this pid. sched_switch and sched_wakeup events handle next
and the wakee pids"
* tag 'trace-v4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (68 commits)
tracefs: Fix refcount imbalance in start_creating()
tracing: Put back comma for empty fields in boot string parsing
tracing: Apply tracer specific options from kernel command line.
tracing: Add some documentation about set_event_pid
ring_buffer: Remove unneeded smp_wmb() before wakeup of reader benchmark
tracing: Allow dumping traces without tracking trace started cpus
ring_buffer: Fix more races when terminating the producer in the benchmark
ring_buffer: Do no not complete benchmark reader too early
tracing: Remove redundant TP_ARGS redefining
tracing: Rename max_stack_lock to stack_trace_max_lock
tracing: Allow arch-specific stack tracer
recordmcount: arm64: Replace the ignored mcount call into nop
recordmcount: Fix endianness handling bug for nop_mcount
tracepoints: Fix documentation of RCU lockdep checks
tracing: ftrace_event_is_function() can return boolean
tracing: is_legal_op() can return boolean
ring-buffer: rb_event_is_commit() can return boolean
ring-buffer: rb_per_cpu_empty() can return boolean
ring_buffer: ring_buffer_empty{cpu}() can return boolean
ring-buffer: rb_is_reader_page() can return boolean
...
|
|
Pull MTD updates from Brian Norris:
"Core:
- WARN (in some cases) when a struct mtd_info is registered multiple
times; in the past this was "supported", but it's still error prone
for future development. There's only one ugly case of this left in
the tree (that we're aware of) and the owners are aware of the
problems there.
- fix potential deadlock in the blkdev removal path NOTE: the
(potential) deadlock was introduced in a for-stable patch. This
one is also marked for -stable.
- ioctl(BLKPG) compat_ioctl support; resolves issues with 32-bit user
space vs 64-bit kernel space
- Set MTD parent device correctly throughout the tree, so the tree
structure appears correctly in sysfs; many drivers were missing
this (soft) requirement
- Move device tree partitions (ofpart) into a dedicated 'partitions'
subnode; this helps to disambiguate whether a node is a partition
or some other auxiliary data
- Improve error handling for partitioning failures
NAND:
- General: Increase timeout period, for corner-case systems with
less-than-accurate jiffies
- Fix OF-based autoloading of several NAND drivers when built as
modules
- pxa3xx_nand:
- Rework timing configuration to be more dynamic
- Refactor PM support
- brcmnand: prepare for NorthStar 2 support (ARM64, 16-bit NAND
chips)
- sunxi_nand: refactoring and a few bug fixes
- vf610: new NAND driver
- FSMC: add SW BCH support; support common NAND DT bindings
- lpc32xx_slc: refactor and improve timing calculations logic
- denali: support for rev 5.1
SPI NOR:
- Layering improvements
- Added Winbond lock/unlock support
- Added mtd_is_locked() (i.e., ioctl(MEMISLOCKED)) support
- Increase full-chip-erase timeout linearly with flash size
- fsl-quadspi: fix compile for non-ARM architectures
- New flash support"
* tag 'for-linus-20151106' of git://git.infradead.org/linux-mtd: (169 commits)
mtd: don't WARN about overloaded users of mtd->reboot_notifier.notifier_call
mtd: nand: sunxi: avoid retrieving data before ECC pass
mtd: nand: sunxi: fix sunxi_nfc_hw_ecc_read/write_chunk()
mtd: blkdevs: fix potential deadlock + lockdep warnings
mtd: ofpart: move ofpart partitions to a dedicated dt node
doc: dt: mtd: support partitions in a special 'partitions' subnode
mtd: brcmnand: Force 8bit mode before doing nand_scan_ident()
mtd: brcmnand: factor out CFG and CFG_EXT bitfields
mtd: mtdpart: Do not fail mtd probe when parsing partitions fails
mtd: fsl-quadspi: fix macro collision problems with READ/WRITE
mtd: warn when registering the same master many times
mtd: fixup corner case error handling in mtd_device_parse_register()
mtd: tests: Replace timeval with ktime_t
mtd: fsmc_nand: Add BCH4 SW ECC support for SPEAr600
mtd: nand: vf610_nfc: use nand_check_erased_ecc_chunk() helper
mtd: nand: increase ready wait timeout and report timeouts
mtd: docg3: off by one in doc_register_sysfs()
mtd: pxa3xx_nand: clean up the pxa3xx timings
mtd: pxa3xx_nand: rework flash detection and timing setup
mtd: pxa3xx_nand: add helpers to setup the timings
...
|
|
Merge patch-bomb from Andrew Morton:
- inotify tweaks
- some ocfs2 updates (many more are awaiting review)
- various misc bits
- kernel/watchdog.c updates
- Some of mm. I have a huge number of MM patches this time and quite a
lot of it is quite difficult and much will be held over to next time.
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (162 commits)
selftests: vm: add tests for lock on fault
mm: mlock: add mlock flags to enable VM_LOCKONFAULT usage
mm: introduce VM_LOCKONFAULT
mm: mlock: add new mlock system call
mm: mlock: refactor mlock, munlock, and munlockall code
kasan: always taint kernel on report
mm, slub, kasan: enable user tracking by default with KASAN=y
kasan: use IS_ALIGNED in memory_is_poisoned_8()
kasan: Fix a type conversion error
lib: test_kasan: add some testcases
kasan: update reference to kasan prototype repo
kasan: move KASAN_SANITIZE in arch/x86/boot/Makefile
kasan: various fixes in documentation
kasan: update log messages
kasan: accurately determine the type of the bad access
kasan: update reported bug types for kernel memory accesses
kasan: update reported bug types for not user nor kernel memory accesses
mm/kasan: prevent deadlock in kasan reporting
mm/kasan: don't use kasan shadow pointer in generic functions
mm/kasan: MODULE_VADDR is not available on all archs
...
|
|
This fixes a bug from commit f3f86e33dc3d ("vfs: Fix pathological
performance case for __alloc_fd()").
v2: refactor to share fd bitmap copying code
Signed-off-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
/proc/pid/oom_adj exists solely to avoid breaking existing userspace
binaries that write to the tunable.
Add a comment in the only possible location within the kernel tree to
describe the situation and motivation for keeping it around.
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Don't build clear_soft_dirty_pmd() if transparent huge pages are not
enabled.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
As mentioned in the commit 56eecdb912b5 ("mm: Use ptep/pmdp_set_numa()
for updating _PAGE_NUMA bit"), architectures like ppc64 don't do tlb
flush in set_pte/pmd functions.
So when dealing with existing pte in clear_soft_dirty, the pte must be
cleared before being modified.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
filemap_fdatawait() is a function to wait for on-going writeback to
complete but also consume and clear error status of the mapping set during
writeback.
The latter functionality is critical for applications to detect writeback
error with system calls like fsync(2)/fdatasync(2).
However filemap_fdatawait() is also used by sync(2) or FIFREEZE ioctl,
which don't check error status of individual mappings.
As a result, fsync() may not be able to detect writeback error if events
happen in the following order:
Application System admin
----------------------------------------------------------
write data on page cache
Run sync command
writeback completes with error
filemap_fdatawait() clears error
fsync returns success
(but the data is not on disk)
This patch adds filemap_fdatawait_keep_errors() for call sites where
writeback error is not handled so that they don't clear error status.
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Cc: Fengguang Wu <fengguang.wu@gmail.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Currently there's no easy way to get per-process usage of hugetlb pages,
which is inconvenient because userspace applications which use hugetlb
typically want to control their processes on the basis of how much memory
(including hugetlb) they use. So this patch simply provides easy access
to the info via /proc/PID/status.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Joern Engel <joern@logfs.org>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Currently /proc/PID/smaps provides no usage info for vma(VM_HUGETLB),
which is inconvenient when we want to know per-task or per-vma base
hugetlb usage. To solve this, this patch adds new fields for hugetlb
usage like below:
Size: 20480 kB
Rss: 0 kB
Pss: 0 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
Referenced: 0 kB
Anonymous: 0 kB
AnonHugePages: 0 kB
Shared_Hugetlb: 18432 kB
Private_Hugetlb: 2048 kB
Swap: 0 kB
KernelPageSize: 2048 kB
MMUPageSize: 2048 kB
Locked: 0 kB
VmFlags: rd wr mr mw me de ht
[hughd@google.com: fix Private_Hugetlb alignment ]
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Joern Engel <joern@logfs.org>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
If the remote locking fail, we run a local vfs unlock that should work and
return success to userland when we didn't actually lock at all. We need
to tell the application that tried to lock that it didn't get it, not that
all went well.
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
readahead_pages in ocfs2_duplicate_clusters_by_page is defined but not
used, so clean it up.
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
A node can mount multiple ocfs2 volumes. And if thread names are same for
each volume/domain, it will bring inconvenience when analyzing problems
because we have to identify which volume/domain the messages belong to.
Since thread name will be printed to messages, so add volume uuid or dlm
name to thread name can benefit problem analysis.
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Gang He <ghe@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
In ocfs2_mknod_locked if '__ocfs2_mknod_locke d' returns an error, we
should reclaim the inode successfully claimed above, otherwise, the
inode never be reused. The case is described below:
ocfs2_mknod
ocfs2_mknod_locked
ocfs2_claim_new_inode
Successfully claim the inode
__ocfs2_mknod_locked
ocfs2_journal_access_di
Failed because of -ENOMEM or other reasons, the inode
lockres has not been initialized yet.
iput(inode)
ocfs2_evict_inode
ocfs2_delete_inode
ocfs2_inode_lock
ocfs2_inode_lock_full_nested
__ocfs2_cluster_lock
Return -EINVAL because of the inode
lockres has not been initialized.
So the following operations are not performed
ocfs2_wipe_inode
ocfs2_remove_inode
ocfs2_free_dinode
ocfs2_free_suballoc_bits
Signed-off-by: Alex Chen <alex.chen@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
There is a race case between mount and delete node/cluster, which will
lead o2hb_thread to malfunctioning dead loop.
o2hb_thread
{
o2nm_depend_this_node();
<<<<<< race window, node may have already been deleted, and then
enter the loop, o2hb thread will be malfunctioning
because of no configured nodes found.
while (!kthread_should_stop() &&
!reg->hr_unclean_stop && !reg->hr_aborted_start) {
}
So check the return value of o2nm_depend_this_node() is needed. If node
has been deleted, do not enter the loop and let mount fail.
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
We have no need to take inode mutex, rw and inode lock if it is not dio
entry when recover orphans. Optimize it by adding a flag
OCFS2_INODE_DIO_ORPHAN_ENTRY to ocfs2_inode_info to reduce contention.
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
dio entry will only do truncate in case of ORPHAN_NEED_TRUNCATE. So do
not include it when doing normal orphan scan to reduce contention.
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Currently cluster allocation is always trying to find a victim chain (a
chian has most space), and this may lead to poor performance because of
discontiguous allocation in some scenarios.
Our test case is block size 4k, cluster size 1M and mount option with
localalloc=2048 (2G), since a gd is 32256M (about 31.5G) and a localalloc
window is only 2G, creating 50G file will result in 2G from gd0, 2G from
gd1, ...
One way to improve performance is enlarge localalloc window size (max
31104M), but this will make end user feel that about 30G is suddenly
"missing", and localalloc currently do not support steal, which means one
node cannot use another node's localalloc even it is not used in fact. So
using the last gd to record the allocation and continues with the gd if it
has enough space for a localalloc window can make the allocation as more
contiguous as possible.
Our test result is below (evaluated in IOPS), which is using iometer
running in VM, dynamic vhd virtual disk stored in ocfs2.
IO model Original After Improved(%)
16K60%Write100%Random 703 876 24.59%
8K90%Write100%Random 735 827 12.59%
4K100%Write100%Random 859 915 6.52%
4K100%Read100%Random 2092 2600 24.30%
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Tested-by: Norton Zhu <norton.zhu@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
A simplified test case is (this case from Ryan):
1) dd if=/dev/zero of=/mnt/hello bs=512 count=1 oflag=direct;
2) truncate /mnt/hello -s 2097152
file 'hello' is not exist before test. After this command,
file 'hello' should be all zero. But 512~4096 is some random data.
Setting bh state to new when get a new block, if so,
direct_io_worker()->dio_zero_block() will fill-in the unused portion
of the block with zero.
Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
If ocfs2_is_overwrite failed, ocfs2_direct_IO_write mays till return
success to the caller.
Signed-off-by: Norton.Zhu <norton.zhu@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
fs/logfs/dev_bdev.c: In function '__bdev_writeseg':
include/linux/kernel.h:601:17: warning: comparison of distinct pointer types lacks a cast [enabled by default]
(void) (&_min1 == &_min2); \
fs/logfs/dev_bdev.c:84:14: note: in expansion of macro 'min'
max_pages = min(nr_pages, BIO_MAX_PAGES);
fs/logfs/dev_bdev.c: In function 'do_erase':
include/linux/kernel.h:601:17: warning: comparison of distinct pointer types lacks a cast [enabled by default]
(void) (&_min1 == &_min2); \
fs/logfs/dev_bdev.c:174:14: note: in expansion of macro 'min'
max_pages = min(nr_pages, BIO_MAX_PAGES);
Lets use min_t and mention the type.
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Cc: Joern Engel <joern@logfs.org>
Cc: Prasad Joshi <prasadjoshi.linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The comment here says that it is checking for invalid bits. But, the mask
is *actually* checking to ensure that _any_ valid bit is set, which is
quite different.
Without this check, an unexpected bit could get set on an inotify object.
Since these bits are also interpreted by the fsnotify/dnotify code, there
is the potential for an object to be mishandled inside the kernel. For
instance, can we be sure that setting the dnotify flag FS_DN_RENAME on an
inotify watch is harmless?
Add the actual check which was intended. Retain the existing inotify bits
are being added to the watch. Plus, this is existing behavior which would
be nice to preserve.
I did a quick sniff test that inotify functions and that my
'inotify-tools' package passes 'make check'.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: John McCutchan <john@johnmccutchan.com>
Cc: Robert Love <rlove@rlove.org>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
There was a report that my patch:
inotify: actually check for invalid bits in sys_inotify_add_watch()
broke CRIU.
The reason is that CRIU looks up raw flags in /proc/$pid/fdinfo/* to
figure out how to rebuild inotify watches and then passes those flags
directly back in to the inotify API. One of those flags
(FS_EVENT_ON_CHILD) is set in mark->mask, but is not part of the inotify
API. It is used inside the kernel to _implement_ inotify but it is not
and has never been part of the API.
My patch above ensured that we only allow bits which are part of the API
(IN_ALL_EVENTS). This broke CRIU.
FS_EVENT_ON_CHILD is really internal to the kernel. It is set _anyway_ on
all inotify marks. So, CRIU was really just trying to set a bit that was
already set.
This patch hides that bit from fdinfo. CRIU will not see the bit, not try
to set it, and should work as before. We should not have been exposing
this bit in the first place, so this is a good patch independent of the
CRIU problem.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reported-by: Andrey Wagin <avagin@gmail.com>
Acked-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Eric Paris <eparis@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: John McCutchan <john@johnmccutchan.com>
Cc: Robert Love <rlove@rlove.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem update from James Morris:
"This is mostly maintenance updates across the subsystem, with a
notable update for TPM 2.0, and addition of Jarkko Sakkinen as a
maintainer of that"
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (40 commits)
apparmor: clarify CRYPTO dependency
selinux: Use a kmem_cache for allocation struct file_security_struct
selinux: ioctl_has_perm should be static
selinux: use sprintf return value
selinux: use kstrdup() in security_get_bools()
selinux: use kmemdup in security_sid_to_context_core()
selinux: remove pointless cast in selinux_inode_setsecurity()
selinux: introduce security_context_str_to_sid
selinux: do not check open perm on ftruncate call
selinux: change CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE default
KEYS: Merge the type-specific data with the payload data
KEYS: Provide a script to extract a module signature
KEYS: Provide a script to extract the sys cert list from a vmlinux file
keys: Be more consistent in selection of union members used
certs: add .gitignore to stop git nagging about x509_certificate_list
KEYS: use kvfree() in add_key
Smack: limited capability for changing process label
TPM: remove unnecessary little endian conversion
vTPM: support little endian guests
char: Drop owner assignment from i2c_driver
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull userns hardlink capability check fix from Eric Biederman:
"This round just contains a single patch. There has been a lot of
other work this period but it is not quite ready yet, so I am pushing
it until 4.5.
The remaining change by Dirk Steinmetz wich fixes both Gentoo and
Ubuntu containers allows hardlinks if we have the appropriate
capabilities in the user namespace. Security wise it is really a
gimme as the user namespace root can already call setuid become that
user and create the hardlink"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
namei: permit linking with CAP_FOWNER in userns
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux
Pull pstore updates from Tony Luck:
"Half dozen small cleanups plus change to allow pstore backend drivers
to be unloaded"
* tag 'please-pull-pstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
pstore: fix code comment to match code
efi-pstore: fix kernel-doc argument name
pstore: Fix return type of pstore_is_mounted()
pstore: add pstore unregister
pstore: add a helper function pstore_register_kmsg
pstore: add vmalloc error check
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"Most part of the patches include enhancing the stability and
performance of in-memory extent caches feature.
In addition, it introduces several new features and configurable
points:
- F2FS_GOING_DOWN_METAFLUSH ioctl to test power failures
- F2FS_IOC_WRITE_CHECKPOINT ioctl to trigger checkpoint by users
- background_gc=sync mount option to do gc synchronously
- periodic checkpoints
- sysfs entry to control readahead blocks for free nids
And the following bug fixes have been merged.
- fix SSA corruption by collapse/insert_range
- correct a couple of gc behaviors
- fix the results of f2fs_map_blocks
- fix error case handling of volatile/atomic writes"
* tag 'for-f2fs-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (54 commits)
f2fs: fix to skip shrinking extent nodes
f2fs: fix error path of ->symlink
f2fs: fix to clear GCed flag for atomic written page
f2fs: don't need to submit bio on error case
f2fs: fix leakage of inmemory atomic pages
f2fs: refactor __find_rev_next_{zero}_bit
f2fs: support fiemap for inline_data
f2fs: flush dirty data for bmap
f2fs: relocate the tracepoint for background_gc
f2fs crypto: fix racing of accessing encrypted page among
f2fs: export ra_nid_pages to sysfs
f2fs: readahead for free nids building
f2fs: support lower priority asynchronous readahead in ra_meta_pages
f2fs: don't tag REQ_META for temporary non-meta pages
f2fs: add a tracepoint for f2fs_read_data_pages
f2fs: set GFP_NOFS for grab_cache_page
f2fs: fix SSA updates resulting in corruption
Revert "f2fs: do not skip dentry block writes"
f2fs: add F2FS_GOING_DOWN_METAFLUSH to test power-failure
f2fs: merge meta writes as many possible
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
Pull dlm update from David Teigland:
"This includes one simple fix to make posix locks interruptible by
signals in cases where a signal handler is used"
* tag 'dlm-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
dlm: make posix locks interruptible
|