Age | Commit message (Collapse) | Author | Files | Lines |
|
commit 9d5afec6b8bd46d6ed821aa1579634437f58ef1f upstream.
On a ppc64 machine, when mounting a fuzzed ext2 image (generated by
fsfuzzer) the following call trace is seen,
VFS: brelse: Trying to free free buffer
WARNING: CPU: 1 PID: 6913 at /root/repos/linux/fs/buffer.c:1165 .__brelse.part.6+0x24/0x40
.__brelse.part.6+0x20/0x40 (unreliable)
.ext4_find_entry+0x384/0x4f0
.ext4_lookup+0x84/0x250
.lookup_slow+0xdc/0x230
.walk_component+0x268/0x400
.path_lookupat+0xec/0x2d0
.filename_lookup+0x9c/0x1d0
.vfs_statx+0x98/0x140
.SyS_newfstatat+0x48/0x80
system_call+0x58/0x6c
This happens because the directory that ext4_find_entry() looks up has
inode->i_size that is less than the block size of the filesystem. This
causes 'nblocks' to have a value of zero. ext4_bread_batch() ends up not
reading any of the directory file's blocks. This renders the entries in
bh_use[] array to continue to have garbage data. buffer_uptodate() on
bh_use[0] can then return a zero value upon which brelse() function is
invoked.
This commit fixes the bug by returning -ENOENT when the directory file
has no associated blocks.
Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit e81d44778d1d57bbaef9e24c4eac7c8a7a401d40 upstream.
The commit 6050d47adcad: "ext4: bail out from make_indexed_dir() on
first error" could end up leaking bh2 in the error path.
[ Also avoid renaming bh2 to bh, which just confuses things --tytso ]
Signed-off-by: yangsheng <yngsion@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit e12fb97222fc41e8442896934f76d39ef99b590a upstream.
Previously commit 14ece1028b3ed53ffec1b1213ffc6acaf79ad77c added a
support for for syncing parent directory of newly created inodes to
make sure that the inode is not lost after a power failure in
no-journal mode.
However this does not work in majority of cases, namely:
- if the directory has inline data
- if the directory is already indexed
- if the directory already has at least one block and:
- the new entry fits into it
- or we've successfully converted it to indexed
So in those cases we might lose the inode entirely even after fsync in
the no-journal mode. This also includes ext2 default mode obviously.
I've noticed this while running xfstest generic/321 and even though the
test should fail (we need to run fsck after a crash in no-journal mode)
I could not find a newly created entries even when if it was fsynced
before.
Fix this by adjusting the ext4_add_entry() successful exit paths to set
the inode EXT4_STATE_NEWENTRY so that fsync has the chance to fsync the
parent directory as well.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Frank Mayhar <fmayhar@google.com>
[bwh: Backported to 3.2: inline data is not supported]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit c9b92530a723ac5ef8e352885a1862b18f31b2f5 upstream.
Instead of checking whether the handle is valid, we check if journal
is enabled. This avoids taking the s_orphan_lock mutex in all cases
when there is no journal in use, including the error paths where
ext4_orphan_del() is called with a handle set to NULL.
Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
[bwh: Adjust context to apply after commit 0e9a9a1ad619
('ext4: avoid hang when mounting non-journal filesystems with orphan list')
and commit e2bfb088fac0
('ext4: don't orphan or truncate the boot loader inode')]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 6050d47adcadbb53582434d919ed7f038d936712 upstream.
When ext4_handle_dirty_dx_node() or ext4_handle_dirty_dirent_node()
fail, there's really something wrong with the fs and there's no point in
continuing further. Just return error from make_indexed_dir() in that
case. Also initialize frames array so that if we return early due to
error, dx_release() doesn't try to dereference uninitialized memory
(which could happen also due to error in do_split()).
Coverity-id: 741300
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
[bwh: Backported to 3.2:
- We have ext4_handle_dirty_metadata() not
ext4_handle_dirty_{dx,dirent}_node()]
- do_split() returns errors by reference not with ERR_PTR()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit f4bb2981024fc91b23b4d09a8817c415396dbabb upstream.
If there is a corrupted file system which has directory entries that
point at reserved, metadata inodes, prohibit them from being used by
treating them the same way we treat Boot Loader inodes --- that is,
mark them to be bad inodes. This prohibits them from being opened,
deleted, or modified via chmod, chown, utimes, etc.
In particular, this prevents a corrupted file system which has a
directory entry which points at the journal inode from being deleted
and its blocks released, after which point Much Hilarity Ensues.
Reported-by: Sami Liedes <sami.liedes@iki.fi>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit e2bfb088fac03c0f621886a04cffc7faa2b49b1d upstream.
The boot loader inode (inode #5) should never be visible in the
directory hierarchy, but it's possible if the file system is corrupted
that there will be a directory entry that points at inode #5. In
order to avoid accidentally trashing it, when such a directory inode
is opened, the inode will be marked as a bad inode, so that it's not
possible to modify (or read) the inode from userspace.
Unfortunately, when we unlink this (invalid/illegal) directory entry,
we will put the bad inode on the ophan list, and then when try to
unlink the directory, we don't actually remove the bad inode from the
orphan list before freeing in-memory inode structure. This means the
in-memory orphan list is corrupted, leading to a kernel oops.
In addition, avoid truncating a bad inode in ext4_destroy_inode(),
since truncating the boot loader inode is not a smart thing to do.
Reported-by: Sami Liedes <sami.liedes@iki.fi>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 0e9a9a1ad619e7e987815d20262d36a2f95717ca upstream.
When trying to mount a file system which does not contain a journal,
but which does have a orphan list containing an inode which needs to
be truncated, the mount call with hang forever in
ext4_orphan_cleanup() because ext4_orphan_del() will return
immediately without removing the inode from the orphan list, leading
to an uninterruptible loop in kernel code which will busy out one of
the CPU's on the system.
This can be trivially reproduced by trying to mount the file system
found in tests/f_orphan_extents_inode/image.gz from the e2fsprogs
source tree. If a malicious user were to put this on a USB stick, and
mount it on a Linux desktop which has automatic mounts enabled, this
could be considered a potential denial of service attack. (Not a big
deal in practice, but professional paranoids worry about such things,
and have even been known to allocate CVE numbers for such problems.)
-js: This is a fix for CVE-2013-2015.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 64cb927371cd2ec43758d8a094a003d27bc3d0dc upstream.
Both ext3 and ext4 htree_dirblock_to_tree() is just filling the
in-core rbtree for use by call_filldir(). All updates of ->f_pos are
done by the latter; bumping it here (on error) is obviously wrong - we
might very well have it nowhere near the block we'd found an error in.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 6a08f447facb4f9e29fcc30fb68060bb5a0d21c2 upstream.
ext4_special_inode_operations have their own ifdef CONFIG_EXT4_FS_XATTR
to mask those methods. And ext4_iget also always sets it, so there is
an inconsistency.
Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
commit 7e936b737211e6b54e34b71a827e56b872e958d8 upstream.
A hard-linked directory to its parent can cause the VFS to deadlock,
and is a sign of a corrupted file system. So detect this case in
ext4_lookup(), before the rmdir() lockup scenario can take place.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue:
vfs: add d_prune dentry operation
vfs: protect i_nlink
filesystems: add set_nlink()
filesystems: add missing nlink wrappers
logfs: remove unnecessary nlink setting
ocfs2: remove unnecessary nlink setting
jfs: remove unnecessary nlink setting
hypfs: remove unnecessary nlink setting
vfs: ignore error on forced remount
readlinkat: ensure we return ENOENT for the empty pathname for normal lookups
vfs: fix dentry leak in simple_fill_super()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (97 commits)
jbd2: Unify log messages in jbd2 code
jbd/jbd2: validate sb->s_first in journal_get_superblock()
ext4: let ext4_ext_rm_leaf work with EXT_DEBUG defined
ext4: fix a syntax error in ext4_ext_insert_extent when debugging enabled
ext4: fix a typo in struct ext4_allocation_context
ext4: Don't normalize an falloc request if it can fit in 1 extent.
ext4: remove comments about extent mount option in ext4_new_inode()
ext4: let ext4_discard_partial_buffers handle unaligned range correctly
ext4: return ENOMEM if find_or_create_pages fails
ext4: move vars to local scope in ext4_discard_partial_page_buffers_no_lock()
ext4: Create helper function for EXT4_IO_END_UNWRITTEN and i_aiodio_unwritten
ext4: optimize locking for end_io extent conversion
ext4: remove unnecessary call to waitqueue_active()
ext4: Use correct locking for ext4_end_io_nolock()
ext4: fix race in xattr block allocation path
ext4: trace punch_hole correctly in ext4_ext_map_blocks
ext4: clean up AGGRESSIVE_TEST code
ext4: move variables to their scope
ext4: fix quota accounting during migration
ext4: migrate cleanup
...
|
|
Replace remaining direct i_nlink updates with a new set_nlink()
updater function.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Tested-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Replace direct i_nlink updates with the respective updater function
(inc_nlink, drop_nlink, clear_nlink, inode_dec_link_count).
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
The tmp_inode should have same uid/gid as the original inode.
Otherwise new metadata blocks will be accounted to wrong quota-id,
which will result in a quota leak after the inode migration is
completed.
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
If a directory with more than EXT4_LINK_MAX subdirectories, the nlink
count is set to 1. Subsequently, if any subdirectories are deleted,
ext4_dec_count() decrements the i_nlink count, which may go to 0
temporarily before being incremented back to 1.
While this is done under i_mutex, which prevents races for directory
and inode operations that check i_nlink, the temporary i_nlink == 0
case is exposed to userspace via stat() and similar calls that do not
hold i_mutex.
Instead, change the code to not decrement i_nlink count for any
directories that do not already have i_nlink larger than 2.
Reported-by: Cliff White <cliffw@whamcloud.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
ext4_dx_add_entry manipulates bh2 and frames[0].bh, which are two buffer_heads
that point to directory blocks assigned to the directory inode. However, the
function calls ext4_handle_dirty_metadata with the inode of the file that's
being added to the directory, not the directory inode itself. Therefore,
correct the code to dirty the directory buffers with the directory inode, not
the file inode.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org
|
|
ext4_mkdir calls ext4_handle_dirty_metadata with dir_block and the inode "dir".
Unfortunately, dir_block belongs to the newly created directory (which is
"inode"), not the parent directory (which is "dir"). Fix the incorrect
association.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org
|
|
When ext4_rename performs a directory rename (move), dir_bh is a
buffer that is modified to update the '..' link in the directory being
moved (old_inode). However, ext4_handle_dirty_metadata is called with
the old parent directory inode (old_dir) and dir_bh, which is
incorrect because dir_bh does not belong to the parent inode. Fix
this error.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org
|
|
Add a new REQ_PRIO to let requests preempt others in the cfq I/O schedule,
and lave REQ_META purely for marking requests as metadata in blktrace.
All existing callers of REQ_META except for XFS are updated to also
set REQ_PRIO for now.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Replace all occurnanced of the undocumented READ_META with READ | REQ_META
and remove the unused WRITE_META define.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Commit df5e6223407e ("ext4: fix deadlock in ext4_symlink() in ENOSPC
conditions") recalculated the number of credits needed for a long
symlink, in the process of splitting it into two transactions. However,
the first credit calculation under-counted because if selinux is
enabled, credits are needed to create the selinux xattr as well.
Overrunning the reservation will result in an OOPS in
jbd2_journal_dirty_metadata() due to this assert:
J_ASSERT_JH(jh, handle->h_buffer_credits > 0);
Fix this by increasing the reservation size.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (60 commits)
ext4: prevent memory leaks from ext4_mb_init_backend() on error path
ext4: use EXT4_BAD_INO for buddy cache to avoid colliding with valid inode #
ext4: use ext4_msg() instead of printk in mballoc
ext4: use ext4_kvzalloc()/ext4_kvmalloc() for s_group_desc and s_group_info
ext4: introduce ext4_kvmalloc(), ext4_kzalloc(), and ext4_kvfree()
ext4: use the correct error exit path in ext4_init_inode_table()
ext4: add missing kfree() on error return path in add_new_gdb()
ext4: change umode_t in tracepoint headers to be an explicit __u16
ext4: fix races in ext4_sync_parent()
ext4: Fix overflow caused by missing cast in ext4_fallocate()
ext4: add action of moving index in ext4_ext_rm_idx for Punch Hole
ext4: simplify parameters of reserve_backup_gdb()
ext4: simplify parameters of add_new_gdb()
ext4: remove lock_buffer in bclean() and setup_new_group_blocks()
ext4: simplify journal handling in setup_new_group_blocks()
ext4: let setup_new_group_blocks() set multiple bits at a time
ext4: fix a typo in ext4_group_extend()
ext4: let ext4_group_add_blocks() handle 0 blocks quickly
ext4: let ext4_group_add_blocks() return an error code
ext4: rename ext4_add_groupblocks() to ext4_group_add_blocks()
...
Fix up conflict in fs/ext4/inode.c: commit aacfc19c626e ("fs: simplify
the blockdev_direct_IO prototype") had changed the ext4_ind_direct_IO()
function for the new simplified calling convention, while commit
dae1e52cb126 ("ext4: move ext4_ind_* functions from inode.c to
indirect.c") moved the function to another file.
|
|
Replace the ->check_acl method with a ->get_acl method that simply reads an
ACL from disk after having a cache miss. This means we can replace the ACL
checking boilerplate code with a single implementation in namei.c.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
... and simplify the living hell out of callers
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Compilation of ext4/namei.c brought up an error and warning messages
when compiled with -DDX_DEBUG
Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
The comment from Al Viro about possible race in the ext4_orphan_add() is
not justified. There is no race possible as we always have either i_mutex
locked, or the inode can not be referenced from outside hence the
J_ASSERS should not be hit from the reason described in comment.
This commit replaces it with notion that we are holding i_mutex so it
should not be possible for i_nlink to be changed while waiting for
s_orphan_lock.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Fix for a null pointer bug found while running punch hole tests
Signed-off-by: Allison Henderson <achender@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
ext4_symlink() cannot call __page_symlink() with transaction open.
__page_symlink() calls ext4_write_begin() which can wait for
transaction commit if we are running out of space thus causing a
deadlock. Also error recovery in ext4_truncate_failed_write() does not
count with the transaction being already started (although I'm not
aware of any particular deadlock here).
Fix the problem by stopping a transaction before calling
__page_symlink() (we have to be careful and put inode to orphan list
so that it gets deleted in case of crash) and starting another one
after __page_symlink() returns for addition of symlink into a
directory.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
When make_indexed_dir() fails (e.g. because of ENOSPC) after it has
allocated block for index tree root, we did not properly mark all
changed buffers dirty. This lead to only some of these buffers being
written out and thus effectively corrupting the directory.
Fix the issue by marking all changed data dirty even in the error
failure case.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (43 commits)
ext4: fix a BUG in mb_mark_used during trim.
ext4: unused variables cleanup in fs/ext4/extents.c
ext4: remove redundant set_buffer_mapped() in ext4_da_get_block_prep()
ext4: add more tracepoints and use dev_t in the trace buffer
ext4: don't kfree uninitialized s_group_info members
ext4: add missing space in printk's in __ext4_grp_locked_error()
ext4: add FITRIM to compat_ioctl.
ext4: handle errors in ext4_clear_blocks()
ext4: unify the ext4_handle_release_buffer() api
ext4: handle errors in ext4_rename
jbd2: add COW fields to struct jbd2_journal_handle
jbd2: add the b_cow_tid field to journal_head struct
ext4: Initialize fsync transaction ids in ext4_new_inode()
ext4: Use single thread to perform DIO unwritten convertion
ext4: optimize ext4_bio_write_page() when no extent conversion is needed
ext4: skip orphan cleanup if fs has unknown ROCOMPAT features
ext4: use the nblocks arg to ext4_truncate_restart_trans()
ext4: fix missing iput of root inode for some mount error paths
ext4: make FIEMAP and delayed allocation play well together
ext4: suppress verbose debugging information if malloc-debug is off
...
Fi up conflicts in fs/ext4/super.c due to workqueue changes
|
|
- Add more ext4 tracepoints.
- Change ext4 tracepoints to use dev_t field with MAJOR/MINOR macros
so that we can save 4 bytes in the ring buffer on some platforms.
- Add sync_mode to ext4_da_writepages, ext4_da_write_pages, and
ext4_da_writepages_result tracepoints. Also remove for_reclaim
field from ext4_da_writepages since it is usually not very useful.
Signed-off-by: Jiaying Zhang <jiayingz@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Checking return code from ext4_journal_get_write_access() is important
with snapshots, because this function invokes COW, so may return new
errors, such as ENOSPC.
We move the call to ext4_journal_get_write_access earlier in the
function, to simplify error handling in the case that this function
returns returns an error.
Signed-off-by: Amir Goldstein <amir73il@users.sf.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Now that VFS check for inode->i_nlink == 0 and returns proper
error, remove similar check from file system
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Call ext4_std_error() in various places when we can't bail out
cleanly, so the file system can be marked as in error.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Check return value of ext4_journal_get_write_access,
ext4_journal_dirty_metadata and ext4_mark_inode_dirty. Move brelse()
under 'out_stop' to release bh properly in case of journal error.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Where the file pointer is available, use ext4_error_file() instead of
ext4_error_inode().
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
This function gets called a lot for large directories, and the answer
is almost always "no, no, there's no problem". This means using
unlikely() is a good thing.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
IS_ERR() already implies unlikely(), so it can be omitted here.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
There should be a check for the NUL character instead of '0'.
Fortunately the only thing that cares about this is NFS serving, which
is why we didn't notice this in the merge window testing.
Reported-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
Signed-off-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Conflicts:
fs/ext4/inode.c
fs/ext4/mballoc.c
include/trace/events/ext4.h
|
|
Use the search_dirblock() in ext4_dx_find_entry(). It makes the code
easier to read, and it takes advantage of common code. It also saves
100 bytes or so of text space.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Brad Spengler <spender@grsecurity.net>
|
|
If the first block of htree directory is missing '.' or '..' but is
otherwise a valid directory, and we do a lookup for '.' or '..', it's
possible to dereference an uninitialized memory pointer in
ext4_htree_next_block().
We avoid this by moving the special case from ext4_dx_find_entry() to
ext4_find_entry(); this also means we can optimize ext4_find_entry()
slightly when NFS looks up "..".
Thanks to Brad Spengler for pointing a Clang warning that led me to
look more closely at this code. The warning was harmless, but it was
useful in pointing out code that was too ugly to live. This warning was
also reported by Roman Borisov.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Brad Spengler <spender@grsecurity.net>
|
|
Clones an existing reference to inode; caller must already hold one.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
commit 3d0518f4, "ext4: New rec_len encoding for very
large blocksizes" made several changes to this path, but from
a perf perspective, un-inlining ext4_rec_len_from_disk() seems
most significant. This function is called from ext4_check_dir_entry(),
which on a file-creation workload is called extremely often.
I tested this with bonnie:
# bonnie++ -u root -s 0 -f -x 200 -d /mnt/test -n 32
(this does 200 iterations) and got this for the file creations:
ext4 stock: Average = 21206.8 files/s
ext4 inlined: Average = 22346.7 files/s (+5%)
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Also start passing the line number to ext4_check_dir since we're going
to need it in upcoming patch.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
No real bugs found, just removed some dead code.
Found by gcc 4.6's new warnings.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Add a new ext4 state to tell us when a file has been newly created; use
that state in ext4_sync_file in no-journal mode to tell us when we need
to sync the parent directory as well as the inode and data itself. This
fixes a problem in which a panic or power failure may lose the entire
file even when using fsync, since the parent directory entry is lost.
Addresses-Google-Bug: #2480057
Signed-off-by: Frank Mayhar <fmayhar@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
This patch was generated using:
#!/usr/bin/perl -i
while (<>) {
s/[ ]+$//;
print;
}
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|