summaryrefslogtreecommitdiff
path: root/fs/proc/proc_sysctl.c
AgeCommit message (Collapse)AuthorFilesLines
2012-01-25sysctl: A more obvious version of grab_header.Eric W. Biederman1-4/+4
Instead of relying on sysct_head_next(NULL) to magically return the right header for the root directory instead explicitly transform NULL into the root directories header. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-01-25sysctl: Remove the now unused ctl_table parent field.Eric W. Biederman1-11/+1
While useful at one time for selinux and the sysctl sanity checks those users no longer use the parent field and we can safely remove it. Inspired-by: Lucian Adrian Grijincu <lucian.grijincu@gmil.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-01-25sysctl: Improve the sysctl sanity checksEric W. Biederman1-136/+86
- Stop validating subdirectories now that we only register leaf tables - Cleanup and improve the duplicate filename check. * Run the duplicate filename check under the sysctl_lock to guarantee we never add duplicate names. * Reduce the duplicate filename check to nearly O(M*N) where M is the number of entries in tthe table we are registering and N is the number of entries in the directory before we got there. - Move the duplicate filename check into it's own function and call it directtly from __register_sysctl_table - Kill the config option as the sanity checks are now cheap enough the config option is unnecessary. The original reason for the config option was because we had a huge table used to verify the proc filename to binary sysctl mapping. That table has now evolved into the binary_sysctl translation layer and is no longer part of the sysctl_check code. - Tighten up the permission checks. Guarnateeing that files only have read or write permissions. - Removed redudant check for parents having a procname as now everything has a procname. - Generalize the backtrace logic so that we print a backtrace from any failure of __register_sysctl_table that was not caused by a memmory allocation failure. The backtrace allows us to track down who erroneously registered a sysctl table. Bechmark before (CONFIG_SYSCTL_CHECK=y): make-dummies 0 999 -> 12s rmmod dummy -> 0.08s Bechmark before (CONFIG_SYSCTL_CHECK=n): make-dummies 0 999 -> 0.7s rmmod dummy -> 0.06s make-dummies 0 99999 -> 1m13s rmmod dummy -> 0.38s Benchmark after: make-dummies 0 999 -> 0.65s rmmod dummy -> 0.055s make-dummies 0 9999 -> 1m10s rmmod dummy -> 0.39s The sysctl sanity checks now impose no measurable cost. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-01-25sysctl: register only tables of sysctl filesEric W. Biederman1-18/+147
Split the registration of a complex ctl_table array which may have arbitrary numbers of directories (->child != NULL) and tables of files into a series of simpler registrations that only register tables of files. Graphically: register('dir', { + file-a + file-b + subdir1 + file-c + subdir2 + file-d + file-e }) is transformed into: wrapper->subheaders[0] = register('dir', {file1-a, file1-b}) wrapper->subheaders[1] = register('dir/subdir1', {file-c}) wrapper->subheaders[2] = register('dir/subdir2', {file-d, file-e}) return wrapper This guarantees that __register_sysctl_table will only see a simple ctl_table array with all entries having (->child == NULL). Care was taken to pass the original simple ctl_table arrays to __register_sysctl_table whenever possible. This change is derived from a similar patch written by Lucrian Grijincu. Inspired-by: Lucian Adrian Grijincu <lucian.grijincu@gmail.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-01-25sysctl: Add ctl_table chains into cstring pathsEric W. Biederman1-0/+9
For any component of table passed to __register_sysctl_paths that actually serves as a path, add that to the cstring path that is passed to __register_sysctl_table. The result is that for most calls to __register_sysctl_paths we only pass a table to __register_sysctl_table that contains no child directories. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-01-25sysctl: Add support for register sysctl tables with a normal cstring path.Eric W. Biederman1-10/+84
Make __register_sysctl_table the core sysctl registration operation and make it take a char * string as path. Now that binary paths have been banished into the real of backwards compatibility in kernel/binary_sysctl.c where they can be safely ignored there is no longer a need to use struct ctl_path to represent path names when registering ctl_tables. Start the transition to using normal char * strings to represent pathnames when registering sysctl tables. Normal strings are easier to deal with both in the internal sysctl implementation and for programmers registering sysctl tables. __register_sysctl_paths is turned into a backwards compatibility wrapper that converts a ctl_path array into a normal char * string. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-01-25sysctl: Create local copies of directory names used in pathsEric W. Biederman1-3/+8
Creating local copies of directory names is a good idea for two reasons. - The dynamic names used by callers must be copied into new strings by the callers today to ensure the strings do not change between register and unregister of the sysctl table. - Sysctl directories have a potentially different lifetime than the time between register and unregister of any particular sysctl table. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-01-25sysctl: Remove the unnecessary sysctl_set parent concept.Eric W. Biederman1-3/+8
In sysctl_net register the two networking roots in the proper order. In register_sysctl walk the sysctl sets in the reverse order of the sysctl roots. Remove parent from ctl_table_set and setup_sysctl_set as it is no longer needed. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-01-25sysctl: Implement retire_sysctl_setEric W. Biederman1-0/+4
This adds a small helper retire_sysctl_set to remove the intimate knowledge about the how a sysctl_set is implemented from net/sysct_net.c Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-01-25sysctl: Make the directories have nlink == 1Eric W. Biederman1-1/+0
I goofed when I made sysctl directories have nlink == 0. nlink == 0 means the directory has been deleted. nlink == 1 meands a directory does not count subdirectories. Use the default nlink == 1 for sysctl directories. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-01-25sysctl: Move the implementation into fs/proc/proc_sysctl.cEric W. Biederman1-0/+622
Move the core sysctl code from kernel/sysctl.c and kernel/sysctl_check.c into fs/proc/proc_sysctl.c. Currently sysctl maintenance is hampered by the sysctl implementation being split across 3 files with artificial layering between them. Consolidate the entire sysctl implementation into 1 file so that it is easier to see what is going on and hopefully allowing for simpler maintenance. For functions that are now only used in fs/proc/proc_sysctl.c remove their declarations from sysctl.h and make them static in fs/proc/proc_sysctl.c Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-01-25sysctl: Register the base sysctl table like any other sysctl table.Eric W. Biederman1-1/+2
Simplify the code by treating the base sysctl table like any other sysctl table and register it with register_sysctl_table. To ensure this table is registered early enough to avoid problems call sysctl_init from proc_sys_init. Rename sysctl_net.c:sysctl_init() to net_sysctl_init() to avoid name conflicts now that kernel/sysctl.c:sysctl_init() is no longer static. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-01-25sysctl: remove impossible condition checkLucas De Marchi1-12/+2
Remove checks for conditions that will never happen. If procname is NULL the loop would already had bailed out, so there's no need to check it again. At the same time this also compacts the function find_in_table() by refactoring it to be easier to read. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi> Reviewed-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2011-11-03Merge branch 'akpm' (Andrew's incoming - part two)Linus Torvalds1-0/+46
Says Andrew: "60 patches. That's good enough for -rc1 I guess. I have quite a lot of detritus to be rechecked, work through maintainers, etc. - most of the remains of MM - rtc - various misc - cgroups - memcg - cpusets - procfs - ipc - rapidio - sysctl - pps - w1 - drivers/misc - aio" * akpm: (60 commits) memcg: replace ss->id_lock with a rwlock aio: allocate kiocbs in batches drivers/misc/vmw_balloon.c: fix typo in code comment drivers/misc/vmw_balloon.c: determine page allocation flag can_sleep outside loop w1: disable irqs in critical section drivers/w1/w1_int.c: multiple masters used same init_name drivers/power/ds2780_battery.c: fix deadlock upon insertion and removal drivers/power/ds2780_battery.c: add a nolock function to w1 interface drivers/power/ds2780_battery.c: create central point for calling w1 interface w1: ds2760 and ds2780, use ida for id and ida_simple_get() to get it pps gpio client: add missing dependency pps: new client driver using GPIO pps: default echo function include/linux/dma-mapping.h: add dma_zalloc_coherent() sysctl: make CONFIG_SYSCTL_SYSCALL default to n sysctl: add support for poll() RapidIO: documentation update drivers/net/rionet.c: fix ethernet address macros for LE platforms RapidIO: fix potential null deref in rio_setup_device() RapidIO: add mport driver for Tsi721 bridge ...
2011-11-03sysctl: add support for poll()Lucas De Marchi1-0/+45
Adding support for poll() in sysctl fs allows userspace to receive notifications of changes in sysctl entries. This adds a infrastructure to allow files in sysctl fs to be pollable and implements it for hostname and domainname. [akpm@linux-foundation.org: s/declare/define/ for definitions] Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi> Cc: Greg KH <gregkh@suse.de> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-03procfs: report EISDIR when reading sysctl dirs in procPavel Emelyanov1-0/+1
On reading sysctl dirs we should return -EISDIR instead of -EINVAL. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-02filesystems: add missing nlink wrappersMiklos Szeredi1-1/+1
Replace direct i_nlink updates with the respective updater function (inc_nlink, drop_nlink, clear_nlink, inode_dec_link_count). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2011-07-20->permission() sanitizing: don't pass flags to ->permission()Al Viro1-1/+1
not used by the instances anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20->permission() sanitizing: MAY_NOT_BLOCKAl Viro1-1/+1
Duplicate the flags argument into mask bitmap. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-20proc_sys_permission() is OK in RCU modeAl Viro1-3/+0
nothing blocking there, since all instances of sysctl ->permissions() method are non-blocking - both of them, that is. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-16Merge branch 'next' into for-linusJames Morris1-1/+0
2011-03-08unfuck proc_sysctl ->d_compare()Al Viro1-2/+5
a) struct inode is not going to be freed under ->d_compare(); however, the thing PROC_I(inode)->sysctl points to just might. Fortunately, it's enough to make freeing that sucker delayed, provided that we don't step on its ->unregistering, clear the pointer to it in PROC_I(inode) before dropping the reference and check if it's NULL in ->d_compare(). b) I'm not sure that we *can* walk into NULL inode here (we recheck dentry->seq between verifying that it's still hashed / fetching dentry->d_inode and passing it to ->d_compare() and there's no negative hashed dentries in /proc/sys/*), but if we can walk into that, we really should not have ->d_compare() return 0 on it! Said that, I really suspect that this check can be simply killed. Nick? Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-02-01security/selinux: fix /proc/sys/ labelingLucian Adrian Grijincu1-1/+0
This fixes an old (2007) selinux regression: filesystem labeling for /proc/sys returned -r--r--r-- unknown /proc/sys/fs/file-nr instead of -r--r--r-- system_u:object_r:sysctl_fs_t:s0 /proc/sys/fs/file-nr Events that lead to breaking of /proc/sys/ selinux labeling: 1) sysctl was reimplemented to route all calls through /proc/sys/ commit 77b14db502cb85a031fe8fde6c85d52f3e0acb63 [PATCH] sysctl: reimplement the sysctl proc support 2) proc_dir_entry was removed from ctl_table: commit 3fbfa98112fc3962c416452a0baf2214381030e6 [PATCH] sysctl: remove the proc_dir_entry member for the sysctl tables 3) selinux still walked the proc_dir_entry tree to apply labeling. Because ctl_tables don't have a proc_dir_entry, we did not label /proc/sys/ inodes any more. To achieve this the /proc/sys/ inodes were marked private and private inodes were ignored by selinux. commit bbaca6c2e7ef0f663bc31be4dad7cf530f6c4962 [PATCH] selinux: enhance selinux to always ignore private inodes commit 86a71dbd3e81e8870d0f0e56b87875f57e58222b [PATCH] sysctl: hide the sysctl proc inodes from selinux Access control checks have been done by means of a special sysctl hook that was called for read/write accesses to any /proc/sys/ entry. We don't have to do this because, instead of walking the proc_dir_entry tree we can walk the dentry tree (as done in this patch). With this patch: * we don't mark /proc/sys/ inodes as private * we don't need the sysclt security hook * we walk the dentry tree to find the path to the inode. We have to strip the PID in /proc/PID/ entries that have a proc_dir_entry because selinux does not know how to label paths like '/1/net/rpc/nfsd.fh' (and defaults to 'proc_t' labeling). Selinux does know of '/net/rpc/nfsd.fh' (and applies the 'sysctl_rpc_t' label). PID stripping from the path was done implicitly in the previous code because the proc_dir_entry tree had the root in '/net' in the example from above. The dentry tree has the root in '/1'. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Lucian Adrian Grijincu <lucian.grijincu@gmail.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2011-01-07fs: provide rcu-walk aware permission i_opsNick Piggin1-1/+4
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: rcu-walk aware d_revalidate methodNick Piggin1-0/+3
Require filesystems be aware of .d_revalidate being called in rcu-walk mode (nd->flags & LOOKUP_RCU). For now do a simple push down, returning -ECHILD from all implementations. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: dcache reduce branches in lookup pathNick Piggin1-2/+2
Reduce some branches and memory accesses in dcache lookup by adding dentry flags to indicate common d_ops are set, rather than having to check them. This saves a pointer memory access (dentry->d_op) in common path lookup situations, and saves another pointer load and branch in cases where we have d_op but not the particular operation. Patched with: git grep -E '[.>]([[:space:]])*d_op([[:space:]])*=' | xargs sed -e 's/\([^\t ]*\)->d_op = \(.*\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]*\)\.d_op = \(.*\);/d_set_d_op(\&\1, \2);/' -i Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: rcu-walk for path lookupNick Piggin1-0/+4
Perform common cases of path lookups without any stores or locking in the ancestor dentry elements. This is called rcu-walk, as opposed to the current algorithm which is a refcount based walk, or ref-walk. This results in far fewer atomic operations on every path element, significantly improving path lookup performance. It also avoids cacheline bouncing on common dentries, significantly improving scalability. The overall design is like this: * LOOKUP_RCU is set in nd->flags, which distinguishes rcu-walk from ref-walk. * Take the RCU lock for the entire path walk, starting with the acquiring of the starting path (eg. root/cwd/fd-path). So now dentry refcounts are not required for dentry persistence. * synchronize_rcu is called when unregistering a filesystem, so we can access d_ops and i_ops during rcu-walk. * Similarly take the vfsmount lock for the entire path walk. So now mnt refcounts are not required for persistence. Also we are free to perform mount lookups, and to assume dentry mount points and mount roots are stable up and down the path. * Have a per-dentry seqlock to protect the dentry name, parent, and inode, so we can load this tuple atomically, and also check whether any of its members have changed. * Dentry lookups (based on parent, candidate string tuple) recheck the parent sequence after the child is found in case anything changed in the parent during the path walk. * inode is also RCU protected so we can load d_inode and use the inode for limited things. * i_mode, i_uid, i_gid can be tested for exec permissions during path walk. * i_op can be loaded. When we reach the destination dentry, we lock it, recheck lookup sequence, and increment its refcount and mountpoint refcount. RCU and vfsmount locks are dropped. This is termed "dropping rcu-walk". If the dentry refcount does not match, we can not drop rcu-walk gracefully at the current point in the lokup, so instead return -ECHILD (for want of a better errno). This signals the path walking code to re-do the entire lookup with a ref-walk. Aside from the final dentry, there are other situations that may be encounted where we cannot continue rcu-walk. In that case, we drop rcu-walk (ie. take a reference on the last good dentry) and continue with a ref-walk. Again, if we can drop rcu-walk gracefully, we return -ECHILD and do the whole lookup using ref-walk. But it is very important that we can continue with ref-walk for most cases, particularly to avoid the overhead of double lookups, and to gain the scalability advantages on common path elements (like cwd and root). The cases where rcu-walk cannot continue are: * NULL dentry (ie. any uncached path element) * parent with d_inode->i_op->permission or ACLs * dentries with d_revalidate * Following links In future patches, permission checks and d_revalidate become rcu-walk aware. It may be possible eventually to make following links rcu-walk aware. Uncached path elements will always require dropping to ref-walk mode, at the very least because i_mutex needs to be grabbed, and objects allocated. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: change d_compare for rcu-walkNick Piggin1-6/+7
Change d_compare so it may be called from lock-free RCU lookups. This does put significant restrictions on what may be done from the callback, however there don't seem to have been any problems with in-tree fses. If some strange use case pops up that _really_ cannot cope with the rcu-walk rules, we can just add new rcu-unaware callbacks, which would cause name lookup to drop out of rcu-walk mode. For in-tree filesystems, this is just a mechanical change. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: change d_delete semanticsNick Piggin1-1/+1
Change d_delete from a dentry deletion notification to a dentry caching advise, more like ->drop_inode. Require it to be constant and idempotent, and not take d_lock. This is how all existing filesystems use the callback anyway. This makes fine grained dentry locking of dput and dentry lru scanning much simpler. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2010-10-26fs: do not assign default i_ino in new_inodeChristoph Hellwig1-0/+2
Instead of always assigning an increasing inode number in new_inode move the call to assign it into those callers that actually need it. For now callers that need it is estimated conservatively, that is the call is added to all filesystems that do not assign an i_ino by themselves. For a few more filesystems we can avoid assigning any inode number given that they aren't user visible, and for others it could be done lazily when an inode number is actually needed, but that's left for later patches. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-10-15llseek: automatically add .llseek fopArnd Bergmann1-0/+1
All file_operations should get a .llseek operation so we can make nonseekable_open the default for future file operations without a .llseek pointer. The three cases that we can automatically detect are no_llseek, seq_lseek and default_llseek. For cases where we can we can automatically prove that the file offset is always ignored, we use noop_llseek, which maintains the current behavior of not returning an error from a seek. New drivers should normally not use noop_llseek but instead use no_llseek and call nonseekable_open at open time. Existing drivers can be converted to do the same when the maintainer knows for certain that no user code relies on calling seek on the device file. The generated code is often incorrectly indented and right now contains comments that clarify for each added line why a specific variant was chosen. In the version that gets submitted upstream, the comments will be gone and I will manually fix the indentation, because there does not seem to be a way to do that using coccinelle. Some amount of new code is currently sitting in linux-next that should get the same modifications, which I will do at the end of the merge window. Many thanks to Julia Lawall for helping me learn to write a semantic patch that does all this. ===== begin semantic patch ===== // This adds an llseek= method to all file operations, // as a preparation for making no_llseek the default. // // The rules are // - use no_llseek explicitly if we do nonseekable_open // - use seq_lseek for sequential files // - use default_llseek if we know we access f_pos // - use noop_llseek if we know we don't access f_pos, // but we still want to allow users to call lseek // @ open1 exists @ identifier nested_open; @@ nested_open(...) { <+... nonseekable_open(...) ...+> } @ open exists@ identifier open_f; identifier i, f; identifier open1.nested_open; @@ int open_f(struct inode *i, struct file *f) { <+... ( nonseekable_open(...) | nested_open(...) ) ...+> } @ read disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off) { <+... ( *off = E | *off += E | func(..., off, ...) | E = *off ) ...+> } @ read_no_fpos disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off) { ... when != off } @ write @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off) { <+... ( *off = E | *off += E | func(..., off, ...) | E = *off ) ...+> } @ write_no_fpos @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off) { ... when != off } @ fops0 @ identifier fops; @@ struct file_operations fops = { ... }; @ has_llseek depends on fops0 @ identifier fops0.fops; identifier llseek_f; @@ struct file_operations fops = { ... .llseek = llseek_f, ... }; @ has_read depends on fops0 @ identifier fops0.fops; identifier read_f; @@ struct file_operations fops = { ... .read = read_f, ... }; @ has_write depends on fops0 @ identifier fops0.fops; identifier write_f; @@ struct file_operations fops = { ... .write = write_f, ... }; @ has_open depends on fops0 @ identifier fops0.fops; identifier open_f; @@ struct file_operations fops = { ... .open = open_f, ... }; // use no_llseek if we call nonseekable_open //////////////////////////////////////////// @ nonseekable1 depends on !has_llseek && has_open @ identifier fops0.fops; identifier nso ~= "nonseekable_open"; @@ struct file_operations fops = { ... .open = nso, ... +.llseek = no_llseek, /* nonseekable */ }; @ nonseekable2 depends on !has_llseek @ identifier fops0.fops; identifier open.open_f; @@ struct file_operations fops = { ... .open = open_f, ... +.llseek = no_llseek, /* open uses nonseekable */ }; // use seq_lseek for sequential files ///////////////////////////////////// @ seq depends on !has_llseek @ identifier fops0.fops; identifier sr ~= "seq_read"; @@ struct file_operations fops = { ... .read = sr, ... +.llseek = seq_lseek, /* we have seq_read */ }; // use default_llseek if there is a readdir /////////////////////////////////////////// @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier readdir_e; @@ // any other fop is used that changes pos struct file_operations fops = { ... .readdir = readdir_e, ... +.llseek = default_llseek, /* readdir is present */ }; // use default_llseek if at least one of read/write touches f_pos ///////////////////////////////////////////////////////////////// @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read.read_f; @@ // read fops use offset struct file_operations fops = { ... .read = read_f, ... +.llseek = default_llseek, /* read accesses f_pos */ }; @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, ... + .llseek = default_llseek, /* write accesses f_pos */ }; // Use noop_llseek if neither read nor write accesses f_pos /////////////////////////////////////////////////////////// @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; identifier write_no_fpos.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, .read = read_f, ... +.llseek = noop_llseek, /* read and write both use no f_pos */ }; @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write_no_fpos.write_f; @@ struct file_operations fops = { ... .write = write_f, ... +.llseek = noop_llseek, /* write uses no f_pos */ }; @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; @@ struct file_operations fops = { ... .read = read_f, ... +.llseek = noop_llseek, /* read uses no f_pos */ }; @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; @@ struct file_operations fops = { ... +.llseek = noop_llseek, /* no read or write fn */ }; ===== End semantic patch ===== Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Julia Lawall <julia@diku.dk> Cc: Christoph Hellwig <hch@infradead.org>
2010-08-10remove inode_setattrChristoph Hellwig1-3/+12
Replace inode_setattr with opencoded variants of it in all callers. This moves the remaining call to vmtruncate into the filesystem methods where it can be replaced with the proper truncate sequence. In a few cases it was obvious that we would never end up calling vmtruncate so it was left out in the opencoded variant: spufs: explicitly checks for ATTR_SIZE earlier btrfs,hugetlbfs,logfs,dlmfs: explicitly clears ATTR_SIZE earlier ufs: contains an opencoded simple_seattr + truncate that sets the filesize just above In addition to that ncpfs called inode_setattr with handcrafted iattrs, which allowed to trim down the opencoded variant. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-11-11sysctl: Don't look at ctl_name and strategy in the generic codeEric W. Biederman1-2/+2
The ctl_name and strategy fields are unused, now that sys_sysctl is a compatibility wrapper around /proc/sys. No longer looking at them in the generic code is effectively what we are doing now and provides the guarantee that during further cleanups we can just remove references to those fields and everything will work ok. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2009-09-24sysctl: remove "struct file *" argument of ->proc_handlerAlexey Dobriyan1-1/+1
It's unused. It isn't needed -- read or write flag is already passed and sysctl shouldn't care about the rest. It _was_ used in two places at arch/frv for some reason. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "David S. Miller" <davem@davemloft.net> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-27constify dentry_operations: procfsAl Viro1-2/+2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-01-05zero i_uid/i_gid on inode allocationAl Viro1-1/+0
... and don't bother in callers. Don't bother with zeroing i_blocks, while we are at it - it's already been zeroed. i_mode is not worth the effort; it has no common default value. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-11-17Fix broken ownership of /proc/sys/ filesAl Viro1-0/+1
D'oh... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Reported-and-tested-by: Peter Palfrader <peter@palfrader.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-23Merge branch 'proc' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc * 'proc' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc: (35 commits) proc: remove fs/proc/proc_misc.c proc: move /proc/vmcore creation to fs/proc/vmcore.c proc: move pagecount stuff to fs/proc/page.c proc: move all /proc/kcore stuff to fs/proc/kcore.c proc: move /proc/schedstat boilerplate to kernel/sched_stats.h proc: move /proc/modules boilerplate to kernel/module.c proc: move /proc/diskstats boilerplate to block/genhd.c proc: move /proc/zoneinfo boilerplate to mm/vmstat.c proc: move /proc/vmstat boilerplate to mm/vmstat.c proc: move /proc/pagetypeinfo boilerplate to mm/vmstat.c proc: move /proc/buddyinfo boilerplate to mm/vmstat.c proc: move /proc/vmallocinfo to mm/vmalloc.c proc: move /proc/slabinfo boilerplate to mm/slub.c, mm/slab.c proc: move /proc/slab_allocators boilerplate to mm/slab.c proc: move /proc/interrupts boilerplate code to fs/proc/interrupts.c proc: move /proc/stat to fs/proc/stat.c proc: move rest of /proc/partitions code to block/genhd.c proc: move /proc/cpuinfo code to fs/proc/cpuinfo.c proc: move /proc/devices code to fs/proc/devices.c proc: move rest of /proc/locks to fs/locks.c ...
2008-10-23proc: spread __initAlexey Dobriyan1-2/+2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23[PATCH] move executable checking into ->permission()Miklos Szeredi1-2/+8
For execute permission on a regular files we need to check if file has any execute bits at all, regardless of capabilites. This check is normally performed by generic_permission() but was also added to the case when the filesystem defines its own ->permission() method. In the latter case the filesystem should be responsible for performing this check. Move the check from inode_permission() inside filesystems which are not calling generic_permission(). Create a helper function execute_ok() that returns true if the inode is a directory or if any execute bits are present in i_mode. Also fix up the following code: - coda control file is never executable - sysctl files are never executable - hfs_permission seems broken on MAY_EXEC, remove - hfsplus_permission is eqivalent to generic_permission(), remove Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2008-10-23[PATCH] fix ->llseek for more directoriesChristoph Hellwig1-0/+1
With this patch all directory fops instances that have a readdir that doesn't take the BKL are switched to generic_file_llseek. Signed-off-by: Christoph Hellwig <hch@lst.de>
2008-10-10proc: make grab_header() staticAdrian Bunk1-1/+1
Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-10proc: proc_sys_root tweakAlexey Dobriyan1-2/+2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-07-27[PATCH] sanitize ->permission() prototypeAl Viro1-1/+1
* kill nameidata * argument; map the 3 bits in ->flags anybody cares about to new MAY_... ones and pass with the mask. * kill redundant gfs2_iop_permission() * sanitize ecryptfs_permission() * fix remaining places where ->permission() instances might barf on new MAY_... found in mask. The obvious next target in that direction is permission(9) folded fix for nfs_permission() breakage from Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-27[PATCH] sanitize proc_sysctlAl Viro1-235/+192
* keep references to ctl_table_head and ctl_table in /proc/sys inodes * grab the former during operations, use the latter for access to entry if that succeeds * have ->d_compare() check if table should be seen for one who does lookup; that allows us to avoid flipping inodes - if we have the same name resolve to different things, we'll just keep several dentries and ->d_compare() will reject the wrong ones. * have ->lookup() and ->readdir() scan the table of our inode first, then walk all ctl_table_header and scan ->attached_by for those that are attached to our directory. * implement ->getattr(). * get rid of insane amounts of tree-walking * get rid of the need to know dentry in ->permission() and of the contortions induced by that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-29sysctl: add the ->permissions callback on the ctl_table_rootPavel Emelyanov1-2/+2
When reading from/writing to some table, a root, which this table came from, may affect this table's permissions, depending on who is working with the table. The core hunk is at the bottom of this patch. All the rest is just pushing the ctl_table_root argument up to the sysctl_perm() function. This will be mostly (only?) used in the net sysctls. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: David S. Miller <davem@davemloft.net> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@sw.ru> Cc: Denis V. Lunev <den@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29sysctl: merge equal proc_sys_read and proc_sys_writePavel Emelyanov1-39/+11
Many (most of) sysctls do not have a per-container sense. E.g. kernel.print_fatal_signals, vm.panic_on_oom, net.core.netdev_budget and so on and so forth. Besides, tuning then from inside a container is not even secure. On the other hand, hiding them completely from the container's tasks sometimes causes user-space to stop working. When developing net sysctl, the common practice was to duplicate a table and drop the write bits in table->mode, but this approach was not very elegant, lead to excessive memory consumption and was not suitable in general. Here's the alternative solution. To facilitate the per-container sysctls ctl_table_root-s were introduced. Each root contains a list of ctl_table_header-s that are visible to different namespaces. The idea of this set is to add the permissions() callback on the ctl_table_root to allow ctl root limit permissions to the same ctl_table-s. The main user of this functionality is the net-namespaces code, but later this will (should) be used by more and more namespaces, containers and control groups. Actually, this idea's core is in a single hunk in the third patch. First two patches are cleanups for sysctl code, while the third one mostly extends the arguments set of some sysctl functions. This patch: These ->read and ->write callbacks act in a very similar way, so merge these paths to reduce the number of places to patch later and shrink the .text size (a bit). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: "David S. Miller" <davem@davemloft.net> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@sw.ru> Cc: Denis V. Lunev <den@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-15Embed a struct path into struct nameidata instead of nd->{dentry,mnt}Jan Blunck1-1/+1
This is the central patch of a cleanup series. In most cases there is no good reason why someone would want to use a dentry for itself. This series reflects that fact and embeds a struct path into nameidata. Together with the other patches of this series - it enforced the correct order of getting/releasing the reference count on <dentry,vfsmount> pairs - it prepares the VFS for stacking support since it is essential to have a struct path in every place where the stack can be traversed - it reduces the overall code size: without patch series: text data bss dec hex filename 5321639 858418 715768 6895825 6938d1 vmlinux with patch series: text data bss dec hex filename 5320026 858418 715768 6894212 693284 vmlinux This patch: Switch from nd->{dentry,mnt} to nd->path.{dentry,mnt} everywhere. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix cifs] [akpm@linux-foundation.org: fix smack] Signed-off-by: Jan Blunck <jblunck@suse.de> Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08procfs: constify function pointer tablesJan Engelhardt1-2/+2
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Mike Frysinger <vapier@gentoo.org> Acked-By: David Howells <dhowells@redhat.com> Acked-by: Bryan Wu <bryan.wu@analog.com> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-26Fix pointer mismatches in proc_sysctl.cDavid Howells1-2/+4
Fix pointer mismatches in proc_sysctl.c. The proc_handler() method returns a size_t through an arg pointer, but is given a pointer to a ssize_t to return into. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>