summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2012-04-02blkcg: restructure statistics printingTejun Heo2-374/+243
blkcg stats handling is a mess. None of the stats has much to do with blkcg core but they are all implemented in blkcg core. Code sharing is achieved by mixing common code with hard-coded cases for each stat counter. This patch restructures statistics printing such that * Common logic exists as helper functions and specific print functions use the helpers to implement specific cases. * Printing functions serving multiple counters don't require hardcoded switching on specific counters. * Printing uses read_seq_string callback (other methods will be phased out). This change enables further cleanups and relocating stats code to the policy implementation it belongs to. Signed-off-by: Tejun Heo <tj@kernel.org>
2012-04-02blkcg: introduce blkg_stat and blkg_rwstatTejun Heo2-207/+293
blkcg uses u64_stats_sync to avoid reading wrong u64 statistic values on 32bit archs and some stat counters have subtypes to distinguish read/writes and sync/async IOs. The stat code paths are confusing and involve a lot of going back and forth between blkcg core and specific policy implementations, and synchronization and subtype handling are open coded in blkcg core. This patch introduces struct blkg_stat and blkg_rwstat which, with accompanying operations, encapsulate stat updating and accessing with proper synchronization. blkg_stat is simple u64 counter with 64bit read-access protection. blkg_rwstat is the one with rw and [a]sync subcounters and takes @rw flags to distinguish IO subtypes (%REQ_WRITE and %REQ_SYNC) and replaces stat_sub_type indexed arrays. All counters in blkio_group_stats and blkio_group_stats_cpu are replaced with either blkg_stat or blkg_rwstat along with all users. This does add one u64_stats_sync per counter and increase stats_sync operations but they're empty/noops on 64bit archs and blkcg doesn't have too many counters, especially with DEBUG_BLK_CGROUP off. While the currently resulting code isn't necessarily simpler at the moment, this will enable further clean up of blkcg stats code. - BLKIO_STAT_{READ|WRITE|SYNC|ASYNC|TOTAL} renamed to BLKG_RWSTAT_{READ|WRITE|SYNC|ASYNC|TOTAL}. - blkg_stat_add() replaces blkio_add_stat() and blkio_check_and_dec_stat(). Note that BUG_ON() on underflow in the latter function no longer exists. It's *way* better to have underflowed stat counters than oopsing. - blkio_group_stats->dequeue is now a proper u64 stat counter instead of ulong. - reset_stats() updated to clear each stat counters individually and BLKG_STATS_DEBUG_CLEAR_{START|SIZE} are removed. - Some functions reconstruct rw flags from direction and sync booleans. This will be removed by future patches. Signed-off-by: Tejun Heo <tj@kernel.org>
2012-04-02blkcg: BLKIO_STAT_CPU_SECTORS doesn't have subcountersTejun Heo1-3/+6
BLKIO_STAT_CPU_SECTORS doesn't need read/write/sync/async subcounters and is counted by blkio_group_stats_cpu->sectors; however, it still holds a member in blkio_group_stats_cpu->stat_arr_cpu. Rearrange stat_type_cpu and define BLKIO_STAT_CPU_ARR_NR and use it for stat_arr_cpu[] size so that only SERVICE_BYTES and SERVICED have subcounters. Signed-off-by: Tejun Heo <tj@kernel.org>
2012-04-02blkcg: remove unused @pol and @plid parametersTejun Heo4-16/+9
@pol to blkg_to_pdata() and @plid to blkg_lookup_create() are no longer necessary. Drop them. Signed-off-by: Tejun Heo <tj@kernel.org>
2012-04-01Merge branch 'for-3.5' of ../cgroup into block/for-3.5/core-mergedTejun Heo10589-294804/+504795
cgroup/for-3.5 contains the following changes which blk-cgroup needs to proceed with the on-going cleanup. * Dynamic addition and removal of cftypes to make config/stat file handling modular for policies. * cgroup removal update to not wait for css references to drain to fix blkcg removal hang caused by cfq caching cfqgs. Pull in cgroup/for-3.5 into block/for-3.5/core. This causes the following conflicts in block/blk-cgroup.c. * 761b3ef50e "cgroup: remove cgroup_subsys argument from callbacks" conflicts with blkiocg_pre_destroy() addition and blkiocg_attach() removal. Resolved by removing @subsys from all subsys methods. * 676f7c8f84 "cgroup: relocate cftype and cgroup_subsys definitions in controllers" conflicts with ->pre_destroy() and ->attach() updates and removal of modular config. Resolved by dropping forward declarations of the methods and applying updates to the relocated blkio_subsys. * 4baf6e3325 "cgroup: convert all non-memcg controllers to the new cftype interface" builds upon the previous item. Resolved by adding ->base_cftypes to the relocated blkio_subsys. Signed-off-by: Tejun Heo <tj@kernel.org>
2012-04-01cgroup: make css->refcnt clearing on cgroup removal optionalTejun Heo3-9/+80
Currently, cgroup removal tries to drain all css references. If there are active css references, the removal logic waits and retries ->pre_detroy() until either all refs drop to zero or removal is cancelled. This semantics is unusual and adds non-trivial complexity to cgroup core and IMHO is fundamentally misguided in that it couples internal implementation details (references to internal data structure) with externally visible operation (rmdir). To userland, this is a behavior peculiarity which is unnecessary and difficult to expect (css refs is otherwise invisible from userland), and, to policy implementations, this is an unnecessary restriction (e.g. blkcg wants to hold css refs for caching purposes but can't as that becomes visible as rmdir hang). Unfortunately, memcg currently depends on ->pre_destroy() retrials and cgroup removal vetoing and can't be immmediately switched to the new behavior. This patch introduces the new behavior of not waiting for css refs to drain and maintains the old behavior for subsystems which have __DEPRECATED_clear_css_refs set. Once, memcg is updated, we can drop the code paths for the old behavior as proposed in the following patch. Note that the following patch is incorrect in that dput work item is in cgroup and may lose some of dputs when multiples css's are released back-to-back, and __css_put() triggers check_for_release() when refcnt reaches 0 instead of 1; however, it shows what part can be removed. http://thread.gmane.org/gmane.linux.kernel.containers/22559/focus=75251 Note that, in not-too-distant future, cgroup core will start emitting warning messages for subsys which require the old behavior, so please get moving. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
2012-04-01cgroup: use negative bias on css->refcnt to block css_tryget()Tejun Heo2-56/+75
When a cgroup is about to be removed, cgroup_clear_css_refs() is called to check and ensure that there are no active css references. This is currently achieved by dropping the refcnt to zero iff it has only the base ref. If all css refs could be dropped to zero, ref clearing is successful and CSS_REMOVED is set on all css. If not, the base ref is restored. While css ref is zero w/o CSS_REMOVED set, any css_tryget() attempt on it busy loops so that they are atomic w.r.t. the whole css ref clearing. This does work but dropping and re-instating the base ref is somewhat hairy and makes it difficult to add more logic to the put path as there are two of them - the regular css_put() and the reversible base ref clearing. This patch updates css ref clearing such that blocking new css_tryget() and putting the base ref are separate operations. CSS_DEACT_BIAS, defined as INT_MIN, is added to css->refcnt and css_tryget() busy loops while refcnt is negative. After all css refs are deactivated, if they were all one, ref clearing succeeded and CSS_REMOVED is set and the base ref is put using the regular css_put(); otherwise, CSS_DEACT_BIAS is subtracted from the refcnts and the original postive values are restored. css_refcnt() accessor which always returns the unbiased positive reference counts is added and used to simplify refcnt usages. While at it, relocate and reformat comments in cgroup_has_css_refs(). This separates css->refcnt deactivation and putting the base ref, which enables the next patch to make ref clearing optional. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
2012-04-01cgroup: implement cgroup_rm_cftypes()Tejun Heo2-10/+45
Implement cgroup_rm_cftypes() which removes an array of cftypes from a subsystem. It can be called whether the target subsys is attached or not. cgroup core will remove the specified file from all existing cgroups. This will be used to improve sub-subsys modularity and will be helpful for unified hierarchy. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
2012-04-01cgroup: introduce struct cfentTejun Heo2-36/+78
This patch adds cfent (cgroup file entry) which is the association between a cgroup and a file. This is in-cgroup representation of files under a cgroup directory. This simplifies walking walking cgroup files and thus cgroup_clear_directory(), which is now implemented in two parts - cgroup_rm_file() and a loop around it. cgroup_rm_file() will be used to implement cftype removal and cfent is scheduled to serve cgroup specific per-file data (e.g. for sysfs-like "sever" semantics). v2: - cfe was freed from cgroup_rm_file() which led to use-after-free if the file had openers at the time of removal. Moved to cgroup_diput(). - cgroup_clear_directory() triggered WARN_ON_ONCE() if d_subdirs wasn't empty after removing all files. This triggered spuriously if some files were open during directory clearing. Removed. v3: - In cgroup_diput(), WARN_ONCE(!list_empty(&cfe->node)) could be spuriously triggered for root cgroups because they don't go through cgroup_clear_directory() on unmount. Don't trigger WARN for root cgroups. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Glauber Costa <glommer@parallels.com>
2012-04-01cgroup: relocate __d_cgrp() and __d_cft()Tejun Heo1-10/+10
Move the two macros upwards as they'll be used earlier in the file. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
2012-04-01cgroup: remove cgroup_add_file[s]()Tejun Heo2-47/+20
No controller is using cgroup_add_files[s](). Unexport them, and convert cgroup_add_files() to handle NULL entry terminated array instead of taking count explicitly and continue creation on failure for internal use. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
2012-04-01cgroup: convert memcg controller to the new cftype interfaceTejun Heo2-15/+13
Convert memcg to use the new cftype based interface. kmem support abuses ->populate() for mem_cgroup_sockets_init() so it can't be removed at the moment. tcp_memcontrol is updated so that tcp_files[] is registered via a __initcall. This change also allows removing the forward declaration of tcp_files[]. Removed. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Glauber Costa <glommer@parallels.com> Cc: Hugh Dickins <hughd@google.com> Cc: Greg Thelen <gthelen@google.com>
2012-04-01memcg: always create memsw files if CONFIG_CGROUP_MEM_RES_CTLR_SWAPTejun Heo1-34/+31
Instead of conditioning creation of memsw files on do_swap_account, always create the files if compiled-in and fail read/write attempts with -EOPNOTSUPP if !do_swap_account. This is suggested by KAMEZAWA to simplify memcg file creation so that it can use cgroup->subsys_cftypes. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
2012-04-01cgroup: convert all non-memcg controllers to the new cftype interfaceTejun Heo8-75/+28
Convert debug, freezer, cpuset, cpu_cgroup, cpuacct, net_prio, blkio, net_cls and device controllers to use the new cftype based interface. Termination entry is added to cftype arrays and populate callbacks are replaced with cgroup_subsys->base_cftypes initializations. This is functionally identical transformation. There shouldn't be any visible behavior change. memcg is rather special and will be converted separately. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <paul@paulmenage.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Vivek Goyal <vgoyal@redhat.com>
2012-04-01cgroup: relocate cftype and cgroup_subsys definitions in controllersTejun Heo4-83/+65
blk-cgroup, netprio_cgroup, cls_cgroup and tcp_memcontrol unnecessarily define cftype array and cgroup_subsys structures at the top of the file, which is unconventional and necessiates forward declaration of methods. This patch relocates those below the definitions of the methods and removes the forward declarations. Note that forward declaration of tcp_files[] is added in tcp_memcontrol.c for tcp_init_cgroup(). This will be removed soon by another patch. This patch doesn't introduce any functional change. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
2012-04-01cgroup: merge cft_release_agent cftype array into the base files arrayTejun Heo1-12/+7
Now that cftype can express whether a file should only be on root, cft_release_agent can be merged into the base files cftypes array. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
2012-04-01cgroup: implement cgroup_add_cftypes() and friendsTejun Heo2-3/+162
Currently, cgroup directories are populated by subsys->populate() callback explicitly creating files on each cgroup creation. This level of flexibility isn't needed or desirable. It provides largely unused flexibility which call for abuses while severely limiting what the core layer can do through the lack of structure and conventions. Per each cgroup file type, the only distinction that cgroup users is making is whether a cgroup is root or not, which can easily be expressed with flags. This patch introduces cgroup_add_cftypes(). These deal with cftypes instead of individual files - controllers indicate that certain types of files exist for certain subsystem. Newly added CFTYPE_*_ON_ROOT flags indicate whether a cftype should be excluded or created only on the root cgroup. cgroup_add_cftypes() can be called any time whether the target subsystem is currently attached or not. cgroup core will create files on the existing cgroups as necessary. Also, cgroup_subsys->base_cftypes is added to ease registration of the base files for the subsystem. If non-NULL on subsys init, the cftypes pointed to by ->base_cftypes are automatically registered on subsys init / load. Further patches will convert the existing users and remove the file based interface. Note that this interface allows dynamic addition of files to an active controller. This will be used for sub-controller modularity and unified hierarchy in the longer term. This patch implements the new mechanism but doesn't apply it to any user. v2: replaced DECLARE_CGROUP_CFTYPES[_COND]() with cgroup_subsys->base_cftypes, which works better for cgroup_subsys which is loaded as module. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
2012-04-01cgroup: build list of all cgroups under a given cgroupfs_rootTejun Heo2-0/+12
Build a list of all cgroups anchored at cgroupfs_root->allcg_list and going through cgroup->allcg_node. The list is protected by cgroup_mutex and will be used to improve cgroup file handling. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
2012-04-01cgroup: move cgroup_clear_directory() call out of cgroup_populate_dir()Tejun Heo1-4/+2
cgroup_populate_dir() currently clears all files and then repopulate the directory; however, the clearing part is only useful when it's called from cgroup_remount(). Relocate the invocation to cgroup_remount(). This is to prepare for further cgroup file handling updates. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
2012-04-01cgroup: deprecate remount option changesTejun Heo2-0/+14
This patch marks the following features for deprecation. * Rebinding subsys by remount: Never reached useful state - only works on empty hierarchies. * release_agent update by remount: release_agent itself will be replaced with conventional fsnotify notification. v2: Lennart pointed out that "name=" is necessary for mounts w/o any controller attached. Drop "name=" deprecation. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Lennart Poettering <mzxreary@0pointer.de>
2012-04-01Linux 3.4-rc1v3.4-rc1Linus Torvalds1-2/+2
2012-04-01Merge branch 's3-for-3.4' of ↵Linus Torvalds3-81/+8
git://git.kernel.org/pub/scm/linux/kernel/git/amit/virtio-console Pull virtio S3 support patches from Amit Shah: "Turns out S3 is not different from S4 for virtio devices: the device is assumed to be reset, so the host and guest state are to be assumed to be out of sync upon resume. We handle the S4 case with exactly the same scenario, so just point the suspend/resume routines to the freeze/restore ones. Once that is done, we also use the PM API's macro to initialise the sleep functions. A couple of cleanups are included: there's no need for special thaw processing in the balloon driver, so that's addressed in patches 1 and 2. Testing: both S3 and S4 support have been tested using these patches using a similar method used earlier during S4 patch development: a guest is started with virtio-blk as the only disk, a virtio network card, a virtio-serial port and a virtio balloon device. Ping from guest to host, dd /dev/zero to a file on the disk, and IO from the host on the virtio-serial port, all at once, while exercising S4 and S3 (separately) were tested. They all continue to work fine after resume. virtio balloon values too were tested by inflating and deflating the balloon." Pulling from Amit, since Rusty is off getting married (and presumably shaving people). * 's3-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/amit/virtio-console: virtio-pci: switch to PM ops macro to initialise PM functions virtio-pci: S3 support virtio-pci: drop restore_common() virtio: drop thaw PM operation virtio: balloon: Allow stats update after restore from S4
2012-04-01Merge branch 'for-linus' of ↵Linus Torvalds41-1640/+1249
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull second try at vfs part d#2 from Al Viro: "Miklos' first series (with do_lookup() rewrite split into edible chunks) + assorted bits and pieces. The 'untangling of do_lookup()' series is is a splitup of what used to be a monolithic patch from Miklos, so this series is basically "how do I convince myself that his patch is correct (or find a hole in it)". No holes found and I like the resulting cleanup, so in it went..." Changes from try 1: Fix a boot problem with selinux, and commit messages prettied up a bit. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (24 commits) vfs: fix out-of-date dentry_unhash() comment vfs: split __lookup_hash untangling do_lookup() - take __lookup_hash()-calling case out of line. untangling do_lookup() - switch to calling __lookup_hash() untangling do_lookup() - merge d_alloc_and_lookup() callers untangling do_lookup() - merge failure exits in !dentry case untangling do_lookup() - massage !dentry case towards __lookup_hash() untangling do_lookup() - get rid of need_reval in !dentry case untangling do_lookup() - eliminate a loop. untangling do_lookup() - expand the area under ->i_mutex untangling do_lookup() - isolate !dentry stuff from the rest of it. vfs: move MAY_EXEC check from __lookup_hash() vfs: don't revalidate just looked up dentry vfs: fix d_need_lookup/d_revalidate order in do_lookup ext3: move headers to fs/ext3/ migrate ext2_fs.h guts to fs/ext2/ext2.h new helper: ext2_image_size() get rid of pointless includes of ext2_fs.h ext2: No longer export ext2_fs.h to user space mtdchar: kill persistently held vfsmount ...
2012-04-01Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds2-9/+7
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar. * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix incorrect usage of for_each_cpu_mask() in select_fallback_rq() sched: Fix __schedule_bug() output when called from an interrupt sched/arch: Introduce the finish_arch_post_lock_switch() scheduler callback
2012-04-01Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds52-584/+2531
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates and fixes from Ingo Molnar: "It's mostly fixes, but there's also two late items: - preliminary GTK GUI support for perf report - PMU raw event format descriptors in sysfs, to be parsed by tooling The raw event format in sysfs is a new ABI. For example for the 'CPU' PMU we have: aldebaran:~> ll /sys/bus/event_source/devices/cpu/format/* -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/any -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/cmask -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/edge -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/event -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/inv -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/offcore_rsp -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/pc -r--r--r--. 1 root root 4096 Mar 31 10:29 /sys/bus/event_source/devices/cpu/format/umask those lists of fields contain a specific format: aldebaran:~> cat /sys/bus/event_source/devices/cpu/format/offcore_rsp config1:0-63 So, those who wish to specify raw events can now use the following event format: -e cpu/cmask=1,event=2,umask=3 Most people will not want to specify any events (let alone raw events), they'll just use whatever default event the tools use. But for more obscure PMU events that have no cross-architecture generic events the above syntax is more usable and a bit more structured than specifying hex numbers." * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits) perf tools: Remove auto-generated bison/flex files perf annotate: Fix off by one symbol hist size allocation and hit accounting perf tools: Add missing ref-cycles event back to event parser perf annotate: addr2line wants addresses in same format as objdump perf probe: Finder fails to resolve function name to address tracing: Fix ent_size in trace output perf symbols: Handle NULL dso in dso__name_len perf symbols: Do not include libgen.h perf tools: Fix bug in raw sample parsing perf tools: Fix display of first level of callchains perf tools: Switch module.h into export.h perf: Move mmap page data_head offset assertion out of header perf: Fix mmap_page capabilities and docs perf diff: Fix to work with new hists design perf tools: Fix modifier to be applied on correct events perf tools: Fix various casting issues for 32 bits perf tools: Simplify event_read_id exit path tracing: Fix ftrace stack trace entries tracing: Move the tracing_on/off() declarations into CONFIG_TRACING perf report: Add a simple GTK2-based 'perf report' browser ...
2012-04-01Merge tag 'parisc-misc' of ↵Linus Torvalds2-6/+28
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6 Pull PARISC misc updates from James Bottomley: "This is a couple of minor updates (fixing lws futex locking and removing some obsolete cpu_*_map calls)." * tag 'parisc-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6: [PARISC] remove references to cpu_*_map. [PARISC] futex: Use same lock set as lws calls
2012-04-01Merge tag 'scsi-misc' of ↵Linus Torvalds63-941/+4407
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 Pull SCSI updates from James Bottomley: "This is primarily another round of driver updates (lpfc, bfa, fcoe, ipr) plus a new ufshcd driver. There shouldn't be anything controversial in here (The final deletion of scsi proc_ops which caused some build breakage has been held over until the next merge window to give us more time to stabilise it). I'm afraid, with me moving continents at exactly the wrong time, anything submitted after the merge window opened has been held over to the next merge window." * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (63 commits) [SCSI] ipr: Driver version 2.5.3 [SCSI] ipr: Increase alignment boundary of command blocks [SCSI] ipr: Increase max concurrent oustanding commands [SCSI] ipr: Remove unnecessary memory barriers [SCSI] ipr: Remove unnecessary interrupt clearing on new adapters [SCSI] ipr: Fix target id allocation re-use problem [SCSI] atp870u, mpt2sas, qla4xxx use pci_dev->revision [SCSI] fcoe: Drop the rtnl_mutex before calling fcoe_ctlr_link_up [SCSI] bfa: Update the driver version to 3.0.23.0 [SCSI] bfa: BSG and User interface fixes. [SCSI] bfa: Fix to avoid vport delete hang on request queue full scenario. [SCSI] bfa: Move service parameter programming logic into firmware. [SCSI] bfa: Revised Fabric Assigned Address(FAA) feature implementation. [SCSI] bfa: Flash controller IOC pll init fixes. [SCSI] bfa: Serialize the IOC hw semaphore unlock logic. [SCSI] bfa: Modify ISR to process pending completions [SCSI] bfa: Add fc host issue lip support [SCSI] mpt2sas: remove extraneous sas_log_info messages [SCSI] libfc: fcoe_transport_create fails in single-CPU environment [SCSI] fcoe: reduce contention for fcoe_rx_list lock [v2] ...
2012-04-01vfs: fix out-of-date dentry_unhash() commentJ. Bruce Fields1-1/+1
64252c75a2196a0cf1e0d3777143ecfe0e3ae650 "vfs: remove dget() from dentry_unhash()" changed the implementation but not the comment. Cc: Sage Weil <sage@newdream.net> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-01vfs: split __lookup_hashMiklos Szeredi1-64/+44
Split __lookup_hash into two component functions: lookup_dcache - tries cached lookup, returns whether real lookup is needed lookup_real - calls i_op->lookup This eliminates code duplication between d_alloc_and_lookup() and d_inode_lookup(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-01untangling do_lookup() - take __lookup_hash()-calling case out of line.Al Viro1-15/+16
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-01untangling do_lookup() - switch to calling __lookup_hash()Al Viro1-67/+46
now we have __lookup_hash() open-coded if !dentry case; just call the damn thing instead... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-01untangling do_lookup() - merge d_alloc_and_lookup() callersAl Viro1-3/+3
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-01untangling do_lookup() - merge failure exits in !dentry caseAl Viro1-15/+8
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-01untangling do_lookup() - massage !dentry case towards __lookup_hash()Al Viro1-25/+20
Reorder if-else cases for starters... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-01untangling do_lookup() - get rid of need_reval in !dentry caseAl Viro1-6/+1
Everything arriving into if (!dentry) will have need_reval = 1. Indeed, the only way to get there with need_reval reset to 0 would be via if (unlikely(d_need_lookup(dentry))) goto unlazy; if (unlikely(dentry->d_flags & DCACHE_OP_REVALIDATE)) { status = d_revalidate(dentry, nd); if (unlikely(status <= 0)) { if (status != -ECHILD) need_reval = 0; goto unlazy; ... unlazy: /* no assignments to dentry */ if (dentry && unlikely(d_need_lookup(dentry))) { dput(dentry); dentry = NULL; } and if d_need_lookup() had already been false the first time around, it will remain false on the second call as well. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-01untangling do_lookup() - eliminate a loop.Al Viro1-4/+8
d_lookup() *will* fail after successful d_invalidate(), if we are holding i_mutex all along. IOW, we don't need to jump back to l: - we know what path will be taken there and can do that (i.e. d_alloc_and_lookup()) directly. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-01untangling do_lookup() - expand the area under ->i_mutexAl Viro1-2/+4
keep holding ->i_mutex over revalidation parts Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-01untangling do_lookup() - isolate !dentry stuff from the rest of it.Al Viro1-1/+16
Duplicate the revalidation-related parts into if (!dentry) branch. Next step will be to pull them under i_mutex. This and the next 8 commits are more or less a splitup of patch by Miklos; folks, when you are working with something that convoluted, carve your patches up into easily reviewed steps, especially when a lot of codepaths involved are rarely hit... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-01vfs: move MAY_EXEC check from __lookup_hash()Miklos Szeredi1-6/+5
The only caller of __lookup_hash() that needs the exec permission check on parent is lookup_one_len(). All lookup_hash() callers already checked permission in LOOKUP_PARENT walk. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-01vfs: don't revalidate just looked up dentryMiklos Szeredi1-3/+1
__lookup_hash() calls ->lookup() if the dentry needs lookup and on success revalidates the dentry (all under dir->i_mutex). While this is harmless it doesn't make a lot of sense. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-01vfs: fix d_need_lookup/d_revalidate order in do_lookupMiklos Szeredi1-2/+2
Doing revalidate on a dentry which has not yet been looked up makes no sense. Move the d_need_lookup() check before d_revalidate(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-01ext3: move headers to fs/ext3/Al Viro23-668/+437
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-01migrate ext2_fs.h guts to fs/ext2/ext2.hAl Viro6-656/+634
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-01new helper: ext2_image_size()Al Viro3-8/+30
... implemented that way since the next commit will leave it almost alone in ext2_fs.h - most of the file (including struct ext2_super_block) is going to move to fs/ext2/ext2.h. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-01get rid of pointless includes of ext2_fs.hAl Viro4-9/+4
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-01ext2: No longer export ext2_fs.h to user spaceThierry Reding2-63/+10
Since the on-disk format has been stable for quite some time, users should either use the headers provided by libext2fs or keep a private copy of this header. For the full discussion, see this thread: https://lkml.org/lkml/2012/3/21/516 While at it, this commit removes all __KERNEL__ guards, which are now unnecessary. Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Jan Kara <jack@suse.cz> Cc: Ted Ts'o <tytso@mit.edu> Cc: Artem Bityutskiy <dedekind1@gmail.com> Cc: Andreas Dilger <aedilger@gmail.com> Cc: linux-ext4@vger.kernel.org
2012-04-01mtdchar: kill persistently held vfsmountAl Viro1-37/+16
... and mtdchar_notifier along with it; just have ->drop_inode() that will unconditionally get evict them instead of dances on mtd device removal and use simple_pin_fs() instead of kern_mount() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-01pstore: trim pstore_get_inode()Al Viro1-18/+8
move mode-dependent parts to callers, kill unused arguments Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-01aio: take final put_ioctx() into callers of io_destroy()Al Viro1-6/+4
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-01aio: merge aio_cancel_all() with wait_for_all_aios()Al Viro1-15/+7
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>