summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2010-05-21Merge branch 'perf/core' of ↵Steven Rostedt32-182/+303
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip into trace/tip/tracing/core-7 Conflicts: include/linux/ftrace_event.h include/trace/ftrace.h kernel/trace/trace_event_perf.c kernel/trace/trace_kprobe.c kernel/trace/trace_syscalls.c Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-05-21perf: Remove more code from the fastpathPeter Zijlstra1-2/+0
Sanity checks cost instructions. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20100521090710.852926930@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-21perf: Optimize the !vmalloc backed bufferPeter Zijlstra1-1/+1
Reduce code and data by using the knowledge that for !PERF_USE_VMALLOC data_order is always 0. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20100521090710.795019386@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-21perf: Optimize perf_output_copy()Peter Zijlstra1-0/+3
Reduce the clutter in perf_output_copy() by keeping an interator in perf_output_handle. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20100521090710.742809176@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-21perf: Fix wakeup storm for RO mmap()sPeter Zijlstra1-1/+1
RO mmap()s don't update the tail pointer, so comparing against it for determining the written data size doesn't really do any good. Keep track of when we last did a wakeup, and compare against that. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20100521090710.684479310@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-21perf, trace: Optimize tracepoints by using per-tracepoint-per-cpu hlist to ↵Peter Zijlstra3-11/+15
track events Avoid the swevent hash-table by using per-tracepoint hlists. Also, avoid conditionals on the fast path by ordering with probe unregister so that we should never get on the callback path without the data being there. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20100521090710.473188012@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-21perf, trace: Optimize tracepoints by removing IRQ-disable from ↵Peter Zijlstra2-16/+10
perf/tracepoint interaction Improves performance. Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1274259525.5605.10352.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-20Merge branch 'perf/urgent' of ↵Ingo Molnar45-399/+607
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core
2010-05-18perf: Optimize perf_output_*() by avoiding local_xchg()Peter Zijlstra1-0/+1
Since the x86 XCHG ins implies LOCK, avoid the use by using a sequence count instead. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-18perf: Optimize the hotpath by converting the perf output buffer to local_tPeter Zijlstra1-8/+7
Since there is now only a single writer, we can use local_t instead and avoid all these pesky LOCK insn. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-18perf: Optimize the perf_output() path by removing IRQ-disablesPeter Zijlstra1-3/+2
Since we can now assume there is only a single writer to each buffer, we can remove per-cpu lock thingy and use a simply nest-count to the same effect. This removes the need to disable IRQs. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-18perf/ftrace: Optimize perf/tracepoint interaction for single eventsPeter Zijlstra3-5/+8
When we've got but a single event per tracepoint there is no reason to try and multiplex it so don't. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-18Merge branch 'x86-pat-for-linus' of ↵Linus Torvalds1-1/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-pat-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, pat: Update the page flags for memtype atomically instead of using memtype_lock x86, pat: In rbt_memtype_check_insert(), update new->type only if valid x86, pat: Migrate to rbtree only backend for pat memtype management x86, pat: Preparatory changes in pat.c for bigger rbtree change rbtree: Add support for augmented rbtrees
2010-05-18Merge branch 'core-hweight-for-linus' of ↵Linus Torvalds4-31/+74
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-hweight-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, hweight: Use a 32-bit popcnt for __arch_hweight32() arch, hweight: Fix compilation errors x86: Add optimized popcnt variants bitops: Optimize hweight() by making use of compile-time evaluation
2010-05-18Merge branch 'x86-irq-for-linus' of ↵Linus Torvalds1-2/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, acpi/irq: Define gsi_end when X86_IO_APIC is undefined x86, irq: Kill io_apic_renumber_irq x86, acpi/irq: Handle isa irqs that are not identity mapped to gsi's. x86, ioapic: Simplify probe_nr_irqs_gsi. x86, ioapic: Optimize pin_2_irq x86, ioapic: Move nr_ioapic_registers calculation to mp_register_ioapic. x86, ioapic: In mpparse use mp_register_ioapic x86, ioapic: Teach mp_register_ioapic to compute a global gsi_end x86, ioapic: Fix the types of gsi values x86, ioapic: Fix io_apic_redir_entries to return the number of entries. x86, ioapic: Only export mp_find_ioapic and mp_find_ioapic_pin in io_apic.h x86, acpi/irq: Generalize mp_config_acpi_legacy_irqs x86, acpi/irq: Fix acpi_sci_ioapic_setup so it has both bus_irq and gsi x86, acpi/irq: pci device dev->irq is an isa irq not a gsi x86, acpi/irq: Teach acpi_get_override_irq to take a gsi not an isa_irq x86, acpi/irq: Introduce apci_isa_irq_to_gsi
2010-05-18Merge branch 'x86-cpu-for-linus' of ↵Linus Torvalds1-0/+30
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, hypervisor: add missing <linux/module.h> Modify the VMware balloon driver for the new x86_hyper API x86, hypervisor: Export the x86_hyper* symbols x86: Clean up the hypervisor layer x86, HyperV: fix up the license to mshyperv.c x86: Detect running on a Microsoft HyperV system x86, cpu: Make APERF/MPERF a normal table-driven flag x86, k8: Fix build error when K8_NB is disabled x86, cacheinfo: Disable index in all four subcaches x86, cacheinfo: Make L3 cache info per node x86, cacheinfo: Reorganize AMD L3 cache structure x86, cacheinfo: Turn off L3 cache index disable feature in virtualized environments x86, cacheinfo: Unify AMD L3 cache index disable checking cpufreq: Unify sysfs attribute definition macros powernow-k8: Fix frequency reporting x86, cpufreq: Add APERF/MPERF support for AMD processors x86: Unify APERF/MPERF support powernow-k8: Add core performance boost support x86, cpu: Add AMD core boosting feature flag to /proc/cpuinfo Fix up trivial conflicts in arch/x86/kernel/cpu/intel_cacheinfo.c and drivers/cpufreq/cpufreq_ondemand.c
2010-05-18Merge branch 'tracing-core-for-linus' of ↵Linus Torvalds6-77/+116
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tracing: Fix "integer as NULL pointer" warning. tracing: Fix tracepoint.h DECLARE_TRACE() to allow more than one header tracing: Make the documentation clear on trace_event boot option ring-buffer: Wrap open-coded WARN_ONCE tracing: Convert nop macros to static inlines tracing: Fix sleep time function profiling tracing: Show sample std dev in function profiling tracing: Add documentation for trace commands mod, traceon/traceoff ring-buffer: Make benchmark handle missed events ring-buffer: Make non-consuming read less expensive with lots of cpus. tracing: Add graph output support for irqsoff tracer tracing: Have graph flags passed in to ouput functions tracing: Add ftrace events for graph tracer tracing: Dump either the oops's cpu source or all cpus buffers tracing: Fix uninitialized variable of tracing/trace output
2010-05-18Merge branch 'sched-core-for-linus' of ↵Linus Torvalds8-117/+166
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (49 commits) stop_machine: Move local variable closer to the usage site in cpu_stop_cpu_callback() sched, wait: Use wrapper functions sched: Remove a stale comment ondemand: Make the iowait-is-busy time a sysfs tunable ondemand: Solve a big performance issue by counting IOWAIT time as busy sched: Intoduce get_cpu_iowait_time_us() sched: Eliminate the ts->idle_lastupdate field sched: Fold updating of the last_update_time_info into update_ts_time_stats() sched: Update the idle statistics in get_cpu_idle_time_us() sched: Introduce a function to update the idle statistics sched: Add a comment to get_cpu_idle_time_us() cpu_stop: add dummy implementation for UP sched: Remove rq argument to the tracepoints rcu: need barrier() in UP synchronize_sched_expedited() sched: correctly place paranioa memory barriers in synchronize_sched_expedited() sched: kill paranoia check in synchronize_sched_expedited() sched: replace migration_thread with cpu_stop stop_machine: reimplement using cpu_stop cpu_stop: implement stop_cpu[s]() sched: Fix select_idle_sibling() logic in select_task_rq_fair() ...
2010-05-18Merge branch 'perf-core-for-linus' of ↵Linus Torvalds8-95/+128
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (311 commits) perf tools: Add mode to build without newt support perf symbols: symbol inconsistency message should be done only at verbose=1 perf tui: Add explicit -lslang option perf options: Type check all the remaining OPT_ variants perf options: Type check OPT_BOOLEAN and fix the offenders perf options: Check v type in OPT_U?INTEGER perf options: Introduce OPT_UINTEGER perf tui: Add workaround for slang < 2.1.4 perf record: Fix bug mismatch with -c option definition perf options: Introduce OPT_U64 perf tui: Add help window to show key associations perf tui: Make <- exit menus too perf newt: Add single key shortcuts for zoom into DSO and threads perf newt: Exit browser unconditionally when CTRL+C, q or Q is pressed perf newt: Fix the 'A'/'a' shortcut for annotate perf newt: Make <- exit the ui_browser x86, perf: P4 PMU - fix counters management logic perf newt: Make <- zoom out filters perf report: Report number of events, not samples perf hist: Clarify events_stats fields usage ... Fix up trivial conflicts in kernel/fork.c and tools/perf/builtin-record.c
2010-05-18Merge branch 'oprofile-for-linus' of ↵Linus Torvalds6-57/+59
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'oprofile-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (24 commits) oprofile/x86: make AMD IBS hotplug capable oprofile/x86: notify cpus only when daemon is running oprofile/x86: reordering some functions oprofile/x86: stop disabled counters in nmi handler oprofile/x86: protect cpu hotplug sections oprofile/x86: remove CONFIG_SMP macros oprofile/x86: fix uninitialized counter usage during cpu hotplug oprofile/x86: remove duplicate IBS capability check oprofile/x86: move IBS code oprofile/x86: return -EBUSY if counters are already reserved oprofile/x86: moving shutdown functions oprofile/x86: reserve counter msrs pairwise oprofile/x86: rework error handler in nmi_setup() oprofile: update file list in MAINTAINERS file oprofile: protect from not being in an IRQ context oprofile: remove double ring buffering ring-buffer: Add lost event count to end of sub buffer tracing: Show the lost events in the trace_pipe output ring-buffer: Add place holder recording of dropped events tracing: Fix compile error in module tracepoints when MODULE_UNLOAD not set ...
2010-05-18Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds6-22/+81
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (24 commits) rcu: remove all rcu head initializations, except on_stack initializations rcu head introduce rcu head init on stack Debugobjects transition check rcu: fix build bug in RCU_FAST_NO_HZ builds rcu: RCU_FAST_NO_HZ must check RCU dyntick state rcu: make SRCU usable in modules rcu: improve the RCU CPU-stall warning documentation rcu: reduce the number of spurious RCU_SOFTIRQ invocations rcu: permit discontiguous cpu_possible_mask CPU numbering rcu: improve RCU CPU stall-warning messages rcu: print boot-time console messages if RCU configs out of ordinary rcu: disable CPU stall warnings upon panic rcu: enable CPU_STALL_VERBOSE by default rcu: slim down rcutiny by removing rcu_scheduler_active and friends rcu: refactor RCU's context-switch handling rcu: rename rcutiny rcu_ctrlblk to rcu_sched_ctrlblk rcu: shrink rcutiny by making synchronize_rcu_bh() be inline rcu: fix now-bogus rcu_scheduler_active comments. rcu: Fix bogus CONFIG_PROVE_LOCKING in comments to reflect reality. rcu: ignore offline CPUs in last non-dyntick-idle CPU check ...
2010-05-18Merge branch 'core-iommu-for-linus' of ↵Linus Torvalds1-12/+12
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86/amd-iommu: Add amd_iommu=off command line option iommu-api: Remove iommu_{un}map_range functions x86/amd-iommu: Implement ->{un}map callbacks for iommu-api x86/amd-iommu: Make amd_iommu_iova_to_phys aware of multiple page sizes x86/amd-iommu: Make iommu_unmap_page and fetch_pte aware of page sizes x86/amd-iommu: Make iommu_map_page and alloc_pte aware of page sizes kvm: Change kvm_iommu_map_pages to map large pages VT-d: Change {un}map_range functions to implement {un}map interface iommu-api: Add ->{un}map callbacks to iommu_ops iommu-api: Add iommu_map and iommu_unmap functions iommu-api: Rename ->{un}map function pointers to ->{un}map_range
2010-05-18Merge branch 'perf/core' of ↵Steven Rostedt12-94/+142
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip into trace/tip/tracing/core-6 Conflicts: include/trace/ftrace.h kernel/trace/trace_kprobe.c Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-05-18Merge branch 'for-linus' of ↵Linus Torvalds3-23/+14
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k: amiga - Floppy platform device conversion m68k: amiga - Sound platform device conversion m68k: amiga - Frame buffer platform device conversion m68k: amiga - Zorro host bridge platform device conversion m68k: amiga - Zorro bus modalias support platform: Make platform resource input parameters const m68k: invoke oom-killer from page fault serial167: Kill unused variables m68k: Implement generic_find_next_{zero_,}le_bit() m68k: hp300 - Checkpatch cleanup m68k: Remove trailing spaces in messages m68k: Simplify param.h by using <asm-generic/param.h> m68k: Remove BKL from rtc implementations
2010-05-17m68k: amiga - Zorro host bridge platform device conversionGeert Uytterhoeven1-9/+0
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2010-05-17m68k: amiga - Zorro bus modalias supportGeert Uytterhoeven2-12/+10
Add Amiga Zorro bus modalias and uevent support Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2010-05-17platform: Make platform resource input parameters constGeert Uytterhoeven1-2/+4
Make the platform resource input parameters of platform_device_add_resources() and platform_device_register_simple() const, as the resources are copied and never modified. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-17atomic_t: Remove volatile from atomic_t definitionAnton Blanchard1-2/+2
When looking at a performance problem on PowerPC, I noticed some awful code generation: c00000000051fc98: 3b 60 00 01 li r27,1 ... c00000000051fca0: 3b 80 00 00 li r28,0 ... c00000000051fcdc: 93 61 00 70 stw r27,112(r1) c00000000051fce0: 93 81 00 74 stw r28,116(r1) c00000000051fce4: 81 21 00 70 lwz r9,112(r1) c00000000051fce8: 80 01 00 74 lwz r0,116(r1) c00000000051fcec: 7d 29 07 b4 extsw r9,r9 c00000000051fcf0: 7c 00 07 b4 extsw r0,r0 c00000000051fcf4: 7c 20 04 ac lwsync c00000000051fcf8: 7d 60 f8 28 lwarx r11,0,r31 c00000000051fcfc: 7c 0b 48 00 cmpw r11,r9 c00000000051fd00: 40 c2 00 10 bne- c00000000051fd10 c00000000051fd04: 7c 00 f9 2d stwcx. r0,0,r31 c00000000051fd08: 40 c2 ff f0 bne+ c00000000051fcf8 c00000000051fd0c: 4c 00 01 2c isync We create two constants, write them out to the stack, read them straight back in and sign extend them. What a mess. It turns out this bad code is a result of us defining atomic_t as a volatile int. We removed the volatile attribute from the powerpc atomic_t definition years ago, but commit ea435467500612636f8f4fb639ff6e76b2496e4b (atomic_t: unify all arch definitions) added it back in. To dig up an old quote from Linus: > The fact is, volatile on data structures is a bug. It's a wart in the C > language. It shouldn't be used. > > Volatile accesses in *code* can be ok, and if we have "atomic_read()" > expand to a "*(volatile int *)&(x)->value", then I'd be ok with that. > > But marking data structures volatile just makes the compiler screw up > totally, and makes code for initialization sequences etc much worse. And screw up it does :) With the volatile removed, we see much more reasonable code generation: c00000000051f5b8: 3b 60 00 01 li r27,1 ... c00000000051f5c0: 3b 80 00 00 li r28,0 ... c00000000051fc7c: 7c 20 04 ac lwsync c00000000051fc80: 7c 00 f8 28 lwarx r0,0,r31 c00000000051fc84: 7c 00 d8 00 cmpw r0,r27 c00000000051fc88: 40 c2 00 10 bne- c00000000051fc98 c00000000051fc8c: 7f 80 f9 2d stwcx. r28,0,r31 c00000000051fc90: 40 c2 ff f0 bne+ c00000000051fc80 c00000000051fc94: 4c 00 01 2c isync Six instructions less. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-17atomic_t: Cast to volatile when accessing atomic variablesAnton Blanchard1-1/+1
In preparation for removing volatile from the atomic_t definition, this patch adds a volatile cast to all the atomic read functions. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2-22/+22
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: rtnetlink: make SR-IOV VF interface symmetric sctp: delete active ICMP proto unreachable timer when free transport tcp: fix MD5 (RFC2385) support
2010-05-16rtnetlink: make SR-IOV VF interface symmetricChris Wright1-4/+19
Now we have a set of nested attributes: IFLA_VFINFO_LIST (NESTED) IFLA_VF_INFO (NESTED) IFLA_VF_MAC IFLA_VF_VLAN IFLA_VF_TX_RATE This allows a single set to operate on multiple attributes if desired. Among other things, it means a dump can be replayed to set state. The current interface has yet to be released, so this seems like something to consider for 2.6.34. Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-16tcp: fix MD5 (RFC2385) supportEric Dumazet1-18/+3
TCP MD5 support uses percpu data for temporary storage. It currently disables preemption so that same storage cannot be reclaimed by another thread on same cpu. We also have to make sure a softirq handler wont try to use also same context. Various bug reports demonstrated corruptions. Fix is to disable preemption and BH. Reported-by: Bhaskar Dutta <bhaskie@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-15Fix the regression created by "set S_DEAD on unlink()..." commitAl Viro1-0/+14
1) i_flags simply doesn't work for mount/unlink race prevention; we may have many links to file and rm on one of those obviously shouldn't prevent bind on top of another later on. To fix it right way we need to mark _dentry_ as unsuitable for mounting upon; new flag (DCACHE_CANT_MOUNT) is protected by d_flags and i_mutex on the inode in question. Set it (with dont_mount(dentry)) in unlink/rmdir/etc., check (with cant_mount(dentry)) in places in namespace.c that used to check for S_DEAD. Setting S_DEAD is still needed in places where we used to set it (for directories getting killed), since we rely on it for readdir/rmdir race prevention. 2) rename()/mount() protection has another bogosity - we unhash the target before we'd checked that it's not a mountpoint. Fixed. 3) ancient bogosity in pivot_root() - we locked i_mutex on the right directory, but checked S_DEAD on the different (and wrong) one. Noticed and fixed. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-15tracing: Comment the use of event_mutex with trace event flagsSteven Rostedt1-1/+8
The flags variable is protected by the event_mutex when modifying, but the event_mutex is not held when reading the variable. This is due to the fact that the reads occur in critical sections where taking a mutex (or even a spinlock) is not wanted. But the two flags that exist (enable and filter_active) have the code written as such to handle the reads to not need a lock. The enable flag is used just to know if the event is enabled or not and its use is always under the event_mutex. Whether or not the event is actually enabled is really determined by the tracepoint being registered. The flag is just a way to let the code know if the tracepoint is registered. The filter_active is different. It is read without the lock. If it is set, then the event probes jump to the filter code. There can be a slight mismatch between filters available and filter_active. If the flag is set but no filters are available, the code safely jumps to a filter nop. If the flag is not set and the filters are available, then the filters are skipped. This is acceptable since filters are usually set before tracing or they are set by humans, which would not notice the slight delay that this causes. v2: Fixed typo: "cacheing" -> "caching" Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-05-14tracing: Combine event filter_active and enable into single flags fieldSteven Rostedt1-2/+19
The filter_active and enable both use an int (4 bytes each) to set a single flag. We can save 4 bytes per event by combining the two into a single integer. text data bss dec hex filename 4913961 1088356 861512 6863829 68bbd5 vmlinux.orig 4894944 1018052 861512 6774508 675eec vmlinux.id 4894871 1012292 861512 6768675 674823 vmlinux.flags This gives us another 5K in savings. The modification of both the enable and filter fields are done under the event_mutex, so it is still safe to combine the two. Note: Although Mathieu gave his Acked-by, he would like it documented that the reads of flags are not protected by the mutex. The way the code works, these reads will not break anything, but will have a residual effect. Since this behavior is the same even before this patch, describing this situation is left to another patch, as this patch does not change the behavior, but just brought it to Mathieu's attention. v2: Updated the event trace self test to for this change. Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-05-14tracing: Remove duplicate id information in event structureSteven Rostedt2-9/+8
Now that the trace_event structure is embedded in the ftrace_event_call structure, there is no need for the ftrace_event_call id field. The id field is the same as the trace_event type field. Removing the id and re-arranging the structure brings down the tracepoint footprint by another 5K. text data bss dec hex filename 4913961 1088356 861512 6863829 68bbd5 vmlinux.orig 4895024 1023812 861512 6780348 6775bc vmlinux.print 4894944 1018052 861512 6774508 675eec vmlinux.id Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-05-14tracing: Move print functions into event classSteven Rostedt3-41/+24
Currently, every event has its own trace_event structure. This is fine since the structure is needed anyway. But the print function structure (trace_event_functions) is now separate. Since the output of the trace event is done by the class (with the exception of events defined by DEFINE_EVENT_PRINT), it makes sense to have the class define the print functions that all events in the class can use. This makes a bigger deal with the syscall events since all syscall events use the same class. The savings here is another 30K. text data bss dec hex filename 4913961 1088356 861512 6863829 68bbd5 vmlinux.orig 4900382 1048964 861512 6810858 67ecea vmlinux.init 4900446 1049028 861512 6810986 67ed6a vmlinux.preprint 4895024 1023812 861512 6780348 6775bc vmlinux.print To accomplish this, and to let the class know what event is being printed, the event structure is embedded in the ftrace_event_call structure. This should not be an issues since the event structure was created for each event anyway. Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-05-14tracing: Allow events to share their print functionsSteven Rostedt4-14/+32
Multiple events may use the same method to print their data. Instead of having all events have a pointer to their print funtions, the trace_event structure now points to a trace_event_functions structure that will hold the way to print ouf the event. The event itself is now passed to the print function to let the print function know what kind of event it should print. This opens the door to consolidating the way several events print their output. text data bss dec hex filename 4913961 1088356 861512 6863829 68bbd5 vmlinux.orig 4900382 1048964 861512 6810858 67ecea vmlinux.init 4900446 1049028 861512 6810986 67ed6a vmlinux.preprint This change slightly increases the size but is needed for the next change. v3: Fix the branch tracer events to handle this change. v2: Fix the new function graph tracer event calls to handle this change. Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-05-14tracing: Move raw_init from events to classSteven Rostedt3-8/+5
The raw_init function pointer in the event is used to initialize various kinds of events. The type of initialization needed is usually classed to the kind of event it is. Two events with the same class will always have the same initialization function, so it makes sense to move this to the class structure. Perhaps even making a special system structure would work since the initialization is the same for all events within a system. But since there's no system structure (yet), this will just move it to the class. text data bss dec hex filename 4913961 1088356 861512 6863829 68bbd5 vmlinux.orig 4900375 1053380 861512 6815267 67fe23 vmlinux.fields 4900382 1048964 861512 6810858 67ecea vmlinux.init The text grew very slightly, but this is a constant growth that happened with the changing of the C files that call the init code. The bigger savings is the data which will be saved the more events share a class. Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-05-14tracing: Move fields from event to class structureSteven Rostedt4-15/+19
Move the defined fields from the event to the class structure. Since the fields of the event are defined by the class they belong to, it makes sense to have the class hold the information instead of the individual events. The events of the same class would just hold duplicate information. After this change the size of the kernel dropped another 3K: text data bss dec hex filename 4913961 1088356 861512 6863829 68bbd5 vmlinux.orig 4900252 1057412 861512 6819176 680d68 vmlinux.regs 4900375 1053380 861512 6815267 67fe23 vmlinux.fields Although the text increased, this was mainly due to the C files having to adapt to the change. This is a constant increase, where new tracepoints will not increase the Text. But the big drop is in the data size (as well as needed allocations to hold the fields). This will give even more savings as more tracepoints are created. Note, if just TRACE_EVENT()s are used and not DECLARE_EVENT_CLASS() with several DEFINE_EVENT()s, then the savings will be lost. But we are pushing developers to consolidate events with DEFINE_EVENT() so this should not be an issue. The kprobes define a unique class to every new event, but are dynamic so it should not be a issue. The syscalls however have a single class but the fields for the individual events are different. The syscalls use a metadata to define the fields. I moved the fields list from the event to the metadata and added a "get_fields()" function to the class. This function is used to find the fields. For normal events and kprobes, get_fields() just returns a pointer to the fields list_head in the class. For syscall events, it returns the fields list_head in the metadata for the event. v2: Fixed the syscall fields. The syscall metadata needs a list of fields for both enter and exit. Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-05-14tracing: Remove per event trace registeringSteven Rostedt3-124/+57
This patch removes the register functions of TRACE_EVENT() to enable and disable tracepoints. The registering of a event is now down directly in the trace_events.c file. The tracepoint_probe_register() is now called directly. The prototypes are no longer type checked, but this should not be an issue since the tracepoints are created automatically by the macros. If a prototype is incorrect in the TRACE_EVENT() macro, then other macros will catch it. The trace_event_class structure now holds the probes to be called by the callbacks. This removes needing to have each event have a separate pointer for the probe. To handle kprobes and syscalls, since they register probes in a different manner, a "reg" field is added to the ftrace_event_class structure. If the "reg" field is assigned, then it will be called for enabling and disabling of the probe for either ftrace or perf. To let the reg function know what is happening, a new enum (trace_reg) is created that has the type of control that is needed. With this new rework, the 82 kernel events and 618 syscall events has their footprint dramatically lowered: text data bss dec hex filename 4913961 1088356 861512 6863829 68bbd5 vmlinux.orig 4914025 1088868 861512 6864405 68be15 vmlinux.class 4918492 1084612 861512 6864616 68bee8 vmlinux.tracepoint 4900252 1057412 861512 6819176 680d68 vmlinux.regs The size went from 6863829 to 6819176, that's a total of 44K in savings. With tracepoints being continuously added, this is critical that the footprint becomes minimal. v5: Added #ifdef CONFIG_PERF_EVENTS around a reference to perf specific structure in trace_events.c. v4: Fixed trace self tests to check probe because regfunc no longer exists. v3: Updated to handle void *data in beginning of probe parameters. Also added the tracepoint: check_trace_callback_type_##call(). v2: Changed the callback probes to pass void * and typecast the value within the function. Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-05-14tracing: Let tracepoints have data passed to tracepoint callbacksSteven Rostedt2-30/+79
This patch adds data to be passed to tracepoint callbacks. The created functions from DECLARE_TRACE() now need a mandatory data parameter. For example: DECLARE_TRACE(mytracepoint, int value, value) Will create the register function: int register_trace_mytracepoint((void(*)(void *data, int value))probe, void *data); As the first argument, all callbacks (probes) must take a (void *data) parameter. So a callback for the above tracepoint will look like: void myprobe(void *data, int value) { } The callback may choose to ignore the data parameter. This change allows callbacks to register a private data pointer along with the function probe. void mycallback(void *data, int value); register_trace_mytracepoint(mycallback, mydata); Then the mycallback() will receive the "mydata" as the first parameter before the args. A more detailed example: DECLARE_TRACE(mytracepoint, TP_PROTO(int status), TP_ARGS(status)); /* In the C file */ DEFINE_TRACE(mytracepoint, TP_PROTO(int status), TP_ARGS(status)); [...] trace_mytracepoint(status); /* In a file registering this tracepoint */ int my_callback(void *data, int status) { struct my_struct my_data = data; [...] } [...] my_data = kmalloc(sizeof(*my_data), GFP_KERNEL); init_my_data(my_data); register_trace_mytracepoint(my_callback, my_data); The same callback can also be registered to the same tracepoint as long as the data registered is different. Note, the data must also be used to unregister the callback: unregister_trace_mytracepoint(my_callback, my_data); Because of the data parameter, tracepoints declared this way can not have no args. That is: DECLARE_TRACE(mytracepoint, TP_PROTO(void), TP_ARGS()); will cause an error. If no arguments are needed, a new macro can be used instead: DECLARE_TRACE_NOARGS(mytracepoint); Since there are no arguments, the proto and args fields are left out. This is part of a series to make the tracepoint footprint smaller: text data bss dec hex filename 4913961 1088356 861512 6863829 68bbd5 vmlinux.orig 4914025 1088868 861512 6864405 68be15 vmlinux.class 4918492 1084612 861512 6864616 68bee8 vmlinux.tracepoint Again, this patch also increases the size of the kernel, but lays the ground work for decreasing it. v5: Fixed net/core/drop_monitor.c to handle these updates. v4: Moved the DECLARE_TRACE() DECLARE_TRACE_NOARGS out of the #ifdef CONFIG_TRACE_POINTS, since the two are the same in both cases. The __DECLARE_TRACE() is what changes. Thanks to Frederic Weisbecker for pointing this out. v3: Made all register_* functions require data to be passed and all callbacks to take a void * parameter as its first argument. This makes the calling functions comply with C standards. Also added more comments to the modifications of DECLARE_TRACE(). v2: Made the DECLARE_TRACE() have the ability to pass arguments and added a new DECLARE_TRACE_NOARGS() for tracepoints that do not need any arguments. Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-05-14tracepoints: Add check trace callback typeMathieu Desnoyers1-1/+6
This check is meant to be used by tracepoint users which do a direct cast of callbacks to (void *) for direct registration, thus bypassing the register_trace_##name and unregister_trace_##name checks. This permits to ensure that the callback type matches the function type at the call site, but without generating any code. Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> LKML-Reference: <20100430165959.GA25605@Krystal> CC: Ingo Molnar <mingo@elte.hu> CC: Andrew Morton <akpm@linux-foundation.org> CC: Thomas Gleixner <tglx@linutronix.de> CC: Peter Zijlstra <peterz@infradead.org> CC: Arnaldo Carvalho de Melo <acme@redhat.com> CC: Lai Jiangshan <laijs@cn.fujitsu.com> CC: Li Zefan <lizf@cn.fujitsu.com> CC: Christoph Hellwig <hch@lst.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-05-14tracing: Create class struct for eventsSteven Rostedt3-28/+28
This patch creates a ftrace_event_class struct that event structs point to. This class struct will be made to hold information to modify the events. Currently the class struct only holds the events system name. This patch slightly increases the size, but this change lays the ground work of other changes to make the footprint of tracepoints smaller. With 82 standard tracepoints, and 618 system call tracepoints (two tracepoints per syscall: enter and exit): text data bss dec hex filename 4913961 1088356 861512 6863829 68bbd5 vmlinux.orig 4914025 1088868 861512 6864405 68be15 vmlinux.class This patch also cleans up some stale comments in ftrace.h. v2: Fixed missing semi-colon in macro. Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-05-14Merge branch 'sched/core' of ↵Steven Rostedt23-133/+211
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip into trace/tip/tracing/core-4
2010-05-13Merge commit 'v2.6.34-rc7' into tracing/coreIngo Molnar23-49/+67
Merge reason: Update from -rc5 to -rc7. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-12revert "procfs: provide stack information for threads" and its fixup commitsRobin Holt1-1/+0
Originally, commit d899bf7b ("procfs: provide stack information for threads") attempted to introduce a new feature for showing where the threadstack was located and how many pages are being utilized by the stack. Commit c44972f1 ("procfs: disable per-task stack usage on NOMMU") was applied to fix the NO_MMU case. Commit 89240ba0 ("x86, fs: Fix x86 procfs stack information for threads on 64-bit") was applied to fix a bug in ia32 executables being loaded. Commit 9ebd4eba7 ("procfs: fix /proc/<pid>/stat stack pointer for kernel threads") was applied to fix a bug which had kernel threads printing a userland stack address. Commit 1306d603f ('proc: partially revert "procfs: provide stack information for threads"') was then applied to revert the stack pages being used to solve a significant performance regression. This patch nearly undoes the effect of all these patches. The reason for reverting these is it provides an unusable value in field 28. For x86_64, a fork will result in the task->stack_start value being updated to the current user top of stack and not the stack start address. This unpredictability of the stack_start value makes it worthless. That includes the intended use of showing how much stack space a thread has. Other architectures will get different values. As an example, ia64 gets 0. The do_fork() and copy_process() functions appear to treat the stack_start and stack_size parameters as architecture specific. I only partially reverted c44972f1 ("procfs: disable per-task stack usage on NOMMU") . If I had completely reverted it, I would have had to change mm/Makefile only build pagewalk.o when CONFIG_PROC_PAGE_MONITOR is configured. Since I could not test the builds without significant effort, I decided to not change mm/Makefile. I only partially reverted 89240ba0 ("x86, fs: Fix x86 procfs stack information for threads on 64-bit") . I left the KSTK_ESP() change in place as that seemed worthwhile. Signed-off-by: Robin Holt <holt@sgi.com> Cc: Stefani Seibold <stefani@seibold.net> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-12dma-mapping: fix dma_sync_single_range_*FUJITA Tomonori1-2/+2
dma_sync_single_range_for_cpu() and dma_sync_single_range_for_device() use a wrong address with a partial synchronization. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-12rcu: remove all rcu head initializations, except on_stack initializationsPaul E. McKenney1-1/+0
Remove all rcu head inits. We don't care about the RCU head state before passing it to call_rcu() anyway. Only leave the "on_stack" variants so debugobjects can keep track of objects on stack. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2010-05-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2-0/+4
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: net: Fix FDDI and TR config checks in ipv4 arp and LLC. IPv4: unresolved multicast route cleanup mac80211: remove association work when processing deauth request ar9170: wait for asynchronous firmware loading ipv4: udp: fix short packet and bad checksum logging phy: Fix initialization in micrel driver. sctp: Fix a race between ICMP protocol unreachable and connect() veth: Dont kfree_skb() after dev_forward_skb() IPv6: fix IPV6_RECVERR handling of locally-generated errors net/gianfar: drop recycled skbs on MTU change iwlwifi: work around passive scan issue