summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2010-10-19xfs: store xfs_mount in the buftarg instead of in the xfs_bufDave Chinner5-21/+20
Each buffer contains both a buftarg pointer and a mount pointer. If we add a mount pointer into the buftarg, we can avoid needing the b_mount field in every buffer and grab it from the buftarg when needed instead. This shrinks the xfs_buf by 8 bytes. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
2010-10-19xfs: introduced uncached buffer read primitveDave Chinner2-0/+37
To avoid the need to use cached buffers for single-shot or buffers cached at the filesystem level, introduce a new buffer read primitive that bypasses the cache an reads directly from disk. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
2010-10-19xfs: rename xfs_buf_get_nodaddr to be more appropriateDave Chinner6-11/+14
xfs_buf_get_nodaddr() is really used to allocate a buffer that is uncached. While it is not directly assigned a disk address, the fact that they are not cached is a more important distinction. With the upcoming uncached buffer read primitive, we should be consistent with this disctinction. While there, make page allocation in xfs_buf_get_nodaddr() safe against memory reclaim re-entrancy into the filesystem by allowing a flags parameter to be passed. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
2010-10-19xfs: don't use vfs writeback for pure metadata modificationsDave Chinner12-86/+65
Under heavy multi-way parallel create workloads, the VFS struggles to write back all the inodes that have been changed in age order. The bdi flusher thread becomes CPU bound, spending 85% of it's time in the VFS code, mostly traversing the superblock dirty inode list to separate dirty inodes old enough to flush. We already keep an index of all metadata changes in age order - in the AIL - and continued log pressure will do age ordered writeback without any extra overhead at all. If there is no pressure on the log, the xfssyncd will periodically write back metadata in ascending disk address offset order so will be very efficient. Hence we can stop marking VFS inodes dirty during transaction commit or when changing timestamps during transactions. This will keep the inodes in the superblock dirty list to those containing data or unlogged metadata changes. However, the timstamp changes are slightly more complex than this - there are a couple of places that do unlogged updates of the timestamps, and the VFS need to be informed of these. Hence add a new function xfs_trans_ichgtime() for transactional changes, and leave xfs_ichgtime() for the non-transactional changes. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Alex Elder <aelder@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-10-19xfs: lockless per-ag lookupsDave Chinner3-11/+23
When we start taking a reference to the per-ag for every cached buffer in the system, kernel lockstat profiling on an 8-way create workload shows the mp->m_perag_lock has higher acquisition rates than the inode lock and has significantly more contention. That is, it becomes the highest contended lock in the system. The perag lookup is trivial to convert to lock-less RCU lookups because perag structures never go away. Hence the only thing we need to protect against is tree structure changes during a grow. This can be done simply by replacing the locking in xfs_perag_get() with RCU read locking. This removes the mp->m_perag_lock completely from this path. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
2010-10-19xfs: remove debug assert for per-ag reference countingDave Chinner1-2/+0
When we start taking references per cached buffer to the the perag it is cached on, it will blow the current debug maximum reference count assert out of the water. The assert has never caught a bug, and we have tracing to track changes if there ever is a problem, so just remove it. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
2010-10-19xfs: reduce the number of CIL lock round trips during commitDave Chinner1-105/+127
When commiting a transaction, we do a lock CIL state lock round trip on every single log vector we insert into the CIL. This is resulting in the lock being as hot as the inode and dcache locks on 8-way create workloads. Rework the insertion loops to bring the number of lock round trips to one per transaction for log vectors, and one more do the busy extents. Also change the allocation of the log vector buffer not to zero it as we copy over the entire allocated buffer anyway. This patch also includes a structural cleanup to the CIL item insertion provided by Christoph Hellwig. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com>
2010-10-19xfs: eliminate some newly-reported gcc warningsPoyo VL1-2/+2
Ionut Gabriel Popescu <poyo_vl@yahoo.com> submitted a simple change to eliminate some "may be used uninitialized" warnings when building XFS. The reported condition seems to be something that GCC did not used to recognize or report. The warnings were produced by: gcc version 4.5.0 20100604 [gcc-4_5-branch revision 160292] (SUSE Linux) Signed-off-by: Ionut Gabriel Popescu <poyo_vl@yahoo.com> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-10-19xfs: remove the ->kill_root btree operationChristoph Hellwig4-88/+44
The implementation os ->kill_root only differ by either simply zeroing out the now unused buffer in the btree cursor in the inode allocation btree or using xfs_btree_setbuf in the allocation btree. Initially both of them used xfs_btree_setbuf, but the use in the ialloc btree was removed early on because it interacted badly with xfs_trans_binval. In addition to zeroing out the buffer in the cursor xfs_btree_setbuf updates the bc_ra array in the btree cursor, and calls xfs_trans_brelse on the buffer previous occupying the slot. The bc_ra update should be done for the alloc btree updated too, although the lack of it does not cause serious problems. The xfs_trans_brelse call on the other hand is effectively a no-op in the end - it keeps decrementing the bli_recur refcount until it hits zero, and then just skips out because the buffer will always be dirty at this point. So removing it for the allocation btree is just fine. So unify the code and move it to xfs_btree.c. While we're at it also replace the call to xfs_btree_setbuf with a NULL bp argument in xfs_btree_del_cursor with a direct call to xfs_trans_brelse given that the cursor is beeing freed just after this and the state updates are superflous. After this xfs_btree_setbuf is only used with a non-NULL bp argument and can thus be simplified. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-10-19xfs: stop using xfs_qm_dqtobp in xfs_qm_dqflushChristoph Hellwig1-88/+76
In xfs_qm_dqflush we know that q_blkno must be initialized already from a previous xfs_qm_dqread. So instead of calling xfs_qm_dqtobp we can simply read the quota buffer directly. This also saves us from a duplicate xfs_qm_dqcheck call check and allows xfs_qm_dqtobp to be simplified now that it is always called for a newly initialized inode. In addition to that properly unwind all locks in xfs_qm_dqflush when xfs_qm_dqcheck fails. This mirrors a similar cleanup in the inode lookup done earlier. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-10-19xfs: simplify xfs_qm_dqusage_adjustChristoph Hellwig1-142/+61
There is no need to have the users and group/project quota locked at the same time. Get rid of xfs_qm_dqget_noattach and just do a xfs_qm_dqget inside xfs_qm_quotacheck_dqadjust for the quota we are operating on right now. The new version of xfs_qm_quotacheck_dqadjust holds the inode lock over it's operations, which is not a problem as it simply increments counters and there is no concern about log contention during mount time. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>
2010-10-19xfs: Introduce XFS_IOC_ZERO_RANGEDave Chinner6-9/+27
XFS_IOC_ZERO_RANGE is the equivalent of an atomic XFS_IOC_UNRESVSP/ XFS_IOC_RESVSP call pair. It enabled ranges of written data to be turned into zeroes without requiring IO or having to free and reallocate the extents in the range given as would occur if we had to punch and then preallocate them separately. This enables applications to zero parts of files very quickly without changing the layout of the files in any way. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-10-19xfs: use range primitives for xfs page cache operationsDave Chinner1-16/+15
While XFS passes ranges to operate on from the core code, the functions being called ignore the either the entire range or the end of the range. This is historical because when the function were written linux didn't have the necessary range operations. Update the functions to use the correct operations. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-10-15Linux 2.6.36-rc8v2.6.36-rc8Linus Torvalds1-2/+2
2010-10-15Un-inline the core-dump helper functionsLinus Torvalds2-32/+40
Tony Luck reports that the addition of the access_ok() check in commit 0eead9ab41da ("Don't dump task struct in a.out core-dumps") broke the ia64 compile due to missing the necessary header file includes. Rather than add yet another include (<asm/unistd.h>) to make everything happy, just uninline the silly core dump helper functions and move the bodies to fs/exec.c where they make a lot more sense. dump_seek() in particular was too big to be an inline function anyway, and none of them are in any way performance-critical. And we really don't need to mess up our include file headers more than they already are. Reported-and-tested-by: Tony Luck <tony.luck@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds13-77/+104
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: ehea: Fix a checksum issue on the receive path net: allow FEC driver to use fixed PHY support tg3: restore rx_dropped accounting b44: fix carrier detection on bind net: clear heap allocations for privileged ethtool actions NET: wimax, fix use after free ATM: iphase, remove sleep-inside-atomic ATM: mpc, fix use after free ATM: solos-pci, remove use after free net/fec: carrier off initially to avoid root mount failure r8169: use device model DMA API r8169: allocate with GFP_KERNEL flag when able to sleep
2010-10-14Don't dump task struct in a.out core-dumpsLinus Torvalds3-22/+6
akiphie points out that a.out core-dumps have that odd task struct dumping that was never used and was never really a good idea (it goes back into the mists of history, probably the original core-dumping code). Just remove it. Also do the access_ok() check on dump_write(). It probably doesn't matter (since normal filesystems all seem to do it anyway), but he points out that it's normally done by the VFS layer, so ... [ I suspect that we should possibly do "vfs_write()" instead of calling ->write directly. That also does the whole fsnotify and write statistics thing, which may or may not be a good idea. ] And just to be anal, do this all for the x86-64 32-bit a.out emulation code too, even though it's not enabled (and won't currently even compile) Reported-by: akiphie <akiphie@lavabit.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-14Merge branch 'fixes' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: ioat2: fix performance regression
2010-10-14Merge branch 'for-2.6.36' of git://linux-nfs.org/~bfields/linuxLinus Torvalds1-2/+0
* 'for-2.6.36' of git://linux-nfs.org/~bfields/linux: nfsd: fix BUG at fs/nfsd/nfsfh.h:199 on unlink
2010-10-14Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds3-4/+14
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: ring-buffer: Fix typo of time extends per page perf, MIPS: Support cross compiling of tools/perf for MIPS perf: Fix incorrect copy_from_user() usage
2010-10-14Merge master.kernel.org:/home/rmk/linux-2.6-armLinus Torvalds11-17/+44
* master.kernel.org:/home/rmk/linux-2.6-arm: ARM: relax ioremap prohibition (309caa9) for -final and -stable ARM: 6440/1: ep93xx: DMA: fix channel_disable cpuimx27: fix i2c bus selection cpuimx27: fix compile when ULPI is selected ARM: 6435/1: Fix HWCAP_TLS flag for ARM11MPCore/Cortex-A9 ARM: 6436/1: AT91: Fix power-saving in idle-mode on 926T processors ARM: fix section mismatch warnings in Versatile Express ARM: 6412/1: kprobes-decode: add support for MOVW instruction ARM: 6419/1: mmu: Fix MT_MEMORY and MT_MEMORY_NONCACHED pte flags ARM: 6416/1: errata: faulty hazard checking in the Store Buffer may lead to data corruption
2010-10-14Merge branch 'omap-fixes-for-linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6 * 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6: omap: iommu-load cam register before flushing the entry
2010-10-14Merge branch 'drm-fixes' of ↵Linus Torvalds12-30/+40
git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: drm/radeon/kms: Silent spurious error message drm/radeon/kms: fix bad cast/shift in evergreen.c drm/radeon/kms: make TV/DFP table info less verbose drm/radeon/kms: leave certain CP int bits enabled drm/radeon/kms: avoid corner case issue with unmappable vram V2
2010-10-14Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds3-10/+10
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, numa: For each node, register the memory blocks actually used x86, AMD, MCE thresholding: Fix the MCi_MISCj iteration order x86, mce, therm_throt.c: Fix missing curly braces in error handling logic
2010-10-14ioat2: fix performance regressionDan Williams1-1/+1
Commit 0793448 "DMAENGINE: generic channel status v2" changed the interface for how dma channel progress is retrieved. It inadvertently exported an internal helper function ioat_tx_status() instead of ioat_dma_tx_status(). The latter polls the hardware to get the latest completion state, while the helper just evaluates the current state without touching hardware. The effect is that we end up waiting for completion timeouts or descriptor allocation errors before the completion state is updated. iperf (before fix): [SUM] 0.0-41.3 sec 364 MBytes 73.9 Mbits/sec iperf (after fix): [SUM] 0.0- 4.5 sec 499 MBytes 940 Mbits/sec This is a regression starting with 2.6.35. Cc: <stable@kernel.org> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Linus Walleij <linus.walleij@stericsson.com> Cc: Maciej Sosnowski <maciej.sosnowski@intel.com> Reported-by: Richard Scobie <richard@sauce.co.nz> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-10-14ehea: Fix a checksum issue on the receive pathBreno Leitao2-1/+9
Currently we set all skbs with CHECKSUM_UNNECESSARY, even those whose protocol we don't know. This patch just add the CHECKSUM_COMPLETE tag for non TCP/UDP packets. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-13nfsd: fix BUG at fs/nfsd/nfsfh.h:199 on unlinkJ. Bruce Fields1-2/+0
As of commit 43a9aa64a2f4330a9cb59aaf5c5636566bce067c "NFSD: Fill in WCC data for REMOVE, RMDIR, MKNOD, and MKDIR", we sometimes call fh_unlock on a filehandle that isn't fully initialized. We should fix up the callers, but as a quick fix it is also sufficient just to remove this assertion. Reported-by: Marius Tolzmann <tolzmann@molgen.mpg.de> Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-13net: allow FEC driver to use fixed PHY supportGreg Ungerer1-14/+27
At least one board using the FEC driver does not have a conventional PHY attached to it, it is directly connected to a somewhat simple ethernet switch (the board is the SnapGear/LITE, and the attached 4-port ethernet switch is a RealTek RTL8305). This switch does not present the usual register interface of a PHY, it presents nothing. So a PHY scan will find nothing - it finds ID's of 0 for each PHY on the attached MII bus. After the FEC driver was changed to use phylib for supporting PHYs it no longer works on this particular board/switch setup. Add code support to use a fixed phy if no PHY is found on the MII bus. This is based on the way the cpmac.c driver solved this same problem. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-13ARM: relax ioremap prohibition (309caa9) for -final and -stableRussell King1-2/+6
... but produce a big warning about the problem as encouragement for people to fix their drivers. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-10-13Merge branch 'for-rmk' of git://git.pengutronix.de/git/imx/linux-2.6Russell King2-1/+2
2010-10-13ARM: 6440/1: ep93xx: DMA: fix channel_disableMika Westerberg1-1/+1
When channel_disable() is called, it disables per channel interrupts and waits until channels state becomes STATE_STALL, and then disables the channel. Now, if the DMA transfer is disabled while the channel is in STATE_NEXT we will not wait anything and disable the channel immediately. This seems to cause weird data corruption for example in audio transfers. Fix is to wait while we are in STATE_NEXT or STATE_ON and only then disable the channel. Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi> Acked-by: Ryan Mallon <ryan@bluewatersys.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-10-12Merge branch 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-1/+1
* 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Move TSC reset out of vmcb_init KVM: x86: Fix SVM VMCB reset
2010-10-12ring-buffer: Fix typo of time extends per pageSteven Rostedt1-1/+1
Time stamps for the ring buffer are created by the difference between two events. Each page of the ring buffer holds a full 64 bit timestamp. Each event has a 27 bit delta stamp from the last event. The unit of time is nanoseconds, so 27 bits can hold ~134 milliseconds. If two events happen more than 134 milliseconds apart, a time extend is inserted to add more bits for the delta. The time extend has 59 bits, which is good for ~18 years. Currently the time extend is committed separately from the event. If an event is discarded before it is committed, due to filtering, the time extend still exists. If all events are being filtered, then after ~134 milliseconds a new time extend will be added to the buffer. This can only happen till the end of the page. Since each page holds a full timestamp, there is no reason to add a time extend to the beginning of a page. Time extends can only fill a page that has actual data at the beginning, so there is no fear that time extends will fill more than a page without any data. When reading an event, a loop is made to skip over time extends since they are only used to maintain the time stamp and are never given to the caller. As a paranoid check to prevent the loop running forever, with the knowledge that time extends may only fill a page, a check is made that tests the iteration of the loop, and if the iteration is more than the number of time extends that can fit in a page a warning is printed and the ring buffer is disabled (all of ftrace is also disabled with it). There is another event type that is called a TIMESTAMP which can hold 64 bits of data in the theoretical case that two events happen 18 years apart. This code has not been implemented, but the name of this event exists, as well as the structure for it. The size of a TIMESTAMP is 16 bytes, where as a time extend is only 8 bytes. The macro used to calculate how many time extends can fit on a page used the TIMESTAMP size instead of the time extend size cutting the amount in half. The following test case can easily trigger the warning since we only need to have half the page filled with time extends to trigger the warning: # cd /sys/kernel/debug/tracing/ # echo function > current_tracer # echo 'common_pid < 0' > events/ftrace/function/filter # echo > trace # echo 1 > trace_marker # sleep 120 # cat trace Enabling the function tracer and then setting the filter to only trace functions where the process id is negative (no events), then clearing the trace buffer to ensure that we have nothing in the buffer, then write to trace_marker to add an event to the beginning of a page, sleep for 2 minutes (only 35 seconds is probably needed, but this guarantees the bug), and then finally reading the trace which will trigger the bug. This patch fixes the typo and prevents the false positive of that warning. Reported-by: Hans J. Koch <hjk@linutronix.de> Tested-by: Hans J. Koch <hjk@linutronix.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Stable Kernel <stable@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-10-12perf, MIPS: Support cross compiling of tools/perf for MIPSDeng-Cheng Zhu1-0/+12
Changes: v4: Fix the cosmetic issue of redundant dot-ops v3: Change rmb() to use SYNC v2: Include mips unistd.h and define rmb()/cpu_relax() in tools/perf/perf.h Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: David Daney <ddaney@caviumnetworks.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-12drm/radeon/kms: Silent spurious error messageJean Delvare1-4/+1
I see the following error message in my kernel log from time to time: radeon 0000:07:00.0: ffff88007c334000 reserve failed for wait radeon 0000:07:00.0: ffff88007c334000 reserve failed for wait After investigation, it turns out that there's nothing to be afraid of and everything works as intended. So remove the spurious log message. Signed-off-by: Jean Delvare <khali@linux-fr.org> Reviewed-by: Jerome Glisse <jglisse@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-10-12drm/radeon/kms: fix bad cast/shift in evergreen.cAlex Deucher1-1/+1
Missing parens. fixes: https://bugs.freedesktop.org/show_bug.cgi?id=30718 Reported-by: Dave Gilbert <freedesktop@treblig.org> Signed-off-by: Alex Deucher <alexdeucher@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: stable@kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-10-12drm/radeon/kms: make TV/DFP table info less verboseAlex Deucher2-22/+22
Make TV standard and DFP table revisions debug only. Signed-off-by: Alex Deucher <alexdeucher@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-10-12drm/radeon/kms: leave certain CP int bits enabledAlex Deucher2-2/+2
These bits are used for internal communication and should be left enabled. This may fix s/r issues on some systems. Signed-off-by: Alex Deucher <alexdeucher@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-10-12drm/radeon/kms: avoid corner case issue with unmappable vram V2Jerome Glisse9-1/+14
We should not allocate any object into unmappable vram if we have no means to access them which on all GPU means having the CP running and on newer GPU having the blit utility working. This patch limit the vram allocation to visible vram until we have acceleration up and running. Note that it's more than unlikely that we run into any issue related to that as when acceleration is not woring userspace should allocate any object in vram beside front buffer which should fit in visible vram. V2 use real_vram_size as mc_vram_size could be bigger than the actual amount of vram [airlied: fixup r700_cp_stop case] Signed-off-by: Jerome Glisse <jglisse@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-10-12perf: Fix incorrect copy_from_user() usageJohn Blackwood1-3/+1
perf events: repair incorrect use of copy_from_user This makes the perf_event_period() return 0 instead of -EFAULT on success. Signed-off-by: John Blackwood<john.blackwood@ccur.com> Signed-off-by: Joe Korty <joe.korty@ccur.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20100928220311.GA18145@tsunami.ccur.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-12fanotify: disable fanotify syscallsEric Paris2-2/+1
This patch disables the fanotify syscalls by just not building them and letting the cond_syscall() statements in kernel/sys_ni.c redirect them to sys_ni_syscall(). It was pointed out by Tvrtko Ursulin that the fanotify interface did not include an explicit prioritization between groups. This is necessary for fanotify to be usable for hierarchical storage management software, as they must get first access to the file, before inotify-like notifiers see the file. This feature can be added in an ABI compatible way in the next release (by using a number of bits in the flags field to carry the info) but it was suggested by Alan that maybe we should just hold off and do it in the next cycle, likely with an (new) explicit argument to the syscall. I don't like this approach best as I know people are already starting to use the current interface, but Alan is all wise and noone on list backed me up with just using what we have. I feel this is needlessly ripping the rug out from under people at the last minute, but if others think it needs to be a new argument it might be the best way forward. Three choices: Go with what we got (and implement the new feature next cycle). Add a new field right now (and implement the new feature next cycle). Wait till next cycle to release the ABI (and implement the new feature next cycle). This is number 3. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-12tg3: restore rx_dropped accountingEric Dumazet2-3/+5
commit 511d22247be7 (tg3: 64 bit stats on all arches), overlooked the rx_dropped accounting. We use a full "struct rtnl_link_stats64" to hold rx_dropped value, but forgot to report it in tg3_get_stats64(). Use an "unsigned long" instead to shrink "struct tg3" by 176 bytes, and report this value to stats readers. Increment rx_dropped counter for oversized frames. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Michael Chan <mchan@broadcom.com> CC: Matt Carlson <mcarlson@broadcom.com> Acked-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-12b44: fix carrier detection on bindPaul Fertser1-2/+2
For carrier detection to work properly when binding the driver with a cable unplugged, netif_carrier_off() should be called after register_netdev(), not before. Signed-off-by: Paul Fertser <fercerpav@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-12x86, numa: For each node, register the memory blocks actually usedYinghai Lu1-3/+5
Russ reported SGI UV is broken recently. He said: | The SRAT table shows that memory range is spread over two nodes. | | SRAT: Node 0 PXM 0 100000000-800000000 | SRAT: Node 1 PXM 1 800000000-1000000000 | SRAT: Node 0 PXM 0 1000000000-1080000000 | |Previously, the kernel early_node_map[] would show three entries |with the proper node. | |[ 0.000000] 0: 0x00100000 -> 0x00800000 |[ 0.000000] 1: 0x00800000 -> 0x01000000 |[ 0.000000] 0: 0x01000000 -> 0x01080000 | |The problem is recent community kernel early_node_map[] shows |only two entries with the node 0 entry overlapping the node 1 |entry. | | 0: 0x00100000 -> 0x01080000 | 1: 0x00800000 -> 0x01000000 After looking at the changelog, Found out that it has been broken for a while by following commit |commit 8716273caef7f55f39fe4fc6c69c5f9f197f41f1 |Author: David Rientjes <rientjes@google.com> |Date: Fri Sep 25 15:20:04 2009 -0700 | | x86: Export srat physical topology Before that commit, register_active_regions() is called for every SRAT memory entry right away. Use nodememblk_range[] instead of nodes[] in order to make sure we capture the actual memory blocks registered with each node. nodes[] contains an extended range which spans all memory regions associated with a node, but that does not mean that all the memory in between are included. Reported-by: Russ Anderson <rja@sgi.com> Tested-by: Russ Anderson <rja@sgi.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4CB27BDF.5000800@kernel.org> Acked-by: David Rientjes <rientjes@google.com> Cc: <stable@kernel.org> 2.6.33 .34 .35 .36 Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-10-11net: clear heap allocations for privileged ethtool actionsKees Cook1-3/+3
Several other ethtool functions leave heap uncleared (potentially) by drivers. Some interfaces appear safe (eeprom, etc), in that the sizes are well controlled. In some situations (e.g. unchecked error conditions), the heap will remain unchanged in areas before copying back to userspace. Note that these are less of an issue since these all require CAP_NET_ADMIN. Cc: stable@kernel.org Signed-off-by: Kees Cook <kees.cook@canonical.com> Acked-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-11NET: wimax, fix use after freeJiri Slaby1-13/+13
Stanse found that i2400m_rx frees skb, but still uses skb->len even though it has skb_len defined. So use skb_len properly in the code. And also define it unsinged int rather than size_t to solve compilation warnings. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com> Cc: linux-wimax@intel.com Acked-by: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-11ATM: iphase, remove sleep-inside-atomicJiri Slaby2-7/+1
Stanse found that ia_init_one locks a spinlock and inside of that it calls ia_start which calls: * request_irq * tx_init which does kmalloc(GFP_KERNEL) Both of them can thus sleep and result in a deadlock. I don't see a reason to have a per-device spinlock there which is used only there and inited right before the lock location. So remove it completely. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-11ATM: mpc, fix use after freeJiri Slaby1-1/+1
Stanse found that mpc_push frees skb and then it dereferences it. It is a typo, new_skb should be dereferenced there. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-11ATM: solos-pci, remove use after freeJiri Slaby1-3/+5
Stanse found we do in console_show: kfree_skb(skb); return skb->len; which is not good. Fix that by remembering the len and use it in the function instead. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Chas Williams <chas@cmf.nrl.navy.mil> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-11Merge branch 'rc-fixes' of ↵Linus Torvalds4-7/+5
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6 * 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6: kbuild: fix oldnoconfig to do the right thing kconfig: Temporarily disable dependency warnings kconfig: delay symbol direct dependency initialization