summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2011-04-26Btrfs: fix missing mutex_unlock in btrfs_del_dir_entries_in_log()Tsutomu Itoh1-2/+5
It is necessary to unlock mutex_lock before it return an error when btrfs_alloc_path() fails. Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-04-26eCryptfs: Add reference counting to lower filesTyler Hicks6-59/+92
For any given lower inode, eCryptfs keeps only one lower file open and multiplexes all eCryptfs file operations through that lower file. The lower file was considered "persistent" and stayed open from the first lookup through the lifetime of the inode. This patch keeps the notion of a single, per-inode lower file, but adds reference counting around the lower file so that it is closed when not currently in use. If the reference count is at 0 when an operation (such as open, create, etc.) needs to use the lower file, a new lower file is opened. Since the file is no longer persistent, all references to the term persistent file are changed to lower file. Locking is added around the sections of code that opens the lower file and assign the pointer in the inode info, as well as the code the fputs the lower file when all eCryptfs users are done with it. This patch is needed to fix issues, when mounted on top of the NFSv3 client, where the lower file is left silly renamed until the eCryptfs inode is destroyed. Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-04-26eCryptfs: dput dentries returned from dget_parentTyler Hicks1-2/+2
Call dput on the dentries previously returned by dget_parent() in ecryptfs_rename(). This is needed for supported eCryptfs mounts on top of the NFSv3 client. Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-04-26eCryptfs: Remove extra d_delete in ecryptfs_rmdirTyler Hicks1-2/+0
vfs_rmdir() already calls d_delete() on the lower dentry. That was being duplicated in ecryptfs_rmdir() and caused a NULL pointer dereference when NFSv3 was the lower filesystem. Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
2011-04-26x86, setup: When probing memory with e801, use ax/bx as a pairH. Peter Anvin1-1/+1
When we use BIOS function e801 to probe memory, we should use ax/bx (or cx/dx) as a pair, not mix and match. This was a typo during the translation from assembly code, and breaks at least one set of machines in the field (which return cx = dx = 0). Reported-and-tested-by: Chris Samuel <chris@csamuel.org> Fix-proposed-by: Thomas Meyer <thomas@m3y3r.de> Link: http://lkml.kernel.org/r/1303566747.12067.10.camel@localhost.localdomain
2011-04-24NFSv4: Ensure that clientid and session establishment can time outTrond Myklebust2-6/+8
The following patch ensures that we do not get permanently trapped in the RPC layer when trying to establish a new client id or session. This again ensures that the state manager can finish in a timely fashion when the last filesystem to reference the nfs_client exits. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-04-24SUNRPC: Allow RPC calls to return ETIMEDOUT instead of EIOTrond Myklebust2-2/+6
On occasion, it is useful for the NFS layer to distinguish between soft timeouts and other EIO errors due to (say) encoding errors, or authentication errors. The following patch ensures that the default behaviour of the RPC layer remains to return EIO on soft timeouts (until we have audited all the callers). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-04-24NFSv4.1: Don't loop forever in nfs4_proc_create_sessionTrond Myklebust4-53/+40
If a server for some reason keeps sending NFS4ERR_DELAY errors, we can end up looping forever inside nfs4_proc_create_session, and so the usual mechanisms for detecting if the nfs_client is dead don't work. Fix this by ensuring that we loop inside the nfs4_state_manager thread instead. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-04-24[SCSI] pmcraid: reject negative request sizeDan Rosenberg1-0/+3
There's a code path in pmcraid that can be reached via device ioctl that causes all sorts of ugliness, including heap corruption or triggering the OOM killer due to consecutive allocation of large numbers of pages. Not especially relevant from a security perspective, since users must have CAP_SYS_ADMIN to open the character device. First, the user can call pmcraid_chr_ioctl() with a type PMCRAID_PASSTHROUGH_IOCTL. A pmcraid_passthrough_ioctl_buffer is copied in, and the request_size variable is set to buffer->ioarcb.data_transfer_length, which is an arbitrary 32-bit signed value provided by the user. If a negative value is provided here, bad things can happen. For example, pmcraid_build_passthrough_ioadls() is called with this request_size, which immediately calls pmcraid_alloc_sglist() with a negative size. The resulting math on allocating a scatter list can result in an overflow in the kzalloc() call (if num_elem is 0, the sglist will be smaller than expected), or if num_elem is unexpectedly large the subsequent loop will call alloc_pages() repeatedly, a high number of pages will be allocated and the OOM killer might be invoked. Prevent this value from being negative in pmcraid_ioctl_passthrough(). Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com> Cc: stable@kernel.org Cc: Anil Ravindranath <anil_ravindranath@pmc-sierra.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-04-24[SCSI] put stricter guards on queue dead checksJames Bottomley1-8/+8
SCSI uses request_queue->queuedata == NULL as a signal that the queue is dying. We set this state in the sdev release function. However, this allows a small window where we release the last reference but haven't quite got to this stage yet and so something will try to take a reference in scsi_request_fn and oops. It's very rare, but we had a report here, so we're pushing this as a bug fix The actual fix is to set request_queue->queuedata to NULL in scsi_remove_device() before we drop the reference. This causes correct automatic rejects from scsi_request_fn as people who hold additional references try to submit work and prevents anything from getting a new reference to the sdev that way. Cc: stable@kernel.org Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-04-24[SCSI] scsi_dh: fix reference counting in scsi_dh_activate error pathMike Snitzer1-3/+6
Commit db422318cbca55168cf965f655471dbf8be82433 ([SCSI] scsi_dh: propagate SCSI device deletion) introduced a regression where the device reference is not dropped prior to scsi_dh_activate's early return from the error path. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@kernel.org # 2.6.38 Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-04-24[SCSI] mpt2sas: prevent heap overflows and unchecked readsDan Rosenberg1-2/+21
At two points in handling device ioctls via /dev/mpt2ctl, user-supplied length values are used to copy data from userspace into heap buffers without bounds checking, allowing controllable heap corruption and subsequently privilege escalation. Additionally, user-supplied values are used to determine the size of a copy_to_user() as well as the offset into the buffer to be read, with no bounds checking, allowing users to read arbitrary kernel memory. Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com> Cc: stable@kernel.org Acked-by: Eric Moore <eric.moore@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-04-24Merge branch 'dcache-cleanup'Linus Torvalds2-28/+18
* dcache-cleanup: vfs: get rid of insane dentry hashing rules
2011-04-24Merge branch 'upstream-linus' of ↵Linus Torvalds8-11/+103
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: libata: ahci_start_engine compliant to AHCI spec ata: pata_at91.c bugfix for initial_timing initialisation ata: pata_at91.c bugfix for high master clock ahci: AHCI-mode SATA patch for Intel Panther Point DeviceIDs ata_piix: IDE-mode SATA patch for Intel Panther Point DeviceIDs libata: Pioneer DVR-216D can't do SETXFER ahci: don't enable port irq before handler is registered libata: Implement ATA_FLAG_NO_DIPM and apply it to mcp65 libata: Kill unused ATA_DFLAG_{H|D}IPM flags ahci: EM supported message type sysfs attribute
2011-04-24Merge branch 'for-linus' of git://git.infradead.org/ubifs-2.6Linus Torvalds2-8/+47
* 'for-linus' of git://git.infradead.org/ubifs-2.6: UBIFS: fix master node recovery UBIFS: fix false assertion warning in case of I/O failures UBIFS: fix false space checking failure
2011-04-24libata: ahci_start_engine compliant to AHCI specJian Peng1-0/+21
At the end of section 10.1 of AHCI spec (rev 1.3), it states Software shall not set PxCMD.ST to 1 until it is determined that a functoinal device is present on the port as determined by PxTFD.STS.BSY=0, PxTFD.STS.DRQ=0 and PxSSTS.DET=3h Even though most AHCI host controller works without this check, specific controller will fail under this condition. Signed-off-by: Jian Peng <jipeng2005@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2011-04-24ata: pata_at91.c bugfix for initial_timing initialisationIgor Plyatov1-2/+12
The "struct ata_timing" must contain 10 members, but ".dmack_hold" member was forgotten for "initial_timing" initialisation. This patch fixes such a problem. Signed-off-by: Igor Plyatov <plyatov@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2011-04-24ata: pata_at91.c bugfix for high master clockIgor Plyatov1-1/+7
The AT91SAM9 microcontrollers with master clock higher then 105 MHz and PIO0, have overflow of the NCS_RD_PULSE value in the MSB. This lead to "NCS_RD_PULSE" pulse longer then "NRD_CYCLE" pulse and driver does not detect ATA device. Signed-off-by: Igor Plyatov <plyatov@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2011-04-24ahci: AHCI-mode SATA patch for Intel Panther Point DeviceIDsSeth Heasley1-0/+6
The previously submitted patch was word-wrapped. This patch adds the AHCI-mode SATA DeviceIDs for the Intel Panther Point PCH. Signed-off-by: Seth Heasley <seth.heasley@intel.com> Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2011-04-24ata_piix: IDE-mode SATA patch for Intel Panther Point DeviceIDsSeth Heasley1-0/+8
The previously submitted patch was word-wrapped. This patch adds the IDE-mode SATA DeviceIDs for the Intel Panther Point PCH. Signed-off-by: Seth Heasley <seth.heasley@intel.com> Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2011-04-24libata: Pioneer DVR-216D can't do SETXFERJeff Mahoney1-0/+1
Commit 4a5610a04d415ed94af75bb1159d2621d62c8328 fixed an issue with the Pioneer DVR-212D not handling SETXFER correctly. An openSUSE user reported a similar issue with his DVR-216D that the NOSETXFER horkage worked around for him as well. This patch adds the DVR-216D (1.08) to the horkage list for NOSETXFER. The issue was reported at: https://bugzilla.novell.com/show_bug.cgi?id=679143 Reported-by: Volodymyr Kyrychenko <vladimir.kirichenko@gmail.com> Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2011-04-24ahci: don't enable port irq before handler is registeredMaxime Bizon2-3/+16
The ahci_pmp_attach() & ahci_pmp_detach() unmask port irqs, but they are also called during port initialization, before ahci host irq handler is registered. On ce4100 platform, this sometimes triggers "irq 4: nobody cared" message when loading driver. Fixed this by not touching the register if the port is in frozen state, and mark all uninitialized port as frozen. Signed-off-by: Maxime Bizon <mbizon@freebox.fr> Acked-by: Tejun Heo <tj@kernel.org> Cc: stable@kernel.org Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2011-04-24libata: Implement ATA_FLAG_NO_DIPM and apply it to mcp65Tejun Heo3-3/+6
NVIDIA mcp65 familiy of controllers cause command timeouts when DIPM is used. Implement ATA_FLAG_NO_DIPM and apply it. This problem was reported by Stefan Bader in the following thread. http://thread.gmane.org/gmane.linux.ide/48841 stable: applicable to 2.6.37 and 38. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Stefan Bader <stefan.bader@canonical.com> Cc: stable@kernel.org Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2011-04-24libata: Kill unused ATA_DFLAG_{H|D}IPM flagsTejun Heo1-2/+0
ATA_DFLAG_{H|D}IPM flags are no longer used. Kill them. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2011-04-24ahci: EM supported message type sysfs attributeHannes Reinecke2-0/+26
This patch adds an sysfs attribute 'em_message_supported' to the ahci host device which prints out the supported enclosure management message types. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2011-04-24kconfig: Avoid buffer underrun in choice inputBen Hutchings1-1/+1
Commit 40aee729b350 ('kconfig: fix default value for choice input') fixed some cases where kconfig would select the wrong option from a choice with a single valid option and thus enter an infinite loop. However, this broke the test for user input of the form 'N?', because when kconfig selects the single valid option the input is zero-length and the test will read the byte before the input buffer. If this happens to contain '?' (as it will in a mips build on Debian unstable today) then kconfig again enters an infinite loop. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: stable@kernel.org [2.6.17+] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-24vfs: get rid of insane dentry hashing rulesLinus Torvalds2-28/+18
The dentry hashing rules have been really quite complicated for a long while, in odd ways. That made functions like __d_drop() very fragile and non-obvious. In particular, whether a dentry was hashed or not was indicated with an explicit DCACHE_UNHASHED bit. That's despite the fact that the hash abstraction that the dentries use actually have a 'is this entry hashed or not' model (which is a simple test of the 'pprev' pointer). The reason that was done is because we used the normal 'is this entry unhashed' model to mark whether the dentry had _ever_ been hashed in the dentry hash tables, and that logic goes back many years (commit b3423415fbc2: "dcache: avoid RCU for never-hashed dentries"). That, in turn, meant that __d_drop had totally different unhashing logic for the dentry hash table case and for the anonymous dcache case, because in order to use the "is this dentry hashed" logic as a flag for whether it had ever been on the RCU hash table, we had to unhash such a dentry differently so that we'd never think that it wasn't 'unhashed' and wouldn't be free'd correctly. That's just insane. It made the logic really hard to follow, when there were two different kinds of "unhashed" states, and one of them (the one that used "list_bl_unhashed()") really had nothing at all to do with being unhashed per se, but with a very subtle lifetime rule instead. So turn all of it around, and make it logical. Instead of having a DENTRY_UNHASHED bit in d_flags to indicate whether the dentry is on the hash chains or not, use the hash chain unhashed logic for that. Suddenly "d_unhashed()" just uses "list_bl_unhashed()", and everything makes sense. And for the lifetime rule, just use an explicit DENTRY_RCUACCEES bit. If we ever insert the dentry into the dentry hash table so that it is visible to RCU lookup, we mark it DENTRY_RCUACCESS to show that it now needs the RCU lifetime rules. Now suddently that test at dentry free time makes sense too. And because unhashing now is sane and doesn't depend on where the dentry got unhashed from (because the dentry hash chain details doesn't have some subtle side effects), we can re-unify the __d_drop() logic and use common code for the unhashing. Also fix one more open-coded hash chain bit_spin_lock() that I missed in the previous chain locking cleanup commit. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-24Merge branch 'pm-fixes' of ↵Linus Torvalds6-4/+34
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6 * 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: PM: Add missing syscore_suspend() and syscore_resume() calls PM: Fix error code paths executed after failing syscore_suspend()
2011-04-24vfs: get rid of 'struct dcache_hash_bucket' abstractionLinus Torvalds1-24/+21
It's a useless abstraction for 'hlist_bl_head', and it doesn't actually help anything - quite the reverse. All the users end up having to know about the hlist_bl_head details anyway, using 'struct hlist_bl_node *' etc. So it just makes the code look confusing. And the cost of it is extra '&b->head' syntactic noise, but more importantly it spuriously makes the hash table dentry list look different from the per-superblock DCACHE_DISCONNECTED dentry list. As a result, the code ended up using ad-hoc locking for one case and special helper functions for what is really another totally identical case in the very same function. Make it all look and work the same. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-23Merge branch 'tty-linus' of ↵Linus Torvalds3-8/+11
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6 * 'tty-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: tty/n_gsm: fix bug in CRC calculation for gsm1 mode serial/imx: read cts state only after acking cts change irq parport_pc.c: correctly release the requested region for the IT887x
2011-04-23SECURITY: Move exec_permission RCU checks into security modulesAndi Kleen5-8/+14
Right now all RCU walks fall back to reference walk when CONFIG_SECURITY is enabled, even though just the standard capability module is active. This is because security_inode_exec_permission unconditionally fails RCU walks. Move this decision to the low level security module. This requires passing the RCU flags down the security hook. This way at least the capability module and a few easy cases in selinux/smack work with RCU walks with CONFIG_SECURITY=y Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Eric Paris <eparis@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-23Merge branch 'for-linus' of ↵Linus Torvalds15-40/+110
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: ALSA: hda - Fix unused warnings when !SND_HDA_NEEDS_RESUME ALSA: hda - Add a fix-up for Acer dmic with ALC271x codec ASoC: add a module alias to the FSI driver ALSA: emu10k1 - Fix "Music" controls to "Synth" controls in documents ARM: s3c2440: gta02; Register dfbmcs320 device for BT audio interface ASoC: codecs: JZ4740: Fix OOPS ASoC: Fix output PGA enabling in wm_hubs CODECs ASoC: sn95031: decorate function with __devexit_p() ASoC: SAMSUNG: Fix the inverted clocks handling for pcm driver ASoC: sst_platform: Fix lock acquring ASoC: fsi: driver safely remove for against irq ASoC: fsi: modify vague PM control on probe ASoC: fsi: take care in failing case of dai register MAINTAINERS: Update Samsung ASoC maintainer's id ASoC: WM8903: HP and Line out PGA/mixer DAPM fixes ASoC: Set left channel volume update bits for WM8994 ASoC: fix config error path ASoC: check channel mismatch between cpu_dai and codec_dai ASoC: Tegra: Suspend/resume support
2011-04-23[PARISC] slub: fix panic with DISCONTIGMEMJames Bottomley1-0/+1
Slub makes assumptions about page_to_nid() which are violated by DISCONTIGMEM and !NUMA. This violation results in a panic because page_to_nid() can be non-zero for pages in the discontiguous ranges and this leads to a null return by get_node(). The assertion by the maintainer is that DISCONTIGMEM should only be allowed when NUMA is also defined. However, at least six architectures: alpha, ia64, m32r, m68k, mips, parisc violate this. The panic is a regression against slab, so just mark slub broken in the problem configuration to prevent users reporting these panics. Cc: stable@kernel.org Acked-by: David Rientjes <rientjes@google.com> Acked-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-04-22Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds3-6/+11
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf, x86: Update/fix Intel Nehalem cache events perf, x86: P4 PMU - Don't forget to clear cpuc->active_mask on overflow x86, perf event: Turn off unstructured raw event access to offcore registers perf: Support Xeon E7's via the Westmere PMU driver
2011-04-22Merge branch 'irq-fixes-for-linus' of ↵Linus Torvalds1-12/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: xtensa: Fixup irq conversion fallout and nmi_count
2011-04-22perf, x86: Update/fix Intel Nehalem cache eventsPeter Zijlstra1-4/+4
Change the Nehalem cache events to use retired memory instruction counters (similar to Westmere), this greatly improves the provided stats. Using: main () { int i; for (i = 0; i < 1000000000; i++) { asm("mov (%%rsp), %%rbx;" "mov %%rbx, (%%rsp);" : : : "rbx"); } } We find: $ perf stat --repeat 10 -e instructions:u -e l1-dcache-loads:u -e l1-dcache-stores:u ./loop_1b_loads+stores Performance counter stats for './loop_1b_loads+stores' (10 runs): 4,000,081,056 instructions:u # 0.000 IPC ( +- 0.000% ) 4,999,502,846 l1-dcache-loads:u ( +- 0.008% ) 1,000,034,832 l1-dcache-stores:u ( +- 0.000% ) 1.565184942 seconds time elapsed ( +- 0.005% ) The 5b is surprising - we'd expect 1b: $ perf stat --repeat 10 -e instructions:u -e r10b:u -e l1-dcache-stores:u ./loop_1b_loads+stores Performance counter stats for './loop_1b_loads+stores' (10 runs): 4,000,081,054 instructions:u # 0.000 IPC ( +- 0.000% ) 1,000,021,961 r10b:u ( +- 0.000% ) 1,000,030,951 l1-dcache-stores:u ( +- 0.000% ) 1.565055422 seconds time elapsed ( +- 0.003% ) Which this patch thus fixes. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Stephane Eranian <eranian@google.com> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Link: http://lkml.kernel.org/n/tip-q9rtru7b7840tws75xzboapv@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-22perf, x86: P4 PMU - Don't forget to clear cpuc->active_mask on overflowCyrill Gorcunov1-1/+1
It's not enough to simply disable event on overflow the cpuc->active_mask should be cleared as well otherwise counter may stall in "active" even in real being already disabled (which potentially may lead to the situation that user may not use this counter further). Don pointed out that: " I also noticed this patch fixed some unknown NMIs on a P4 when I stressed the box". Tested-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Link: http://lkml.kernel.org/r/1303398203-2918-3-git-send-email-dzickus@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-22x86, perf event: Turn off unstructured raw event access to offcore registersIngo Molnar1-1/+5
Andi Kleen pointed out that the Intel offcore support patches were merged without user-space tool support to the functionality: | | The offcore_msr perf kernel code was merged into 2.6.39-rc*, but the | user space bits were not. This made it impossible to set the extra mask | and actually do the OFFCORE profiling | Andi submitted a preliminary patch for user-space support, as an extension to perf's raw event syntax: | | Some raw events -- like the Intel OFFCORE events -- support additional | parameters. These can be appended after a ':'. | | For example on a multi socket Intel Nehalem: | | perf stat -e r1b7:20ff -a sleep 1 | | Profile the OFFCORE_RESPONSE.ANY_REQUEST with event mask REMOTE_DRAM_0 | that measures any access to DRAM on another socket. | But this kind of usability is absolutely unacceptable - users should not be expected to type in magic, CPU and model specific incantations to get access to useful hardware functionality. The proper solution is to expose useful offcore functionality via generalized events - that way users do not have to care which specific CPU model they are using, they can use the conceptual event and not some model specific quirky hexa number. We already have such generalization in place for CPU cache events, and it's all very extensible. "Offcore" events measure general DRAM access patters along various parameters. They are particularly useful in NUMA systems. We want to support them via generalized DRAM events: either as the fourth level of cache (after the last-level cache), or as a separate generalization category. That way user-space support would be very obvious, memory access profiling could be done via self-explanatory commands like: perf record -e dram ./myapp perf record -e dram-remote ./myapp ... to measure DRAM accesses or more expensive cross-node NUMA DRAM accesses. These generalized events would work on all CPUs and architectures that have comparable PMU features. ( Note, these are just examples: actual implementation could have more sophistication and more parameter - as long as they center around similarly simple usecases. ) Now we do not want to revert *all* of the current offcore bits, as they are still somewhat useful for generic last-level-cache events, implemented in this commit: e994d7d23a0b: perf: Fix LLC-* events on Intel Nehalem/Westmere But we definitely do not yet want to expose the unstructured raw events to user-space, until better generalization and usability is implemented for these hardware event features. ( Note: after generalization has been implemented raw offcore events can be supported as well: there can always be an odd event that is marginally useful but not useful enough to generalize. DRAM profiling is definitely *not* such a category so generalization must be done first. ) Furthermore, PERF_TYPE_RAW access to these registers was not intended to go upstream without proper support - it was a side-effect of the above e994d7d23a0b commit, not mentioned in the changelog. As v2.6.39 is nearing release we go for the simplest approach: disable the PERF_TYPE_RAW offcore hack for now, before it escapes into a released kernel and becomes an ABI. Once proper structure is implemented for these hardware events and users are offered usable solutions we can revisit this issue. Reported-by: Andi Kleen <ak@linux.intel.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1302658203-4239-1-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-22perf: Support Xeon E7's via the Westmere PMU driverAndi Kleen1-0/+1
There's a new model number public, 47, for Xeon E7 (aka Westmere EX). Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: a.p.zijlstra@chello.nl Link: http://lkml.kernel.org/r/1303429715-10202-1-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-21Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds5-5/+20
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: ide: unexport DISK_EVENT_MEDIA_CHANGE for ide-gd and ide-cd block: don't propagate unlisted DISK_EVENTs to userland elevator: check for ELEVATOR_INSERT_SORT_MERGE in !elvpriv case too
2011-04-21ide: unexport DISK_EVENT_MEDIA_CHANGE for ide-gd and ide-cdTejun Heo3-2/+12
check_events() implementations in both ide-gd and ide-cd are inadequate for in-kernel event polling. Both generate media change events continuously when certain conditions are met causing infinite event loop between the driver and userland event handler. As disk event now supports suppression of unlisted events, simply de-listing DISK_EVENT_MEDIA_CHANGE from disk->events resolves the problem. Internal handling around media revalidation will behave the same while userland will fall back to userland event polling after detecting the device doesn't support disk events. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Jens Axboe <jaxboe@fusionio.com> Acked-by: "David S. Miller" <davem@davemloft.net> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-04-21block: don't propagate unlisted DISK_EVENTs to userlandTejun Heo1-2/+6
DISK_EVENT_MEDIA_CHANGE is used for both userland visible event and internal event for revalidation of removeable devices. Some legacy drivers don't implement proper event detection and continuously generate events under certain circumstances. For example, ide-cd generates media changed continuously if there's no media in the drive, which can lead to infinite loop of events jumping back and forth between the driver and userland event handler. This patch updates disk event infrastructure such that it never propagates events not listed in disk->events to userland. Those events are processed the same for internal purposes but uevent generation is suppressed. This also ensures that userland only gets events which are advertised in the @events sysfs node lowering risk of confusion. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-04-21elevator: check for ELEVATOR_INSERT_SORT_MERGE in !elvpriv case tooJens Axboe1-1/+2
The sort insert is the one that goes to the IO scheduler. With the SORT_MERGE addition, we could bypass IO scheduler setup but still ask the IO scheduler to insert the request. This would cause an oops on switching IO schedulers through the sysfs interface, unless the disk just happened to be idle while it occured. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-04-21CIFS: Fix memory over bound bug in cifs_parse_mount_optionsPavel Shilovsky1-2/+3
While password processing we can get out of options array bound if the next character after array is delimiter. The patch adds a check if we reach the end. Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-04-21Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfsLinus Torvalds1-1/+3
* 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: fix duplicate message output
2011-04-21Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds4-56/+20
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, numa: Fix cpu nodemasks for NUMA emulation and CONFIG_DEBUG_PER_CPU_MAPS Revert "x86, NUMA: Fix fakenuma boot failure"
2011-04-21raid5: fix build error, sector_t usageRandy Dunlap1-1/+1
Change <sectors> from unsigned long long to sector_t. This matches its source field. ERROR: "__udivdi3" [drivers/md/raid456.ko] undefined! Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linusLinus Torvalds3-18/+9
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: virtio: console: Enable call to hvc_remove() on console port remove virtio_pci: Prevent double-free of pci regions after device hot-unplug virtio: Decrement avail idx on buffer detach
2011-04-21Merge branch 'drm-fixes' of ↵Linus Torvalds3-6/+17
git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: agp: fix arbitrary kernel memory writes agp: fix OOM and buffer overflow drm/radeon/kms: fix IH writeback on r6xx+ on big endian machines
2011-04-21Merge branch 'drm-intel-fixes' of ↵Linus Torvalds2-37/+45
git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux-2.6 * 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux-2.6: drm/i915: Initialise g4x watermarks for disabled pipes drm/i915: Sanitize the output registers after resume drm/i915/tv: Fix modeset flickering introduced in 7f58aabc3 drm/i915/tv: Only poll for TV connections drm/i915/tv: Remember the detected TV type