summaryrefslogtreecommitdiff
path: root/drivers/pci/pci-sysfs.c
AgeCommit message (Collapse)AuthorFilesLines
2011-03-31Fix common misspellingsLucas De Marchi1-1/+1
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-24userns: security: make capabilities relative to the user namespaceSerge E. Hallyn1-1/+1
- Introduce ns_capable to test for a capability in a non-default user namespace. - Teach cap_capable to handle capabilities in a non-default user namespace. The motivation is to get to the unprivileged creation of new namespaces. It looks like this gets us 90% of the way there, with only potential uid confusion issues left. I still need to handle getting all caps after creation but otherwise I think I have a good starter patch that achieves all of your goals. Changelog: 11/05/2010: [serge] add apparmor 12/14/2010: [serge] fix capabilities to created user namespaces Without this, if user serge creates a user_ns, he won't have capabilities to the user_ns he created. THis is because we were first checking whether his effective caps had the caps he needed and returning -EPERM if not, and THEN checking whether he was the creator. Reverse those checks. 12/16/2010: [serge] security_real_capable needs ns argument in !security case 01/11/2011: [serge] add task_ns_capable helper 01/11/2011: [serge] add nsown_capable() helper per Bastian Blank suggestion 02/16/2011: [serge] fix a logic bug: the root user is always creator of init_user_ns, but should not always have capabilities to it! Fix the check in cap_capable(). 02/21/2011: Add the required user_ns parameter to security_capable, fixing a compile failure. 02/23/2011: Convert some macros to functions as per akpm comments. Some couldn't be converted because we can't easily forward-declare them (they are inline if !SECURITY, extern if SECURITY). Add a current_user_ns function so we can use it in capability.h without #including cred.h. Move all forward declarations together to the top of the #ifdef __KERNEL__ section, and use kernel-doc format. 02/23/2011: Per dhowells, clean up comment in cap_capable(). 02/23/2011: Per akpm, remove unreachable 'return -EPERM' in cap_capable. (Original written and signed off by Eric; latest, modified version acked by him) [akpm@linux-foundation.org: fix build] [akpm@linux-foundation.org: export current_user_ns() for ecryptfs] [serge.hallyn@canonical.com: remove unneeded extra argument in selinux's task_has_capability] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: David Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-18Merge branch 'linux-next' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: label: remove #include of ACPI header to avoid warnings PCI: label: Fix compilation error when CONFIG_ACPI is unset PCI: pre-allocate additional resources to devices only after successful allocation of essential resources. PCI: introduce reset_resource() PCI: data structure agnostic free list function PCI: refactor io size calculation code PCI: do not create quirk I/O regions below PCIBIOS_MIN_IO for ICH PCI hotplug: acpiphp: set current_state to D0 in register_slot PCI: Export ACPI _DSM provided firmware instance number and string name to sysfs PCI: add more checking to ICH region quirks PCI: aer-inject: Override PCIe AER Mask Registers PCI: fix tlan build when CONFIG_PCI is not enabled PCI: remove quirk for pre-production systems PCI: Avoid potential NULL pointer dereference in pci_scan_bridge PCI/lpc: irq and pci_ids patch for Intel DH89xxCC DeviceIDs PCI: sysfs: Fix failure path for addition of "vpd" attribute
2011-02-15pci: use security_capable() when checking capablities during config space readChris Wright1-1/+2
This reintroduces commit 47970b1b which was subsequently reverted as f00eaeea. The original change was broken and caused X startup failures and generally made privileged processes incapable of reading device dependent config space. The normal capable() interface returns true on success, but the LSM interface returns 0 on success. This thinko is now fixed in this patch, and has been confirmed to work properly. So, once again...Eric Paris noted that commit de139a3 ("pci: check caps from sysfs file open to read device dependent config space") caused the capability check to bypass security modules and potentially auditing. Rectify this by calling security_capable() when checking the open file's capabilities for config space reads. Reported-by: Eric Paris <eparis@redhat.com> Tested-by: Dave Young <hidave.darkstar@gmail.com> Acked-by: James Morris <jmorris@namei.org> Cc: Dave Airlie <airlied@gmail.com> Cc: Alex Riesen <raa.lkml@gmail.com> Cc: Sedat Dilek <sedat.dilek@googlemail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: James Morris <jmorris@namei.org>
2011-02-13Revert "pci: use security_capable() when checking capablities during config ↵Linus Torvalds1-2/+1
space read" This reverts commit 47970b1b2aa64464bc0a9543e86361a622ae7c03. It turns out it breaks several distributions. Looks like the stricter selinux checks fail due to selinux policies not being set to allow the access - breaking X, but also lspci. So while the change was clearly the RightThing(tm) to do in theory, in practice we have backwards compatibility issues making it not work. Reported-by: Dave Young <hidave.darkstar@gmail.com> Acked-by: David Airlie <airlied@linux.ie> Acked-by: Alex Riesen <raa.lkml@gmail.com> Cc: Eric Paris <eparis@redhat.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: James Morris <jmorris@namei.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-11pci: use security_capable() when checking capablities during config space readChris Wright1-1/+2
Eric Paris noted that commit de139a3 ("pci: check caps from sysfs file open to read device dependent config space") caused the capability check to bypass security modules and potentially auditing. Rectify this by calling security_capable() when checking the open file's capabilities for config space reads. Reported-by: Eric Paris <eparis@redhat.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: James Morris <jmorris@namei.org>
2011-02-08PCI: sysfs: Fix failure path for addition of "vpd" attributeBen Hutchings1-1/+1
Commit 280c73d ("PCI: centralize the capabilities code in pci-sysfs.c") changed the initialisation of the "rom" and "vpd" attributes, and made the failure path for the "vpd" attribute incorrect. We must free the new attribute structure (attr), but instead we currently free dev->vpd->attr. That will normally be NULL, resulting in a memory leak, but it might be a stale pointer, resulting in a double-free. Found by inspection; compile-tested only. Cc: stable@kernel.org Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-01-14PCI: sysfs: Update ROM to include default owner write accessAlex Williamson1-1/+1
The PCI sysfs ROM interface requires an enabling write to access the ROM image, but the default file mode is 0400. The original proposed patch adding sysfs ROM support was a true read-only interface, with the enabling bit coming in as a feature request. I suspect it was simply an oversight that the file mode didn't get updated to match the API. Acked-by: Chris Wright <chrisw@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-11-16PCI: fix offset check for sysfs mmapped filesDarrick J. Wong1-1/+1
I just loaded 2.6.37-rc2 on my machines, and I noticed that X no longer starts. Running an strace of the X server shows that it's doing this: open("/sys/bus/pci/devices/0000:07:00.0/resource0", O_RDWR) = 10 mmap(NULL, 16777216, PROT_READ|PROT_WRITE, MAP_SHARED, 10, 0) = -1 EINVAL (Invalid argument) This code seems to be asking for a shared read/write mapping of 16MB worth of BAR0 starting at file offset 0, and letting the kernel assign a starting address. Unfortunately, this -EINVAL causes X not to start. Looking into dmesg, there's a complaint like so: process "Xorg" tried to map 0x01000000 bytes at page 0x00000000 on 0000:07:00.0 BAR 0 (start 0x 96000000, size 0x 1000000) ...with the following code in pci_mmap_fits: pci_start = (mmap_api == PCI_MMAP_SYSFS) ? pci_resource_start(pdev, resno) >> PAGE_SHIFT : 0; if (start >= pci_start && start < pci_start + size && start + nr <= pci_start + size) It looks like the logic here is set up such that when the mmap call comes via sysfs, the check in pci_mmap_fits wants vma->vm_pgoff to be between the resource's start and end address, and the end of the vma to be no farther than the end. However, the sysfs PCI resource files always start at offset zero, which means that this test always fails for programs that mmap the sysfs files. Given the comment in the original commit 3b519e4ea618b6943a82931630872907f9ac2c2b, I _think_ the old procfs files require that the file offset be equal to the resource's base address when mmapping. I think what we want here is for pci_start to be 0 when mmap_api == PCI_MMAP_PROCFS. The following patch makes that change, after which the Matrox and Mach64 X drivers work again. Acked-by: Martin Wilck <martin.wilck@ts.fujitsu.com> Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-11-15PCI: sysfs: fix printk warningsRandy Dunlap1-1/+2
Cast pci_resource_start() and pci_resource_len() to u64 for printk. drivers/pci/pci-sysfs.c:753: warning: format '%16Lx' expects type 'long long unsigned int', but argument 9 has type 'resource_size_t' drivers/pci/pci-sysfs.c:753: warning: format '%16Lx' expects type 'long long unsigned int', but argument 10 has type 'resource_size_t' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-11-11PCI: fix size checks for mmap() on /proc/bus/pci filesMartin Wilck1-6/+16
The checks for valid mmaps of PCI resources made through /proc/bus/pci files that were introduced in 9eff02e2042f96fb2aedd02e032eca1c5333d767 have several problems: 1. mmap() calls on /proc/bus/pci files are made with real file offsets > 0, whereas under /sys/bus/pci/devices, the start of the resource corresponds to offset 0. This may lead to false negatives in pci_mmap_fits(), which implicitly assumes the /sys/bus/pci/devices layout. 2. The loop in proc_bus_pci_mmap doesn't skip empty resouces. This leads to false positives, because pci_mmap_fits() doesn't treat empty resources correctly (the calculated size is 1 << (8*sizeof(resource_size_t)-PAGE_SHIFT) in this case!). 3. If a user maps resources with BAR > 0, pci_mmap_fits will emit bogus WARNINGS for the first resources that don't fit until the correct one is found. On many controllers the first 2-4 BARs are used, and the others are empty. In this case, an mmap attempt will first fail on the non-empty BARs (including the "right" BAR because of 1.) and emit bogus WARNINGS because of 3., and finally succeed on the first empty BAR because of 2. This is certainly not the intended behaviour. This patch addresses all 3 issues. Updated with an enum type for the additional parameter for pci_mmap_fits(). Cc: stable@kernel.org Signed-off-by: Martin Wilck <martin.wilck@ts.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-07-30PCI: export SMBIOS provided firmware instance and label to sysfsNarendra K1-0/+5
This patch exports SMBIOS provided firmware instance and label of onboard PCI devices to sysfs. New files are: /sys/bus/pci/devices/.../label which contains the firmware name for the device in question, and /sys/bus/pci/devices/.../index which contains the firmware device type instance for the given device. Signed-off-by: Jordan Hargrave <jordan_hargrave@dell.com> Signed-off-by: Narendra K <narendra_k@dell.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-07-30PCI: Allow read/write access to sysfs I/O port resourcesAlex Williamson1-0/+68
PCI sysfs resource files currently only allow mmap'ing. On x86 this works fine for memory backed BARs, but doesn't work at all for I/O port backed BARs. Add read/write to I/O port PCI sysfs resource files to allow userspace access to these device regions. Acked-by: Chris Wright <chrisw@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-07-30PCI: pci-sysfs: remove casts from void*Kulikov Vasiliy1-1/+1
Remove unnesessary casts from void*. Signed-off-by: Kulikov Vasiliy <segooon@gmail.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-06-12Revert "PCI: create function symlinks in /sys/bus/pci/slots/N/"Jesse Barnes1-37/+0
This reverts commit 75568f8094eb0333e9c2109b23cbc8b82d318a3c. Since they're just a convenience anyway, remove these symlinks since they're causing duplicate filename errors in the wild. Acked-by: Alex Chiang <achiang@canonical.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-05-22Merge branch 'linux-next' of ↵Linus Torvalds1-1/+43
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (36 commits) PCI: hotplug: pciehp: Removed check for hotplug of display devices PCI: read memory ranges out of Broadcom CNB20LE host bridge PCI: Allow manual resource allocation for PCI hotplug bridges x86/PCI: make ACPI MCFG reserved error messages ACPI specific PCI hotplug: Use kmemdup PM/PCI: Update PCI power management documentation PCI: output FW warning in pci_read/write_vpd PCI: fix typos pci_device_dis/enable to pci_dis/enable_device in comments PCI quirks: disable msi on AMD rs4xx internal gfx bridges PCI: Disable MSI for MCP55 on P5N32-E SLI x86/PCI: irq and pci_ids patch for additional Intel Cougar Point DeviceIDs PCI: aerdrv: trivial cleanup for aerdrv_core.c PCI: aerdrv: trivial cleanup for aerdrv.c PCI: aerdrv: introduce default_downstream_reset_link PCI: aerdrv: rework find_aer_service PCI: aerdrv: remove is_downstream PCI: aerdrv: remove magical ROOT_ERR_STATUS_MASKS PCI: aerdrv: redefine PCI_ERR_ROOT_*_SRC PCI: aerdrv: rework do_recovery PCI: aerdrv: rework get_e_source() ...
2010-05-21pci: check caps from sysfs file open to read device dependent config spaceChris Wright1-1/+2
The PCI config space bin_attr read handler has a hardcoded CAP_SYS_ADMIN check to verify privileges before allowing a user to read device dependent config space. This is meant to protect from an unprivileged user potentially locking up the box. When assigning a PCI device directly to a guest with libvirt and KVM, the sysfs config space file is chown'd to the unprivileged user that the KVM guest will run as. The guest needs to have full access to the device's config space since it's responsible for driving the device. However, despite being the owner of the sysfs file, the CAP_SYS_ADMIN check will not allow read access beyond the config header. With this patch we check privileges against the capabilities used when openining the sysfs file. The allows a privileged process to open the file and hand it to an unprivileged process, and the unprivileged process can still read all of the config space. Signed-off-by: Chris Wright <chrisw@sous-sol.org> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-21sysfs: add struct file* to bin_attr callbacksChris Wright1-12/+30
This allows bin_attr->read,write,mmap callbacks to check file specific data (such as inode owner) as part of any privilege validation. Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-11PCI: return correct value when writing to the "reset" attributeMichal Schmidt1-1/+6
A successful write() to the "reset" sysfs attribute should return the number of bytes written, not 0. Otherwise userspace (bash) retries the write over and over again. Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-05-11PCI: create function symlinks in /sys/bus/pci/slots/N/Alex Chiang1-0/+37
Create convenience symlinks in sysfs, linking slots to device functions, and vice versa. These links make it easier for users to figure out which devices actually live in what slots. For example: sapphire:/sys/bus/pci/slots # ls 1 10 2 3 4 5 6 7 8 9 sapphire:/sys/bus/pci/slots # ls -l 3 total 0 -r--r--r-- 1 root root 65536 Aug 18 14:10 address lrwxrwxrwx 1 root root 0 Aug 18 14:10 function0 -> ../../../../devices/pci0000:23/0000:23:01.0 lrwxrwxrwx 1 root root 0 Aug 18 14:10 function1 -> ../../../../devices/pci0000:23/0000:23:01.1 sapphire:/sys/bus/pci/slots # ls -l 3/function0/slot lrwxrwxrwx 1 root root 0 Aug 18 14:13 3/function0/slot -> ../../../bus/pci/slots/3 The original form of this patch was written by Matthew Wilcox, and was enhanced to include links from the sysfs slots/ directory pointing back at the device functions. Cc: willy@linux.intel.com Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo1-0/+1
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-19sysfs: Initialised pci bus legacy_mem field before useMel Gorman1-1/+1
PPC64 is failing to boot the latest mmotm due to an uninitialised pointer in pci_create_legacy_files(). The surprise is that machines boot at all and it would appear to affect current mainline as well. This patch fixes the problem. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-08sysfs: fix for thinko with sysfs_bin_attr_init()Stephen Rothwell1-2/+2
After merging the final tree, today's linux-next build (powerpc allyesconfig) failed like this: drivers/pci/pci-sysfs.c: In function 'pci_create_legacy_files': drivers/pci/pci-sysfs.c:645: error: lvalue required as unary '&' operand drivers/pci/pci-sysfs.c:658: error: lvalue required as unary '&' operand Caused by commit "sysfs: Use sysfs_attr_init and sysfs_bin_attr_init on dynamic attributes" interacting with commit "sysfs: Use one lockdep class per sysfs attribute") both from the driver-core tree. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-08sysfs: Use sysfs_attr_init and sysfs_bin_attr_init on dynamic attributesEric W. Biederman1-0/+5
These are the non-static sysfs attributes that exist on my test machine. Fix them to use sysfs_attr_init or sysfs_bin_attr_init as appropriate. It simply requires making a sysfs attribute present to see this. So this is a little bit tedious but otherwise not too bad. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-01-05PCI: Check the node argument passed to cpumask_of_nodeDavid John1-2/+4
Commit e0cd516 "PCI: derive nearby CPUs from device's instead of bus' NUMA information" causes an null pointer dereference when reading from the sysfs attributes local_cpu* on Intel machines with no ACPI NUMA proximity info, since dev->numa_node gets set to -1 for all PCI devices, which then gets passed to cpumask_of_node. Add a check to prevent this. Signed-off-by: David John <davidjon@xenontk.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-05PCI: show dma_mask bits in /sysYinghai Lu1-0/+17
So we can catch if the driver sets an incorrect dma_mask. Reviewed-by: Grant Grundler <grundler@google.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-11-07PCI: derive nearby CPUs from device's instead of bus' NUMA informationAndreas Herrmann1-0/+8
In case of AMD CPU northbridge functions this NUMA information might differ. Here is an example from a 4-socket system. Currently Linux shows root@hagen:/sys/devices/pci0000:00/0000:00:1a.4# cat numa_node 0 root@hagen:/sys/devices/pci0000:00/0000:00:1a.4# cat local_cpu* 0-3 00000000,0000000f which is not correct for northbridge functions as the local CPUs are those of the same socket. With this patch and a quirk for AMD CPU NB functions Linux can do better and correctly show root@hagen:/sys/devices/pci0000:00/0000:00:1a.4# cat numa_node 2 root@hagen:/sys/devices/pci0000:00/0000:00:1a.4# cat local_cpu* 8-11 00000000,00000f00 Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-10PCI: expose function reset capability in sysfsMichael S. Tsirkin1-0/+37
Some devices allow an individual function to be reset without affecting other functions in the same device: that's what pci_reset_function does. For devices that have this support, expose reset attribite in sysfs. This is useful e.g. for virtualization, where a qemu userspace process wants to reset the device when the guest is reset, to emulate machine reboot as closely as possible. Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-04-23docbooks: add/fix PCI kernel-docRandy Dunlap1-4/+8
Add drivers/pci/*.c source files to DocBook/kernel-api.tmpl and update those pci/*.c source files that need kernel-doc fixes. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-04-06PCI: allow PCI core hotplug to remove PCI root busAlex Chiang1-4/+0
There is no reason to prevent removal of root bus devices. A subsequent rescan will find them just fine. Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-04-06PCI: Setup disabled bridges even if buses are addedYuji Shimada1-1/+1
This patch sets up disabled bridges even if buses have already been added. pci_assign_unassigned_resources is called after buses are added. pci_assign_unassigned_resources calls pci_bus_assign_resources. pci_bus_assign_resources calls pci_setup_bridge to configure BARs of bridges. Currently pci_setup_bridge returns immediately if the bus have already been added. So pci_assign_unassigned_resources can't configure BARs of bridges that were added in a disabled state; this patch fixes the issue. On logical hot-add, we need to prevent the kernel from re-initializing bridges that have already been initialized. To achieve this, pci_setup_bridge returns immediately if the bridge have already been enabled. We don't need to check whether the specified bus is a root bus or not. pci_setup_bridge is not called on a root bus, because a root bus does not have a bridge. The patch adds a new helper function, pci_is_enabled. I made the function name similar to pci_is_managed. The codes which use enable_cnt directly are changed to use pci_is_enabled. Acked-by: Alex Chiang <achiang@hp.com> Signed-off-by: Yuji Shimada <shimada-yxb@necst.nec.co.jp> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-21PCI: Introduce /sys/bus/pci/devices/.../rescanAlex Chiang1-0/+19
This interface allows the user to force a rescan of the device's parent bus and all subordinate buses, and rediscover devices removed earlier from this part of the device tree. Cc: Trent Piepho <xyzzy@speakeasy.org> Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-21PCI: Introduce /sys/bus/pci/devices/.../removeAlex Chiang1-0/+36
This patch adds an attribute named "remove" to a PCI device's sysfs directory. Writing a non-zero value to this attribute will remove the PCI device and any children of it. Trent Piepho wrote the original implementation and documentation. Thanks to Vegard Nossum for testing under kmemcheck and finding locking issues with the sysfs interface. Cc: Trent Piepho <xyzzy@speakeasy.org> Tested-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-21PCI: Introduce /sys/bus/pci/rescanAlex Chiang1-0/+26
This interface allows the user to force a rescan of all PCI buses in system, and rediscover devices that have been removed earlier. pci_bus_attrs implementation from Trent Piepho. Thanks to Vegard Nossum for discovering locking issues with the sysfs interface. Cc: Trent Piepho <xyzzy@speakeasy.org> Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20PCI: expose boot VGA device via sysfs.Dave Airlie1-2/+22
X really would like to know which VGA device was considered the boot device by the system. The x86 PCI fixups have support for discovering this but we provide no way to expose it to userspace. This adds a sysfs file per VGA class device which has the value 0 for non the boot device or unknown, and 1 if the VGA device is the boot device. Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20PCI/alpha: pci sysfs resourcesIvan Kokshaysky1-2/+17
This closes http://bugzilla.kernel.org/show_bug.cgi?id=10893 which is a showstopper for X development on alpha. The generic HAVE_PCI_MMAP code (drivers/pci-sysfs.c) is not very useful since we have to deal with three different types of MMIO address spaces: sparse and dense mappings for old ev4/ev5 machines and "normal" 1:1 MMIO space (bwx) for ev56 and later. Also "write combine" mappings are meaningless on alpha - roughly speaking, alpha does write combining, IO reordering and other optimizations by default, unless user splits IO accesses with memory barriers. I think the cleanest way to deal with resource files on alpha is to convert the default no-op pci_create_resource_files() and pci_remove_resource_files() for !HAVE_PCI_MMAP case into __weak functions and override them with alpha specific ones. Another alpha hook is needed for "legacy_" resource files to handle sparse addressing (pci_adjust_legacy_attr). With the "standard" resourceN files on ev56/ev6 libpciaccess works "out of the box". Handling of resourceN_sparse/resourceN_dense files on older machines obviously requires some userland work. Sparse/dense stuff has been tested on sx164 (pca56/pyxis, normally uses bwx IO) with the kernel hacked into "cia compatible" mode. Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-02-05PCI: return error on failure to read PCI ROMsTimothy S. Nelson1-2/+2
This patch makes the ROM reading code return an error to user space if the size of the ROM read is equal to 0. The patch also emits a warnings if the contents of the ROM are invalid, and documents the effects of the "enable" file on ROM reading. Signed-off-by: Timothy S. Nelson <wayland@wayland.id.au> Acked-by: Alex Villacis-Lasso <a_villacis@palosanto.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-10Merge branch 'cpus4096-for-linus' of ↵Linus Torvalds1-6/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: [IA64] fix typo in cpumask_of_pcibus() x86: fix x86_32 builds for summit and es7000 arch's cpumask: use work_on_cpu in acpi-cpufreq.c for read_measured_perf_ctrs cpumask: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write cpumask: use cpumask_var_t in acpi-cpufreq.c cpumask: use work_on_cpu in acpi/cstate.c cpumask: convert struct cpufreq_policy to cpumask_var_t cpumask: replace CPUMASK_ALLOC etc with cpumask_var_t x86: cleanup remaining cpumask_t ops in smpboot code cpumask: update pci_bus_show_cpuaffinity to use new cpumask API cpumask: update local_cpus_show to use new cpumask API ia64: cpumask fix for is_affinity_mask_valid()
2009-01-07PCI: revise VPD access interfaceStephen Hemminger1-30/+8
Change PCI VPD API which was only used by sysfs to something usable in drivers. * move iteration over multiple words to the low level * use conventional types for arguments * add exportable wrapper Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07PCI: define PCI resource names in an 'enum'Yu Zhao1-1/+3
This patch moves all definitions of the PCI resource names to an 'enum', and also replaces some hard-coded resource variables with symbol names. This change eases introduction of device specific resources. Reviewed-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Yu Zhao <yu.zhao@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07PCI: Make settable sysfs attributes more consistentTrent Piepho1-20/+28
PCI devices have three settable boolean attributes, enable, broken_parity_status, and msi_bus. The store functions for these would silently interpret "0x01" as false, "1llogical" as true, and "true" would be (silently!) ignored and do nothing. This is inconsistent with typical sysfs handling of settable attributes, and just plain doesn't make much sense. So, use strict_strtoul(), which was created for this purpose. The store functions will treat a value of 0 as false, non-zero as true, and return -EINVAL for a parse failure. Additionally, is_enabled_store() and msi_bus_store() return -EPERM if CAP_SYS_ADMIN is lacking, rather than silently doing nothing. This is more typical behavior for sysfs attributes that need a capability. And msi_bus_store() will only print the "forced subordinate bus ..." warning if the MSI flag was actually forced to a different value. Signed-off-by: Trent Piepho <xyzzy@speakeasy.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07resource: allow MMIO exclusivity for device driversArjan van de Ven1-0/+3
Device drivers that use pci_request_regions() (and similar APIs) have a reasonable expectation that they are the only ones accessing their device. As part of the e1000e hunt, we were afraid that some userland (X or some bootsplash stuff) was mapping the MMIO region that the driver thought it had exclusively via /dev/mem or via various sysfs resource mappings. This patch adds the option for device drivers to cause their reserved regions to the "banned from /dev/mem use" list, so now both kernel memory and device-exclusive MMIO regions are banned. NOTE: This is only active when CONFIG_STRICT_DEVMEM is set. In addition to the config option, a kernel parameter iomem=relaxed is provided for the cases where developers want to diagnose, in the field, drivers issues from userspace. Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07PCI: check mmap range of /proc/bus/pci files tooJesse Barnes1-1/+1
/proc/bus/pci allows you to mmap resource ranges too, so we should probably be checking to make sure the mapping is somewhat valid. Uses the same code as the recent sysfs mmap range checking patch from Linus. Acked-by: David Miller <davem@davemloft.net> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-04cpumask: update local_cpus_show to use new cpumask APIMike Travis1-6/+6
Impact: use new cpumask API to reduce stack usage Replace the local cpumask_t variable with a pointer to the const cpumask that needs to be printed. Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-13cpumask: change cpumask_scnprintf, cpumask_parse_user, cpulist_parse, and ↵Rusty Russell1-2/+2
cpulist_scnprintf to take pointers. Impact: change calling convention of existing cpumask APIs Most cpumask functions started with cpus_: these have been replaced by cpumask_ ones which take struct cpumask pointers as expected. These four functions don't have good replacement names; fortunately they're rarely used, so we just change them over. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: paulus@samba.org Cc: mingo@redhat.com Cc: tony.luck@intel.com Cc: ralf@linux-mips.org Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: cl@linux-foundation.org Cc: srostedt@redhat.com
2008-11-04PCI: fix range check on mmapped sysfs resource filesEd Swierk1-1/+1
pci_mmap_fits() returns the wrong answer if the sysfs resource file size is not a multiple of the page size. vm_end and vm_start are already page-aligned, so size - start < nr, causing mmap() to return EINVAL. Signed-off-by: Ed Swierk <eswierk@aristanetworks.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-10-20PCI: Add ability to mmap legacy_io on some platformsBenjamin Herrenschmidt1-5/+88
This adds the ability to mmap legacy IO space to the legacy_io files in sysfs on platforms that support it. This will allow to clean up X to use this instead of /dev/mem for legacy IO accesses such as those performed by Int10. While at it I moved pci_create/remove_legacy_files() to pci-sysfs.c where I think they belong, thus making more things statis in there and cleaned up some spurrious prototypes in the ia64 pci.h file Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-10-20PCI: centralize the capabilities code in pci-sysfs.cZhao, Yu1-55/+83
This patch centralizes functions used to add and remove sysfs entries for various capabilities. With this cleanup, the code is more readable and easier for adding new capability related functions. Signed-off-by: Yu Zhao <yu.zhao@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-10-20PCI: replace cfg space size (256/4096) by macros.Zhao, Yu1-5/+5
This is a cleanup that changes all PCI configuration space size representations to the macros (PCI_CFG_SPACE_SIZE and PCI_CFG_SPACE_EXP_SIZE). And the macros are also moved from drivers/pci/probe.c to drivers/pci/pci.h. Signed-off-by: Yu Zhao <yu.zhao@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-10-03Check mapped ranges on sysfs resource filesLinus Torvalds1-0/+19
This is loosely based on a patch by Jesse Barnes to check the user-space PCI mappings though the sysfs interfaces. Quoting Jesse's original explanation: It's fairly common for applications to map PCI resources through sysfs. However, with the current implementation, it's possible for an application to map far more than the range corresponding to the resourceN file it opened. This patch plugs that hole by checking the range at mmap time, similar to what is done on platforms like sparc64 in their lower level PCI remapping routines. It was initially put together to help debug the e1000e NVRAM corruption problem, since we initially thought an X driver might be walking past the end of one of its mappings and clobbering the NVRAM. It now looks like that's not the case, but doing the check is still important for obvious reasons. and this version of the patch differs in that it uses a helper function to clarify the code, and does all the checks in pages (instead of bytes) in order to avoid overflows when doing "<< PAGE_SHIFT" etc. Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>