summaryrefslogtreecommitdiff
path: root/drivers/pci/pci.c
AgeCommit message (Collapse)AuthorFilesLines
2011-10-27pci: Clamp pcie_set_readrq() when using "performance" settingsBenjamin Herrenschmidt1-2/+16
When configuring the PCIe settings for "performance", we allow parents to have a larger Max Payload Size than children and rely on children Max Read Request Size to not be larger than their own MPS to avoid having the host bridge generate responses they can't cope with. However, various drivers in Linux call pci_set_readrq() with arbitrary values, assuming this to be a simple performance tweak. This breaks under our "performance" configuration. Fix that by making sure the value programmed by pcie_set_readrq() is never larger than the configured MPS for that device. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Jon Mason <mason@myri.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14PCI / PM: Extend PME polling to all PCI devicesRafael J. Wysocki1-21/+20
The land of PCI power management is a land of sorrow and ugliness, especially in the area of signaling events by devices. There are devices that set their PME Status bits, but don't really bother to send a PME message or assert PME#. There are hardware vendors who don't connect PME# lines to the system core logic (they know who they are). There are PCI Express Root Ports that don't bother to trigger interrupts when they receive PME messages from the devices below. There are ACPI BIOSes that forget to provide _PRW methods for devices capable of signaling wakeup. Finally, there are BIOSes that do provide _PRW methods for such devices, but then don't bother to call Notify() for those devices from the corresponding _Lxx/_Exx GPE-handling methods. In all of these cases the kernel doesn't have a chance to receive a proper notification that it should wake up a device, so devices stay in low-power states forever. Worse yet, in some cases they continuously send PME Messages that are silently ignored, because the kernel simply doesn't know that it should clear the device's PME Status bit. This problem was first observed for "parallel" (non-Express) PCI devices on add-on cards and Matthew Garrett addressed it by adding code that polls PME Status bits of such devices, if they are enabled to signal PME, to the kernel. Recently, however, it has turned out that PCI Express devices are also affected by this issue and that it is not limited to add-on devices, so it seems necessary to extend the PME polling to all PCI devices, including PCI Express and planar ones. Still, it would be wasteful to poll the PME Status bits of devices that are known to receive proper PME notifications, so make the kernel (1) poll the PME Status bits of all PCI and PCIe devices enabled to signal PME and (2) disable the PME Status polling for devices for which correct PME notifications are received. Tested-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-04PCI: Disable MPS configuration by defaultJon Mason1-1/+5
Add the ability to disable PCI-E MPS turning and using the BIOS configured MPS defaults. Due to the number of issues recently discovered on some x86 chipsets, make this the default behavior. Also, add the option for peer to peer DMA MPS configuration. Peer to peer DMA is outside the scope of this patch, but MPS configuration could prevent it from working by having the MPS on one root port different than the MPS on another. To work around this, simply make the system wide MPS the smallest possible value (128B). Signed-off-by: Jon Mason <mason@myri.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-09-10PCI: Remove MRRS modification from MPS setting codeJon Mason1-1/+1
Modifying the Maximum Read Request Size to 0 (value of 128Bytes) has massive negative ramifications on some devices. Without knowing which devices have this issue, do not modify from the default value when walking the PCI-E bus in pcie_bus_safe mode. Also, make pcie_bus_safe the default procedure. Tested-by: Sven Schnelle <svens@stackframe.org> Tested-by: Simon Kirby <sim@hostway.ca> Tested-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Reported-and-tested-by: Eric Dumazet <eric.dumazet@gmail.com> Reported-and-tested-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> References: https://bugzilla.kernel.org/show_bug.cgi?id=42162 Signed-off-by: Jon Mason <mason@myri.com> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-21pci: fix new kernel-doc warning in pci.cRandy Dunlap1-1/+1
Fix new kernel-doc warning in pci.c: Warning(drivers/pci/pci.c:3259): No description found for parameter 'mps' Warning(drivers/pci/pci.c:3259): Excess function parameter 'rq' description in 'pcie_set_mps' Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-01PCI: Set PCI-E Max Payload Size on fabricJon Mason1-0/+67
On a given PCI-E fabric, each device, bridge, and root port can have a different PCI-E maximum payload size. There is a sizable performance boost for having the largest possible maximum payload size on each PCI-E device. However, if improperly configured, fatal bus errors can occur. Thus, it is important to ensure that PCI-E payloads sends by a device are never larger than the MPS setting of all devices on the way to the destination. This can be achieved two ways: - A conservative approach is to use the smallest common denominator of the entire tree below a root complex for every device on that fabric. This means for example that having a 128 bytes MPS USB controller on one leg of a switch will dramatically reduce performances of a video card or 10GE adapter on another leg of that same switch. It also means that any hierarchy supporting hotplug slots (including expresscard or thunderbolt I suppose, dbl check that) will have to be entirely clamped to 128 bytes since we cannot predict what will be plugged into those slots, and we cannot change the MPS on a "live" system. - A more optimal way is possible, if it falls within a couple of constraints: * The top-level host bridge will never generate packets larger than the smallest TLP (or if it can be controlled independently from its MPS at least) * The device will never generate packets larger than MPS (which can be configured via MRRS) * No support of direct PCI-E <-> PCI-E transfers between devices without some additional code to specifically deal with that case Then we can use an approach that basically ignores downstream requests and focuses exclusively on upstream requests. In that case, all we need to care about is that a device MPS is no larger than its parent MPS, which allows us to keep all switches/bridges to the max MPS supported by their parent and eventually the PHB. In this case, your USB controller would no longer "starve" your 10GE Ethernet and your hotplug slots won't affect your global MPS. Additionally, the hotplugged devices themselves can be configured to a larger MPS up to the value configured in the hotplug bridge. To choose between the two available options, two PCI kernel boot args have been added to the PCI calls. "pcie_bus_safe" will provide the former behavior, while "pcie_bus_perf" will perform the latter behavior. By default, the latter behavior is used. NOTE: due to the location of the enablement, each arch will need to add calls to this function. This patch only enables x86. This patch includes a number of changes recommended by Benjamin Herrenschmidt. Tested-by: Jordan_Hargrave@dell.com Signed-off-by: Jon Mason <mason@myri.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-07-22PCI: correct pcie_set_readrq write sizeJon Mason1-2/+2
When setting the PCI-E MRRS, pcie_set_readrq queries the current settings via a pci_read_config_word call but writes the modified result via a pci_write_config_dword. This results in writing 16 more bits than were queried. Also, the function description comment is slightly incorrect. Signed-off-by: Jon Mason <jdmason@kudzu.us> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-07-22PCI: ARI is a PCIe v2 featureChris Wright1-1/+6
The function pci_enable_ari() may mistakenly set the downstream port of a v1 PCIe switch in ARI Forwarding mode. This is a PCIe v2 feature, and with an SR-IOV device on that switch port believing the switch above is ARI capable it may attempt to use functions 8-255, translating into invalid (non-zero) device numbers for that bus. This has been seen to cause Completion Timeouts and general misbehaviour including hangs and panics. Cc: stable@kernel.org Acked-by: Don Dutile <ddutile@redhat.com> Tested-by: Don Dutile <ddutile@redhat.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-07-09PCI: conditional resource-reallocation through kernel parameter pci=reallocRam Pai1-0/+2
Multiple attempts to dynamically reallocate pci resources have unfortunately lead to regressions. Though we continue to fix the regressions and fine tune the dynamic-reallocation behavior, we have not reached a acceptable state yet. This patch provides a interim solution. It disables dynamic reallocation by default, but adds the ability to enable it through pci=realloc kernel command line parameter. Tested-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-06-24Merge branch 'for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: x86/PCI/ACPI: fix type mismatch PCI: fix new kernel-doc warning PCI: Fix warning in drivers/pci/probe.c on sparc64
2011-06-14x86/uv/x2apic: update for change in pci bridge handling.Dave Airlie1-2/+2
When I added 3448a19da479b6bd1e28e2a2be9fa16c6a6feb39 I forgot about the special uv handling code for this, so this patch fixes it up. Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Acked-by: Ingo Molnar Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-06-01PCI: fix new kernel-doc warningRandy Dunlap1-1/+1
Fix pci.c kernel-doc warnings: Warning(drivers/pci/pci.c:3292): No description found for parameter 'flags' Warning(drivers/pci/pci.c:3292): Excess function parameter 'change_bridge_flags' description in 'pci_set_vga_state' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-24Merge branch 'drm-core-next' of ↵Linus Torvalds1-11/+14
git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (169 commits) drivers/gpu/drm/radeon/atom.c: fix warning drm/radeon/kms: bump kms version number drm/radeon/kms: properly set num banks for fusion asics drm/radeon/kms/atom: move dig phy init out of modesetting drm/radeon/kms/cayman: fix typo in register mask drm/radeon/kms: fix typo in spread spectrum code drm/radeon/kms: fix tile_config value reported to userspace on cayman. drm/radeon/kms: fix incorrect comparison in cayman setup code. drm/radeon/kms: add wait idle ioctl for eg->cayman drm/radeon/cayman: setup hdp to invalidate and flush when asked drm/radeon/evergreen/btc/fusion: setup hdp to invalidate and flush when asked agp/uninorth: Fix lockups with radeon KMS and >1x. drm/radeon/kms: the SS_Id field in the LCD table if for LVDS only drm/radeon/kms: properly set the CLK_REF bit for DCE3 devices drm/radeon/kms: fixup eDP connector handling drm/radeon/kms: bail early for eDP in hotplug callback drm/radeon/kms: simplify hotplug handler logic drm/radeon/kms: rewrite DP handling drm/radeon/kms/atom: add support for setting DP panel mode drm/radeon/kms: atombios.h updates for DP panel mode ...
2011-05-21PCI: Add interfaces to store and load the device saved stateAlex Williamson1-0/+98
For KVM device assignment, we'd like to save off the state of a device prior to passing it to the guest and restore it later. We also want to allow pci_reset_funciton() to be called while the device is owned by the guest. This however overwrites and invalidates the struct pci_dev buffers, so we can't just manually call save and restore. Add generic interfaces for the saved state to be stored and reloaded back into struct pci_dev at a later time. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-21PCI: Track the size of each saved capability data areaAlex Williamson1-5/+7
This will allow us to store and load it later. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-12PCI: add latency tolerance reporting enable/disable supportJesse Barnes1-0/+149
Latency tolerance reporting allows devices to send messages to the root complex indicating their latency tolerance for snooped & unsnooped memory transactions. Add support for enabling & disabling this feature, along with a routine to set the max latencies a device should send upstream. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-12PCI: add OBFF enable/disable supportJesse Barnes1-0/+92
OBFF (optimized buffer flush/fill), where supported, can help improve energy efficiency by giving devices information about when interrupts and other activity will have a reduced power impact. It requires support from both the device and system (i.e. not only does the device need to respond to OBFF messages, but the platform must be capable of generating and routing them to the end point). Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-12PCI: add ID-based ordering enable/disable supportJesse Barnes1-0/+53
Add support to allow drivers to enable/disable ID-based ordering. Where supported, ID-based ordering can significantly improve the latency of individual requests by preventing them from queueing up behind unrelated traffic. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-11PCI/PM: Add kerneldoc description of pci_pm_reset()Rafael J. Wysocki1-0/+15
The pci_pm_reset() function is not a very nice interface due to its limitations and conditional behavior (e.g. it doesn't affect devices in low-power states), but it cannot be simply dropped, because existing device drivers may depend on it. However, its behavior and limitations should be well documented, so add an appropriate kerneldoc comment to it. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-04vgaarb: use bridges to control VGA routing where possible.Dave Airlie1-11/+14
So in a lot of modern systems, a GPU will always be below a parent bridge that won't share with any other GPUs. This means VGA arbitration on those GPUs can be controlled by using the bridge routing instead of io/mem decodes. The problem is locating which GPUs share which upstream bridges. This patch attempts to identify all the GPUs which can be controlled via bridges, and ones that can't. This patch endeavours to work out the bridge sharing semantics. When disabling GPUs via a bridge, it doesn't do irq callbacks or touch the io/mem decodes for the gpu. Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-03-21PCI: PCIe links may not get configured for ASPM under POWERSAVE modeNaga Chumbalkar1-0/+6
v3 -> v2: Moved ASPM enabling logic to pci_set_power_state() v2 -> v1: Preserved the logic in pci_raw_set_power_state() : Added ASPM enabling logic after scanning Root Bridge : http://marc.info/?l=linux-pci&m=130046996216391&w=2 v1 : http://marc.info/?l=linux-pci&m=130013164703283&w=2 The assumption made in commit 41cd766b065970ff6f6c89dd1cf55fa706c84a3d (PCI: Don't enable aspm before drivers have had a chance to veto it) that pci_enable_device() will result in re-configuring ASPM when aspm_policy is POWERSAVE is no longer valid. This is due to commit 97c145f7c87453cec90e91238fba5fe2c1561b32 (PCI: read current power state at enable time) which resets dev->current_state to D0. Due to this the call to pcie_aspm_pm_state_change() is never made. Note the equality check (below) that returns early: ./drivers/pci/pci.c: pci_raw_set_pci_power_state() 546 /* Check if we're already there */ 547 if (dev->current_state == state) 548 return 0; Therefore OSPM never configures the PCIe links for ASPM to turn them "on". Fix it by configuring ASPM from the pci_enable_device() code path. This also allows a driver such as the e1000e networking driver a chance to disable ASPM (L0s, L1), if need be, prior to enabling the device. A driver may perform this action if the device is known to mis-behave wrt ASPM. Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-01-14PCI/PM: Report wakeup events before resuming devicesRafael J. Wysocki1-1/+1
Make wakeup events be reported by the PCI subsystem before attempting to resume devices or queuing up runtime resume requests for them, because wakeup events should be reported as soon as they have been detected. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-01-14PCI/PM: Use pm_wakeup_event() directly for reporting wakeup eventsRafael J. Wysocki1-16/+0
After recent changes related to wakeup events pm_wakeup_event() automatically checks if the given device is configured to signal wakeup, so pci_wakeup_event() may be a static inline function calling pm_wakeup_event() directly. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-12-23PCI: make pci_restore_state return voidJon Mason1-5/+2
pci_restore_state only ever returns 0, thus there is no benefit in having it return any value. Also, a large majority of the callers do not check the return code of pci_restore_state. Make the pci_restore_state a void return and avoid the overhead. Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Jon Mason <jon.mason@exar.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-11-11PCI: read current power state at enable timeJesse Barnes1-0/+12
When we enable a PCI device, we avoid doing a lot of the initial setup work if the device's enable count is non-zero. If we don't fetch the power state though, we may later fail to set up MSI due to the unknown status. So pick it up before we short circuit the rest due to a pre-existing enable or mismatched enable/disable pair (as happens with VGA devices, which are special in a special way). Tested-by: Jesse Brandeburg <jesse.brandeburg@gmail.com> Reported-by: Dave Airlie <airlied@linux.ie> Tested-by: Dave Airlie <airlied@linux.ie> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-10-18PCI: Add support for polling PME state on suspended legacy PCI devicesMatthew Garrett1-0/+77
Not all hardware vendors hook up the PME line for legacy PCI devices, meaning that wakeup events get lost. The only way around this is to poll the devices to see if their state has changed, so add support for doing that on legacy PCI devices that aren't part of the core chipset. Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-10-16PCI: Adjust confusing if indentation in pcie_get_readrqJulia Lawall1-1/+1
Indent the branch of an if. The semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r disable braces4@ position p1,p2; statement S1,S2; @@ ( if (...) { ... } | if (...) S1@p1 S2@p2 ) @script:python@ p1 << r.p1; p2 << r.p2; @@ if (p1[0].column == p2[0].column): cocci.print_main("branch",p1) cocci.print_secs("after",p2) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-08-06Merge branch 'linux-next' of ↵Linus Torvalds1-4/+0
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (30 commits) PCI: update for owner removal from struct device_attribute PCI: Fix warnings when CONFIG_DMI unset PCI: Do not run NVidia quirks related to MSI with MSI disabled x86/PCI: use for_each_pci_dev() PCI: use for_each_pci_dev() PCI: MSI: Restore read_msi_msg_desc(); add get_cached_msi_msg_desc() PCI: export SMBIOS provided firmware instance and label to sysfs PCI: Allow read/write access to sysfs I/O port resources x86/PCI: use host bridge _CRS info on ASRock ALiveSATA2-GLAN PCI: remove unused HAVE_ARCH_PCI_SET_DMA_MAX_SEGMENT_{SIZE|BOUNDARY} PCI: disable mmio during bar sizing PCI: MSI: Remove unsafe and unnecessary hardware access PCI: Default PCIe ASPM control to on and require !EMBEDDED to disable PCI: kernel oops on access to pci proc file while hot-removal PCI: pci-sysfs: remove casts from void* ACPI: Disable ASPM if the platform won't provide _OSC control for PCIe PCI hotplug: make sure child bridges are enabled at hotplug time PCI hotplug: shpchp: Removed check for hotplug of display devices PCI hotplug: pciehp: Fixed return value sign for pciehp_unconfigure_device PCI: Don't enable aspm before drivers have had a chance to veto it ...
2010-07-30PCI: remove unused HAVE_ARCH_PCI_SET_DMA_MAX_SEGMENT_{SIZE|BOUNDARY}FUJITA Tomonori1-4/+0
In 2.6.34, we transformed the PCI DMA API into the generic device mode. The PCI DMA API is just the wrapper of the DMA API. So we don't need HAVE_ARCH_PCI_SET_DMA_MAX_SEGMENT_SIZE or HAVE_ARCH_PCI_SET_DMA_SEGMENT_BOUNDARY (which enable architectures to have the own implementations). Both haven't been used anyway. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-07-19PM: Make it possible to avoid races between wakeup and system sleepRafael J. Wysocki1-1/+19
One of the arguments during the suspend blockers discussion was that the mainline kernel didn't contain any mechanisms making it possible to avoid races between wakeup and system suspend. Generally, there are two problems in that area. First, if a wakeup event occurs exactly when /sys/power/state is being written to, it may be delivered to user space right before the freezer kicks in, so the user space consumer of the event may not be able to process it before the system is suspended. Second, if a wakeup event occurs after user space has been frozen, it is not generally guaranteed that the ongoing transition of the system into a sleep state will be aborted. To address these issues introduce a new global sysfs attribute, /sys/power/wakeup_count, associated with a running counter of wakeup events and three helper functions, pm_stay_awake(), pm_relax(), and pm_wakeup_event(), that may be used by kernel subsystems to control the behavior of this attribute and to request the PM core to abort system transitions into a sleep state already in progress. The /sys/power/wakeup_count file may be read from or written to by user space. Reads will always succeed (unless interrupted by a signal) and return the current value of the wakeup events counter. Writes, however, will only succeed if the written number is equal to the current value of the wakeup events counter. If a write is successful, it will cause the kernel to save the current value of the wakeup events counter and to abort the subsequent system transition into a sleep state if any wakeup events are reported after the write has returned. [The assumption is that before writing to /sys/power/state user space will first read from /sys/power/wakeup_count. Next, user space consumers of wakeup events will have a chance to acknowledge or veto the upcoming system transition to a sleep state. Finally, if the transition is allowed to proceed, /sys/power/wakeup_count will be written to and if that succeeds, /sys/power/state will be written to as well. Still, if any wakeup events are reported to the PM core by kernel subsystems after that point, the transition will be aborted.] Additionally, put a wakeup events counter into struct dev_pm_info and make these per-device wakeup event counters available via sysfs, so that it's possible to check the activity of various wakeup event sources within the kernel. To illustrate how subsystems can use pm_wakeup_event(), make the low-level PCI runtime PM wakeup-handling code use it. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: markgross <markgross@thegnar.org> Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
2010-06-23virtio-pci: disable msi at startupMichael S. Tsirkin1-0/+1
virtio-pci resets the device at startup by writing to the status register, but this does not clear the pci config space, specifically msi enable status which affects register layout. This breaks things like kdump when they try to use e.g. virtio-blk. Fix by forcing msi off at startup. Since pci.c already has a routine to do this, we export and use it instead of duplicating code. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: linux-pci@vger.kernel.org Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: stable@kernel.org
2010-05-22Merge branch 'linux-next' of ↵Linus Torvalds1-3/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (36 commits) PCI: hotplug: pciehp: Removed check for hotplug of display devices PCI: read memory ranges out of Broadcom CNB20LE host bridge PCI: Allow manual resource allocation for PCI hotplug bridges x86/PCI: make ACPI MCFG reserved error messages ACPI specific PCI hotplug: Use kmemdup PM/PCI: Update PCI power management documentation PCI: output FW warning in pci_read/write_vpd PCI: fix typos pci_device_dis/enable to pci_dis/enable_device in comments PCI quirks: disable msi on AMD rs4xx internal gfx bridges PCI: Disable MSI for MCP55 on P5N32-E SLI x86/PCI: irq and pci_ids patch for additional Intel Cougar Point DeviceIDs PCI: aerdrv: trivial cleanup for aerdrv_core.c PCI: aerdrv: trivial cleanup for aerdrv.c PCI: aerdrv: introduce default_downstream_reset_link PCI: aerdrv: rework find_aer_service PCI: aerdrv: remove is_downstream PCI: aerdrv: remove magical ROOT_ERR_STATUS_MASKS PCI: aerdrv: redefine PCI_ERR_ROOT_*_SRC PCI: aerdrv: rework do_recovery PCI: aerdrv: rework get_e_source() ...
2010-05-20Merge branch 'for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (44 commits) vlynq: make whole Kconfig-menu dependant on architecture add descriptive comment for TIF_MEMDIE task flag declaration. EEPROM: max6875: Header file cleanup EEPROM: 93cx6: Header file cleanup EEPROM: Header file cleanup agp: use NULL instead of 0 when pointer is needed rtc-v3020: make bitfield unsigned PCI: make bitfield unsigned jbd2: use NULL instead of 0 when pointer is needed cciss: fix shadows sparse warning doc: inode uses a mutex instead of a semaphore. uml: i386: Avoid redefinition of NR_syscalls fix "seperate" typos in comments cocbalt_lcdfb: correct sections doc: Change urls for sparse Powerpc: wii: Fix typo in comment i2o: cleanup some exit paths Documentation/: it's -> its where appropriate UML: Fix compiler warning due to missing task_struct declaration UML: add kernel.h include to signal.c ...
2010-05-19PCI: fix typos pci_device_dis/enable to pci_dis/enable_device in commentsRoman Fietze1-1/+1
This fixes all occurrences of pci_enable_device and pci_disable_device in all comments. There are no code changes involved. Signed-off-by: Roman Fietze <roman.fietze@telemotive.de> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-05-11PCI: clearing wakeup flags not neededAlan Stern1-2/+0
This patch (as1353) removes a couple of unnecessary assignments from the PCI core. The should_wakeup flag is naturally initialized to 0; there's no need to clear it. Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-04-23Merge branch 'master' into for-nextJiri Kosina1-24/+21
2010-04-23PCI: Ensure we re-enable devices on resumeMatthew Garrett1-5/+1
If the firmware puts a device back into D0 state at resume time, we'll update its state in resume_noirq and thus skip the platform resume code. Calling that code twice should be safe and we ought to avoid getting to that point anyway, so remove the check and also allow the platform pci code to be called for D0. Fixes USB not being powered after resume on recent Lenovo machines. Acked-by: Alex Chiang <achiang@canonical.com> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo1-0/+1
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-19PCI: cleanup error return for pcix get and set mmrbc functionsDean Nelson1-20/+16
pcix_get_mmrbc() returns the maximum memory read byte count (mmrbc), if successful, or an appropriate error value, if not. Distinguishing errors from correct values and understanding the meaning of an error can be somewhat confusing in that: correct values: 512, 1024, 2048, 4096 errors: -EINVAL -22 PCIBIOS_FUNC_NOT_SUPPORTED 0x81 PCIBIOS_BAD_VENDOR_ID 0x83 PCIBIOS_DEVICE_NOT_FOUND 0x86 PCIBIOS_BAD_REGISTER_NUMBER 0x87 PCIBIOS_SET_FAILED 0x88 PCIBIOS_BUFFER_TOO_SMALL 0x89 The PCIBIOS_ errors are returned from the PCI functions generated by the PCI_OP_READ() and PCI_OP_WRITE() macros. In a similar manner, pcix_set_mmrbc() also returns the PCIBIOS_ error values returned from pci_read_config_[word|dword]() and pci_write_config_word(). Following pcix_get_max_mmrbc()'s example, the following patch simply returns -EINVAL for all PCIBIOS_ errors encountered by pcix_get_mmrbc(), and -EINVAL or -EIO for those encountered by pcix_set_mmrbc(). This simplification was chosen in light of the fact that none of the current callers of these functions are interested in the specific type of error encountered. In the future, should this change, one could simply create a function that maps each PCIBIOS_ error to a corresponding unique errno value, which could be called by pcix_get_max_mmrbc(), pcix_get_mmrbc(), and pcix_set_mmrbc(). Additionally, this patch eliminates some unnecessary variables. Cc: stable@kernel.org Signed-off-by: Dean Nelson <dnelson@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-03-19PCI: fix access of PCI_X_CMD by pcix get and set mmrbc functionsDean Nelson1-5/+6
An e1000 driver on a system with a PCI-X bus was always being returned a value of 135 from both pcix_get_mmrbc() and pcix_set_mmrbc(). This value reflects an error return of PCIBIOS_BAD_REGISTER_NUMBER from pci_bus_read_config_dword(,, cap + PCI_X_CMD,). This is because for a dword, the following portion of the PCI_OP_READ() macro: if (PCI_##size##_BAD) return PCIBIOS_BAD_REGISTER_NUMBER; expands to: if (pos & 3) return PCIBIOS_BAD_REGISTER_NUMBER; And is always true for 'cap + PCI_X_CMD', which is 0xe4 + 2 = 0xe6. ('cap' is the result of calling pci_find_capability(, PCI_CAP_ID_PCIX).) The same problem exists for pci_bus_write_config_dword(,, cap + PCI_X_CMD,). In both cases, instead of calling _dword(), _word() should be called. Cc: stable@kernel.org Signed-off-by: Dean Nelson <dnelson@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-03-19PCI: kill off pci_register_set_vga_state() symbol export.Paul Mundt1-1/+0
When pci_register_set_vga_state() was made __init, the EXPORT_SYMBOL() was retained, which now leaves us with a section mismatch. Signed-off-by: Paul Mundt <lethal@linux-sh.org> Cc: Mike Travis <travis@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-03-19PCI: fix return value from pcix_get_max_mmrbc()Dean Nelson1-1/+1
For the PCI_X_STATUS register, pcix_get_max_mmrbc() is returning an incorrect value, which is based on: (stat & PCI_X_STATUS_MAX_READ) >> 12 Valid return values are 512, 1024, 2048, 4096, which correspond to a 'stat' (masked and right shifted by 21) of 0, 1, 2, 3, respectively. A right shift by 11 would generate the correct return value when 'stat' (masked and right shifted by 21) has a value of 1 or 2. But for a value of 0 or 3 it's not possible to generate the correct return value by only right shifting. Fix is based on pcix_get_mmrbc()'s similar dealings with the PCI_X_CMD register. Cc: stable@kernel.org Signed-off-by: Dean Nelson <dnelson@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-03-16Fix typos in commentsThomas Weber1-1/+1
[Ss]ytem => [Ss]ystem udpate => update paramters => parameters orginal => original Signed-off-by: Thomas Weber <swirl@gmx.li> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-03-13dma-mapping: pci: move pci_set_dma_mask and pci_set_consistent_dma_mask to ↵FUJITA Tomonori1-28/+0
pci-dma-compat.h We can use pci-dma-compat.h to implement pci_set_dma_mask and pci_set_consistent_dma_mask as we do with the other PCI DMA API. We can remove HAVE_ARCH_PCI_SET_DMA_MASK too. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Greg KH <greg@kroah.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-13dma-mapping: dma-mapping.h: add dma_set_coherent_maskFUJITA Tomonori1-4/+3
dma_set_coherent_mask corresponds to pci_set_consistent_dma_mask. This is necessary to move to the generic device model DMA API from the PCI bus specific API in the long term. dma_set_coherent_mask works in the exact same way that pci_set_consistent_dma_mask does. So this patch also changes pci_set_consistent_dma_mask to call dma_set_coherent_mask. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: James Bottomley <James.Bottomley@suse.de> Cc: David S. Miller <davem@davemloft.net> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Greg KH <greg@kroah.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-13dma-mapping: pci: convert pci_set_dma_mask to call dma_set_maskFUJITA Tomonori1-6/+4
This changes pci_set_dma_mask to call the generic DMA API, dma_set_mask. pci_set_dma_mask (in drivers/pci/pci.c) does the same things that dma_set_mask does on all the architectures that use pci_set_dma_mask; calls dma_supprted and sets dev->dma_mask. So we safely change pci_set_dma_mask to simply call dma_set_mask. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: James Bottomley <James.Bottomley@suse.de> Cc: David S. Miller <davem@davemloft.net> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Greg KH <greg@kroah.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-09Merge branch 'for-linus' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI/PM Runtime: Make runtime PM of PCI devices inactive by default
2010-03-08Driver core: create lock/unlock functions for struct deviceGreg Kroah-Hartman1-2/+2
In the future, we are going to be changing the lock type for struct device (once we get the lockdep infrastructure properly worked out) To make that changeover easier, and to possibly burry the lock in a different part of struct device, let's create some functions to lock and unlock a device so that no out-of-core code needs to be changed in the future. This patch creates the device_lock/unlock/trylock() functions, and converts all in-tree users to them. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jean Delvare <khali@linux-fr.org> Cc: Dave Young <hidave.darkstar@gmail.com> Cc: Ming Lei <tom.leiming@gmail.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Phil Carmody <ext-phil.2.carmody@nokia.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Pavel Machek <pavel@ucw.cz> Cc: Len Brown <len.brown@intel.com> Cc: Magnus Damm <damm@igel.co.jp> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Stefan Richter <stefanr@s5r6.in-berlin.de> Cc: David Brownell <dbrownell@users.sourceforge.net> Cc: Vegard Nossum <vegard.nossum@gmail.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Alex Chiang <achiang@hp.com> Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrew Patterson <andrew.patterson@hp.com> Cc: Yu Zhao <yu.zhao@intel.com> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Samuel Ortiz <sameo@linux.intel.com> Cc: Wolfram Sang <w.sang@pengutronix.de> Cc: CHENG Renquan <rqcheng@smu.edu.sg> Cc: Oliver Neukum <oliver@neukum.org> Cc: Frans Pop <elendil@planet.nl> Cc: David Vrabel <david.vrabel@csr.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-08Merge branch 'x86-mrst-for-linus' of ↵Linus Torvalds1-0/+43
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mrst-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits) x86, mrst: Fix whitespace breakage in apb_timer.c x86, mrst: Fix APB timer per cpu clockevent x86, mrst: Remove X86_MRST dependency on PCI_IOAPIC x86, olpc: Use pci subarch init for OLPC x86, pci: Add arch_init to x86_init abstraction x86, mrst: Add Kconfig dependencies for Moorestown x86, pci: Exclude Moorestown PCI code if CONFIG_X86_MRST=n x86, numaq: Make CONFIG_X86_NUMAQ depend on CONFIG_PCI x86, pci: Add sanity check for PCI fixed bar probing x86, legacy_irq: Remove duplicate vector assigment x86, legacy_irq: Remove left over nr_legacy_irqs x86, mrst: Platform clock setup code x86, apbt: Moorestown APB system timer driver x86, mrst: Add vrtc platform data setup code x86, mrst: Add platform timer info parsing code x86, mrst: Fill in PCI functions in x86_init layer x86, mrst: Add dummy legacy pic to platform setup x86/PCI: Moorestown PCI support x86, ioapic: Add dummy ioapic functions x86, ioapic: Early enable ioapic for timer irq ... Fixed up semantic conflict of new clocksources due to commit 17622339af25 ("clocksource: add argument to resume callback").
2010-03-06PCI/PM Runtime: Make runtime PM of PCI devices inactive by defaultRafael J. Wysocki1-0/+2
Make the run-time power management of PCI devices be inactive by default by calling pm_runtime_forbid() for each PCI device during its initialization. This setting may be overriden by the user space with the help of the /sys/devices/.../power/control interface. That's necessary to avoid breakage on systems where ACPI-based wake-up is known to fail for some devices. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>