summaryrefslogtreecommitdiff
path: root/drivers/pci/msi.c
AgeCommit message (Collapse)AuthorFilesLines
2013-08-13PCI: msi: add default MSI operations for !HAVE_GENERIC_HARDIRQS platformsThomas Petazzoni1-0/+16
Some platforms (e.g S390) don't use the generic hardirqs code and therefore do not defined HAVE_GENERIC_HARDIRQS. This prevents using the irq_set_chip_data() and irq_get_chip_data() functions that are used for the default implementations of the MSI operations. So, when CONFIG_GENERIC_HARDIRQS is not enabled, provide another default implementation of the MSI operations, that simply errors out. The architecture is responsible for implementing those operations (which is the case on S390), and cannot use the msi_chip infrastructure. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2013-08-12PCI: Introduce new MSI chip infrastructureThierry Reding1-2/+25
The new struct msi_chip is used to associated an MSI controller with a PCI bus. It is automatically handed down from the root to its children during bus enumeration. This patch provides default (weak) implementations for the architecture- specific MSI functions (arch_setup_msi_irq(), arch_teardown_msi_irq() and arch_msi_check_device()) which check if a PCI device's bus has an attached MSI chip and forward the call appropriately. Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Daniel Price <daniel.price@gmail.com> Tested-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2013-08-12PCI: use weak functions for MSI arch-specific functionsThomas Petazzoni1-24/+24
Until now, the MSI architecture-specific functions could be overloaded using a fairly complex set of #define and compile-time conditionals. In order to prepare for the introduction of the msi_chip infrastructure, it is desirable to switch all those functions to use the 'weak' mechanism. This commit converts all the architectures that were overidding those MSI functions to use the new strategy. Note that we keep two separate, non-weak, functions default_teardown_msi_irqs() and default_restore_msi_irqs() for the default behavior of the arch_teardown_msi_irqs() and arch_restore_msi_irqs(), as the default behavior is needed by x86 PCI code. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Tested-by: Daniel Price <daniel.price@gmail.com> Tested-by: Thierry Reding <thierry.reding@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux390@de.ibm.com Cc: linux-s390@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: x86@kernel.org Cc: Russell King <linux@arm.linux.org.uk> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: linux-ia64@vger.kernel.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: David S. Miller <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Cc: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2013-05-28PCI: Allocate only as many MSI vectors as requested by driverAlexander Gordeev1-2/+8
Because of the encoding of the "Multiple Message Capable" and "Multiple Message Enable" fields, a device can only advertise that it's capable of a power-of-two number of vectors, and the OS can only enable a power-of-two number. For example, a device that's limited internally to using 18 vectors would have to advertise that it's capable of 32. The 14 extra vectors consume vector numbers and IRQ descriptors even though the device can't actually use them. This fix introduces a 'msi_desc::nvec_used' field to address this issue. When non-zero, it is the actual number of MSIs the device will send, as requested by the device driver. This value should be used by architectures to set up and tear down only as many interrupt resources as the device will actually use. Note, although the existing 'msi_desc::multiple' field might seem redundant, in fact it is not. The number of MSIs advertised need not be the smallest power-of-two larger than the number of MSIs the device will send. Thus, it is not always possible to derive the former from the latter, so we need to keep them both to handle this case. [bhelgaas: changelog, rename to "nvec_used"] Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-30PCI: Set ->mask_pos correctlyDan Carpenter1-2/+4
The "+" operation has higher precedence than "?:" and ->msi_cap is always non-zero here so the original statement is equivalent to: entry->mask_pos = PCI_MSI_MASK_64; Which wasn't the intent. [bhelgaas: my fault from 78b5a310ce] Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-24Merge branch 'pci/gavin-msi-cleanup' into nextBjorn Helgaas1-96/+80
* pci/gavin-msi-cleanup: vfio-pci: Use cached MSI/MSI-X capabilities vfio-pci: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK PCI: Remove "extern" from function declarations PCI: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK PCI: Drop msi_mask_reg() and remove drivers/pci/msi.h PCI: Use msix_table_size() directly, drop multi_msix_capable() PCI: Drop msix_table_offset_reg() and msix_pba_offset_reg() macros PCI: Drop is_64bit_address() and is_mask_bit_support() macros PCI: Drop msi_data_reg() macro PCI: Drop msi_lower_address_reg() and msi_upper_address_reg() macros PCI: Drop msi_control_reg() macro and use PCI_MSI_FLAGS directly PCI: Use cached MSI/MSI-X offsets from dev, not from msi_desc PCI: Clean up MSI/MSI-X capability #defines PCI: Use cached MSI-X cap while enabling MSI-X PCI: Use cached MSI cap while enabling MSI interrupts PCI: Remove MSI/MSI-X cap check in pci_msi_check_device() PCI: Cache MSI/MSI-X capability offsets in struct pci_dev PCI: Use u8, not int, for PM capability offset [SCSI] megaraid_sas: Use correct #define for MSI-X capability
2013-04-23PCI: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASKBjorn Helgaas1-2/+2
PCI_MSIX_FLAGS_BIRMASK is mis-named because the BIR mask is in the Table Offset register, not the flags ("Message Control" per spec) register. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23PCI: Drop msi_mask_reg() and remove drivers/pci/msi.hBjorn Helgaas1-2/+2
msi_mask_reg() doesn't provide any useful abstraction, do drop it. Remove the now-empty drivers/pci/msi.h. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23PCI: Use msix_table_size() directly, drop multi_msix_capable()Bjorn Helgaas1-2/+5
The users of multi_msix_capable() are really interested in the table size, so just say what we mean. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23PCI: Drop msix_table_offset_reg() and msix_pba_offset_reg() macrosBjorn Helgaas1-2/+2
msix_table_offset_reg() is used only once and adds a useless indirection, so just use the table offset directly. msix_pba_offset_reg() is unused, so just delete it. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23PCI: Drop is_64bit_address() and is_mask_bit_support() macrosBjorn Helgaas1-2/+2
is_64bit_address() and is_mask_bit_support() don't provide any useful abstraction, so drop them. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23PCI: Drop msi_data_reg() macroBjorn Helgaas1-6/+6
msi_data_reg() doesn't provide any useful abstraction, so drop it. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23PCI: Drop msi_lower_address_reg() and msi_upper_address_reg() macrosBjorn Helgaas1-8/+8
msi_lower_address_reg() and msi_upper_address_reg() don't provide any useful abstraction, so drop them. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23PCI: Drop msi_control_reg() macro and use PCI_MSI_FLAGS directlyBjorn Helgaas1-4/+4
Note the error in pci_msix_table_size() -- we used PCI_MSI_FLAGS to locate the PCI_MSIX_FLAGS word. No actual breakage because PCI_MSI_FLAGS and PCI_MSIX_FLAGS happen to be the same. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23PCI: Use cached MSI/MSI-X offsets from dev, not from msi_descBjorn Helgaas1-14/+8
We always know the type (MSI vs MSI-X), so we can use the correct cached capability offset rather than relying on the copy in the msi_attrib. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23PCI: Use cached MSI-X cap while enabling MSI-XGavin Shan1-20/+16
The patch uses the cached MSI-X capability offset in pci_dev instead of reading it from config space when enabling MSI-X interrupts. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23PCI: Use cached MSI cap while enabling MSI interruptsGavin Shan1-13/+11
The patch uses the cached MSI capability offset in pci_dev instead of reading it from config space when enabling MSI interrupts. [bhelgaas: removed unrelated msi_control_reg() changes] Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23PCI: Remove MSI/MSI-X cap check in pci_msi_check_device()Gavin Shan1-4/+1
The function pci_msi_check_device() is called while enabling MSI or MSI-X interrupts to make sure the PCI device can support MSI or MSI-X capability. This patch removes the check on MSI or MSI-X capability in the function and lets the caller do the check. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23PCI: Cache MSI/MSI-X capability offsets in struct pci_devGavin Shan1-23/+19
The patch caches the MSI and MSI-X capability offset in PCI device (struct pci_dev) so that we needn't read it from the config space upon enabling or disabling MSI or MSI-X interrupts. [bhelgaas: moved pm_cap size change to separate patch] Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-12PCI: Make local functions/structs staticBjorn Helgaas1-2/+2
This fixes "no previous prototype" warnings found via "make W=1". Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-01-24PCI/MSI: Enable multiple MSIs with pci_enable_msi_block_auto()Alexander Gordeev1-0/+26
The new function pci_enable_msi_block_auto() tries to allocate maximum possible number of MSIs up to the number the device supports. It generalizes a pattern when pci_enable_msi_block() is contiguously called until it succeeds or fails. Opposite to pci_enable_msi_block() which takes the number of MSIs to allocate as a input parameter, pci_enable_msi_block_auto() could be used by device drivers to obtain the number of assigned MSIs and the number of MSIs the device supports. Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Jeff Garzik <jgarzik@pobox.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/c3de2419df94a0f95ca1a6f755afc421486455e6.1353324359.git.agordeev@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-11-30s390/pci: PCI adapter interrupts for MSI/MSI-XJan Glauber1-0/+6
Support PCI adapter interrupts using the Single-IRQ-mode. Single-IRQ-mode disables an adapter IRQ automatically after delivering it until the SIC instruction enables it again. This is used to reduce the number of IRQs for streaming workloads. Up to 64 MSI handlers can be registered per PCI function. A hash table is used to map interrupt numbers to MSI descriptors. The interrupt vector is scanned using the flogr instruction. Only MSI/MSI-X interrupts are supported, no legacy INTs. Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-01-07x86/PCI: Expand the x86_msi_ops to have a restore MSIs.Konrad Rzeszutek Wilk1-2/+27
The MSI restore function will become a function pointer in an x86_msi_ops struct. It defaults to the implementation in the io_apic.c and msi.c. We piggyback on the indirection mechanism introduced by "x86: Introduce x86_msi_ops". Cc: x86@kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: linux-pci@vger.kernel.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-07PCI: msi: fix imbalanced refcount of msi irq sysfs objectsNeil Horman1-2/+12
This warning was recently reported to me: ------------[ cut here ]------------ WARNING: at lib/kobject.c:595 kobject_put+0x50/0x60() Hardware name: VMware Virtual Platform kobject: '(null)' (ffff880027b0df40): is not initialized, yet kobject_put() is being called. Modules linked in: vmxnet3(+) vmw_balloon i2c_piix4 i2c_core shpchp raid10 vmw_pvscsi Pid: 630, comm: modprobe Tainted: G W 3.1.6-1.fc16.x86_64 #1 Call Trace: [<ffffffff8106b73f>] warn_slowpath_common+0x7f/0xc0 [<ffffffff8106b836>] warn_slowpath_fmt+0x46/0x50 [<ffffffff810da293>] ? free_desc+0x63/0x70 [<ffffffff812a9aa0>] kobject_put+0x50/0x60 [<ffffffff812e4c25>] free_msi_irqs+0xd5/0x120 [<ffffffff812e524c>] pci_enable_msi_block+0x24c/0x2c0 [<ffffffffa017c273>] vmxnet3_alloc_intr_resources+0x173/0x240 [vmxnet3] [<ffffffffa0182e94>] vmxnet3_probe_device+0x615/0x834 [vmxnet3] [<ffffffff812d141c>] local_pci_probe+0x5c/0xd0 [<ffffffff812d2cb9>] pci_device_probe+0x109/0x130 [<ffffffff8138ba2c>] driver_probe_device+0x9c/0x2b0 [<ffffffff8138bceb>] __driver_attach+0xab/0xb0 [<ffffffff8138bc40>] ? driver_probe_device+0x2b0/0x2b0 [<ffffffff8138bc40>] ? driver_probe_device+0x2b0/0x2b0 [<ffffffff8138a8ac>] bus_for_each_dev+0x5c/0x90 [<ffffffff8138b63e>] driver_attach+0x1e/0x20 [<ffffffff8138b240>] bus_add_driver+0x1b0/0x2a0 [<ffffffffa0188000>] ? 0xffffffffa0187fff [<ffffffff8138c246>] driver_register+0x76/0x140 [<ffffffff815ca414>] ? printk+0x51/0x53 [<ffffffffa0188000>] ? 0xffffffffa0187fff [<ffffffff812d2996>] __pci_register_driver+0x56/0xd0 [<ffffffffa018803a>] vmxnet3_init_module+0x3a/0x3c [vmxnet3] [<ffffffff81002042>] do_one_initcall+0x42/0x180 [<ffffffff810aad71>] sys_init_module+0x91/0x200 [<ffffffff815dccc2>] system_call_fastpath+0x16/0x1b ---[ end trace 44593438a59a9558 ]--- Using INTx interrupt, #Rx queues: 1. It occurs when populate_msi_sysfs fails, which in turn causes free_msi_irqs to be called. Because populate_msi_sysfs fails, we never registered any of the msi irq sysfs objects, but free_msi_irqs still calls kobject_del and kobject_put on each of them, which gets flagged in the above stack trace. The fix is pretty straightforward. We can key of the parent pointer in the kobject. It is only set if the kobject_init_and_add succededs in populate_msi_sysfs. If anything fails there, each kobject has its parent reset to NULL Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Bjorn Helgaas <bhelgaas@google.com> CC: Greg Kroah-Hartman <gregkh@suse.de> CC: linux-pci@vger.kernel.org Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-07PCI: msi: Disable msi interrupts when we initialize a pci deviceEric W. Biederman1-0/+10
I traced a nasty kexec on panic boot failure to the fact that we had screaming msi interrupts and we were not disabling the msi messages at kernel startup. The booting kernel had not enabled those interupts so was not prepared to handle them. I can see no reason why we would ever want to leave the msi interrupts enabled at boot if something else has enabled those interrupts. The pci spec specifies that msi interrupts should be off by default. Drivers are expected to enable the msi interrupts if they want to use them. Our interrupt handling code reprograms the interrupt handlers at boot and will not be be able to do anything useful with an unexpected interrupt. This patch applies cleanly all of the way back to 2.6.32 where I noticed the problem. Cc: stable@kernel.org Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-07PCI/sysfs: add per pci device msi[x] irq listing (v5)Neil Horman1-0/+111
This patch adds a per-pci-device subdirectory in sysfs called: /sys/bus/pci/devices/<device>/msi_irqs This sub-directory exports the set of msi vectors allocated by a given pci device, by creating a numbered sub-directory for each vector beneath msi_irqs. For each vector various attributes can be exported. Currently the only attribute is called mode, which tracks the operational mode of that vector (msi vs. msix) Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-11-01pci: Fix files needing export.h for EXPORT_SYMBOL/THIS_MODULEPaul Gortmaker1-0/+1
They were implicitly getting it from device.h --> module.h but we want to clean that up. So add the minimal header for these macros. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-29drivers: Final irq namespace conversionThomas Gleixner1-5/+5
Scripted with coccinelle. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-12-23PCI: Add mask bit definition for MSI-X tableSheng Yang1-2/+3
Then we can use it instead of magic number 1. Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-10-18msi: Introduce default_[teardown|setup]_msi_irqs with fallback.Thomas Gleixner1-2/+12
Introduce an override for the arch_[teardown|setup]_msi_irqs that can be utilized to fallback to the default arch_* code. If a platform wants to utilize the code paths defined in driver/pci/msi.c it has to define HAVE_DEFAULT_MSI_TEARDOWN_IRQS or HAVE_DEFAULT_MSI_SETUP_IRQS. Otherwise the old mechanism of over-ridding the arch_* works fine. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: x86@kernel.org
2010-10-12pci: Cleanup the irq_desc mess in msiThomas Gleixner1-15/+9
Handing down irq_desc to msi just so that msi can access irq_desc.irq_data.msi_desc is a pretty stupid idea. The calling code can hand down a pointer to msi_desc so msi code does not need to know about the irq descriptor at all. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@elte.hu> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-10-12pci: Convert msi to new irq_chip functionsThomas Gleixner1-7/+7
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@elte.hu> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Tony Luck <tony.luck@intel.com> Cc: Russell King <linux@arm.linux.org.uk>
2010-07-30PCI: MSI: Restore read_msi_msg_desc(); add get_cached_msi_msg_desc()Ben Hutchings1-5/+42
commit 2ca1af9aa3285c6a5f103ed31ad09f7399fc65d7 "PCI: MSI: Remove unsafe and unnecessary hardware access" changed read_msi_msg_desc() to return the last MSI message written instead of reading it from the device, since it may be called while the device is in a reduced power state. However, the pSeries platform code really does need to read messages from the device, since they are initially written by firmware. Therefore: - Restore the previous behaviour of read_msi_msg_desc() - Add new functions get_cached_msi_msg{,_desc}() which return the last MSI message written - Use the new functions where appropriate Acked-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-07-30PCI: MSI: Remove unsafe and unnecessary hardware accessBen Hutchings1-23/+11
During suspend on an SMP system, {read,write}_msi_msg_desc() may be called to mask and unmask interrupts on a device that is already in a reduced power state. At this point memory-mapped registers including MSI-X tables are not accessible, and config space may not be fully functional either. While a device is in a reduced power state its interrupts are effectively masked and its MSI(-X) state will be restored when it is brought back to D0. Therefore these functions can simply read and write msi_desc::msg for devices not in D0. Further, read_msi_msg_desc() should only ever be used to update a previously written message, so it can always read msi_desc::msg and never needs to touch the hardware. Tested-by: "Michael Chan" <mchan@broadcom.com> Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-07-30PCI: fix wrong memory address handling in MSI-XKenji Kaneshige1-1/+1
Use resource_size_t for MMIO address instead of unsigned long. Otherwise, higher 32-bits of MMIO address are cleared unexpectedly in x86-32 PAE. Acked-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo1-0/+1
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2009-09-10PCI MSI: Style cleanupsHidetoshi Seto1-22/+22
Cleanups (nearly based on checkpatch). Before: total: 11 errors, 2 warnings, 0 checks, 842 lines checked After: total: 0 errors, 0 warnings, 0 checks, 842 lines checked v2: fix it's/its mistakes in comment Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-10PCI MSI: MSI-X cleanup, msix_setup_entries()Hidetoshi Seto1-23/+36
Cleanup based on the prototype from Matthew Milcox. Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-10PCI MSI: MSI-X cleanup, msix_program_entries()Hidetoshi Seto1-10/+19
Cleanup based on the prototype from Matthew Milcox. Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-10PCI MSI: MSI-X cleanup, msix_map_region()Hidetoshi Seto1-13/+19
Cleanup based on the prototype from Matthew Milcox. Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-10PCI MSI: Relocate error path in init_msix_capability()Hidetoshi Seto1-18/+22
Move it from the middle of the function to the end. Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-10PCI MSI: Unify msi_free_irqs() and msix_free_all_irqs()Hidetoshi Seto1-43/+31
Unify msi_free_irqs() and msix_free_all_irqs(), and rename it to a common void function free_msi_irqs(). And relocate the common function to where the prototype is located now. Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-10PCI MSI: Use list_first_entry()Hidetoshi Seto1-1/+1
use list_first_entry() instead of list_entry(). Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-10PCI MSI: Remove attribute check from pci_disable_msi()Hidetoshi Seto1-8/+1
The msi_list never have MSI-X's msi_desc while MSI is enabled, and also it never have MSI's msi_desc while MSI-X is enabled. This patch remove check for MSI-X entry from the pci_disable_msi(), referring that pci_disable_msix() does not have any check for MSI entry. Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-06-29PCI MSI: Fix restoration of MSI/MSI-X mask states in suspend/resumeHidetoshi Seto1-7/+28
There are 2 problems on mask states in suspend/resume. [1]: It is better to restore the mask states of MSI/MSI-X to initial states (MSI is unmasked, MSI-X is masked) when we release the device. The pci_msi_shutdown() does the restoration of mask states for MSI, while the msi_free_irqs() does it for MSI-X. In other words, in the "disable" path both of MSI and MSI-X are handled, but in the "shutdown" path only MSI is handled. MSI: pci_disable_msi() => pci_msi_shutdown() [ mask states for MSI restored ] => msi_set_enable(dev, pos, 0); => msi_free_irqs() MSI-X: pci_disable_msix() => pci_msix_shutdown() => msix_set_enable(dev, 0); => msix_free_all_irqs => msi_free_irqs() [ mask states for MSI-X restored ] This patch moves the masking for MSI-X from msi_free_irqs() to pci_msix_shutdown(). This change has some positive side effects: - It prevents OS from touching mask states before reading preserved bits in the register, which can be happen if msi_free_irqs() is called from error path in msix_capability_init(). - It also prevents touching the register after turning off MSI-X in "disable" path, which can be a problem on some devices. [2]: We have cache of the mask state in msi_desc, which is automatically updated when msi/msix_mask_irq() is called. This cached states are used for the resume. But since what need to be restored in the resume is the states before the shutdown on the suspend, calling msi/msix_mask_irq() from pci_msi/msix_shutdown() is not appropriate. This patch introduces __msi/msix_mask_irq() that do mask as same as msi/msix_mask_irq() but does not update cached state, for use in pci_msi/msix_shutdown(). [updated: get rid of msi/msix_mask_irq_nocache() (proposed by Matthew Wilcox)] Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-06-29PCI MSI: Unmask MSI if setup failedHidetoshi Seto1-0/+1
The initial state of mask register of MSI is unmasked. We set it masked before calling arch_setup_msi_irqs(). If arch_setup_msi_irq() fails, it is better to restore the state of the mask register. Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-06-29PCI MSI: shorten PCI_MSIX_ENTRY_* symbol namesHidetoshi Seto1-10/+8
These names are too long! Drop _OFFSET to save some bytes/lines. Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-06-29PCI MSI: Return if alloc_msi_entry for MSI-X failedHidetoshi Seto1-2/+8
In current code it continues setup even if alloc_msi_entry() for MSI-X is failed due to lack of memory. It means arch_setup_msi_irqs() might be called with msi_desc entries less than its argument nvec. At least x86's arch_setup_msi_irqs() uses list_for_each_entry() for dev->msi_list that suspected to have entries same numbers as nvec, and it doesn't check the number of allocated vectors and passed arg nvec. Therefore it will result in success of pci_enable_msix(), with less vectors allocated than requested. This patch fixes the error route to return -ENOMEM, instead of continuing the setup (proposed by Matthew Wilcox). Note that there is no iounmap in msi_free_irqs() if no msi_disc is allocated. Reviewed-by: Matthew Wilcox <matthew@wil.cx> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-06-20PCI: make msi_free_irqs() to use msix_mask_irq() instead of open coded writeHidetoshi Seto1-4/+1
Use msix_mask_irq() instead of direct use of writel, so as not to clear preserved bits in the Vector Control register [31:1]. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-06-20PCI: Fix the NIU MSI-X problem in a better wayMatthew Wilcox1-18/+28
The previous MSI-X fix (8d181018532dd709ec1f789e374cda92d7b01ce1) had three bugs. First, it didn't move the write that disabled the vector. This led to writing garbage to the MSI-X vector (spotted by Michael Ellerman). It didn't fix the PCI resume case, and it had a race window where the device could generate an interrupt before the MSI-X registers were programmed (leading to a DMA to random addresses). Fortunately, the MSI-X capability has a bit to mask all the vectors. By setting this bit instead of clearing the enable bit, we can ensure the device will not generate spurious interrupts. Since the capability is now enabled, the NIU device will not have a problem with the reads and writes to the MSI-X registers being in the original order in the code. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>