summaryrefslogtreecommitdiff
path: root/drivers/pci/pcie/aer/aerdrv_errprint.c
AgeCommit message (Collapse)AuthorFilesLines
2018-06-11PCI/AER: Squash aerdrv_errprint.c into aerdrv.cBjorn Helgaas1-260/+0
Squash aerdrv_errprint.c into aerdrv.c. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>
2018-06-03PCI/AER: Decode Error Source Requester IDBjorn Helgaas1-7/+11
Decode the Requester ID from the AER Error Source Register into domain/ bus/device/function format to match other logging. In cases where the ID matches the device used for pci_err(), drop the extra ID completely so we don't print it twice. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-05-10PCI/AER: Add TLP header information to tracepointThomas Tai1-2/+2
When a PCIe AER error occurs, the TLP header information is printed in the kernel message but it is missing from the tracepoint. A userspace program can use this information in the tracepoint to better analyze problems. To enable the tracepoint: echo 1 > /sys/kernel/debug/tracing/events/ras/aer_event/enable Example tracepoint output: $ cat /sys/kernel/debug/tracing/trace aer_event: 0000:01:00.0 PCIe Bus Error: severity=Uncorrected, non-fatal, Completer Abort TLP Header={0x0,0x1,0x2,0x3} Signed-off-by: Thomas Tai <thomas.tai@oracle.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-05-08PCI/AER: Unify error bit printing for native and CPER reportingAlexandru Gagniuc1-7/+9
AER errors can be reported natively (Linux AER driver fields interrupts and reads error state directly from hardware) or via the ACPI/APEI/GHES/CPER path (platform firmware reads error state from hardware and sends it to Linux via ACPI interfaces). Previously the same error would produce different output depending on whether it was reported natively or via ACPI. The CPER path resulted in hard-to-understand messages, without a prefix. Instead use __aer_print_error() for both native AER and CPER to provide a more consistent log format. Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com> [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-03-19PCI: Tidy commentsBjorn Helgaas1-3/+0
Remove pointless comments that tell us the file name, remove blank line comments, follow multi-line comment conventions. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-02-01Merge branch 'pci/spdx' into nextBjorn Helgaas1-4/+1
* pci/spdx: PCI: Add SPDX GPL-2.0+ to replace implicit GPL v2 or later statement PCI: Add SPDX GPL-2.0+ to replace GPL v2 or later boilerplate PCI: Add SPDX GPL-2.0 to replace COPYING boilerplate PCI: Add SPDX GPL-2.0 to replace GPL v2 boilerplate PCI: Add SPDX GPL-2.0 when no license was specified
2018-01-29PCI: Add SPDX GPL-2.0 to replace COPYING boilerplateBjorn Helgaas1-4/+1
Add SPDX GPL-2.0 to all PCI files that referred to the kernel default "COPYING" file, which specifies GPL version 2. Remove the boilerplate language referring to the GPL and "COPYING", relying on the assertion in b24413180f56 ("License cleanup: add SPDX GPL-2.0 license identifier to files with no license") that the SPDX identifier may be used instead of the full boilerplate text. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-18PCI: Add wrappers for dev_printk()Frederick Lawler1-11/+11
Add PCI-specific dev_printk() wrappers and use them to simplify the code slightly. No functional change intended. Signed-off-by: Frederick Lawler <fred@fredlawl.com> [bhelgaas: squash into one patch] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-09-20PCI/AER: Remove duplicate AER severity translationTyler Baicar1-4/+2
Currently the AER severity is being translated twice in the code flow for PCIe errors. It is first translated in ghes_do_proc() before calling into the AER driver. Then it is translated again when the AER driver calls cper_print_aer(). This causes the severity that is used in cper_print_aer() to be incorrect. Remove the second translation that is in cper_print_aer() since this function is already receiving the correct AER severity. Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Borislav Petkov <bp@suse.de>
2015-03-06PCI/AER: Avoid info leak in __print_tlp_header()Rasmus Villemoes1-10/+2
Commit fab4c256a58b ("PCI/AER: Add a TLP header print helper") introduced the helper function __print_tlp_header(), but contrary to the intention, the behaviour did change: Since we're taking the address of the parameter t, the first 4 or 8 bytes printed will be the value of the pointer t itself, and the remaining 12 or 8 bytes will be who-knows-what (something from the stack). We want to show the values of the four members of the struct aer_header_log_regs; that can be done without ugly and error-prone casts. On little-endian this should produce the same output as originally intended, and since no-one has complained about getting garbage output so far, I think big-endian should be ok too. Fixes: fab4c256a58b ("PCI/AER: Add a TLP header print helper") Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Borislav Petkov <bp@suse.de> CC: stable@vger.kernel.org # v3.14+
2014-09-25PCI/AER: Add additional PCIe AER error stringsChen, Gong1-2/+9
Add strings for all AER error bits defined in PCIe r3.0. [bhelgaas: changelog, drop designated initializer change] Signed-off-by: Chen, Gong <gong.chen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-06-23trace, AER: Move trace into unified interfaceChen, Gong1-3/+1
AER uses a separate trace interface by now. To make it consistent, move it into unified RAS trace interface. Signed-off-by: Chen, Gong <gong.chen@linux.intel.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Tony Luck <tony.luck@intel.com>
2014-06-11PCI: Merge multi-line quoted stringsRyan Desfosses1-7/+3
Merge quoted strings that are broken across lines into a single entity. The compiler merges them anyway, but checkpatch complains about it, and merging them makes it easier to grep for strings. No functional change. [bhelgaas: changelog, do the same for everything under drivers/pci] Signed-off-by: Ryan Desfosses <ryan@desfo.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-12-14PCI/AER: Clean up error printing code a bitBorislav Petkov1-24/+27
Save one indentation level in aer_print_error() for the generic case where we have info->status of an error, disregard 80 cols rule a bit for the sake of better readability, fix alignment. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-12-14PCI/AER: Add a TLP header print helperBorislav Petkov1-23/+21
... and call it instead of duplicating the large printk format statement. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-05-30aerdrv: Move cper_print_aer() call out of interrupt contextLance Ortiz1-2/+2
The following warning was seen on 3.9 when a corrected PCIe error was being handled by the AER subsystem. WARNING: at .../drivers/pci/search.c:214 pci_get_dev_by_id+0x8a/0x90() This occurred because a call to pci_get_domain_bus_and_slot() was added to cper_print_pcie() to setup for the call to cper_print_aer(). The warning showed up because cper_print_pcie() is called in an interrupt context and pci_get* functions are not supposed to be called in that context. The solution is to move the cper_print_aer() call out of the interrupt context and into aer_recover_work_func() to avoid any warnings when calling pci_get* functions. Signed-off-by: Lance Ortiz <lance.ortiz@hp.com> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-01-04aerdrv: Cleanup log output for AERLance Ortiz1-28/+26
These changes make cper_print_aer more consistent with aer_print_error and clean things up by elimiating the use of the prefix variable and replacing it with dev_printk. Signed-off-by: Lance Ortiz <lance.ortiz@hp.com> Acked-by: Boris Petkov <bp@alien8.de> Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-01-04aerdrv: Enhanced AER loggingLance Ortiz1-1/+8
This patch will provide a more reliable and easy way for user-space applications to have access to AER logs rather than reading them from the message buffer. It also provides a way to notify user-space when an AER event occurs. The aer driver is updated to generate a trace event of function 'aer_event' when a PCIe error is reported over the AER interface. The trace event was added to both the interrupt based aer path and the firmware first path. Signed-off-by: Lance Ortiz <lance.ortiz@hp.com> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Boris Petkov <bp@alien8.de> Signed-off-by: Tony Luck <tony.luck@intel.com>
2011-07-22PCI: PCIe AER: add aer_recover_queueHuang Ying1-1/+2
In addition to native PCIe AER, now APEI (ACPI Platform Error Interface) GHES (Generic Hardware Error Source) can be used to report PCIe AER errors too. To add support to APEI GHES PCIe AER recovery, aer_recover_queue is added to export the recovery function in native PCIe AER driver. Recoverable PCIe AER errors are reported via NMI in APEI GHES. Then APEI GHES uses irq_work to delay the error processing into an IRQ handler. But PCIe AER recovery can be very time-consuming, so aer_recover_queue, which can be used in IRQ handler, delays the real recovery action into the process context, that is, work queue. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-03-22ACPI, APEI, Add PCIe AER error information printing supportHuang Ying1-0/+59
The AER error information printing support is implemented in drivers/pci/pcie/aer/aer_print.c. So some string constants, functions and macros definitions can be re-used without being exported. The original PCIe AER error information printing function is not re-used directly because the overall format is quite different. And changing the original printing format may make some original users' scripts broken. Signed-off-by: Huang Ying <ying.huang@intel.com> CC: Jesse Barnes <jbarnes@virtuousgeek.org> CC: Zhang Yanmin <yanmin.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2011-03-22PCIe, AER, use pre-generated prefix in error information printingHuang Ying1-75/+48
When printing PCIe AER error information, each line is prefixed with PCIe device and driver information. In original implementation, the prefix is generated when each line is printed. In fact, all lines share the same prefix. So this patch pre-generated the prefix, and use that one when each line is printed. In addition to common prefix can be pre-generated, the trailing white spaces in string constants and NULLs in char * array constants can be removed too. These can reduce the object file size further. The size of object file before and after changing is as follow: text data bss dec before: 3038 0 0 3038 after: 2118 0 0 2118 Signed-off-by: Huang Ying <ying.huang@intel.com> CC: Jesse Barnes <jbarnes@virtuousgeek.org> CC: Zhang Yanmin <yanmin.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2009-12-17PCI: change PCI nomenclature in drivers/pci/ (non-comment changes)Stefan Assmann1-2/+2
Changing occurrences of variants of PCI-X and PCIe to the PCI-SIG terms listed in the "Trademark and Logo Usage Guidelines". http://www.pcisig.com/developers/procedures/logos/Trademark_and_Logo_Usage_Guidelines_updated_112206.pdf Patch is limited to drivers/pci/ and changes concern non-comment parts or anything that might be visible to the user. Signed-off-by: Stefan Assmann <sassmann@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-10PCI: pcie, aer: change error print formatHidetoshi Seto1-32/+36
Use dev_printk like format. Sample (real machine + dummy error injected by aer-inject): - Before: +------ PCI-Express Device Error ------+ Error Severity : Corrected PCIE Bus Error type : Data Link Layer Bad TLP : Receiver ID : 2800 VendorID=8086h, DeviceID=1096h, Bus=28h, Device=00h, Function=00h +------ PCI-Express Device Error ------+ Error Severity : Corrected PCIE Bus Error type : Data Link Layer Bad TLP : Bad DLLP : Receiver ID : 2801 VendorID=8086h, DeviceID=1096h, Bus=28h, Device=00h, Function=01h Error of this Agent(2801) is reported first - After: pcieport-driver 0000:00:02.0: AER: Multiple Corrected error received: id=2801 e1000e 0000:28:00.0: PCIE Bus Error: severity=Corrected, type=Data Link Layer, id=2800(Receiver ID) e1000e 0000:28:00.0: device [8086:1096] error status/mask=00000040/00000000 e1000e 0000:28:00.0: [ 6] Bad TLP e1000e 0000:28:00.1: PCIE Bus Error: severity=Corrected, type=Data Link Layer, id=2801(Receiver ID) e1000e 0000:28:00.1: device [8086:1096] error status/mask=000000c0/00000000 e1000e 0000:28:00.1: [ 6] Bad TLP e1000e 0000:28:00.1: [ 7] Bad DLLP e1000e 0000:28:00.1: Error of this Agent(2801) is reported first Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-10PCI: pcie, aer: flags to bitsHidetoshi Seto1-3/+3
Compact struct and codes. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-10PCI: pcie, aer: report multiple/first error on a deviceHidetoshi Seto1-4/+4
Multiple bits might be set in the Uncorrectable Error Status register. But aer_print_error_source() only report a error of the lowest bit set in the error status register. So print strings for all bits unmasked and set. And check First Error Pointer to mark the error occured first. This FEP is not valid when the corresponing bit of the Uncorrectable Error Status register is not set, or unimplemented or undefined. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-10PCI: pcie, aer: refer mask state in mask register properlyHidetoshi Seto1-2/+4
ERR_{,UN}CORRECTABLE_ERROR_MASK are set of error bits which linux know, set of PCI_ERR_COR_* and PCI_ERR_UNC_* defined in linux/pci_regs.h. This masks make aerdrv not to report errors of unknown bit, while aerdrv have ability to report such undefined errors as "Unknown Error Bit %2d". OTOH aerdrv_errprint does not have any check of setting in mask register. So it could report masked wrong error by finding bit in status without knowing that the bit is masked in the mask register. This patch changes aerdrv to use mask state in mask register propely instead of defined/hardcoded ERR_{,UN}CORRECTABLE_ERROR_MASK. This change prevents aerdrv from reporting masked error, and also enable reporting unknown errors. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Reviewed-by: Andrew Patterson <andrew.patterson@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-10PCI: pcie, aer: remove spinlock in aerdrv_errprint.cHidetoshi Seto1-20/+8
The static buffer errmsg_buff[] is used only for building error message in fixed format, and is protected by a spinlock. This patch removes this buffer and the spinlock. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Reviewed-by: Andrew Patterson <andrew.patterson@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-10PCI: pcie, aer: fix report of multiple errorsHidetoshi Seto1-11/+8
The flag AER_MULTI_ERROR_VALID_FLAG in info->flag does mean that the root port receives multiple error messages. Error messages can be posted from different devices, so it does not mean that each reported device has multiple errors. If there are multiple error devices and the root port has valid error source ID, it would be nice to report which device is the error source reported first. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-10PCI: pcie, aer: rework MASK macros in aerdrv_errprint.cHidetoshi Seto1-25/+21
Definitions of MASK macros in aerdrv_errprint.c are tricky and unsafe. For example, AER_AGENT_TRANSMITTER_MASK(_sev, _stat) does work like: static inline func(int _sev, int _stat) { if (_sev == AER_CORRECTABLE) return (_stat & (PCI_ERR_COR_REP_ROLL|PCI_ERR_COR_REP_TIMER)); else return (_stat & PCI_ERR_COR_REP_ROLL); } In case of else path here, for uncorrectable errors, testing bits in _stat by PCI_ERR_COR_* does not make sense because _stat should have only PCI_ERR_UNC_* bits originated in uncorrectable error status register. But at this time this is safe because uncorrectable error using bit position same to PCI_ERR_COR_REP_ROLL(= bit position 8) is not defined. Likewise, AER_AGENT_COMPLETER_MASK is always PCI_ERR_UNC_COMP_ABORT but it works because bit 15 of correctable error status is not defined. It means that these MASK macros will turn to be wrong once if new error is defined. (In fact, bit 15 of correctable is now defined in PCIe 2.1) This patch changes these MASK macros to be more strict, not to return PCI_ERR_COR_* bits for uncorrectable error status and vise versa. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Reviewed-by: Andrew Patterson <andrew.patterson@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-10PCI: pcie, aer: AER_PR for printing in aerdrv_errprint.cHidetoshi Seto1-19/+15
Add workaround macro to reduce the number of checkpatch warning: WARNING: printk() should include KERN_ facility level Before: total: 0 errors, 10 warnings, 247 lines checked After: total: 0 errors, 1 warnings, 243 lines checked Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-10PCI: pcie, aer: checkpatch style cleanup in pcie/aer/*Hidetoshi Seto1-25/+24
Before: drivers/pci/pcie/aer/aer_inject.c total: 4 errors, 4 warnings, 473 lines checked drivers/pci/pcie/aer/aerdrv.c total: 5 errors, 2 warnings, 333 lines checked drivers/pci/pcie/aer/aerdrv.h total: 1 errors, 0 warnings, 139 lines checked drivers/pci/pcie/aer/aerdrv_core.c total: 4 errors, 3 warnings, 872 lines checked drivers/pci/pcie/aer/aerdrv_errprint.c total: 12 errors, 11 warnings, 248 lines checked After: drivers/pci/pcie/aer/aer_inject.c total: 0 errors, 0 warnings, 466 lines checked drivers/pci/pcie/aer/aerdrv.c total: 0 errors, 0 warnings, 335 lines checked drivers/pci/pcie/aer/aerdrv.h total: 0 errors, 0 warnings, 139 lines checked drivers/pci/pcie/aer/aerdrv_core.c total: 0 errors, 0 warnings, 869 lines checked drivers/pci/pcie/aer/aerdrv_errprint.c total: 0 errors, 10 warnings, 247 lines checked Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Reviewed-by: Andrew Patterson <andrew.patterson@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07PCI hotplug: aerdrv: fix a typo in error messageHidetoshi Seto1-1/+1
"TLP" is an acronym for "Transaction Layer Packet." Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2006-09-27PCI-Express AER implemetation: AER core and aerdriverZhang, Yanmin1-0/+248
Patch 3 implements the core part of PCI-Express AER and aerdrv port service driver. When a root port service device is probed, the aerdrv will call request_irq to register irq handler for AER error interrupt. When a device sends an PCI-Express error message to the root port, the root port will trigger an interrupt, by either MSI or IO-APIC, then kernel would run the irq handler. The handler collects root error status register and schedules a work. The work will call the core part to process the error based on its type (Correctable/non-fatal/fatal). As for Correctable errors, the patch chooses to just clear the correctable error status register of the device. As for the non-fatal error, the patch follows generic PCI error handler rules to call the error callback functions of the endpoint's driver. If the device is a bridge, the patch chooses to broadcast the error to downstream devices. As for the fatal error, the patch resets the pci-express link and follows generic PCI error handler rules to call the error callback functions of the endpoint's driver. If the device is a bridge, the patch chooses to broadcast the error to downstream devices. Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>