summaryrefslogtreecommitdiff
path: root/drivers/edac/amd64_edac.h
AgeCommit message (Collapse)AuthorFilesLines
2020-05-22EDAC/amd64: Add AMD family 17h model 60h PCI IDsAlexander Monakov1-0/+3
Add support for AMD Renoir (4000-series Ryzen CPUs). Signed-off-by: Alexander Monakov <amonakov@ispras.ru> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Yazen Ghannam <yazen.ghannam@amd.com> Link: https://lkml.kernel.org/r/20200510204842.2603-4-amonakov@ispras.ru
2020-01-16EDAC/amd64: Add family ops for Family 19h Models 00h-0FhYazen Ghannam1-0/+3
Add family ops to support AMD Family 19h systems. Existing Family 17h functions can be used. Also, add Family 19h to the list of families to automatically load the module. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200110015651.14887-5-Yazen.Ghannam@amd.com
2019-11-06EDAC/amd64: Save max number of controllers to family typeYazen Ghannam1-0/+2
The maximum number of memory controllers is fixed within a family/model group. In most cases, this has been fixed at 2, but some systems may have up to 8. The struct amd64_family_type already contains family/model-specific information, and this can be used rather than adding model checks to various functions. Create a new field in struct amd64_family_type for max_mcs. Set this when setting other family type information, and use this when needing the maximum number of memory controllers possible for a system. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Robert Richter <rrichter@marvell.com> Cc: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20191106012448.243970-4-Yazen.Ghannam@amd.com
2019-09-07EDAC/amd64: Add PCI device IDs for family 17h, model 70hIsaac Vaughn1-0/+3
Add the new Family 17h Model 70h PCI IDs (device 18h functions 0 and 6) to the AMD64 EDAC module. [ bp: s/f17_base_addr_to_cs_size/f17_addr_mask_to_cs_size/g ] Signed-off-by: Isaac Vaughn <isaac.vaughn@knights.ucf.edu> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: James Morse <james.morse@arm.com> Cc: linux-edac@vger.kernel.org Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Robert Richter <rrichter@marvell.com> Cc: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20190906192131.8ced0ca112146f32d82b6cae@knights.ucf.edu
2019-08-23EDAC/amd64: Support asymmetric dual-rank DIMMsYazen Ghannam1-1/+2
Future AMD systems will support asymmetric dual-rank DIMMs. These are DIMMs where the ranks are of different sizes. The even rank will use the Primary Even Chip Select registers and the odd rank will use the Secondary Odd Chip Select registers. Recognize if a Secondary Odd Chip Select is being used. Use the Secondary Odd Address Mask when calculating the chip select size. [ bp: move csrow_sec_enabled() to the header, fix CS_ODD define and tone-down the capitalized words spelling. ] Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20190821235938.118710-8-Yazen.Ghannam@amd.com
2019-08-23EDAC/amd64: Cache secondary Chip Select registersYazen Ghannam1-0/+4
AMD Family 17h systems have a set of secondary Chip Select Base Addresses and Address Masks. These do not represent unique Chip Selects, rather they are used in conjunction with the primary Chip Select registers in certain cases. Cache these secondary Chip Select registers for future use. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20190821235938.118710-7-Yazen.Ghannam@amd.com
2019-08-22EDAC/amd64: Support more than two controllers for chip selects handlingYazen Ghannam1-2/+3
The struct chip_select array that's used for saving chip select bases and masks is fixed at length of two. There should be one struct chip_select for each controller, so this array should be increased to support systems that may have more than two controllers. Increase the size of the struct chip_select array to eight, which is the largest number of controllers per die currently supported on AMD systems. Fix number of DIMMs and Chip Select bases/masks on Family17h, because AMD Family 17h systems support 2 DIMMs, 4 CS bases, and 2 CS masks per channel. Also, carve out the Family 17h+ reading of the bases/masks into a separate function. This effectively reverts the original bases/masks reading code to before Family 17h support was added. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20190821235938.118710-2-Yazen.Ghannam@amd.com
2019-04-25Revert "EDAC/amd64: Support more than two controllers for chip select handling"Borislav Petkov1-3/+2
This reverts commit 0a227af521d6df5286550b62f4b591417170b4ea. Unfortunately, this commit caused wrong detection of chip select sizes on some F17h client machines: --- 00-rc6+ 2019-02-14 14:28:03.126622904 +0100 +++ 01-rc4+ 2019-04-14 21:06:16.060614790 +0200 EDAC amd64: MC: 0: 0MB 1: 0MB -EDAC amd64: MC: 2: 16383MB 3: 16383MB +EDAC amd64: MC: 2: 0MB 3: 2097151MB EDAC amd64: MC: 4: 0MB 5: 0MB EDAC amd64: MC: 6: 0MB 7: 0MB EDAC MC: UMC1 chip selects: EDAC amd64: MC: 0: 0MB 1: 0MB -EDAC amd64: MC: 2: 16383MB 3: 16383MB +EDAC amd64: MC: 2: 0MB 3: 2097151MB EDAC amd64: MC: 4: 0MB 5: 0MB EDAC amd64: MC: 6: 0MB 7: 0M Revert it for now until it has been solved properly. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Yazen Ghannam <yazen.ghannam@amd.com>
2019-03-27EDAC/amd64: Support more than two controllers for chip select handlingYazen Ghannam1-2/+3
The struct chip_select array that's used for saving chip select bases and masks is fixed at length of two. There should be one struct chip_select for each controller, so this array should be increased to support systems that may have more than two controllers. Increase the size of the struct chip_select array to eight, which is the largest number of controllers per die currently supported on AMD systems. Also, carve out the Family 17h+ reading of the bases/masks into a separate function. This effectively reverts the original bases/masks reading code to before Family 17h support was added. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Kim Phillips <kim.phillips@amd.com> Cc: James Morse <james.morse@arm.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Link: https://lkml.kernel.org/r/20190228153558.127292-5-Yazen.Ghannam@amd.com
2019-03-27EDAC/amd64: Recognize x16 symbol sizeYazen Ghannam1-1/+1
Future AMD systems may support x16 symbol sizes. Recognize if a system is using x16 symbol size. Also, simplify the print statement. Note that a x16 syndrome vector table is not necessary like with x4 or x8 syndromes. This is because systems that support x16 symbol sizes are SMCA systems and in that case, the syndrome can be directly extracted from the MCA_SYND[Syndrome] field. [ bp: massage. ] Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Kim Phillips <kim.phillips@amd.com> Cc: James Morse <james.morse@arm.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Link: https://lkml.kernel.org/r/20190228153558.127292-4-Yazen.Ghannam@amd.com
2019-03-27EDAC/amd64: Support more than two Unified Memory ControllersYazen Ghannam1-4/+2
The first few models of Family 17h all had 2 Unified Memory Controllers per Die, so this was treated as a fixed value. However, future systems may have more Unified Memory Controllers per Die. Related to this, the channel number and base address of a Unified Memory Controller were found by matching on fixed, known values. However, current and future systems follow this pattern for the channel number and base address of a Unified Memory Controller: 0xYXXXXX, where Y is the channel number. So matching on hardcoded values is not necessary. Set the number of Unified Memory Controllers at driver init time based on the family/model. Also, update the functions that find the channel number and base address of a Unified Memory Controller to support more than two. [ bp: Move num_umcs into the .c file and simplify comment. ] Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Kim Phillips <kim.phillips@amd.com> Cc: James Morse <james.morse@arm.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Link: https://lkml.kernel.org/r/20190228153558.127292-3-Yazen.Ghannam@amd.com
2019-03-27EDAC/amd64: Add Family 17h Model 30h PCI IDsYazen Ghannam1-0/+3
Add the new Family 17h Model 30h PCI IDs to the AMD64 EDAC module. This also fixes a probe failure that appeared when some other PCI IDs for Family 17h Model 30h were added to the AMD NB code. Fixes: be3518a16ef2 (x86/amd_nb: Add PCI device IDs for family 17h, model 30h) Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Kim Phillips <kim.phillips@amd.com> Cc: James Morse <james.morse@arm.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: linux-edac <linux-edac@vger.kernel.org> Link: https://lkml.kernel.org/r/20190228153558.127292-1-Yazen.Ghannam@amd.com
2018-08-27EDAC, amd64: Add Family 17h, models 10h-2fh supportMichael Jin1-0/+3
Add new device IDs for family 17h, models 10h-2fh. This is required by amd64_edac_mod in order to properly detect PCI device functions 0 and 6. Signed-off-by: Michael Jin <mikhail.jin@gmail.com> Reviewed-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20180816192840.31166-1-mikhail.jin@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-02-14EDAC, amd64: Bump driver versionBorislav Petkov1-1/+1
Last time we did that was when we enabled Bulldozer. Now, we enabled Zen so it is only natural ... :-) Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
2017-01-28EDAC, amd64: Add x86cpuid sanity check during initYazen Ghannam1-0/+1
Match one of the devices in amd64_cpuids[] before loading the module. This is an additional sanity check against users trying to load amd64_edac_mod on unsupported systems. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485537863-2707-9-git-send-email-Yazen.Ghannam@amd.com [ Get rid of err_ret label, make it a bit more readable this way. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2017-01-28EDAC, amd64: Remove unused printing macrosYazen Ghannam1-6/+0
amd64_{debug,notice} don't have any users, so remove them. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1485537863-2707-6-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
2016-12-15edac: rename edac_core.h to edac_mc.hMauro Carvalho Chehab1-1/+1
Now, all left at edac_core.h are at drivers/edac/edac_mc.c, so rename it to edac_mc.h. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-12-01EDAC, amd64: Improve amd64-specific printing macrosBorislav Petkov1-2/+2
Prefix the warn and error macros with the respective string so that callers don't have to say "Error" or "Warning". We save us string length this way in the actual calls. While at it, shorten the calls in reserve_mc_sibling_devs(). Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
2016-11-29EDAC, amd64: Define and register UMC error decode functionYazen Ghannam1-0/+2
How we need to decode UMC errors is different from how we decode bus errors, so let's define a new function for this. We also need a way to determine the UMC channel since we're not guaranteed that there is a fixed relation between channel and MCA bank. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1480359593-80369-1-git-send-email-Yazen.Ghannam@amd.com [ Fold in decode_synd_reg(), simplify. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2016-11-29EDAC, amd64: Add Fam17h debug outputYazen Ghannam1-0/+6
Read a few more UMC registers and provide debug output in order to be as similar as possible to older AMD systems. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1480344621-14966-1-git-send-email-Yazen.Ghannam@amd.com [ Remove unneeded K8 check and comments, fixup others. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2016-11-28EDAC, amd64: Add Fam17h scrubber supportYazen Ghannam1-0/+2
Fam17h has new register offsets and fields for setting up the DRAM scrubber so add support for this. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-17-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
2016-11-28EDAC, amd64: Read MC registers on AMD Fam17hYazen Ghannam1-0/+13
Fam17h has a different set of registers and bitfields. Most of these registers are read through SMN (System Management Network) rather than PCI config space. Also, the derivation of various values is now different. Update amd64_edac to read the appropriate registers and extract the correct values for Fam17h. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-12-git-send-email-Yazen.Ghannam@amd.com [ Save us the indentation level in read_mc_regs(), add defines ] Signed-off-by: Borislav Petkov <bp@suse.de>
2016-11-28EDAC, amd64: Reserve correct PCI devices on AMD Fam17hYazen Ghannam1-1/+1
Fam17h needs PCI device functions 0 and 6 instead of 1 and 2 as on older systems. Update struct amd64_pvt to hold the new functions and reserve them if on Fam17h. Also, allocate an array of UMC structs within our newly allocated PVT struct. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-11-git-send-email-Yazen.Ghannam@amd.com [ init_one_instance() error handling, shorten lines, unbreak >80 cols lines. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2016-11-24EDAC, amd64: Add AMD Fam17h family type and opsYazen Ghannam1-1/+10
Add a family type and associated ops for Fam17h. Define a struct to hold all the UMC registers that we need. Make this a part of struct amd64_pvt in order to maximize code reuse in the rest of the driver. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-10-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
2016-11-24EDAC, amd64: Extend ecc_enabled() to Fam17hYazen Ghannam1-0/+16
Update the ecc_enabled() function to work on Fam17h. This entails reading a different set of registers and using the SMN (System Management Network) rather than PCI devices. Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1479423463-8536-9-git-send-email-Yazen.Ghannam@amd.com [ Fixup ecc_en assignment and get_umc_base(). ] Signed-off-by: Borislav Petkov <bp@suse.de>
2016-05-09EDAC, amd64_edac: Drop pci_register_driver() useBorislav Petkov1-1/+1
- remove homegrown instances counting. - take F3 PCI device from amd_nb caching instead of F2 which was used with the PCI core. With those changes, the driver doesn't need to register a PCI driver and relies on the northbridges caching which we do anyway on AMD. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Yazen Ghannam <yazen.ghannam@amd.com>
2015-09-29EDAC, amd64_edac: Update copyright and remove changelogAravind Gopalakrishnan1-55/+1
Git provides us all the changelogs anyway. So trim the comments section here. Update the copyrights info while at it. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1443440593-2316-3-git-send-email-Aravind.Gopalakrishnan@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-29EDAC, amd64_edac: Extend scrub rate support to F15hM60hAravind Gopalakrishnan1-0/+2
The scrub rate control register has moved to function 2 in PCI config space and is at a different offset on family 0x15, models 0x60 and later. The minimum recommended scrub rate has also changed. (Refer to D18F2x1c9_dct[1:0][DramScrub] in Fam15hM60h BKDG). Adjust set_scrub_rate() and get_scrub_rate() functions to accommodate this. Tested on F15hM60h, Fam15h, models 00h-0fh and Fam10h systems. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1443440593-2316-2-git-send-email-Aravind.Gopalakrishnan@amd.com [ Cleanup conditionals. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2015-02-23EDAC: amd64: Use static attribute groupsTakashi Iwai1-22/+2
Instead of calling device_create_file() and device_remove_file() manually, pass the static attribute groups with the new edac_mc_add_mc_with_groups(). The conditional creation of inject sysfs files is done by a proper is_visible callback. Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: http://lkml.kernel.org/r/1423046938-18111-4-git-send-email-tiwai@suse.de Signed-off-by: Borislav Petkov <bp@suse.de>
2014-10-30amd64_edac: Add F15h M60h supportAravind Gopalakrishnan1-3/+12
This patch adds support for ECC error decoding for F15h M60h processor. Aside from the usual changes, the patch adds support for some new features in the processor: - DDR4(unbuffered, registered); LRDIMM DDR3 support - relevant debug messages have been modified/added to report these memory types - new dbam_to_cs mappers - if (F15h M60h && LRDIMM); we need a 'multiplier' value to find cs_size. This multiplier value is obtained from the per-dimm DCSM register. So, change the interface to accept a 'cs_mask_nr' value to facilitate this calculation - switch-casing determine_memory_type() - done to cleanse the function of too many if-else statements and improve readability - This is now called early in read_mc_regs() to cache dram_type Misc cleanup: - amd64_pci_table[] is condensed by using PCI_VDEVICE macro. Testing details: Tested the patch by injecting 'ECC' type errors using mce_amd_inj and error decoding works fine. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Link: http://lkml.kernel.org/r/1414617483-4941-1-git-send-email-Aravind.Gopalakrishnan@amd.com [ Boris: determine_memory_type() cleanups ] Signed-off-by: Borislav Petkov <bp@suse.de>
2014-09-23amd64_edac: Modify usage of amd64_read_dct_pci_cfg()Aravind Gopalakrishnan1-5/+0
Rationale behind this change: - F2x1xx addresses were stopped from being mapped explicitly to DCT1 from F15h (OR) onwards. They use _dct[0:1] mechanism to access the registers. So we should move away from using address ranges to select DCT for these families. - On newer processors, the address ranges used to indicate DCT1 (0x140, 0x1a0) have different meanings than what is assumed currently. Changes introduced: - amd64_read_dct_pci_cfg() now takes in dct value and uses it for 'selecting the dct' - Update usage of the function. Keep in mind that different families have specific handling requirements - Remove [k8|f10]_read_dct_pci_cfg() as they don't do much different from amd64_read_pci_cfg() - Move the k8 specific check to amd64_read_pci_cfg - Remove f15_read_dct_pci_cfg() and move logic to amd64_read_dct_pci_cfg() - Remove now needless .read_dct_pci_cfg Testing: - Tested on Fam 10h; Fam15h Models: 00h, 30h; Fam16h using 'EDAC_DEBUG' and mce_amd_inj - driver obtains info from F2x registers and caches it in pvt structures correctly - ECC decoding works fine Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Link: http://lkml.kernel.org/r/1410799058-3149-1-git-send-email-aravind.gopalakrishnan@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
2014-02-27amd64_edac: Add support for newer F16h modelsAravind Gopalakrishnan1-0/+3
Extend ECC decoding support for F16h M30h. Tested on F16h M30h with ECC turned on using mce_amd_inj module and the patch works fine. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Link: http://lkml.kernel.org/r/1392913726-16961-1-git-send-email-Aravind.Gopalakrishnan@amd.com Tested-by: Arindam Nath <Arindam.Nath@amd.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Borislav Petkov <bp@suse.de>
2013-10-22bitops: Introduce a more generic BITMASK macroChen, Gong1-8/+0
GENMASK is used to create a contiguous bitmask([hi:lo]). It is implemented twice in current kernel. One is in EDAC driver, the other is in SiS/XGI FB driver. Move it to a more generic place for other usage. Signed-off-by: Chen, Gong <gong.chen@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Thomas Winischhofer <thomas@winischhofer.net> Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Mauro Carvalho Chehab <m.chehab@samsung.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-08-12amd64_edac: Get rid of boot_cpu_data accessesBorislav Petkov1-1/+3
Now that we cache (family, model, stepping) locally, use them instead of boot_cpu_data. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de>
2013-08-12amd64_edac: Add ECC decoding support for newer F15h modelsAravind Gopalakrishnan1-4/+54
On newer models, support has been included for upto 4 DCT's, however, only DCT0 and DCT3 are currently configured (cf BKDG Section 2.10). Also, the routing DRAM Requests algorithm is different for F15h M30h. Thus it is cleaner to use a brand new function rather than adding quirks to the more generic f1x_match_to_this_node(). Refer to "2.10.5 DRAM Routing Requests" in the BKDG for further info. Tested on Fam15h M30h with ECC turned on using mce_amd_inj facility and verified to be functionally correct. While at it, verify if erratum workarounds for E505 and E637 still hold. From email conversations within AMD, the current status of the errata is: * Erratum 505: fixed in model 0x1, stepping 0x1 and later. * Erratum 637: not fixed. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> [ Cleanups, corrections ] Signed-off-by: Borislav Petkov <bp@suse.de>
2013-04-19amd64_edac: Add Family 16h supportAravind Gopalakrishnan1-1/+3
Add code to handle DRAM ECC errors decoding for Fam16h. Tested on Fam16h with ECC turned on using the mce_amd_inj facility and works fine. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> [ Boris: cleanups and clarifications ] Signed-off-by: Borislav Petkov <bp@suse.de>
2013-01-10amd64_edac: Fix type usage in NB IDs and memory rangesDaniel J Blueman1-3/+3
Use appropriate types for northbridge IDs and memory ranges. Mark immutable data const and keep within compilation unit on related structures. Signed-off-by: Daniel J Blueman <daniel@numascale-asia.com> Link: http://lkml.kernel.org/r/1354265060-22956-2-git-send-email-daniel@numascale-asia.com [Boris: Drop arg change to node_to_amd_nb] Signed-off-by: Borislav Petkov <bp@alien8.de>
2013-01-10x86, AMD, NB: Add multi-domain supportDaniel J Blueman1-6/+0
Fix get_node_id to match northbridge IDs from the array of detected ones, allowing multi-server support such as with Numascale's NumaConnect, renaming to 'amd_get_node_id' for consistency. Signed-off-by: Daniel J Blueman <daniel@numascale-asia.com> Link: http://lkml.kernel.org/r/1353997932-8475-1-git-send-email-daniel@numascale-asia.com [Boris: shorten lines to fit 80 cols] Signed-off-by: Borislav Petkov <bp@alien8.de>
2012-11-28amd64_edac: Use DBAM_DIMM macroBorislav Petkov1-1/+1
Instead of open-coding it, use the DBAM_DIMM macro in amd64_csrow_nr_pages() which we have already. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2012-11-28amd64_edac: Reorganize error reporting pathBorislav Petkov1-1/+18
Rewrite CE/UE paths so that they use the same code and drop additional code duplication in handle_ue. Add a struct err_info which collects required info for the error reporting. This, in turn, helps slimming all edac_mc_handle_error() calls down to one. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2012-11-28amd64_edac: Improve error injectionBorislav Petkov1-4/+19
When injecting DRAM ECC errors over the F3xB[8,C] interface, the machine does this by injecting the error in the next non-cached access. This takes relatively long time on a normal system so that in order for us to expedite it, we disable the caches around the injection. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2012-11-28amd64_edac: Cleanup error injection codeBorislav Petkov1-8/+9
Invert kstrtoul return value testing and win one indentation level. Also, shorten up macro names so that the lines can fit into 80 cols. No functional change. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2012-10-30EDAC: Change Boris' email addressBorislav Petkov1-1/+1
My @amd.com address will be invalid soon so move to private email address. Signed-off-by: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/1351532410-4887-2-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-11amd64_edac: convert sysfs logic to use struct deviceMauro Carvalho Chehab1-7/+22
Now that the EDAC core supports struct device, there's no sense on having any logic at the EDAC core to simulate it. So, instead of adding such logic there, change the logic at amd64_edac to use it. Reviewed-by: Aristeu Rozanski <arozansk@redhat.com> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-04-26amd64_edac: Erratum #637 workaroundBorislav Petkov1-0/+1
F15h CPUs may report a non-DRAM address when reporting an error address belonging to a CC6 state save area. Add a workaround to detect this condition and compute the actual DRAM address of the error as documented in the Revision Guide for AMD Family 15h Models 00h-0Fh Processors. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-04-26amd64_edac: Factor in CC6 save areaBorislav Petkov1-0/+2
F15h and later use a portion of DRAM as a CC6 storage area. BIOS programs D18F1x[17C:140,7C:40] DRAM Base/Limit accordingly by subtracting the storage area from the DRAM limit setting. However, in order for edac to consider that part of DRAM too, we need to include it into the per-node range. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17amd64_edac: Fix DRAM base macrosBorislav Petkov1-4/+4
Return unsigned u8 values only. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17amd64_edac: Fix node id signednessBorislav Petkov1-2/+2
A node id can never be negative since we use it as an index into the DRAM ranges array. This also makes one of the BUG_ON conditions redundant. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17amd64_edac: Drop redundant declarationsBorislav Petkov1-8/+0
Those were moved to the mce_amd.h header. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2011-03-17amd64_edac: Enable driver on F15hBorislav Petkov1-5/+3
Add the PCI device ids required for driver registration. Remove pvt->ctl_name and use the family descriptor directly, instead. Then, bump driver version and fixup its format. Finally, enable DRAM ECC decoding on F15h. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>