summaryrefslogtreecommitdiff
path: root/drivers/edac
AgeCommit message (Collapse)AuthorFilesLines
2010-10-24i7core_edac: Reduce args of i7core_register_mciHidetoshi Seto1-15/+9
We can check the number of channels in i7core_register_mci. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Introduce i7core_unregister_mciHidetoshi Seto1-32/+44
In i7core_probe, when setup of mci for 2nd or later socket failed, we should cleanup prepared mci for 1st socket or so before "put" of all devices. So let have i7core_unregister_mci that can be shared between here and i7core_remove. While here fix a typo "hanler". Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Use saved pointersHidetoshi Seto1-3/+2
We already have saved pointers. Use shorter ones. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Check probe counter in i7core_removeHidetoshi Seto1-0/+6
Prevent i7core_remove from running multiple times. Otherwise value proved will be negative and something will be wrong. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Call pci_dev_put() when alloc_i7core_dev() failedHidetoshi Seto1-1/+3
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Fix error path of i7core_register_mciHidetoshi Seto1-5/+11
Release resources properly. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Fix order of lines in i7core_register_mciHidetoshi Seto1-11/+9
The flag is_registered is not initialized until mci_bind_devs() is called. Refer it properly. The mci->dev and mci->edac_check is required in edac_mc_add_mc(), so prepare them just before the call. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Always do get/put for all devicesHidetoshi Seto1-13/+14
We already do 'get' for all sockets at once. So do 'put' in the same way. And let args of the 'get' function to void since it handles only the single, static and known size table pci_dev_table[]. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Introduce i7core_pci_ctl_create/releaseHidetoshi Seto1-20/+24
Have a couple of method. while here sort out lines in the i7core_register_mci() a bit. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Introduce free_i7core_devHidetoshi Seto1-5/+9
Have a method to make a couple with alloc_i7core_dev() previously introduced. Using in pair will help proper resource handling. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Introduce alloc_i7core_devHidetoshi Seto1-10/+24
It's nice to have a method for a single purpose. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Reduce args of i7core_get_onedeviceHidetoshi Seto1-12/+9
Since we need to pass the index of the entry, pass the table itself instead of passing individual members of the table. While here make it static. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Fix the logic in i7core_remove()Hidetoshi Seto1-2/+2
commit 47251b4d960bdfa648b0d06dbc6d445f41cb3906 have changed the logic for unexplained reasons. It looks strange that it can release i7core_dev without calling i7core_put_devices() that releases i7core_dev->pdev. Fix the part. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Don't do the legacy PCI probe by defaultMauro Carvalho Chehab1-1/+6
The legacy PCI probe sometimes cause hangs. Better to have it disabled by default, and have a parameter to enable it. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: don't use a freed mci structMauro Carvalho Chehab3-5/+4
This is a nasty bug. Since kobject count will be reduced by zero by edac_mc_del_mc(), and this triggers the kobj release method, the mci memory will be freed automatically. So, all we have left is ctl_name, as shown by enabling debug: [ 80.822186] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 1020: edac_remove_sysfs_mci_device() remove_link [ 80.832590] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 1024: edac_remove_sysfs_mci_device() remove_mci_instance [ 80.843776] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 640: edac_mci_control_release() mci instance idx=0 releasing [ 80.855163] EDAC MC: Removed device 0 for i7core_edac.c i7 core #0: DEV 0000:3f:03.0 [ 80.862936] EDAC DEBUG: in drivers/edac/i7core_edac.c, line at 2089: (null): free structs [ 80.871134] EDAC DEBUG: in drivers/edac/edac_mc.c, line at 238: edac_mc_free() [ 80.878379] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 726: edac_mc_unregister_sysfs_main_kobj() [ 80.888043] EDAC DEBUG: in drivers/edac/i7core_edac.c, line at 1232: drivers/edac/i7core_edac.c: i7core_put_devices() Also, kfree(mci) shouldn't happen at the kobj.release, as it happens when edac_remove_sysfs_mci_device() is called, but the logic is: edac_remove_sysfs_mci_device(mci); edac_printk(KERN_INFO, EDAC_MC, "Removed device %d for %s %s: DEV %s\n", mci->mc_idx, mci->mod_name, mci->ctl_name, edac_dev_name(mci)); So, as the edac_printk() needs the mci struct, this generates an OOPS. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24edac_core: Print debug messages at release callsMauro Carvalho Chehab3-0/+5
This is important to track a nasty bug at the free logic. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24edac_core: Don't let free(mci) happen while using itMauro Carvalho Chehab1-2/+0
A very nasty bug were happening on edac core, due to the way mci objects are freed. mci memory is freed when kobject count reaches zero, by edac_mci_control_release(). However, from the logs, this is clearly happening before the final usage of mci struct: [15799.607454] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 640: edac_mci_control_release() mci instance idx=0 releasing [15799.618773] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 769: edac_inst_grp_release() [15799.627326] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 894: edac_remove_mci_instance_attributes() end of seeking for group all_channel_counts [15799.640887] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 877: edac_remove_mci_instance_attributes() sysfs_attrib = ffffffffa01d7240 [15799.653412] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 1020: edac_remove_sysfs_mci_device() remove_link [15799.663753] EDAC DEBUG: in drivers/edac/edac_mc_sysfs.c, line at 1024: edac_remove_sysfs_mci_device() remove_mci_instance Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24edac_core: Do a better job with node removalMauro Carvalho Chehab2-29/+43
Make sure we remove groups at the right order Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: explicitly remove PCI devices from the devices listMauro Carvalho Chehab1-4/+6
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: MCE NMI handling should stop firstMauro Carvalho Chehab1-1/+8
Otherwise, a NMI may happen causing a race condition and a panic. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Initialize all priv vars before start pollingMauro Carvalho Chehab1-12/+12
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Improve debug to seek for register/remove errorsMauro Carvalho Chehab1-2/+9
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: move #if PAGE_SHIFT to edac_core.hMauro Carvalho Chehab2-5/+3
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Properly mark const static vars as suchMauro Carvalho Chehab3-46/+83
There are two groups of sysfs attributes: one for rdimm and another for udimm. Instead of changing dynamically the unique static struct for handling udimm's, declare two vars and make them constant. This avoids the risk of having two or more memory controllers, each needing a different set of attributes. While here, use const on all places where it is applicable. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> edac_core: use const for constant sysfs arguments Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: move static vars to the beginning of the fileMauro Carvalho Chehab1-6/+5
While here, don't initialize probed with 0. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-24i7core_edac: Be sure that the edac pci handler will be properly releasedMauro Carvalho Chehab3-17/+28
With multi-sockets, more than one edac pci handler is enabled. Be sure to un-register all instances. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-10-22Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bpLinus Torvalds15-758/+1014
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: (21 commits) EDAC, MCE: Fix shift warning on 32-bit EDAC, MCE: Add a BIT_64() macro EDAC, MCE: Enable MCE decoding on F12h EDAC, MCE: Add F12h NB MCE decoder EDAC, MCE: Add F12h IC MCE decoder EDAC, MCE: Add F12h DC MCE decoder EDAC, MCE: Add support for F11h MCEs EDAC, MCE: Enable MCE decoding on F14h EDAC, MCE: Fix FR MCEs decoding EDAC, MCE: Complete NB MCE decoders EDAC, MCE: Warn about LS MCEs on F14h EDAC, MCE: Adjust IC decoders to F14h EDAC, MCE: Adjust DC decoders to F14h EDAC, MCE: Rename files EDAC, MCE: Rework MCE injection EDAC: Export edac sysfs class to users. EDAC, MCE: Pass complete MCE info to decoders EDAC, MCE: Sanitize error codes EDAC, MCE: Remove unused function parameter EDAC, MCE: Add HW_ERR prefix ...
2010-10-22Merge branch 'x86-amd-nb-for-linus' of ↵Linus Torvalds2-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-amd-nb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, amd_nb: Enable GART support for AMD family 0x15 CPUs x86, amd: Use compute unit information to determine thread siblings x86, amd: Extract compute unit information for AMD CPUs x86, amd: Add support for CPUID topology extension of AMD CPUs x86, nmi: Support NMI watchdog on newer AMD CPU families x86, mtrr: Assume SYS_CFG[Tom2ForceMemTypeWB] exists on all future AMD CPUs x86, k8: Rename k8.[ch] to amd_nb.[ch] and CONFIG_K8_NB to CONFIG_AMD_NB x86, k8-gart: Decouple handling of garts and northbridges x86, cacheinfo: Fix dependency of AMD L3 CID x86, kvm: add new AMD SVM feature bits x86, cpu: Fix allowed CPUID bits for KVM guests x86, cpu: Update AMD CPUID feature bits x86, cpu: Fix renamed, not-yet-shipping AMD CPUID feature bit x86, AMD: Remove needless CPU family check (for L3 cache info) x86, tsc: Remove CPU frequency calibration on AMD
2010-10-21EDAC, MCE: Fix shift warning on 32-bitBorislav Petkov1-1/+1
Fix drivers/edac/mce_amd.c:262: warning: left shift count >= width of type on 32-bit builds. Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Add a BIT_64() macroBorislav Petkov1-0/+2
Add a macro for 64-bit vectors to use when accessing MSR contents. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Enable MCE decoding on F12hBorislav Petkov1-1/+1
Turn on MCE decoding on F12h. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Add F12h NB MCE decoderBorislav Petkov1-2/+3
F12h is completely covered by the generic path. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Add F12h IC MCE decoderBorislav Petkov1-0/+1
... which is the same as for K8 and F10h. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Add F12h DC MCE decoderBorislav Petkov1-7/+17
F12h DC MCE signatures are a subset of F10h's so reuse them. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Add support for F11h MCEsBorislav Petkov1-3/+12
F11h has almost the same MCE signatures as K8 except DRAM ECC and MC5 bank errors. Reuse functionality from the other families. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Enable MCE decoding on F14hBorislav Petkov1-4/+5
Now that all decoders have been taught about F14h, models < 0x10 MCEs, enable decoding on this family of CPUs. Also, issue a short informational message upon boot that MCE decoding gets enabled. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Fix FR MCEs decodingBorislav Petkov1-4/+10
Those are N/A on K8, so don't decode them there. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Complete NB MCE decodersBorislav Petkov3-56/+158
Add support for decoding F14h BU MCEs and improve decoding of the remaining families. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Warn about LS MCEs on F14hBorislav Petkov1-5/+13
F14h CPUs do not generate LS MCEs so exit early and warn the user in case this path is ever hit that something else might be going haywire. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Adjust IC decoders to F14hBorislav Petkov2-48/+71
Add support for IC MCEs for F14h CPUs. K8 and F10h are almost identical so use one function for both. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Adjust DC decoders to F14hBorislav Petkov2-27/+171
Add a per-family data cache decoders. Since there is a certain overlap between the different DC MCE signatures, reuse functionality between the families as far as possible. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Rename filesBorislav Petkov5-3/+4
Drop "edac_" string from the filenames since they're prefixed with edac/ in their pathname anyway. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Rework MCE injectionBorislav Petkov7-207/+203
Add sysfs injection facilities for testing of the MCE decoding code. Remove large parts of amd64_edac_dbg.c, as a result, which did only NB MCE injection anyway and the new injection code supports that functionality already. Add an injection module so that MCE decoding code in production kernels like those in RHEL and SLES can be tested. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC: Export edac sysfs class to users.Borislav Petkov6-95/+71
Move toplevel sysfs class to the stub and make it available to non-modularized code too. Add proper refcounting of its users and move the registration functionality into the reference counting routines. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Pass complete MCE info to decodersBorislav Petkov4-47/+56
... instead of the MCi_STATUS info only for improved handling of certain types of errors later. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Sanitize error codesBorislav Petkov2-49/+19
Clean up error codes names, shorten to mnemonics, add RRRR boundary checking. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Remove unused function parameterBorislav Petkov3-7/+4
Remove remains from previous functionality. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC, MCE: Add HW_ERR prefixBorislav Petkov1-17/+17
.. so that the user knows what she's looking at there in dmesg. Also, fix a minor cosmetic output inconsistency. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-21EDAC: Fix error returnBorislav Petkov1-1/+1
We should return a negative value when we cannot get the toplevel edac sysfs class. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2010-10-18Update broken web addresses in the kernel.Justin P. Mattock2-2/+2
The patch below updates broken web addresses in the kernel Signed-off-by: Justin P. Mattock <justinmattock@gmail.com> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Finn Thain <fthain@telegraphics.com.au> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Matt Turner <mattst88@gmail.com> Cc: Dimitry Torokhov <dmitry.torokhov@gmail.com> Cc: Mike Frysinger <vapier.adi@gmail.com> Acked-by: Ben Pfaff <blp@cs.stanford.edu> Acked-by: Hans J. Koch <hjk@linutronix.de> Reviewed-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Jiri Kosina <jkosina@suse.cz>