summaryrefslogtreecommitdiff
path: root/Documentation/admin-guide/ras.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/admin-guide/ras.rst')
-rw-r--r--Documentation/admin-guide/ras.rst22
1 files changed, 21 insertions, 1 deletions
diff --git a/Documentation/admin-guide/ras.rst b/Documentation/admin-guide/ras.rst
index d71340e86c27..1b90c6f00a92 100644
--- a/Documentation/admin-guide/ras.rst
+++ b/Documentation/admin-guide/ras.rst
@@ -81,7 +81,7 @@ That defines some categories of errors:
still run, eventually replacing the affected hardware by a hot spare,
if available.
- Also, when an error happens on an userspace process, it is also possible to
+ Also, when an error happens on a userspace process, it is also possible to
kill such process and let userspace restart it.
The mechanism for handling non-fatal errors is usually complex and may
@@ -438,11 +438,13 @@ A typical EDAC system has the following structure under
│   │   ├── ce_count
│   │   ├── ce_noinfo_count
│   │   ├── dimm0
+ │   │   │   ├── dimm_ce_count
│   │   │   ├── dimm_dev_type
│   │   │   ├── dimm_edac_mode
│   │   │   ├── dimm_label
│   │   │   ├── dimm_location
│   │   │   ├── dimm_mem_type
+ │   │   │   ├── dimm_ue_count
│   │   │   ├── size
│   │   │   └── uevent
│   │   ├── max_location
@@ -457,11 +459,13 @@ A typical EDAC system has the following structure under
│   │   ├── ce_count
│   │   ├── ce_noinfo_count
│   │   ├── dimm0
+ │   │   │   ├── dimm_ce_count
│   │   │   ├── dimm_dev_type
│   │   │   ├── dimm_edac_mode
│   │   │   ├── dimm_label
│   │   │   ├── dimm_location
│   │   │   ├── dimm_mem_type
+ │   │   │   ├── dimm_ue_count
│   │   │   ├── size
│   │   │   └── uevent
│   │   ├── max_location
@@ -483,6 +487,22 @@ this ``X`` memory module:
This attribute file displays, in count of megabytes, the memory
that this csrow contains.
+- ``dimm_ue_count`` - Uncorrectable Errors count attribute file
+
+ This attribute file displays the total count of uncorrectable
+ errors that have occurred on this DIMM. If panic_on_ue is set
+ this counter will not have a chance to increment, since EDAC
+ will panic the system.
+
+- ``dimm_ce_count`` - Correctable Errors count attribute file
+
+ This attribute file displays the total count of correctable
+ errors that have occurred on this DIMM. This count is very
+ important to examine. CEs provide early indications that a
+ DIMM is beginning to fail. This count field should be
+ monitored for non-zero values and report such information
+ to the system administrator.
+
- ``dimm_dev_type`` - Device type attribute file
This attribute file will display what type of DRAM device is