Patchwork EDAC, skx_common: Add code to recognise new compound error code.

login
register
mail settings
Submitter Luck, Tony
Date Feb. 5, 2019, 6:21 p.m.
Message ID <20190205182109.27828-1-tony.luck@intel.com>
Download mbox | patch
Permalink /patch/718693/
State New
Headers show

Comments

Luck, Tony - Feb. 5, 2019, 6:21 p.m.
New error code for systems that use DRAM as an extra level of cache.

New code looks like:

    000F 0010 1MMM CCCC

where the MMM and CCCC bits are used for the same purpose as the
original code. For this new class of errors the ADXL translation
will provide details of both the DIMM used as cache for the error
location and the component that is being cached.

Note: This new error code starts with Skylake. Older EDAC drivers do
not need to be updated.

Signed-off-by: Tony Luck <tony.luck@intel.com>
---

[New SDM published today includes this new compound error code]

 drivers/edac/skx_common.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)
Borislav Petkov - Feb. 6, 2019, 10:10 a.m.
On Tue, Feb 05, 2019 at 10:21:09AM -0800, Tony Luck wrote:
> New error code for systems that use DRAM as an extra level of cache.
> 
> New code looks like:
> 
>     000F 0010 1MMM CCCC
> 
> where the MMM and CCCC bits are used for the same purpose as the
> original code. For this new class of errors the ADXL translation
> will provide details of both the DIMM used as cache for the error
> location and the component that is being cached.
> 
> Note: This new error code starts with Skylake. Older EDAC drivers do
> not need to be updated.
> 
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---
> 
> [New SDM published today includes this new compound error code]
> 
>  drivers/edac/skx_common.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)

Applied, thanks.

Patch

diff --git a/drivers/edac/skx_common.c b/drivers/edac/skx_common.c
index 513523ad5649..0e96e7b5b0a7 100644
--- a/drivers/edac/skx_common.c
+++ b/drivers/edac/skx_common.c
@@ -494,9 +494,11 @@  static void skx_mce_output_error(struct mem_ctl_info *mci,
 	}
 
 	/*
-	 * According with Table 15-9 of the Intel Architecture spec vol 3A,
-	 * memory errors should fit in this mask:
+	 * According to Intel Architecture spec vol 3B,
+	 * Table 15-10 "IA32_MCi_Status [15:0] Compound Error Code Encoding"
+	 * memory errors should fit one of these masks:
 	 *	000f 0000 1mmm cccc (binary)
+	 *	000f 0010 1mmm cccc (binary)	[RAM used as cache]
 	 * where:
 	 *	f = Correction Report Filtering Bit. If 1, subsequent errors
 	 *	    won't be shown
@@ -504,7 +506,7 @@  static void skx_mce_output_error(struct mem_ctl_info *mci,
 	 *	cccc = channel
 	 * If the mask doesn't match, report an error to the parsing logic
 	 */
-	if (!((errcode & 0xef80) == 0x80)) {
+	if (!((errcode & 0xef80) == 0x80 || (errcode & 0xef80) == 0x280)) {
 		optype = "Can't parse: it is not a mem";
 	} else {
 		switch (optypenum) {