Patchwork EDAC, dmc520:: add DMC520 EDAC driver

login
register
mail settings
Submitter Sasha Levin
Date Jan. 18, 2019, 4:23 p.m.
Message ID <20190118162324.17123-1-sashal@kernel.org>
Download mbox | patch
Permalink /patch/704209/
State New
Headers show

Comments

Sasha Levin - Jan. 18, 2019, 4:23 p.m.
From: Rui Zhao <ruizhao@microsoft.com>

New driver supports DRAM error detection and correction on DMC520
controller.

Validated on actual hardware: DRAM errors showed up once the DDR core
voltage was lowered down by 200+mV using test tool.

Signed-off-by: Rui Zhao <ruizhao@microsoft.com>
[sl: minor nits in commit message and code, added maintainers entry]
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 MAINTAINERS                |   6 +
 drivers/edac/Kconfig       |   7 +
 drivers/edac/Makefile      |   1 +
 drivers/edac/dmc520_edac.c | 495 +++++++++++++++++++++++++++++++++++++
 4 files changed, 509 insertions(+)
 create mode 100644 drivers/edac/dmc520_edac.c
Borislav Petkov - Jan. 21, 2019, 12:35 p.m.
On Fri, Jan 18, 2019 at 11:23:24AM -0500, Sasha Levin wrote:
> From: Rui Zhao <ruizhao@microsoft.com>
> 
> New driver supports DRAM error detection and correction on DMC520
> controller.

That's this thing, right?

https://developer.arm.com/products/system-ip/memory-controllers/corelink-dmc-520

> Validated on actual hardware:

Which is what exactly?

This looks like a driver for the memory controller IP and that could get
integrated in other platforms so I'd prefer if this driver was called
<your_platform>_edac and the DMC520 was a generic piece of functionality
like the FSL memory controller IP:

mpc85xx_edac_mod-y                      := fsl_ddr_edac.o mpc85xx_edac.o
obj-$(CONFIG_EDAC_MPC85XX)              += mpc85xx_edac_mod.o

layerscape_edac_mod-y                   := fsl_ddr_edac.o layerscape_edac.o
obj-$(CONFIG_EDAC_LAYERSCAPE)           += layerscape_edac_mod.o

Thx.
James Morse - Jan. 21, 2019, 5:09 p.m.
Hi Sasha, Rui,

On 18/01/2019 16:23, Sasha Levin wrote:
> From: Rui Zhao <ruizhao@microsoft.com>
> New driver supports DRAM error detection and correction on DMC520
> controller.

> Validated on actual hardware: DRAM errors showed up once the DDR core
> voltage was lowered down by 200+mV using test tool.

That's quite cool!


> ---
>  MAINTAINERS                |   6 +
>  drivers/edac/Kconfig       |   7 +
>  drivers/edac/Makefile      |   1 +
>  drivers/edac/dmc520_edac.c | 495 +++++++++++++++++++++++++++++++++++++
>  4 files changed, 509 insertions(+)
>  create mode 100644 drivers/edac/dmc520_edac.c

Where do I find the dt-binding for this?

It would be good if we can make this generic, so it works on all platforms with
a DMC520, possibly along with other components. (e.g. the a15 L2 driver posted
recently).

This will mostly be getting the DT right, as we can refactor the code when a
second user comes along, but can't change the DT format.


The TRM describes 'a set of interrupts', which ones does the binding anticipate
are wired up for this? There are separate status bits for corrected and
uncorrected, and one pair for dram versus ram. (not sure what this corresponds
with).

Do we care about the link-error interrupt?


It looks you're platform has wired corrected and uncorrected dram up as separate
SPI, but not touched the ram interrupts. These are choices the soc designers
made, we need to capture this stuff in the binding.

A system may have multiple memory controllers, they may share the interrupts.

(It looks like you've folded these corners out by not using IRQF_SHARED: this
driver only works for independent interrupts, which can run concurrently, but
work fine because the only register they both touch is interrupt_clr, which is
write-only.)


For these pre-v8.2-RAS things the expectation is firmware handles all this
stuff. I'm surprised your platform hasn't made the memory-controller
'secure-only', so only platform-firmware can touch it.

I can't see how this could be used on v8/aarch64, it would imply there is no
firmware, which suggests this is a UP system. (or firmware trusts linux not to
mess it up!)

I'm guessing this is for 32bit, where on your platform linux is running in
'secure'. This is a significant platform policy, to make this generic we'd need
to find a way of describing it. i.e., other platforms may have a dmc520, the DT
may describe where it is, but linux can't touch it.
I believe that today this depends on the bootloader knowing which nodes to
remove when starting linux.

I think this policy-bit can be done by having a soc-family/vendor specific
driver that knows which of the edac components can be used on this platform.
The altera driver does something along these lines in altr_sdram_probe().
Obviously if we can have all the data come from DT that is better.


> diff --git a/drivers/edac/dmc520_edac.c b/drivers/edac/dmc520_edac.c
> new file mode 100644
> index 000000000000..5f14889074af
> --- /dev/null
> +++ b/drivers/edac/dmc520_edac.c
> @@ -0,0 +1,495 @@

> +/* Driver settings */
> +#define DMC520_EDAC_CHANS			1
> +#define DMC520_EDAC_ERR_GRAIN			1
> +#define DMC520_EDAC_INT_COUNT			2
> +#define DMC520_EDAC_BUS_WIDTH			8

Should these be in the DT?
If someone else has a dmc520 configured slightly differently, how can we get
both systems going, without having to change your system's DT?


> +static bool dmc520_is_ecc_enabled(struct dmc520_edac *edac)
> +{
> +	u32 reg_val = dmc520_read_reg(edac, REG_OFFSET_FEATURE_CONFIG);
> +
> +	return (FIELD_GET(REG_FIELD_DRAM_ECC_ENABLED, reg_val) != 0);
> +}

Ah, so there is firmware that sets this up and enables it ...


> +static int dmc520_edac_probe(struct platform_device *pdev)
> +{
> +	struct dmc520_edac *edac;
> +	struct mem_ctl_info *mci;
> +	struct edac_mc_layer layers[2];
> +	int ret, irq;
> +	struct resource *res;
> +	void __iomem *reg_base;
> +
> +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +	reg_base = devm_ioremap_resource(&pdev->dev, res);
> +	if (IS_ERR(reg_base))
> +		return PTR_ERR(reg_base);
> +
> +	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
> +	layers[0].size = dmc520_get_rank_count(reg_base);
> +	layers[0].is_virt_csrow = true;
> +
> +	layers[1].type = EDAC_MC_LAYER_CHANNEL;
> +	layers[1].size = DMC520_EDAC_CHANS;
> +	layers[1].is_virt_csrow = false;

If you can read the rank count, why hard code the bank?
(which is what I assume channels corresponds to ... although this has confused
me before [0]).


> +	platform_set_drvdata(pdev, mci);
> +
> +	mci->pdev = &pdev->dev;
> +	mci->mtype_cap = MEM_FLAG_DDR3 | MEM_FLAG_DDR4;
> +	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;

> +	mci->scrub_cap = SCRUB_HW_SRC;
> +	mci->scrub_mode = SCRUB_NONE;

Is this saying the device supports error scrubbing, but its disabled?
Do we know that?

Can the user try and turn it on? (I can't find anything that reads this!)

It doesn't look like we can configured scrubbing if it wasn't done at boot.
3.3.245 scrub_control0_now of the TRM has "Cannot be written to and only updated
when in CONFIG or LOW-POWER states". I assume we can't put this thing back into
config mode when we're running.


[...]

> +	/* Check ECC CE/UE errors */
> +	dmc520_handle_ecc_errors(mci, true, false);
> +	dmc520_handle_ecc_errors(mci, false, false);

Do we know overflow=false?


> +	/* Enable interrupts */
> +	dmc520_write_reg(edac,
> +			 DRAM_ECC_INT_CE_MASK | DRAM_ECC_INT_UE_MASK,
> +			 REG_OFFSET_INTERRUPT_CONTROL);


What if they were enabled before? (e.g. enabled by firmware, the bootloader or
kdump). If they're already enabled, can we race with the interrupt handler on
another CPU and get double reporting of the error counter?


Thanks,

James

[0] https://lore.kernel.org/lkml/d4ee6d3a-b3ac-2986-40db-8423de7e960a@arm.com/
James Morse - Jan. 23, 2019, 6:36 p.m.
Hi Rui,

On 23/01/2019 00:42, Rui Zhao wrote:
> On Monday, January 21, 2019 9:09 AM, James Morse wrote:
>> It would be good if we can make this generic, so it works on all platforms with
>> a DMC520, possibly along with other components. (e.g. the a15 L2 driver posted
>> recently).
>>
>> This will mostly be getting the DT right, as we can refactor the code when a
>> second user comes along, but can't change the DT format.

> Agreed. It'd be good if we can move all platform specific settings to DT such
> that a second user can update the code and won't break the DT for original
> device. I'll think about the DT format to make it more generic.

When the time comes, could you post a dt-binding as the first patch? These add
the documentation under Documentation/device-tree/bindings, and need to be CC'd
to the DT folks.


>> The TRM describes 'a set of interrupts', which ones does the binding anticipate
>> are wired up for this? There are separate status bits for corrected and
>> uncorrected, and one pair for dram versus ram. (not sure what this corresponds
>> with).

>> Do we care about the link-error interrupt?

> Will add ram interrupts handling, and make it configurable in DT.

I don't think the code needs to change, we just need to make it clear these are
the dram interrupts, and they are different numbers.
This means someone can add the ram interrupts later, and not have to
disambiguate them to keep your platform going.

(its this stuff that the binding document describes)


> Could you please
> share a bit more info on link-error interrupt? TRM doesn't have detailed info about
> it.

I only have the TRM too! The section titled 'RAS' on page 2-23 talks about 'link
error protection', so it might be relevant so someone. We don't care about this
now, but if someone does in the future, we will need to have a way of adding it
to the DT.

> Would like to know what's the impact if this error happens, and how to fit it
> with current reporting in EDAC core.

At a guess the interrupt triggers when link_err_count increases. (link_err has
an overflow bit, so the interrupt must be related to a counter).

If we could associate a link with a layer in edac, we could report errors
against that point. But I've no idea how 'links' correspond with 'ranks and banks'!


>>> +     layers[1].type = EDAC_MC_LAYER_CHANNEL;
>>> +     layers[1].size = DMC520_EDAC_CHANS;
>>> +     layers[1].is_virt_csrow = false;

>> If you can read the rank count, why hard code the bank?
> 
>> (which is what I assume channels corresponds to ... although this has confused
>> me before [0]).

> I misunderstood what channel meant. For bank, we can read config from the register.

Its an assumption. Channel seems to be an overloaded term with different meanings.
Fortunately edac drivers get to pick what their layers mean. It looks like all
this controls is the names that come out of sysfs.

rank/bank are part of the address decoding, as we can read both sizes I think it
makes sense to use both as it should further localise the error.


>>> +     mci->scrub_cap = SCRUB_HW_SRC;
>>> +     mci->scrub_mode = SCRUB_NONE;

>> Is this saying the device supports error scrubbing, but its disabled?
>> Do we know that?

>> Can the user try and turn it on? (I can't find anything that reads this!)

> We don’t want to change what’s already configured.

I don't think we could if we wanted to. This thing has 'READY' and 'CONFIG'
modes, the scrubbing can only be written to in 'CONFIG' mode. I'm willing to bet
we need it to be in 'READY' to be executing the kernel from dram.


> Will change scrub_mode to HW.

Do we know its enabled? This is something firmware has to set up, someone else's
platform may do it differently.

I think should read one of the scrub control registers to find out if its turned on.

But, I can't find what uses this value ...


>>> +     /* Enable interrupts */
>>> +     dmc520_write_reg(edac,
>>> +                      DRAM_ECC_INT_CE_MASK | DRAM_ECC_INT_UE_MASK,
>>> +                      REG_OFFSET_INTERRUPT_CONTROL);

>> What if they were enabled before? (e.g. enabled by firmware, the bootloader or
>> kdump). If they're already enabled, can we race with the interrupt handler on
>> another CPU and get double reporting of the error counter?

> We don't expect interrupts to be enabled by firmware or bootloader if this driver
> is enabled.

What about a previous instance of this driver? Linux supports kexec, kdump and
hibernate, all of which cause us to inherit a slightly used platform.


> If firmware enables it, they're suppose to handle the interrupt.

Ah, so you still have resident firmware!
How come your firmware trusts linux not to turn off the memory controller?!
These things are usually protected by trust zone so the OS can't pull the memory
from under firmware's feet.


Thanks,

James
Borislav Petkov - Jan. 23, 2019, 6:46 p.m.
On Wed, Jan 23, 2019 at 06:36:23PM +0000, James Morse wrote:
> > Would like to know what's the impact if this error happens, and how to fit it
> > with current reporting in EDAC core.
> 
> At a guess the interrupt triggers when link_err_count increases. (link_err has
> an overflow bit, so the interrupt must be related to a counter).
> 
> If we could associate a link with a layer in edac, we could report errors
> against that point. But I've no idea how 'links' correspond with 'ranks and banks'!

Well, I have no clue what kind of links you guys are talking but if
those are per-chance coherent links used by cores to communicate in a
coherent fabric, or cores and devices, what would showing those errors
to the user bring ya?

Or are ya talking about different kinds of links?

In any case, the first question to ask would be, can some agent or the
user do something with the information that X or Y link errors happened?

If not, then why bother?

If yes, then that's a different story.
Sasha Levin - Jan. 23, 2019, 6:50 p.m.
On Mon, Jan 21, 2019 at 01:35:47PM +0100, Borislav Petkov wrote:
>On Fri, Jan 18, 2019 at 11:23:24AM -0500, Sasha Levin wrote:
>> From: Rui Zhao <ruizhao@microsoft.com>
>>
>> New driver supports DRAM error detection and correction on DMC520
>> controller.
>
>That's this thing, right?
>
>https://developer.arm.com/products/system-ip/memory-controllers/corelink-dmc-520

Yup!

>> Validated on actual hardware:
>
>Which is what exactly?

A variant of a Broadcom's SST100 board.

--
Thanks,
Sasha
Borislav Petkov - Jan. 23, 2019, 7:03 p.m.
On Wed, Jan 23, 2019 at 01:50:07PM -0500, Sasha Levin wrote:
> A variant of a Broadcom's SST100 board.

Is that some platform which people will use and run linux on and thus
would make sense to have an EDAC driver for or is this something
devel-only toy thing?

Searching a bit doesn't tell me a whole lot except some enablement for
some stingray SOC reference board...
Sasha Levin - Jan. 23, 2019, 7:09 p.m.
On Wed, Jan 23, 2019 at 08:03:54PM +0100, Borislav Petkov wrote:
>On Wed, Jan 23, 2019 at 01:50:07PM -0500, Sasha Levin wrote:
>> A variant of a Broadcom's SST100 board.
>
>Is that some platform which people will use and run linux on and thus
>would make sense to have an EDAC driver for or is this something
>devel-only toy thing?

It will have (a lot of) use in prod, and is not just a toy :)

>Searching a bit doesn't tell me a whole lot except some enablement for
>some stingray SOC reference board...

Right, they hardware we're working on is nothing more than reference
boards at this point.

--
Thanks,
Sasha
Rui Zhao - Jan. 23, 2019, 10:08 p.m.
Hi James,

On Wednesday, January 23, 2019 10:36 AM, James Morse wrote:
> When the time comes, could you post a dt-binding as the first patch? These add the documentation under Documentation/device-tree/bindings, and need to be CC'd to the DT folks.

Sure, will do.

>> Will change scrub_mode to HW.

> Do we know its enabled? This is something firmware has to set up, someone else's platform may do it differently.
> I think should read one of the scrub control registers to find out if its turned on.
> But, I can't find what uses this value ...

Looks like if scrub_mode is set to SCRUB_SW_SRC, the MC core will do arch specific sw scrub on the page with ce error. 
We can config the mode base on register settings.

>>>> +     /* Enable interrupts */
>>>> +     dmc520_write_reg(edac,
>>>> +                      DRAM_ECC_INT_CE_MASK | DRAM_ECC_INT_UE_MASK,
>>>> +                      REG_OFFSET_INTERRUPT_CONTROL);

>>> What if they were enabled before? (e.g. enabled by firmware, the 
>>> bootloader or kdump). If they're already enabled, can we race with 
>>> the interrupt handler on another CPU and get double reporting of the error counter?

>> We don't expect interrupts to be enabled by firmware or bootloader if 
>> this driver is enabled.

> What about a previous instance of this driver? Linux supports kexec, kdump and hibernate, all of which cause us to inherit a slightly used platform.

Agreed. We shouldn't make assumption on the register value when driver starts.

>> If firmware enables it, they're suppose to handle the interrupt.

> Ah, so you still have resident firmware!
> How come your firmware trusts linux not to turn off the memory controller?!
> These things are usually protected by trust zone so the OS can't pull the memory from under firmware's feet.

We have firmware to config the memory controller and want to have an EDAC driver to report ECC status.
Could you please elaborate a bit on the security concern on this approach? Like some malicious app/driver can access
memory controller registers can cause issue? 

What's the recommend approach if Linux won't be able to access memory controller registers? Have firmware do the ECC 
status monitoring and some sort of driver to query ECC status from firmware?

Appreciate your insights on this.

Thanks,
Rui
James Morse - Feb. 5, 2019, 5:31 p.m.
Hi Boris,

On 23/01/2019 18:46, Borislav Petkov wrote:
> On Wed, Jan 23, 2019 at 06:36:23PM +0000, James Morse wrote:
>>> Would like to know what's the impact if this error happens, and how to fit it
>>> with current reporting in EDAC core.
>>
>> At a guess the interrupt triggers when link_err_count increases. (link_err has
>> an overflow bit, so the interrupt must be related to a counter).
>>
>> If we could associate a link with a layer in edac, we could report errors
>> against that point. But I've no idea how 'links' correspond with 'ranks and banks'!


> Well, I have no clue what kind of links you guys are talking but if
> those are per-chance coherent links used by cores to communicate in a
> coherent fabric, or cores and devices, what would showing those errors
> to the user bring ya?

(I mentioned this because its the next interrupt in the register, its an example
of something that may be added for another platform in the future, which affects
the DT and probing)


> Or are ya talking about different kinds of links?

... whatever the manual means by 'link', good point, it could be the
interconnect side.

'alert_mode_next', in the feature control register talks about DIMM training,
and says 'dfi_err' is treated a a link error. DFI is defined earlier as the 'DDR
PHY interface', so these must be links between the DMC520 and DDR.


> In any case, the first question to ask would be, can some agent or the
> user do something with the information that X or Y link errors happened?
> 
> If not, then why bother?
> If yes, then that's a different story.

I agree. Surely if the DIMMs are socketed link-errors are another reason to
replace the DIMM.

It looks like this doesn't matter on Rui's platform,


Thanks,

James
James Morse - Feb. 5, 2019, 5:31 p.m.
Hi Rui,

On 23/01/2019 22:08, Rui Zhao wrote:
> On Wednesday, January 23, 2019 10:36 AM, James Morse wrote:
>>> If firmware enables it, they're suppose to handle the interrupt.

>> Ah, so you still have resident firmware!
>> How come your firmware trusts linux not to turn off the memory controller?!
>> These things are usually protected by trust zone so the OS can't pull the memory from under firmware's feet.

> We have firmware to config the memory controller and want to have an EDAC driver to report ECC status.

> Could you please elaborate a bit on the security concern on this approach? Like some malicious app/driver can access
> memory controller registers can cause issue? 

I'm remembering this:
https://lore.kernel.org/linux-arm-kernel/9b9c4cd5-4428-c08d-d4a3-7352c6c80583@arm.com/

Robin Murphy wrote:
| [ For anyone interested, it puts the DRAM controller into sleep mode.
| The kernel can't even panic if all the memory suddenly disappears :D ]

This would be a problem if you need your Secure-world software needs to keep
working, and depends on the memory behind this controller.

It might be that your secure-world software only uses some other memory, in
which case this wouldn't matter.
It may be linux _is_ your secure-world software, in which case it wouldn't
matter either.


> What's the recommend approach if Linux won't be able to access memory controller
> registers? Have firmware do the ECC 
> status monitoring and some sort of driver to query ECC status from firmware?

If Linux runs in the normal-world, can't you use trust-zone to prevent Linux
from accessing the memory controller?

If you did this, you'd need to handle the UE interrupts in firmware, and
wouldn't be able to use this driver in linux. Your platform hasn't gone this
way, so I guess one of the above cases applies.


Thanks,

James
Rui Zhao - March 6, 2019, 5:20 a.m.
Hi James,

> On Tuesday, February 5, 2019 9:31 AM, James Morse wrote:


>> We have firmware to config the memory controller and want to have an EDAC driver to report ECC status.


>> Could you please elaborate a bit on the security concern on this 

>> approach? Like some malicious app/driver can access memory controller registers can cause issue?


> I'm remembering this:

> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flinux-arm-kernel%2F9b9c4cd5-4428-c08d-d4a3-7352c6c80583%40arm.com%2F&amp;data=02%7C01%7Cruizhao%40microsoft.com%7C02f5b12bbf01452f9d1208d68b8fc748%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636849846981772601&amp;sdata=DhwIPDGAucHiVN%2Byfa10yHDZz5zZwi5OlyrKHE4KUNQ%3D&amp;reserved=0


> Robin Murphy wrote:

> | [ For anyone interested, it puts the DRAM controller into sleep mode.

> | The kernel can't even panic if all the memory suddenly disappears :D ]


> This would be a problem if you need your Secure-world software needs to keep working, and depends on the memory behind this controller.


> It might be that your secure-world software only uses some other memory, in which case this wouldn't matter.

> It may be linux _is_ your secure-world software, in which case it wouldn't matter either.


We had internal discussion with our security team and in our product we do trust Linux. I'll send an updated patch to move platform specific settings like interrupt config to DT and include DT bindings doc for this driver.

Thanks,
Rui

Patch

diff --git a/MAINTAINERS b/MAINTAINERS
index 32d444476a90..e8ec396a1475 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5429,6 +5429,12 @@  F:	Documentation/driver-api/edac.rst
 F:	drivers/edac/
 F:	include/linux/edac.h
 
+EDAC-DMC520
+M:	Rui Zhao <ruizhao@microsoft.com>
+L:	linux-edac@vger.kernel.org
+S:	Supported
+F:	drivers/edac/dmc520_edac.c
+
 EDAC-E752X
 M:	Mark Gross <mark.gross@intel.com>
 L:	linux-edac@vger.kernel.org
diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
index e286b5b99003..78ffdb9cfa6b 100644
--- a/drivers/edac/Kconfig
+++ b/drivers/edac/Kconfig
@@ -475,4 +475,11 @@  config EDAC_QCOM
 	  For debugging issues having to do with stability and overall system
 	  health, you should probably say 'Y' here.
 
+config EDAC_DMC520
+	tristate "ARM DMC-520 ECC"
+	depends on ARM64
+	help
+	  Support for error detection and correction on
+	  SoCs with the ARM DMC-520 DRAM controller.
+
 endif # EDAC
diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
index 716096d08ea0..793d64f525d4 100644
--- a/drivers/edac/Makefile
+++ b/drivers/edac/Makefile
@@ -78,3 +78,4 @@  obj-$(CONFIG_EDAC_SYNOPSYS)		+= synopsys_edac.o
 obj-$(CONFIG_EDAC_XGENE)		+= xgene_edac.o
 obj-$(CONFIG_EDAC_TI)			+= ti_edac.o
 obj-$(CONFIG_EDAC_QCOM)			+= qcom_edac.o
+obj-$(CONFIG_EDAC_DMC520)		+= dmc520_edac.o
diff --git a/drivers/edac/dmc520_edac.c b/drivers/edac/dmc520_edac.c
new file mode 100644
index 000000000000..5f14889074af
--- /dev/null
+++ b/drivers/edac/dmc520_edac.c
@@ -0,0 +1,495 @@ 
+// SPDX-License-Identifier: GPL-2.0+
+
+#include <linux/module.h>
+#include <linux/platform_device.h>
+#include <linux/edac.h>
+#include <linux/io.h>
+#include <linux/of.h>
+#include <linux/interrupt.h>
+#include <linux/bitfield.h>
+#include <edac_mc.h>
+
+/* DMC-520 registers */
+#define REG_OFFSET_FEATURE_CONFIG		0x130
+#define REG_OFFSET_ECC_ERRC_COUNT_31_00		0x158
+#define REG_OFFSET_ECC_ERRC_COUNT_63_32		0x15C
+#define REG_OFFSET_ECC_ERRD_COUNT_31_00		0x160
+#define REG_OFFSET_ECC_ERRD_COUNT_63_32		0x164
+#define REG_OFFSET_FEATURE_CONTROL_NEXT		0x1F0
+#define REG_OFFSET_INTERRUPT_CONTROL		0x500
+#define REG_OFFSET_INTERRUPT_CLR		0x508
+#define REG_OFFSET_INTERRUPT_STATUS		0x510
+#define REG_OFFSET_DRAM_ECC_ERRC_INT_INFO_31_00	0x528
+#define REG_OFFSET_DRAM_ECC_ERRC_INT_INFO_63_32	0x52C
+#define REG_OFFSET_DRAM_ECC_ERRD_INT_INFO_31_00	0x530
+#define REG_OFFSET_DRAM_ECC_ERRD_INT_INFO_63_32	0x534
+#define REG_OFFSET_ADDRESS_CONTROL_NOW		0x1010
+#define REG_OFFSET_DECODE_CONTROL_NOW		0x1014
+#define REG_OFFSET_MEMORY_TYPE_NOW		0x1128
+
+/* DMC-520 types, masks and bitfields */
+#define MEMORY_TYPE_LPDDR3			0
+#define MEMORY_TYPE_DDR3			1
+#define MEMORY_TYPE_DDR4			2
+#define MEMORY_TYPE_LPDDR4			3
+
+#define MEMORY_DEV_WIDTH_X4			0
+#define MEMORY_DEV_WIDTH_X8			1
+#define MEMORY_DEV_WIDTH_X16			2
+#define MEMORY_DEV_WIDTH_X32			3
+
+#define DRAM_ECC_INT_CE_MASK			BIT(2)
+#define DRAM_ECC_INT_UE_MASK			BIT(3)
+#define DRAM_ECC_INT_CE_OVERFLOW_MASK		BIT(18)
+#define DRAM_ECC_INT_UE_OVERFLOW_MASK		BIT(19)
+
+#define REG_FIELD_DRAM_ECC_ENABLED		GENMASK(1, 0)
+#define REG_FIELD_MEMORY_TYPE			GENMASK(2, 0)
+#define REG_FIELD_DEVICE_WIDTH			GENMASK(9, 8)
+#define REG_FIELD_ADDRESS_CONTROL_COL		GENMASK(2, 0)
+#define REG_FIELD_ADDRESS_CONTROL_ROW		GENMASK(10, 8)
+#define REG_FIELD_ADDRESS_CONTROL_BANK		GENMASK(18, 16)
+#define REG_FIELD_ADDRESS_CONTROL_RANK		GENMASK(25, 24)
+#define REG_FIELD_ERR_INFO_LOW_VALID		BIT(0)
+#define REG_FIELD_ERR_INFO_LOW_COL		GENMASK(10, 1)
+#define REG_FIELD_ERR_INFO_LOW_ROW		GENMASK(28, 11)
+#define REG_FIELD_ERR_INFO_LOW_RANK		GENMASK(31, 29)
+#define REG_FIELD_ERR_INFO_HIGH_BANK		GENMASK(3, 0)
+#define REG_FIELD_ERR_INFO_HIGH_VALID		BIT(31)
+
+#define DRAM_ECC_MIN_INT_OVERFLOW_ERROR_COUNT	256
+#define DRAM_ADDRESS_CONTROL_MIN_COL_BITS	8
+#define DRAM_ADDRESS_CONTROL_MIN_ROW_BITS	11
+
+/* Driver settings */
+#define DMC520_EDAC_CHANS			1
+#define DMC520_EDAC_ERR_GRAIN			1
+#define DMC520_EDAC_INT_COUNT			2
+#define DMC520_EDAC_BUS_WIDTH			8
+
+#define EDAC_MSG_BUF_SIZE			128
+#define EDAC_MOD_NAME				"dmc520-edac"
+#define EDAC_CTL_NAME				"dmc520"
+
+struct ecc_error_info {
+	u32 col;
+	u32 row;
+	u32 bank;
+	u32 rank;
+};
+
+struct dmc520_edac {
+	void __iomem *reg_base;
+	char message[EDAC_MSG_BUF_SIZE];
+};
+
+static int dmc520_mc_idx;
+
+static u32 dmc520_read_reg(struct dmc520_edac *edac, u32 offset)
+{
+	return readl(edac->reg_base + offset);
+}
+
+static void dmc520_write_reg(struct dmc520_edac *edac, u32 val, u32 offset)
+{
+	writel(val, edac->reg_base + offset);
+}
+
+static u32 dmc520_calc_ecc_error(u32 value)
+{
+	u32 total = 0;
+
+	/* Each rank's error counter takes one byte */
+	while (value > 0) {
+		total += (value & 0xFF);
+		value >>= 8;
+	}
+	return total;
+}
+
+static u32 dmc520_get_ecc_error_count(struct dmc520_edac *edac, bool is_ce)
+{
+	u32 reg_offset_low, reg_offset_high;
+	u32 err_low, err_high;
+	u32 ce_count;
+
+	reg_offset_low = is_ce ? REG_OFFSET_ECC_ERRC_COUNT_31_00 :
+				 REG_OFFSET_ECC_ERRD_COUNT_31_00;
+	reg_offset_high = is_ce ? REG_OFFSET_ECC_ERRC_COUNT_63_32 :
+				  REG_OFFSET_ECC_ERRD_COUNT_63_32;
+
+	err_low = dmc520_read_reg(edac, reg_offset_low);
+	err_high = dmc520_read_reg(edac, reg_offset_high);
+
+	ce_count = dmc520_calc_ecc_error(err_low) +
+		   dmc520_calc_ecc_error(err_high);
+
+	/* Reset error counters */
+	dmc520_write_reg(edac, 0, reg_offset_low);
+	dmc520_write_reg(edac, 0, reg_offset_high);
+
+	return ce_count;
+}
+
+static bool dmc520_get_ecc_error_info(struct dmc520_edac *edac,
+				      bool is_ce,
+				      struct ecc_error_info *info)
+{
+	u32 reg_offset_low, reg_offset_high;
+	u32 reg_val_low, reg_val_high;
+	bool valid;
+
+	reg_offset_low = is_ce ? REG_OFFSET_DRAM_ECC_ERRC_INT_INFO_31_00 :
+				 REG_OFFSET_DRAM_ECC_ERRD_INT_INFO_31_00;
+	reg_offset_high = is_ce ? REG_OFFSET_DRAM_ECC_ERRC_INT_INFO_63_32 :
+				  REG_OFFSET_DRAM_ECC_ERRD_INT_INFO_63_32;
+
+	reg_val_low = dmc520_read_reg(edac, reg_offset_low);
+	reg_val_high = dmc520_read_reg(edac, reg_offset_high);
+
+	valid = (FIELD_GET(REG_FIELD_ERR_INFO_LOW_VALID, reg_val_low) != 0) &&
+		(FIELD_GET(REG_FIELD_ERR_INFO_HIGH_VALID, reg_val_high) != 0);
+
+	if (info) {
+		if (valid) {
+			info->col = FIELD_GET(REG_FIELD_ERR_INFO_LOW_COL,
+					      reg_val_low);
+			info->row = FIELD_GET(REG_FIELD_ERR_INFO_LOW_ROW,
+					      reg_val_low);
+			info->rank = FIELD_GET(REG_FIELD_ERR_INFO_LOW_RANK,
+					       reg_val_low);
+			info->bank = FIELD_GET(REG_FIELD_ERR_INFO_HIGH_BANK,
+					       reg_val_high);
+		} else {
+			memset(info, 0, sizeof(struct ecc_error_info));
+		}
+	}
+
+	return valid;
+}
+
+static bool dmc520_is_ecc_enabled(struct dmc520_edac *edac)
+{
+	u32 reg_val = dmc520_read_reg(edac, REG_OFFSET_FEATURE_CONFIG);
+
+	return (FIELD_GET(REG_FIELD_DRAM_ECC_ENABLED, reg_val) != 0);
+}
+
+static enum mem_type dmc520_get_mtype(struct dmc520_edac *edac)
+{
+	enum mem_type mt;
+	u32 reg_val, type;
+
+	reg_val = dmc520_read_reg(edac, REG_OFFSET_MEMORY_TYPE_NOW);
+	type = FIELD_GET(REG_FIELD_MEMORY_TYPE, reg_val);
+
+	switch (type) {
+	case MEMORY_TYPE_LPDDR3:
+	case MEMORY_TYPE_DDR3:
+		mt = MEM_DDR3;
+		break;
+
+	case MEMORY_TYPE_DDR4:
+	case MEMORY_TYPE_LPDDR4:
+	default:
+		mt = MEM_DDR4;
+		break;
+	}
+	return mt;
+}
+
+static enum dev_type dmc520_get_dtype(struct dmc520_edac *edac)
+{
+	enum dev_type dt;
+	u32 reg_val, device_width;
+
+	reg_val = dmc520_read_reg(edac, REG_OFFSET_MEMORY_TYPE_NOW);
+	device_width = FIELD_GET(REG_FIELD_DEVICE_WIDTH, reg_val);
+
+	switch (device_width) {
+	case MEMORY_DEV_WIDTH_X4:
+		dt = DEV_X4;
+		break;
+
+	case MEMORY_DEV_WIDTH_X8:
+		dt = DEV_X8;
+		break;
+
+	case MEMORY_DEV_WIDTH_X16:
+		dt = DEV_X16;
+		break;
+
+	case MEMORY_DEV_WIDTH_X32:
+		dt = DEV_X32;
+		break;
+	}
+	return dt;
+}
+
+static u32 dmc520_get_rank_count(void __iomem *reg_base)
+{
+	u32 reg_val, rank_bits;
+
+	reg_val = readl(reg_base + REG_OFFSET_ADDRESS_CONTROL_NOW);
+	rank_bits = FIELD_GET(REG_FIELD_ADDRESS_CONTROL_RANK, reg_val);
+
+	return (1 << rank_bits);
+}
+
+static u64 dmc520_get_rank_size(struct dmc520_edac *edac)
+{
+	u32 reg_val, col_bits, row_bits, bank_bits;
+
+	reg_val = dmc520_read_reg(edac, REG_OFFSET_ADDRESS_CONTROL_NOW);
+
+	col_bits = FIELD_GET(REG_FIELD_ADDRESS_CONTROL_COL, reg_val) +
+		   DRAM_ADDRESS_CONTROL_MIN_COL_BITS;
+	row_bits = FIELD_GET(REG_FIELD_ADDRESS_CONTROL_ROW, reg_val) +
+		   DRAM_ADDRESS_CONTROL_MIN_ROW_BITS;
+	bank_bits = FIELD_GET(REG_FIELD_ADDRESS_CONTROL_BANK, reg_val);
+
+	return (u64)DMC520_EDAC_BUS_WIDTH << (col_bits + row_bits + bank_bits);
+}
+
+static void dmc520_handle_ecc_errors(struct mem_ctl_info *mci,
+				     bool is_ce,
+				     bool overflow)
+{
+	struct ecc_error_info info;
+	struct dmc520_edac *edac;
+	u32 cnt;
+
+	edac = mci->pvt_info;
+	dmc520_get_ecc_error_info(edac, is_ce, &info);
+
+	cnt = dmc520_get_ecc_error_count(edac, is_ce);
+
+	if (overflow)
+		cnt += DRAM_ECC_MIN_INT_OVERFLOW_ERROR_COUNT;
+
+	if (cnt > 0) {
+		snprintf(edac->message, ARRAY_SIZE(edac->message),
+			 "rank:%d bank:%d row:%d col:%d",
+			 info.rank, info.bank,
+			 info.row, info.col);
+
+		edac_mc_handle_error((is_ce ? HW_EVENT_ERR_CORRECTED :
+				     HW_EVENT_ERR_UNCORRECTED),
+				     mci, cnt, 0, 0, 0, info.rank, 0, -1,
+				     edac->message, "");
+	}
+}
+
+static irqreturn_t dmc520_edac_isr(int irq, void *data, bool is_ce)
+{
+	u32 i_mask, o_mask, status;
+	bool overflow;
+	struct mem_ctl_info *mci;
+	struct dmc520_edac *edac;
+
+	mci = data;
+	edac = mci->pvt_info;
+
+	i_mask = is_ce ? DRAM_ECC_INT_CE_MASK : DRAM_ECC_INT_UE_MASK;
+	o_mask = is_ce ? DRAM_ECC_INT_CE_OVERFLOW_MASK :
+			 DRAM_ECC_INT_UE_OVERFLOW_MASK;
+
+	status = dmc520_read_reg(edac, REG_OFFSET_INTERRUPT_STATUS);
+	overflow = ((status & o_mask) != 0);
+
+	dmc520_handle_ecc_errors(mci, is_ce, overflow);
+
+	dmc520_write_reg(edac, i_mask, REG_OFFSET_INTERRUPT_CLR);
+
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t dmc520_edac_ce_isr(int irq, void *data)
+{
+	return dmc520_edac_isr(irq, data, true);
+}
+
+static irqreturn_t dmc520_edac_ue_isr(int irq, void *data)
+{
+	return dmc520_edac_isr(irq, data, false);
+}
+
+static void dmc520_init_csrow(struct mem_ctl_info *mci)
+{
+	struct csrow_info *csi;
+	struct dimm_info *dimm;
+	int row, ch;
+	enum dev_type dt;
+	enum mem_type mt;
+	u64 rs;
+	u32 pages_per_rank;
+	struct dmc520_edac *edac = mci->pvt_info;
+
+	dt = dmc520_get_dtype(edac);
+	mt = dmc520_get_mtype(edac);
+	rs = dmc520_get_rank_size(edac);
+	pages_per_rank = rs >> PAGE_SHIFT;
+
+	for (row = 0; row < mci->nr_csrows; row++) {
+		csi = mci->csrows[row];
+
+		for (ch = 0; ch < csi->nr_channels; ch++) {
+			dimm		= csi->channels[ch]->dimm;
+			dimm->edac_mode	= EDAC_FLAG_SECDED;
+			dimm->mtype	= mt;
+			dimm->nr_pages	= pages_per_rank / csi->nr_channels;
+			dimm->grain	= DMC520_EDAC_ERR_GRAIN;
+			dimm->dtype	= dt;
+		}
+	}
+}
+
+static int dmc520_edac_probe(struct platform_device *pdev)
+{
+	struct dmc520_edac *edac;
+	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
+	int ret, irq;
+	struct resource *res;
+	void __iomem *reg_base;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	reg_base = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(reg_base))
+		return PTR_ERR(reg_base);
+
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = dmc520_get_rank_count(reg_base);
+	layers[0].is_virt_csrow = true;
+
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = DMC520_EDAC_CHANS;
+	layers[1].is_virt_csrow = false;
+
+	mci = edac_mc_alloc(dmc520_mc_idx++, ARRAY_SIZE(layers), layers,
+			    sizeof(struct dmc520_edac));
+	if (!mci) {
+		edac_printk(KERN_ERR, EDAC_MOD_NAME,
+			    "Failed to allocate memory for mc instance\n");
+		return -ENOMEM;
+	}
+
+	edac = mci->pvt_info;
+	edac->reg_base = reg_base;
+
+	if (!dmc520_is_ecc_enabled(edac)) {
+		edac_printk(KERN_ERR, EDAC_MOD_NAME, "ECC not enabled\n");
+		ret = -ENXIO;
+		goto err;
+	}
+
+	platform_set_drvdata(pdev, mci);
+
+	mci->pdev = &pdev->dev;
+	mci->mtype_cap = MEM_FLAG_DDR3 | MEM_FLAG_DDR4;
+	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
+	mci->scrub_cap = SCRUB_HW_SRC;
+	mci->scrub_mode = SCRUB_NONE;
+	mci->edac_cap = EDAC_FLAG_SECDED;
+	mci->ctl_name = EDAC_CTL_NAME;
+	mci->dev_name = dev_name(mci->pdev);
+	mci->mod_name = EDAC_MOD_NAME;
+	mci->ctl_page_to_phys = NULL;
+
+	edac_op_state = EDAC_OPSTATE_INT;
+
+	dmc520_init_csrow(mci);
+
+	ret = edac_mc_add_mc(mci);
+	if (ret) {
+		edac_printk(KERN_ERR, EDAC_MOD_NAME,
+			    "Failed to register with EDAC core\n");
+		goto err;
+	}
+
+	for (irq = 0; irq < DMC520_EDAC_INT_COUNT; ++irq) {
+		irq_handler_t dmc520_edac_isr;
+		int irq_id = platform_get_irq(pdev, irq);
+
+		if (irq_id < 0) {
+			edac_printk(KERN_ERR, EDAC_MC,
+				    "Failed to get %s irq\n",
+				    irq == 0 ? "CE" : "UE");
+			ret = -ENODEV;
+			goto err;
+		}
+
+		dmc520_edac_isr = (irq == 0 ? dmc520_edac_ce_isr :
+					      dmc520_edac_ue_isr);
+
+		ret = devm_request_irq(&pdev->dev,
+				       irq_id,
+				       dmc520_edac_isr,
+				       0,
+				       dev_name(&pdev->dev),
+				       mci);
+		if (ret < 0) {
+			edac_printk(KERN_ERR, EDAC_MC,
+				    "Failed to request irq %d\n", irq_id);
+			goto err;
+		}
+	}
+
+	/* Check ECC CE/UE errors */
+	dmc520_handle_ecc_errors(mci, true, false);
+	dmc520_handle_ecc_errors(mci, false, false);
+
+	/* Enable interrupts */
+	dmc520_write_reg(edac,
+			 DRAM_ECC_INT_CE_MASK | DRAM_ECC_INT_UE_MASK,
+			 REG_OFFSET_INTERRUPT_CONTROL);
+
+	return 0;
+
+err:
+	edac_mc_free(mci);
+
+	return ret;
+}
+
+static int dmc520_edac_remove(struct platform_device *pdev)
+{
+	struct dmc520_edac *edac;
+	struct mem_ctl_info *mci;
+
+	mci = platform_get_drvdata(pdev);
+	edac = mci->pvt_info;
+
+	/* Disable interrupts */
+	dmc520_write_reg(edac,
+			 DRAM_ECC_INT_CE_MASK | DRAM_ECC_INT_UE_MASK,
+			 REG_OFFSET_INTERRUPT_CONTROL);
+
+	edac_mc_del_mc(&pdev->dev);
+	edac_mc_free(mci);
+
+	return 0;
+}
+
+static const struct of_device_id dmc520_edac_driver_id[] = {
+	{ .compatible = "arm,dmc-520", },
+	{ /* end of table */ }
+};
+
+MODULE_DEVICE_TABLE(of, dmc520_edac_driver_id);
+
+static struct platform_driver dmc520_edac_driver = {
+	.driver = {
+		.name = "dmc520",
+		.of_match_table = dmc520_edac_driver_id,
+	},
+
+	.probe = dmc520_edac_probe,
+	.remove = dmc520_edac_remove
+};
+
+module_platform_driver(dmc520_edac_driver);
+
+MODULE_AUTHOR("Rui Zhao <ruizhao@microsoft.com>");
+MODULE_DESCRIPTION("DMC-520 ECC driver");
+MODULE_LICENSE("GPL v2");