Patchwork dmaengine: imx-dma: fix wrong callback invoke

login
register
mail settings
Submitter Leonid Iziumtsev
Date Jan. 15, 2019, 5:15 p.m.
Message ID <20190115171523.702-1-leonid.iziumtsev@gmail.com>
Download mbox | patch
Permalink /patch/700543/
State New
Headers show

Comments

Leonid Iziumtsev - Jan. 15, 2019, 5:15 p.m.
Once the "ld_queue" list is not empty, next descriptor will migrate
into "ld_active" list. The "desc" variable will be overwritten
during that transition. And later the dmaengine_desc_get_callback_invoke()
will use it as an argument. As result we invoke wrong callback.

That behaviour was in place since:
commit fcaaba6c7136 ("dmaengine: imx-dma: fix callback path in tasklet").
But after commit 4cd13c21b207 ("softirq: Let ksoftirqd do its job")
things got worse, since possible delay between tasklet_schedule()
from DMA irq handler and actual tasklet function execution got bigger.
And that gave more time for new DMA request to be submitted and
to be put into "ld_queue" list.

It has been noticed that DMA issue is causing problems for "mxc-mmc"
driver. While stressing the system with heavy network traffic and
writing/reading to/from sd card simultaneously the timeout may happen:

10013000.sdhci: mxcmci_watchdog: read time out (status = 0x30004900)

That often lead to file system corruption.

Signed-off-by: Leonid Iziumtsev <leonid.iziumtsev@gmail.com>
---
 drivers/dma/imx-dma.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)
Vinod Koul - Jan. 20, 2019, 10:52 a.m.
On 15-01-19, 17:15, Leonid Iziumtsev wrote:
> Once the "ld_queue" list is not empty, next descriptor will migrate
> into "ld_active" list. The "desc" variable will be overwritten
> during that transition. And later the dmaengine_desc_get_callback_invoke()
> will use it as an argument. As result we invoke wrong callback.
> 
> That behaviour was in place since:
> commit fcaaba6c7136 ("dmaengine: imx-dma: fix callback path in tasklet").
> But after commit 4cd13c21b207 ("softirq: Let ksoftirqd do its job")
> things got worse, since possible delay between tasklet_schedule()
> from DMA irq handler and actual tasklet function execution got bigger.
> And that gave more time for new DMA request to be submitted and
> to be put into "ld_queue" list.
> 
> It has been noticed that DMA issue is causing problems for "mxc-mmc"
> driver. While stressing the system with heavy network traffic and
> writing/reading to/from sd card simultaneously the timeout may happen:
> 
> 10013000.sdhci: mxcmci_watchdog: read time out (status = 0x30004900)
> 
> That often lead to file system corruption.

This looks reasonable to me and I think should go to stable as well.
Fabio can we get some testing done on this patch

> 
> Signed-off-by: Leonid Iziumtsev <leonid.iziumtsev@gmail.com>
> ---
>  drivers/dma/imx-dma.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/dma/imx-dma.c b/drivers/dma/imx-dma.c
> index c2fff3f6c9ca..4a09af3cd546 100644
> --- a/drivers/dma/imx-dma.c
> +++ b/drivers/dma/imx-dma.c
> @@ -618,7 +618,7 @@ static void imxdma_tasklet(unsigned long data)
>  {
>  	struct imxdma_channel *imxdmac = (void *)data;
>  	struct imxdma_engine *imxdma = imxdmac->imxdma;
> -	struct imxdma_desc *desc;
> +	struct imxdma_desc *desc, *next_desc;
>  	unsigned long flags;
>  
>  	spin_lock_irqsave(&imxdma->lock, flags);
> @@ -648,10 +648,10 @@ static void imxdma_tasklet(unsigned long data)
>  	list_move_tail(imxdmac->ld_active.next, &imxdmac->ld_free);
>  
>  	if (!list_empty(&imxdmac->ld_queue)) {
> -		desc = list_first_entry(&imxdmac->ld_queue, struct imxdma_desc,
> -					node);
> +		next_desc = list_first_entry(&imxdmac->ld_queue,
> +					     struct imxdma_desc, node);
>  		list_move_tail(imxdmac->ld_queue.next, &imxdmac->ld_active);
> -		if (imxdma_xfer_desc(desc) < 0)
> +		if (imxdma_xfer_desc(next_desc) < 0)
>  			dev_warn(imxdma->dev, "%s: channel: %d couldn't xfer desc\n",
>  				 __func__, imxdmac->channel);
>  	}
> -- 
> 2.11.0
Fabio Estevam - Jan. 23, 2019, 11:43 a.m.
Hi Vinod,

On Sun, Jan 20, 2019 at 8:54 AM Vinod Koul <vkoul@kernel.org> wrote:

> This looks reasonable to me and I think should go to stable as well.
> Fabio can we get some testing done on this patch

I currently don't have access to a mx25pdk board. Will probably get
access to it next week.

Patch looks good though.
Vinod Koul - Feb. 4, 2019, 7:06 a.m.
On 15-01-19, 17:15, Leonid Iziumtsev wrote:
> Once the "ld_queue" list is not empty, next descriptor will migrate
> into "ld_active" list. The "desc" variable will be overwritten
> during that transition. And later the dmaengine_desc_get_callback_invoke()
> will use it as an argument. As result we invoke wrong callback.
> 
> That behaviour was in place since:
> commit fcaaba6c7136 ("dmaengine: imx-dma: fix callback path in tasklet").
> But after commit 4cd13c21b207 ("softirq: Let ksoftirqd do its job")
> things got worse, since possible delay between tasklet_schedule()
> from DMA irq handler and actual tasklet function execution got bigger.
> And that gave more time for new DMA request to be submitted and
> to be put into "ld_queue" list.
> 
> It has been noticed that DMA issue is causing problems for "mxc-mmc"
> driver. While stressing the system with heavy network traffic and
> writing/reading to/from sd card simultaneously the timeout may happen:
> 
> 10013000.sdhci: mxcmci_watchdog: read time out (status = 0x30004900)
> 
> That often lead to file system corruption.

Applied and tagged to stable, thanks

Patch

diff --git a/drivers/dma/imx-dma.c b/drivers/dma/imx-dma.c
index c2fff3f6c9ca..4a09af3cd546 100644
--- a/drivers/dma/imx-dma.c
+++ b/drivers/dma/imx-dma.c
@@ -618,7 +618,7 @@  static void imxdma_tasklet(unsigned long data)
 {
 	struct imxdma_channel *imxdmac = (void *)data;
 	struct imxdma_engine *imxdma = imxdmac->imxdma;
-	struct imxdma_desc *desc;
+	struct imxdma_desc *desc, *next_desc;
 	unsigned long flags;
 
 	spin_lock_irqsave(&imxdma->lock, flags);
@@ -648,10 +648,10 @@  static void imxdma_tasklet(unsigned long data)
 	list_move_tail(imxdmac->ld_active.next, &imxdmac->ld_free);
 
 	if (!list_empty(&imxdmac->ld_queue)) {
-		desc = list_first_entry(&imxdmac->ld_queue, struct imxdma_desc,
-					node);
+		next_desc = list_first_entry(&imxdmac->ld_queue,
+					     struct imxdma_desc, node);
 		list_move_tail(imxdmac->ld_queue.next, &imxdmac->ld_active);
-		if (imxdma_xfer_desc(desc) < 0)
+		if (imxdma_xfer_desc(next_desc) < 0)
 			dev_warn(imxdma->dev, "%s: channel: %d couldn't xfer desc\n",
 				 __func__, imxdmac->channel);
 	}