Patchwork [2/2] dmaengine: qcom_hidma: assign channel cookie correctly

login
register
mail settings
Submitter Yang Shunyong
Date Dec. 7, 2018, 4:29 a.m.
Message ID <ea6c6317dabffdcfa28cdb391b50edce98ce875d.1544156508.git.shunyong.yang@hxt-semitech.com>
Download mbox | patch
Permalink /patch/674913/
State New
Headers show

Comments

Yang Shunyong - Dec. 7, 2018, 4:29 a.m.
When dma_cookie_complete() is called in hidma_process_completed(),
dma_cookie_status() will return DMA_COMPLETE in hidma_tx_status(). Then,
hidma_txn_is_success() will be called to use channel cookie
mchan->last_success to do additional DMA status check. Current code
assigns mchan->last_success after dma_cookie_complete(). This causes
a race condition of dma_cookie_status() returns DMA_COMPLETE before
mchan->last_success is assigned correctly. The race will cause
hidma_tx_status() return DMA_ERROR but the transaction is actually a
success. Moreover, in async_tx case, it will cause a timeout panic
in async_tx_quiesce().

 Kernel panic - not syncing: async_tx_quiesce: DMA error waiting for
 transaction
 ...
 Call trace:
 [<ffff000008089994>] dump_backtrace+0x0/0x1f4
 [<ffff000008089bac>] show_stack+0x24/0x2c
 [<ffff00000891e198>] dump_stack+0x84/0xa8
 [<ffff0000080da544>] panic+0x12c/0x29c
 [<ffff0000045d0334>] async_tx_quiesce+0xa4/0xc8 [async_tx]
 [<ffff0000045d03c8>] async_trigger_callback+0x70/0x1c0 [async_tx]
 [<ffff0000048b7d74>] raid_run_ops+0x86c/0x1540 [raid456]
 [<ffff0000048bd084>] handle_stripe+0x5e8/0x1c7c [raid456]
 [<ffff0000048be9ec>] handle_active_stripes.isra.45+0x2d4/0x550 [raid456]
 [<ffff0000048beff4>] raid5d+0x38c/0x5d0 [raid456]
 [<ffff000008736538>] md_thread+0x108/0x168
 [<ffff0000080fb1cc>] kthread+0x10c/0x138
 [<ffff000008084d34>] ret_from_fork+0x10/0x18

Cc: Joey Zheng <yu.zheng@hxt-semitech.com>
Signed-off-by: Shunyong Yang <shunyong.yang@hxt-semitech.com>
---
 drivers/dma/qcom/hidma.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)
Sinan Kaya - Dec. 8, 2018, 8:28 p.m.
On 12/6/2018 11:29 PM, Shunyong Yang wrote:
> When dma_cookie_complete() is called in hidma_process_completed(),
> dma_cookie_status() will return DMA_COMPLETE in hidma_tx_status(). Then,
> hidma_txn_is_success() will be called to use channel cookie
> mchan->last_success to do additional DMA status check. Current code
> assigns mchan->last_success after dma_cookie_complete(). This causes
> a race condition of dma_cookie_status() returns DMA_COMPLETE before
> mchan->last_success is assigned correctly. The race will cause
> hidma_tx_status() return DMA_ERROR but the transaction is actually a
> success. Moreover, in async_tx case, it will cause a timeout panic
> in async_tx_quiesce().
> 
>   Kernel panic - not syncing: async_tx_quiesce: DMA error waiting for
>   transaction
>   ...
>   Call trace:
>   [<ffff000008089994>] dump_backtrace+0x0/0x1f4
>   [<ffff000008089bac>] show_stack+0x24/0x2c
>   [<ffff00000891e198>] dump_stack+0x84/0xa8
>   [<ffff0000080da544>] panic+0x12c/0x29c
>   [<ffff0000045d0334>] async_tx_quiesce+0xa4/0xc8 [async_tx]
>   [<ffff0000045d03c8>] async_trigger_callback+0x70/0x1c0 [async_tx]
>   [<ffff0000048b7d74>] raid_run_ops+0x86c/0x1540 [raid456]
>   [<ffff0000048bd084>] handle_stripe+0x5e8/0x1c7c [raid456]
>   [<ffff0000048be9ec>] handle_active_stripes.isra.45+0x2d4/0x550 [raid456]
>   [<ffff0000048beff4>] raid5d+0x38c/0x5d0 [raid456]
>   [<ffff000008736538>] md_thread+0x108/0x168
>   [<ffff0000080fb1cc>] kthread+0x10c/0x138
>   [<ffff000008084d34>] ret_from_fork+0x10/0x18
> 
> Cc: Joey Zheng<yu.zheng@hxt-semitech.com>
> Signed-off-by: Shunyong Yang<shunyong.yang@hxt-semitech.com>


Acked-by: Sinan Kaya <okaya@kernel.org>

to both patches 1/2 and 2/2.
Yang Shunyong - Dec. 14, 2018, 1:03 a.m.
Hi, Sinan

On 2018/12/9 4:28, Sinan Kaya wrote:
> On 12/6/2018 11:29 PM, Shunyong Yang wrote:
>> When dma_cookie_complete() is called in hidma_process_completed(),
>> dma_cookie_status() will return DMA_COMPLETE in hidma_tx_status(). Then,
>> hidma_txn_is_success() will be called to use channel cookie
>> mchan->last_success to do additional DMA status check. Current code
>> assigns mchan->last_success after dma_cookie_complete(). This causes
>> a race condition of dma_cookie_status() returns DMA_COMPLETE before
>> mchan->last_success is assigned correctly. The race will cause
>> hidma_tx_status() return DMA_ERROR but the transaction is actually a
>> success. Moreover, in async_tx case, it will cause a timeout panic
>> in async_tx_quiesce().
>>
>>   Kernel panic - not syncing: async_tx_quiesce: DMA error waiting for
>>   transaction
>>   ...
>>   Call trace:
>>   [<ffff000008089994>] dump_backtrace+0x0/0x1f4
>>   [<ffff000008089bac>] show_stack+0x24/0x2c
>>   [<ffff00000891e198>] dump_stack+0x84/0xa8
>>   [<ffff0000080da544>] panic+0x12c/0x29c
>>   [<ffff0000045d0334>] async_tx_quiesce+0xa4/0xc8 [async_tx]
>>   [<ffff0000045d03c8>] async_trigger_callback+0x70/0x1c0 [async_tx]
>>   [<ffff0000048b7d74>] raid_run_ops+0x86c/0x1540 [raid456]
>>   [<ffff0000048bd084>] handle_stripe+0x5e8/0x1c7c [raid456]
>>   [<ffff0000048be9ec>] handle_active_stripes.isra.45+0x2d4/0x550 [raid456]
>>   [<ffff0000048beff4>] raid5d+0x38c/0x5d0 [raid456]
>>   [<ffff000008736538>] md_thread+0x108/0x168
>>   [<ffff0000080fb1cc>] kthread+0x10c/0x138
>>   [<ffff000008084d34>] ret_from_fork+0x10/0x18
>>
>> Cc: Joey Zheng<yu.zheng@hxt-semitech.com>
>> Signed-off-by: Shunyong Yang<shunyong.yang@hxt-semitech.com>
> 
> 
> Acked-by: Sinan Kaya <okaya@kernel.org>
> 
> to both patches 1/2 and 2/2.
> 
> 
Thanks for the ACKs.

Shunyong.
Yang Shunyong - Jan. 4, 2019, 8:56 a.m.
Hi, Vinod,

Gentle ping. Would you please help to review these two patches and merge?

On 2018/12/14 9:03, Yang, Shunyong wrote:
> Hi, Sinan
> 
> On 2018/12/9 4:28, Sinan Kaya wrote:
>> On 12/6/2018 11:29 PM, Shunyong Yang wrote:
>>> When dma_cookie_complete() is called in hidma_process_completed(),
>>> dma_cookie_status() will return DMA_COMPLETE in hidma_tx_status(). Then,
>>> hidma_txn_is_success() will be called to use channel cookie
>>> mchan->last_success to do additional DMA status check. Current code
>>> assigns mchan->last_success after dma_cookie_complete(). This causes
>>> a race condition of dma_cookie_status() returns DMA_COMPLETE before
>>> mchan->last_success is assigned correctly. The race will cause
>>> hidma_tx_status() return DMA_ERROR but the transaction is actually a
>>> success. Moreover, in async_tx case, it will cause a timeout panic
>>> in async_tx_quiesce().
>>>
>>>   Kernel panic - not syncing: async_tx_quiesce: DMA error waiting for
>>>   transaction
>>>   ...
>>>   Call trace:
>>>   [<ffff000008089994>] dump_backtrace+0x0/0x1f4
>>>   [<ffff000008089bac>] show_stack+0x24/0x2c
>>>   [<ffff00000891e198>] dump_stack+0x84/0xa8
>>>   [<ffff0000080da544>] panic+0x12c/0x29c
>>>   [<ffff0000045d0334>] async_tx_quiesce+0xa4/0xc8 [async_tx]
>>>   [<ffff0000045d03c8>] async_trigger_callback+0x70/0x1c0 [async_tx]
>>>   [<ffff0000048b7d74>] raid_run_ops+0x86c/0x1540 [raid456]
>>>   [<ffff0000048bd084>] handle_stripe+0x5e8/0x1c7c [raid456]
>>>   [<ffff0000048be9ec>] handle_active_stripes.isra.45+0x2d4/0x550 [raid456]
>>>   [<ffff0000048beff4>] raid5d+0x38c/0x5d0 [raid456]
>>>   [<ffff000008736538>] md_thread+0x108/0x168
>>>   [<ffff0000080fb1cc>] kthread+0x10c/0x138
>>>   [<ffff000008084d34>] ret_from_fork+0x10/0x18
>>>
>>> Cc: Joey Zheng<yu.zheng@hxt-semitech.com>
>>> Signed-off-by: Shunyong Yang<shunyong.yang@hxt-semitech.com>
>>
>>
>> Acked-by: Sinan Kaya <okaya@kernel.org>
>>
>> to both patches 1/2 and 2/2.
>>
>>
> Thanks for the ACKs.
> 
> Shunyong.
> 
>
Vinod Koul - Jan. 4, 2019, 5:31 p.m.
On 07-12-18, 12:29, Shunyong Yang wrote:
> When dma_cookie_complete() is called in hidma_process_completed(),
> dma_cookie_status() will return DMA_COMPLETE in hidma_tx_status(). Then,
> hidma_txn_is_success() will be called to use channel cookie
> mchan->last_success to do additional DMA status check. Current code
> assigns mchan->last_success after dma_cookie_complete(). This causes
> a race condition of dma_cookie_status() returns DMA_COMPLETE before
> mchan->last_success is assigned correctly. The race will cause
> hidma_tx_status() return DMA_ERROR but the transaction is actually a
> success. Moreover, in async_tx case, it will cause a timeout panic
> in async_tx_quiesce().
> 
>  Kernel panic - not syncing: async_tx_quiesce: DMA error waiting for
>  transaction
>  ...
>  Call trace:
>  [<ffff000008089994>] dump_backtrace+0x0/0x1f4
>  [<ffff000008089bac>] show_stack+0x24/0x2c
>  [<ffff00000891e198>] dump_stack+0x84/0xa8
>  [<ffff0000080da544>] panic+0x12c/0x29c
>  [<ffff0000045d0334>] async_tx_quiesce+0xa4/0xc8 [async_tx]
>  [<ffff0000045d03c8>] async_trigger_callback+0x70/0x1c0 [async_tx]
>  [<ffff0000048b7d74>] raid_run_ops+0x86c/0x1540 [raid456]
>  [<ffff0000048bd084>] handle_stripe+0x5e8/0x1c7c [raid456]
>  [<ffff0000048be9ec>] handle_active_stripes.isra.45+0x2d4/0x550 [raid456]
>  [<ffff0000048beff4>] raid5d+0x38c/0x5d0 [raid456]
>  [<ffff000008736538>] md_thread+0x108/0x168
>  [<ffff0000080fb1cc>] kthread+0x10c/0x138
>  [<ffff000008084d34>] ret_from_fork+0x10/0x18
> 
> Cc: Joey Zheng <yu.zheng@hxt-semitech.com>
> Signed-off-by: Shunyong Yang <shunyong.yang@hxt-semitech.com>
> ---
>  drivers/dma/qcom/hidma.c | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/dma/qcom/hidma.c b/drivers/dma/qcom/hidma.c
> index 9d639ed1955a..aa88bcceda20 100644
> --- a/drivers/dma/qcom/hidma.c
> +++ b/drivers/dma/qcom/hidma.c
> @@ -138,24 +138,24 @@ static void hidma_process_completed(struct hidma_chan *mchan)
>  		desc = &mdesc->desc;
>  		last_cookie = desc->cookie;
>  
> +		llstat = hidma_ll_status(mdma->lldev, mdesc->tre_ch);
> +
>  		spin_lock_irqsave(&mchan->lock, irqflags);
> +		if (llstat == DMA_COMPLETE) {
> +			mchan->last_success = last_cookie;
> +			result.result = DMA_TRANS_NOERROR;
> +		} else
> +			result.result = DMA_TRANS_ABORTED;

Coding style mandates that else should also have braces, please fix that
Yang Shunyong - Jan. 9, 2019, 1:35 a.m.
Hi, Vinod and All,
  I found, somtimes, I cannot received some mails from this mailing
list. For example, I sent my v2 series on 7th, Jan,

https://patchwork.kernel.org/project/linux-dmaengine/list/?series=62713

It appeared on patchwork.kernel.org but I haven't received it in this
mailing list.

And I found randomly, a month or two months, I am unsubscribed from this
list automatically. Then I have to re-subscribe the list.

Have someone met similar problem as me? Or maybe I missed some necessary
settings?

Shunyong.
Thanks.
Vinod Koul - Jan. 9, 2019, 6:58 a.m.
On 09-01-19, 01:35, Yang, Shunyong wrote:
> Hi, Vinod and All,
>   I found, somtimes, I cannot received some mails from this mailing
> list. For example, I sent my v2 series on 7th, Jan,
> 
> https://patchwork.kernel.org/project/linux-dmaengine/list/?series=62713
> 
> It appeared on patchwork.kernel.org but I haven't received it in this
> mailing list.
> 
> And I found randomly, a month or two months, I am unsubscribed from this
> list automatically. Then I have to re-subscribe the list.

I guess your corporate server is to blame. vger mail list is "know" to 
unsubscribe servers which behave badly, so you need to ask your mail
admins why and try to use other email which is known to be friendly with
vger :)

> Have someone met similar problem as me? Or maybe I missed some necessary
> settings?
> 
> Shunyong.
> Thanks.
Yang Shunyong - Jan. 9, 2019, 7:51 a.m.
Hi, Vinod,
  Thanks for your information. I will check with our IT.
Shunyong.
Thanks.

On 2019/1/9 15:00, Vinod Koul wrote:
> On 09-01-19, 01:35, Yang, Shunyong wrote:
>> Hi, Vinod and All,
>>   I found, somtimes, I cannot received some mails from this mailing
>> list. For example, I sent my v2 series on 7th, Jan,
>>
>> https://patchwork.kernel.org/project/linux-dmaengine/list/?series=62713
>>
>> It appeared on patchwork.kernel.org but I haven't received it in this
>> mailing list.
>>
>> And I found randomly, a month or two months, I am unsubscribed from this
>> list automatically. Then I have to re-subscribe the list.
> 
> I guess your corporate server is to blame. vger mail list is "know" to 
> unsubscribe servers which behave badly, so you need to ask your mail
> admins why and try to use other email which is known to be friendly with
> vger :)
> 
>> Have someone met similar problem as me? Or maybe I missed some necessary
>> settings?
>>
>> Shunyong.
>> Thanks.
>
Theodore Ts'o - Jan. 9, 2019, 4:55 p.m.
On Wed, Jan 09, 2019 at 07:51:11AM +0000, Yang, Shunyong wrote:
> Hi, Vinod,
>   Thanks for your information. I will check with our IT.
> Shunyong.

The usual answer is generally over-enthuastic application of anti-SPAM
techniques, especially DMARC, which is fundamentally mailing-list hostile.

Cheers,

					- Ted

Patch

diff --git a/drivers/dma/qcom/hidma.c b/drivers/dma/qcom/hidma.c
index 9d639ed1955a..aa88bcceda20 100644
--- a/drivers/dma/qcom/hidma.c
+++ b/drivers/dma/qcom/hidma.c
@@ -138,24 +138,24 @@  static void hidma_process_completed(struct hidma_chan *mchan)
 		desc = &mdesc->desc;
 		last_cookie = desc->cookie;
 
+		llstat = hidma_ll_status(mdma->lldev, mdesc->tre_ch);
+
 		spin_lock_irqsave(&mchan->lock, irqflags);
+		if (llstat == DMA_COMPLETE) {
+			mchan->last_success = last_cookie;
+			result.result = DMA_TRANS_NOERROR;
+		} else
+			result.result = DMA_TRANS_ABORTED;
+
 		dma_cookie_complete(desc);
 		spin_unlock_irqrestore(&mchan->lock, irqflags);
 
-		llstat = hidma_ll_status(mdma->lldev, mdesc->tre_ch);
 		dmaengine_desc_get_callback(desc, &cb);
 
 		dma_run_dependencies(desc);
 
 		spin_lock_irqsave(&mchan->lock, irqflags);
 		list_move(&mdesc->node, &mchan->free);
-
-		if (llstat == DMA_COMPLETE) {
-			mchan->last_success = last_cookie;
-			result.result = DMA_TRANS_NOERROR;
-		} else
-			result.result = DMA_TRANS_ABORTED;
-
 		spin_unlock_irqrestore(&mchan->lock, irqflags);
 
 		dmaengine_desc_callback_invoke(&cb, &result);