Patchwork [BUG] mt76x0u: Probing issues on Raspberry Pi 3 B+

login
register
mail settings
Submitter Stanislaw Gruszka
Date Feb. 10, 2019, 9:41 a.m.
Message ID <20190210094123.GB2913@redhat.com>
Download mbox | patch
Permalink /patch/722285/
State New
Headers show

Comments

Stanislaw Gruszka - Feb. 10, 2019, 9:41 a.m.
On Sat, Feb 09, 2019 at 09:29:05PM +0100, Stefan Wahren wrote:
> > could you please test the following series:
> > https://patchwork.kernel.org/cover/10764453/
> 
> yeah this fixed the probing timeout and the driver will probe successful. AFAIK the dwc2 host mode doesn't support scatter-gather yet.

So this is either dwc2 scatter-gather problem which should be addressed in
this driver or mt76x0u does something wrong when configuring SG.

Disabling SG is just workaround, which do not address actual problem.

I think I found mt76x0u issue that could cause this USB probe error
(and possibly also address AMD IOMMU issue). We seems do not correctly
set URB transfer length smaller than sg buffer length. Attached  patch
should correct that.

Stanislaw
From bc09bc7fa604019a5ef90184390e7c2a3899869d Mon Sep 17 00:00:00 2001
From: Stanislaw Gruszka <sgruszka@redhat.com>
Date: Sun, 10 Feb 2019 08:09:48 +0100
Subject: [PATCH] mt76x02: usb_mcu: limit sg length

When sending fw data we limting urb transfer length by changing buf->len,
while keeping segment length at max_payload value. That may confuse
underlying drivers responsible for DMA.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
---
 drivers/net/wireless/mediatek/mt76/mt76x02_usb_mcu.c | 1 +
 1 file changed, 1 insertion(+)
Lorenzo Bianconi - Feb. 10, 2019, 10:22 a.m.
> On Sat, Feb 09, 2019 at 09:29:05PM +0100, Stefan Wahren wrote:
> > > could you please test the following series:
> > > https://patchwork.kernel.org/cover/10764453/
> >
> > yeah this fixed the probing timeout and the driver will probe successful. AFAIK the dwc2 host mode doesn't support scatter-gather yet.
>
> So this is either dwc2 scatter-gather problem which should be addressed in
> this driver or mt76x0u does something wrong when configuring SG.
>
> Disabling SG is just workaround, which do not address actual problem.
>
> I think I found mt76x0u issue that could cause this USB probe error
> (and possibly also address AMD IOMMU issue). We seems do not correctly
> set URB transfer length smaller than sg buffer length. Attached  patch
> should correct that.

Hi Stanislaw,

I think 'sg[0].length' is already set in mt76u_fill_rx_sg().
Moreover applying this patch I got the following crash (rpi-5.0.y):

[   38.856623] mt76x0u 1-1.2:1.0: ASIC revision: 76100002 MAC
revision: 76502000
[   38.865999] ------------[ cut here ]------------
[   38.871817] WARNING: CPU: 3 PID: 751 at lib/refcount.c:187
refcount_sub_and_test_checked+0xa4/0xb8
[   38.883277] refcount_t: underflow; use-after-free.
[   38.889358] Modules linked in: mt76x0u(+) mt76x0_common mt76x02_usb
mt76_usb mt76x02_lib mt76 mac80211 bnep hci_uart btbcm serdev
bluetooth ecdh_generic spidev brcmfmac brcmutil sha256_generic
cfg80211 rfkill raspberrypi_hwmon hwmon b
cm2835_v4l2(C) i2c_bcm2835 v4l2_common videobuf2_vmalloc
videobuf2_memops snd_bcm2835(C) spi_bcm2835 videobuf2_v4l2
videobuf2_common snd_pcm videodev snd_timer media snd uio_pdrv_genirq
uio fixed i2c_dev ip_tables x_tables ipv6
[   38.938953] CPU: 3 PID: 751 Comm: systemd-udevd Tainted: G
C        5.0.0-rc5-v7+ #1
[   38.950574] Hardware name: BCM2835
[   38.955503] Backtrace:
[   38.959410] [<8010c91c>] (dump_backtrace) from [<8010cc00>]
(show_stack+0x20/0x24)
[   38.969952]  r6:60000013 r5:ffffffff r4:00000000 r3:a50bade6
[   38.977198] [<8010cbe0>] (show_stack) from [<807ca5f4>]
(dump_stack+0xc8/0x114)
[   38.986183] [<807ca52c>] (dump_stack) from [<8011e65c>]
(__warn+0xf4/0x120)
[   38.994804]  r9:000000bb r8:804d0138 r7:00000009 r6:8099dc84
r5:00000000 r4:b9631b58
[   39.005708] [<8011e568>] (__warn) from [<8011e6d0>]
(warn_slowpath_fmt+0x48/0x50)
[   39.016372]  r9:7f652128 r8:80d1419c r7:80c0bac4 r6:b34a2044
r5:b6096780 r4:00000000
[   39.027368] [<8011e68c>] (warn_slowpath_fmt) from [<804d0138>]
(refcount_sub_and_test_checked+0xa4/0xb8)
[   39.040150]  r3:80c91c25 r2:8099dc94
[   39.045296]  r4:00000000
[   39.049320] [<804d0094>] (refcount_sub_and_test_checked) from
[<804d0164>] (refcount_dec_and_test_checked+0x18/0x1c)
[   39.062966]  r4:b6096780 r3:00000001
[   39.068043] [<804d014c>] (refcount_dec_and_test_checked) from
[<805db49c>] (usb_free_urb+0x20/0x4c)
[   39.080279] [<805db47c>] (usb_free_urb) from [<7f62d804>]
(mt76u_buf_free+0x98/0xac [mt76_usb])
[   39.092224]  r4:00000001 r3:00000001
[   39.097386] [<7f62d76c>] (mt76u_buf_free [mt76_usb]) from
[<7f62def8>] (mt76u_queues_deinit+0x44/0x100 [mt76_usb])
[   39.111016]  r8:b791f600 r7:b3628480 r6:b3628e20 r5:00000001
r4:00000000 r3:00000080
[   39.122039] [<7f62deb4>] (mt76u_queues_deinit [mt76_usb]) from
[<7f650040>] (mt76x0u_cleanup+0x40/0x4c [mt76x0u])
[   39.135636]  r7:b3628480 r6:b791f600 r5:ffffffea r4:b3628e20
[   39.142968] [<7f650000>] (mt76x0u_cleanup [mt76x0u]) from
[<7f650564>] (mt76x0u_probe+0x1f0/0x354 [mt76x0u])
[   39.156063]  r4:b3628e20 r3:00000000
[   39.161202] [<7f650374>] (mt76x0u_probe [mt76x0u]) from
[<805e0b6c>] (usb_probe_interface+0x104/0x240)
[   39.173805]  r7:00000000 r6:7f652034 r5:b6299000 r4:b791f620
[   39.181165] [<805e0a68>] (usb_probe_interface) from [<8056a8bc>]
(really_probe+0x224/0x2f8)
[   39.192856]  r10:b626d800 r9:00000019 r8:7f652034 r7:80d3e124
r6:00000000 r5:80d3e120
[   39.204057]  r4:b791f620 r3:805e0a68
[   39.209269] [<8056a698>] (really_probe) from [<8056ab60>]
(driver_probe_device+0x6c/0x180)
[   39.220854]  r10:b626d800 r9:7f6522c0 r8:b791f620 r7:00000000
r6:7f652034 r5:7f652034
[   39.232057]  r4:b791f620 r3:00000000
[   39.237265] [<8056aaf4>] (driver_probe_device) from [<8056ad54>]
(__driver_attach+0xe0/0xe4)
[   39.248982]  r9:7f6522c0 r8:7f65122c r7:00000000 r6:b791f654
r5:7f652034 r4:b791f620
[   39.260002] [<8056ac74>] (__driver_attach) from [<8056880c>]
(bus_for_each_dev+0x68/0xa0)
[   39.271498]  r6:8056ac74 r5:7f652034 r4:00000000 r3:00000027
[   39.278885] [<805687a4>] (bus_for_each_dev) from [<8056a1cc>]
(driver_attach+0x28/0x30)
[   39.290252]  r6:80c6ddc8 r5:b6096a00 r4:7f652034
[   39.296557] [<8056a1a4>] (driver_attach) from [<80569c24>]
(bus_add_driver+0x194/0x21c)
[   39.307899] [<80569a90>] (bus_add_driver) from [<8056b504>]
(driver_register+0x8c/0x124)
[   39.319328]  r7:80c6ddc8 r6:7f652034 r5:00000000 r4:7f652034
[   39.326730] [<8056b478>] (driver_register) from [<805df510>]
(usb_register_driver+0x74/0x140)
[   39.338655]  r5:00000000 r4:7f652000
[   39.343880] [<805df49c>] (usb_register_driver) from [<7f655024>]
(mt76x0_driver_init+0x24/0x1000 [mt76x0u])
[   39.357002]  r9:00000001 r8:7f652308 r7:00000000 r6:80c08d48
r5:7f655000 r4:7f6522c0
[   39.368143] [<7f655000>] (mt76x0_driver_init [mt76x0u]) from
[<80102f6c>] (do_one_initcall+0x4c/0x210)
[   39.380894] [<80102f20>] (do_one_initcall) from [<801ae63c>]
(do_init_module+0x6c/0x21c)
[   39.392392]  r8:7f652308 r7:80c08d48 r6:b6116880 r5:7f6522c0
r4:7f6522c0
[   39.400876] [<801ae5d0>] (do_init_module) from [<801ad68c>]
(load_module+0x1d10/0x2304)

Moreover for mt76x0u SG is 'already' disabled since we use just one
buffer so from performance point of view I do not see any difference
of using a standard usb buffer.
This patch has been tested in multiple scenarios and seems to fix
reported issues (for usb2.0).
Are you concerned about increasing code complexity?

Regards,
Lorenzo

>
> Stanislaw
Stanislaw Gruszka - Feb. 11, 2019, 7:44 a.m.
On Sun, Feb 10, 2019 at 11:22:25AM +0100, Lorenzo Bianconi wrote:
> > On Sat, Feb 09, 2019 at 09:29:05PM +0100, Stefan Wahren wrote:
> > > > could you please test the following series:
> > > > https://patchwork.kernel.org/cover/10764453/
> > >
> > > yeah this fixed the probing timeout and the driver will probe successful. AFAIK the dwc2 host mode doesn't support scatter-gather yet.
> >
> > So this is either dwc2 scatter-gather problem which should be addressed in
> > this driver or mt76x0u does something wrong when configuring SG.
> >
> > Disabling SG is just workaround, which do not address actual problem.
> >
> > I think I found mt76x0u issue that could cause this USB probe error
> > (and possibly also address AMD IOMMU issue). We seems do not correctly
> > set URB transfer length smaller than sg buffer length. Attached  patch
> > should correct that.
> 
> Hi Stanislaw,
> 
> I think 'sg[0].length' is already set in mt76u_fill_rx_sg().

It is, buf->len and sg[0].length are initialized to the same value for 1
segment. But then buf->len (assigned to urb->buffer_transfer_length) change
to smaller value , but sg[0].length stay the same. What I think can be
problem for usb host driver.

> Moreover applying this patch I got the following crash (rpi-5.0.y):

Ok, so with patch probe fail instantly and trigger yet another bug(s)
on error path. You seems to address that already. 

> Moreover for mt76x0u SG is 'already' disabled since we use just one
> buffer so from performance point of view I do not see any difference
> of using a standard usb buffer.
> This patch has been tested in multiple scenarios and seems to fix
> reported issues (for usb2.0).

Ok, so passing buffer via urb->transfer_buffer works. But why urb->sg
does not work for 1 segment ?

> Are you concerned about increasing code complexity?

That's one of my concerns. Another, more important one, is that
changing to urb->transfer_buffer just hide the problems. And they will
pop up when someone will start to use SG (BTW how this can be tested
for more than one fragment, IOW how multiple fragments skb's can
be generated ? ).

And now I think the bugs can be in mt76 driver taking that problems
happened on different platforms (rpi and AMD IOMMU), i.e. we do not
correctly set urb->nub_seg or length or do some other thing wrong.

Stanislaw
Lorenzo Bianconi - Feb. 11, 2019, 10:04 a.m.
> On Sun, Feb 10, 2019 at 11:22:25AM +0100, Lorenzo Bianconi wrote:
> > > On Sat, Feb 09, 2019 at 09:29:05PM +0100, Stefan Wahren wrote:
> > > > > could you please test the following series:
> > > > > https://patchwork.kernel.org/cover/10764453/
> > > >
> > > > yeah this fixed the probing timeout and the driver will probe successful. AFAIK the dwc2 host mode doesn't support scatter-gather yet.
> > >
> > > So this is either dwc2 scatter-gather problem which should be addressed in
> > > this driver or mt76x0u does something wrong when configuring SG.
> > >
> > > Disabling SG is just workaround, which do not address actual problem.
> > >
> > > I think I found mt76x0u issue that could cause this USB probe error
> > > (and possibly also address AMD IOMMU issue). We seems do not correctly
> > > set URB transfer length smaller than sg buffer length. Attached  patch
> > > should correct that.
> > 
> > Hi Stanislaw,
> > 
> > I think 'sg[0].length' is already set in mt76u_fill_rx_sg().
> 
> It is, buf->len and sg[0].length are initialized to the same value for 1
> segment. But then buf->len (assigned to urb->buffer_transfer_length) change
> to smaller value , but sg[0].length stay the same. What I think can be
> problem for usb host driver.
> 
> > Moreover applying this patch I got the following crash (rpi-5.0.y):
> 
> Ok, so with patch probe fail instantly and trigger yet another bug(s)
> on error path. You seems to address that already. 
> 
> > Moreover for mt76x0u SG is 'already' disabled since we use just one
> > buffer so from performance point of view I do not see any difference
> > of using a standard usb buffer.
> > This patch has been tested in multiple scenarios and seems to fix
> > reported issues (for usb2.0).
> 
> Ok, so passing buffer via urb->transfer_buffer works. But why urb->sg
> does not work for 1 segment ?

Here it is a different issue respect to the AMD IOMMU one, dwc2 host driver
does not implement SG I/O so probing fails. I guess it is still useful to
implement a 'legacy' mode that enable mt76 on host controllers that do not implement
SG I/O (rpi is a very common device so it will be cool to have mt76 working on
it). Moreover we are not removing functionalities, user experience will remain
the same

> 
> > Are you concerned about increasing code complexity?
> 
> That's one of my concerns. Another, more important one, is that
> changing to urb->transfer_buffer just hide the problems. And they will
> pop up when someone will start to use SG (BTW how this can be tested
> for more than one fragment, IOW how multiple fragments skb's can
> be generated ? ).

You need:
- usb 3.0 controller/device
- A-MSDU capable AP

> 
> And now I think the bugs can be in mt76 driver taking that problems
> happened on different platforms (rpi and AMD IOMMU), i.e. we do not
> correctly set urb->nub_seg or length or do some other thing wrong.

We still need to fix it on usb3.0 with AMD cpu/motherboard :)

Regards,
Lorenzo

> 
> Stanislaw
Stefan Wahren - Feb. 11, 2019, 10:33 a.m.
Hi,

Am 11.02.19 um 11:04 schrieb Lorenzo Bianconi:
>> On Sun, Feb 10, 2019 at 11:22:25AM +0100, Lorenzo Bianconi wrote:
>>>> On Sat, Feb 09, 2019 at 09:29:05PM +0100, Stefan Wahren wrote:
>>>>>> could you please test the following series:
>>>>>> https://patchwork.kernel.org/cover/10764453/
>>>>> yeah this fixed the probing timeout and the driver will probe successful. AFAIK the dwc2 host mode doesn't support scatter-gather yet.
>>>> So this is either dwc2 scatter-gather problem which should be addressed in
>>>> this driver or mt76x0u does something wrong when configuring SG.
>>>>
>>>> Disabling SG is just workaround, which do not address actual problem.
>>>>
>>>> I think I found mt76x0u issue that could cause this USB probe error
>>>> (and possibly also address AMD IOMMU issue). We seems do not correctly
>>>> set URB transfer length smaller than sg buffer length. Attached  patch
>>>> should correct that.
>>> Hi Stanislaw,
>>>
>>> I think 'sg[0].length' is already set in mt76u_fill_rx_sg().
>> It is, buf->len and sg[0].length are initialized to the same value for 1
>> segment. But then buf->len (assigned to urb->buffer_transfer_length) change
>> to smaller value , but sg[0].length stay the same. What I think can be
>> problem for usb host driver.
>>
>>> Moreover applying this patch I got the following crash (rpi-5.0.y):
>> Ok, so with patch probe fail instantly and trigger yet another bug(s)
>> on error path. You seems to address that already. 
>>
>>> Moreover for mt76x0u SG is 'already' disabled since we use just one
>>> buffer so from performance point of view I do not see any difference
>>> of using a standard usb buffer.
>>> This patch has been tested in multiple scenarios and seems to fix
>>> reported issues (for usb2.0).
>> Ok, so passing buffer via urb->transfer_buffer works. But why urb->sg
>> does not work for 1 segment ?
> Here it is a different issue respect to the AMD IOMMU one, dwc2 host driver
> does not implement SG I/O so probing fails. I guess it is still useful to
> implement a 'legacy' mode that enable mt76 on host controllers that do not implement
> SG I/O (rpi is a very common device so it will be cool to have mt76 working on
> it). Moreover we are not removing functionalities, user experience will remain
> the same
>
i'm not sure that you understand my mail [1] with the summary of my test
results.

In case of using the arm/multi_v7_defconfig (32 bit) the mt76 works like
a charm without your sg avoid patch series, but the arm64/defconfig (64
bit) requires the series to probe at least. So i wouldn't conclude from
the fact that dwc2 doesn't support SG any probing issues on arm64. So we
need to investigate which config option triggers the problem.

Stefan

[1] - https://marc.info/?l=linux-usb&m=154981675724078
Lorenzo Bianconi - Feb. 11, 2019, 11:06 a.m.
> Hi,
> 
> Am 11.02.19 um 11:04 schrieb Lorenzo Bianconi:
> >> On Sun, Feb 10, 2019 at 11:22:25AM +0100, Lorenzo Bianconi wrote:
> >>>> On Sat, Feb 09, 2019 at 09:29:05PM +0100, Stefan Wahren wrote:
> >>>>>> could you please test the following series:
> >>>>>> https://patchwork.kernel.org/cover/10764453/
> >>>>> yeah this fixed the probing timeout and the driver will probe successful. AFAIK the dwc2 host mode doesn't support scatter-gather yet.
> >>>> So this is either dwc2 scatter-gather problem which should be addressed in
> >>>> this driver or mt76x0u does something wrong when configuring SG.
> >>>>
> >>>> Disabling SG is just workaround, which do not address actual problem.
> >>>>
> >>>> I think I found mt76x0u issue that could cause this USB probe error
> >>>> (and possibly also address AMD IOMMU issue). We seems do not correctly
> >>>> set URB transfer length smaller than sg buffer length. Attached  patch
> >>>> should correct that.
> >>> Hi Stanislaw,
> >>>
> >>> I think 'sg[0].length' is already set in mt76u_fill_rx_sg().
> >> It is, buf->len and sg[0].length are initialized to the same value for 1
> >> segment. But then buf->len (assigned to urb->buffer_transfer_length) change
> >> to smaller value , but sg[0].length stay the same. What I think can be
> >> problem for usb host driver.
> >>
> >>> Moreover applying this patch I got the following crash (rpi-5.0.y):
> >> Ok, so with patch probe fail instantly and trigger yet another bug(s)
> >> on error path. You seems to address that already. 
> >>
> >>> Moreover for mt76x0u SG is 'already' disabled since we use just one
> >>> buffer so from performance point of view I do not see any difference
> >>> of using a standard usb buffer.
> >>> This patch has been tested in multiple scenarios and seems to fix
> >>> reported issues (for usb2.0).
> >> Ok, so passing buffer via urb->transfer_buffer works. But why urb->sg
> >> does not work for 1 segment ?
> > Here it is a different issue respect to the AMD IOMMU one, dwc2 host driver
> > does not implement SG I/O so probing fails. I guess it is still useful to
> > implement a 'legacy' mode that enable mt76 on host controllers that do not implement
> > SG I/O (rpi is a very common device so it will be cool to have mt76 working on
> > it). Moreover we are not removing functionalities, user experience will remain
> > the same
> >
> i'm not sure that you understand my mail [1] with the summary of my test
> results.
> 

Yes right, I did not get it sorry :)
as indicated here https://www.raspberrypi.org/documentation/linux/kernel/building.md
I am using bcm2709_defconfig config (using it I spotted the mt76 crashes and
probe failure)

Regards,
Lorenzo

> In case of using the arm/multi_v7_defconfig (32 bit) the mt76 works like
> a charm without your sg avoid patch series, but the arm64/defconfig (64
> bit) requires the series to probe at least. So i wouldn't conclude from
> the fact that dwc2 doesn't support SG any probing issues on arm64. So we
> need to investigate which config option triggers the problem.
> 
> Stefan
> 
> [1] - https://marc.info/?l=linux-usb&m=154981675724078
>
Stefan Wahren - Feb. 11, 2019, 2:04 p.m.
Hi Lorenzo,

Am 11.02.19 um 12:06 schrieb Lorenzo Bianconi:
>> Hi,
>>
>> Am 11.02.19 um 11:04 schrieb Lorenzo Bianconi:
>>>> On Sun, Feb 10, 2019 at 11:22:25AM +0100, Lorenzo Bianconi wrote:
>>>>>> On Sat, Feb 09, 2019 at 09:29:05PM +0100, Stefan Wahren wrote:
>>>>>>>> could you please test the following series:
>>>>>>>> https://patchwork.kernel.org/cover/10764453/
>>>>>>> yeah this fixed the probing timeout and the driver will probe successful. AFAIK the dwc2 host mode doesn't support scatter-gather yet.
>>>>>> So this is either dwc2 scatter-gather problem which should be addressed in
>>>>>> this driver or mt76x0u does something wrong when configuring SG.
>>>>>>
>>>>>> Disabling SG is just workaround, which do not address actual problem.
>>>>>>
>>>>>> I think I found mt76x0u issue that could cause this USB probe error
>>>>>> (and possibly also address AMD IOMMU issue). We seems do not correctly
>>>>>> set URB transfer length smaller than sg buffer length. Attached  patch
>>>>>> should correct that.
>>>>> Hi Stanislaw,
>>>>>
>>>>> I think 'sg[0].length' is already set in mt76u_fill_rx_sg().
>>>> It is, buf->len and sg[0].length are initialized to the same value for 1
>>>> segment. But then buf->len (assigned to urb->buffer_transfer_length) change
>>>> to smaller value , but sg[0].length stay the same. What I think can be
>>>> problem for usb host driver.
>>>>
>>>>> Moreover applying this patch I got the following crash (rpi-5.0.y):
>>>> Ok, so with patch probe fail instantly and trigger yet another bug(s)
>>>> on error path. You seems to address that already. 
>>>>
>>>>> Moreover for mt76x0u SG is 'already' disabled since we use just one
>>>>> buffer so from performance point of view I do not see any difference
>>>>> of using a standard usb buffer.
>>>>> This patch has been tested in multiple scenarios and seems to fix
>>>>> reported issues (for usb2.0).
>>>> Ok, so passing buffer via urb->transfer_buffer works. But why urb->sg
>>>> does not work for 1 segment ?
>>> Here it is a different issue respect to the AMD IOMMU one, dwc2 host driver
>>> does not implement SG I/O so probing fails. I guess it is still useful to
>>> implement a 'legacy' mode that enable mt76 on host controllers that do not implement
>>> SG I/O (rpi is a very common device so it will be cool to have mt76 working on
>>> it). Moreover we are not removing functionalities, user experience will remain
>>> the same
>>>
>> i'm not sure that you understand my mail [1] with the summary of my test
>> results.
>>
> Yes right, I did not get it sorry :)
> as indicated here https://www.raspberrypi.org/documentation/linux/kernel/building.md
> I am using bcm2709_defconfig config (using it I spotted the mt76 crashes and
> probe failure)

no problem, at the beginning this could be very confusing. I only want
to clarify that this documentation refers to the vendor kernel (with a
different USB host driver) of the Raspberry Pi Foundation.

All my results refers to the mainline kernel we all should talk about. I
started a gist which try to describe the mainline variant:
https://gist.github.com/lategoodbye/c7317a42bf7f9c07f5a91baed8c68f75

Maybe this could be helpful.

Stefan

> Regards,
> Lorenzo
>
>> In case of using the arm/multi_v7_defconfig (32 bit) the mt76 works like
>> a charm without your sg avoid patch series, but the arm64/defconfig (64
>> bit) requires the series to probe at least. So i wouldn't conclude from
>> the fact that dwc2 doesn't support SG any probing issues on arm64. So we
>> need to investigate which config option triggers the problem.
>>
>> Stefan
>>
>> [1] - https://marc.info/?l=linux-usb&m=154981675724078
>>
Lorenzo Bianconi - Feb. 11, 2019, 3:10 p.m.
> Hi Lorenzo,
> 
> Am 11.02.19 um 12:06 schrieb Lorenzo Bianconi:
> >> Hi,
> >>
> >> Am 11.02.19 um 11:04 schrieb Lorenzo Bianconi:
> >>>> On Sun, Feb 10, 2019 at 11:22:25AM +0100, Lorenzo Bianconi wrote:
> >>>>>> On Sat, Feb 09, 2019 at 09:29:05PM +0100, Stefan Wahren wrote:
> >>>>>>>> could you please test the following series:
> >>>>>>>> https://patchwork.kernel.org/cover/10764453/
> >>>>>>> yeah this fixed the probing timeout and the driver will probe successful. AFAIK the dwc2 host mode doesn't support scatter-gather yet.
> >>>>>> So this is either dwc2 scatter-gather problem which should be addressed in
> >>>>>> this driver or mt76x0u does something wrong when configuring SG.
> >>>>>>
> >>>>>> Disabling SG is just workaround, which do not address actual problem.
> >>>>>>
> >>>>>> I think I found mt76x0u issue that could cause this USB probe error
> >>>>>> (and possibly also address AMD IOMMU issue). We seems do not correctly
> >>>>>> set URB transfer length smaller than sg buffer length. Attached  patch
> >>>>>> should correct that.
> >>>>> Hi Stanislaw,
> >>>>>
> >>>>> I think 'sg[0].length' is already set in mt76u_fill_rx_sg().
> >>>> It is, buf->len and sg[0].length are initialized to the same value for 1
> >>>> segment. But then buf->len (assigned to urb->buffer_transfer_length) change
> >>>> to smaller value , but sg[0].length stay the same. What I think can be
> >>>> problem for usb host driver.
> >>>>
> >>>>> Moreover applying this patch I got the following crash (rpi-5.0.y):
> >>>> Ok, so with patch probe fail instantly and trigger yet another bug(s)
> >>>> on error path. You seems to address that already. 
> >>>>
> >>>>> Moreover for mt76x0u SG is 'already' disabled since we use just one
> >>>>> buffer so from performance point of view I do not see any difference
> >>>>> of using a standard usb buffer.
> >>>>> This patch has been tested in multiple scenarios and seems to fix
> >>>>> reported issues (for usb2.0).
> >>>> Ok, so passing buffer via urb->transfer_buffer works. But why urb->sg
> >>>> does not work for 1 segment ?
> >>> Here it is a different issue respect to the AMD IOMMU one, dwc2 host driver
> >>> does not implement SG I/O so probing fails. I guess it is still useful to
> >>> implement a 'legacy' mode that enable mt76 on host controllers that do not implement
> >>> SG I/O (rpi is a very common device so it will be cool to have mt76 working on
> >>> it). Moreover we are not removing functionalities, user experience will remain
> >>> the same
> >>>
> >> i'm not sure that you understand my mail [1] with the summary of my test
> >> results.
> >>
> > Yes right, I did not get it sorry :)
> > as indicated here https://www.raspberrypi.org/documentation/linux/kernel/building.md
> > I am using bcm2709_defconfig config (using it I spotted the mt76 crashes and
> > probe failure)
> 
> no problem, at the beginning this could be very confusing. I only want
> to clarify that this documentation refers to the vendor kernel (with a
> different USB host driver) of the Raspberry Pi Foundation.
> 
> All my results refers to the mainline kernel we all should talk about. I
> started a gist which try to describe the mainline variant:
> https://gist.github.com/lategoodbye/c7317a42bf7f9c07f5a91baed8c68f75

So to summarize:
- Raspberry Pi Foundation kernel works just with RFC series
- mainline kernel works out of the box

is my understanding correct? I am still considering adding a legacy mode since
there will not be any regression using it instead of SG I/O with just one SG
buffer

Regards,
Lorenzo

> 
> Maybe this could be helpful.
> 
> Stefan
> 
> > Regards,
> > Lorenzo
> >
> >> In case of using the arm/multi_v7_defconfig (32 bit) the mt76 works like
> >> a charm without your sg avoid patch series, but the arm64/defconfig (64
> >> bit) requires the series to probe at least. So i wouldn't conclude from
> >> the fact that dwc2 doesn't support SG any probing issues on arm64. So we
> >> need to investigate which config option triggers the problem.
> >>
> >> Stefan
> >>
> >> [1] - https://marc.info/?l=linux-usb&m=154981675724078
> >>
Alan Stern - Feb. 11, 2019, 3:12 p.m.
On Mon, 11 Feb 2019, Lorenzo Bianconi wrote:

> Here it is a different issue respect to the AMD IOMMU one, dwc2 host driver
> does not implement SG I/O so probing fails. I guess it is still useful to
> implement a 'legacy' mode that enable mt76 on host controllers that do not implement
> SG I/O (rpi is a very common device so it will be cool to have mt76 working on
> it). Moreover we are not removing functionalities, user experience will remain
> the same

Has anyone considered adding SG support to dwc2?  It shouldn't be very 
difficult.  The corresponding change for ehci-hcd required adding no 
more than about 30 lines of code.

Alan Stern
Stefan Wahren - Feb. 11, 2019, 3:27 p.m.
Hi,

Am 11.02.19 um 16:10 schrieb Lorenzo Bianconi:
>> Hi Lorenzo,
>>
>> Am 11.02.19 um 12:06 schrieb Lorenzo Bianconi:
>>>> Hi,
>>>>
>>>> Am 11.02.19 um 11:04 schrieb Lorenzo Bianconi:
>>>>>> On Sun, Feb 10, 2019 at 11:22:25AM +0100, Lorenzo Bianconi wrote:
>>>>>>>> On Sat, Feb 09, 2019 at 09:29:05PM +0100, Stefan Wahren wrote:
>>>>>>>>>> could you please test the following series:
>>>>>>>>>> https://patchwork.kernel.org/cover/10764453/
>>>>>>>>> yeah this fixed the probing timeout and the driver will probe successful. AFAIK the dwc2 host mode doesn't support scatter-gather yet.
>>>>>>>> So this is either dwc2 scatter-gather problem which should be addressed in
>>>>>>>> this driver or mt76x0u does something wrong when configuring SG.
>>>>>>>>
>>>>>>>> Disabling SG is just workaround, which do not address actual problem.
>>>>>>>>
>>>>>>>> I think I found mt76x0u issue that could cause this USB probe error
>>>>>>>> (and possibly also address AMD IOMMU issue). We seems do not correctly
>>>>>>>> set URB transfer length smaller than sg buffer length. Attached  patch
>>>>>>>> should correct that.
>>>>>>> Hi Stanislaw,
>>>>>>>
>>>>>>> I think 'sg[0].length' is already set in mt76u_fill_rx_sg().
>>>>>> It is, buf->len and sg[0].length are initialized to the same value for 1
>>>>>> segment. But then buf->len (assigned to urb->buffer_transfer_length) change
>>>>>> to smaller value , but sg[0].length stay the same. What I think can be
>>>>>> problem for usb host driver.
>>>>>>
>>>>>>> Moreover applying this patch I got the following crash (rpi-5.0.y):
>>>>>> Ok, so with patch probe fail instantly and trigger yet another bug(s)
>>>>>> on error path. You seems to address that already. 
>>>>>>
>>>>>>> Moreover for mt76x0u SG is 'already' disabled since we use just one
>>>>>>> buffer so from performance point of view I do not see any difference
>>>>>>> of using a standard usb buffer.
>>>>>>> This patch has been tested in multiple scenarios and seems to fix
>>>>>>> reported issues (for usb2.0).
>>>>>> Ok, so passing buffer via urb->transfer_buffer works. But why urb->sg
>>>>>> does not work for 1 segment ?
>>>>> Here it is a different issue respect to the AMD IOMMU one, dwc2 host driver
>>>>> does not implement SG I/O so probing fails. I guess it is still useful to
>>>>> implement a 'legacy' mode that enable mt76 on host controllers that do not implement
>>>>> SG I/O (rpi is a very common device so it will be cool to have mt76 working on
>>>>> it). Moreover we are not removing functionalities, user experience will remain
>>>>> the same
>>>>>
>>>> i'm not sure that you understand my mail [1] with the summary of my test
>>>> results.
>>>>
>>> Yes right, I did not get it sorry :)
>>> as indicated here https://www.raspberrypi.org/documentation/linux/kernel/building.md
>>> I am using bcm2709_defconfig config (using it I spotted the mt76 crashes and
>>> probe failure)
>> no problem, at the beginning this could be very confusing. I only want
>> to clarify that this documentation refers to the vendor kernel (with a
>> different USB host driver) of the Raspberry Pi Foundation.
>>
>> All my results refers to the mainline kernel we all should talk about. I
>> started a gist which try to describe the mainline variant:
>> https://gist.github.com/lategoodbye/c7317a42bf7f9c07f5a91baed8c68f75
> So to summarize:
> - Raspberry Pi Foundation kernel works just with RFC series
> - mainline kernel works out of the box
>
> is my understanding correct? 

not really.

Compiling the mainline kernel with arm/multi_v7_defconfig it works.
Using the same kernel but with arm64/defconfig doesn't work. But i don't
think this is a 32/64 bit issue. The arm64 defconfig is much more
complex (e.g. enables more IOMMU stuff).

Regards
Stefan
Lorenzo Bianconi - Feb. 11, 2019, 3:57 p.m.
> Hi,
> 
> Am 11.02.19 um 16:10 schrieb Lorenzo Bianconi:
> >> Hi Lorenzo,
> >>
> >> Am 11.02.19 um 12:06 schrieb Lorenzo Bianconi:
> >>>> Hi,
> >>>>
> >>>> Am 11.02.19 um 11:04 schrieb Lorenzo Bianconi:
> >>>>>> On Sun, Feb 10, 2019 at 11:22:25AM +0100, Lorenzo Bianconi wrote:
> >>>>>>>> On Sat, Feb 09, 2019 at 09:29:05PM +0100, Stefan Wahren wrote:
> >>>>>>>>>> could you please test the following series:
> >>>>>>>>>> https://patchwork.kernel.org/cover/10764453/
> >>>>>>>>> yeah this fixed the probing timeout and the driver will probe successful. AFAIK the dwc2 host mode doesn't support scatter-gather yet.
> >>>>>>>> So this is either dwc2 scatter-gather problem which should be addressed in
> >>>>>>>> this driver or mt76x0u does something wrong when configuring SG.
> >>>>>>>>
> >>>>>>>> Disabling SG is just workaround, which do not address actual problem.
> >>>>>>>>
> >>>>>>>> I think I found mt76x0u issue that could cause this USB probe error
> >>>>>>>> (and possibly also address AMD IOMMU issue). We seems do not correctly
> >>>>>>>> set URB transfer length smaller than sg buffer length. Attached  patch
> >>>>>>>> should correct that.
> >>>>>>> Hi Stanislaw,
> >>>>>>>
> >>>>>>> I think 'sg[0].length' is already set in mt76u_fill_rx_sg().
> >>>>>> It is, buf->len and sg[0].length are initialized to the same value for 1
> >>>>>> segment. But then buf->len (assigned to urb->buffer_transfer_length) change
> >>>>>> to smaller value , but sg[0].length stay the same. What I think can be
> >>>>>> problem for usb host driver.
> >>>>>>
> >>>>>>> Moreover applying this patch I got the following crash (rpi-5.0.y):
> >>>>>> Ok, so with patch probe fail instantly and trigger yet another bug(s)
> >>>>>> on error path. You seems to address that already. 
> >>>>>>
> >>>>>>> Moreover for mt76x0u SG is 'already' disabled since we use just one
> >>>>>>> buffer so from performance point of view I do not see any difference
> >>>>>>> of using a standard usb buffer.
> >>>>>>> This patch has been tested in multiple scenarios and seems to fix
> >>>>>>> reported issues (for usb2.0).
> >>>>>> Ok, so passing buffer via urb->transfer_buffer works. But why urb->sg
> >>>>>> does not work for 1 segment ?
> >>>>> Here it is a different issue respect to the AMD IOMMU one, dwc2 host driver
> >>>>> does not implement SG I/O so probing fails. I guess it is still useful to
> >>>>> implement a 'legacy' mode that enable mt76 on host controllers that do not implement
> >>>>> SG I/O (rpi is a very common device so it will be cool to have mt76 working on
> >>>>> it). Moreover we are not removing functionalities, user experience will remain
> >>>>> the same
> >>>>>
> >>>> i'm not sure that you understand my mail [1] with the summary of my test
> >>>> results.
> >>>>
> >>> Yes right, I did not get it sorry :)
> >>> as indicated here https://www.raspberrypi.org/documentation/linux/kernel/building.md
> >>> I am using bcm2709_defconfig config (using it I spotted the mt76 crashes and
> >>> probe failure)
> >> no problem, at the beginning this could be very confusing. I only want
> >> to clarify that this documentation refers to the vendor kernel (with a
> >> different USB host driver) of the Raspberry Pi Foundation.
> >>
> >> All my results refers to the mainline kernel we all should talk about. I
> >> started a gist which try to describe the mainline variant:
> >> https://gist.github.com/lategoodbye/c7317a42bf7f9c07f5a91baed8c68f75
> > So to summarize:
> > - Raspberry Pi Foundation kernel works just with RFC series
> > - mainline kernel works out of the box
> >
> > is my understanding correct? 
> 
> not really.
> 
> Compiling the mainline kernel with arm/multi_v7_defconfig it works.
> Using the same kernel but with arm64/defconfig doesn't work. But i don't
> think this is a 32/64 bit issue. The arm64 defconfig is much more
> complex (e.g. enables more IOMMU stuff).

thx for the clarification :)

Regards,
Lorenzo

> 
> Regards
> Stefan
>
Stanislaw Gruszka - Feb. 11, 2019, 5:22 p.m.
On Mon, Feb 11, 2019 at 04:27:40PM +0100, Stefan Wahren wrote:
> >> All my results refers to the mainline kernel we all should talk about. I
> >> started a gist which try to describe the mainline variant:
> >> https://gist.github.com/lategoodbye/c7317a42bf7f9c07f5a91baed8c68f75
> > So to summarize:
> > - Raspberry Pi Foundation kernel works just with RFC series
> > - mainline kernel works out of the box
> >
> > is my understanding correct? 
> 
> not really.
> 
> Compiling the mainline kernel with arm/multi_v7_defconfig it works.
> Using the same kernel but with arm64/defconfig doesn't work. But i don't
> think this is a 32/64 bit issue. The arm64 defconfig is much more
> complex (e.g. enables more IOMMU stuff).

One possible thing that could be broken with IOMMU is allocating
big buffers via page_fraq_alloc(). Theoretically that should work,
but who knows. You can check my patch posted recently, it make
the driver stop doing big allocations via page_frag_alloc():

https://lore.kernel.org/linux-wireless/1549872974-7268-1-git-send-email-sgruszka@redhat.com/

Stanislaw
Stanislaw Gruszka - Feb. 11, 2019, 5:33 p.m.
On Mon, Feb 11, 2019 at 10:12:57AM -0500, Alan Stern wrote:
> On Mon, 11 Feb 2019, Lorenzo Bianconi wrote:
> 
> > Here it is a different issue respect to the AMD IOMMU one, dwc2 host driver
> > does not implement SG I/O so probing fails. I guess it is still useful to
> > implement a 'legacy' mode that enable mt76 on host controllers that do not implement
> > SG I/O (rpi is a very common device so it will be cool to have mt76 working on
> > it). Moreover we are not removing functionalities, user experience will remain
> > the same
> 
> Has anyone considered adding SG support to dwc2?  It shouldn't be very 
> difficult.  The corresponding change for ehci-hcd required adding no 
> more than about 30 lines of code.

That would be cool. Perhaps somebody with dwc2 hardware could do this.

However in mt76x02u we possibly would like to support other usb host
drivers with sg_tablesize = 0 . I would like to clarify what is correct
to do with such drivers.

Is ok to pass buffer via urb->sg with urb->num_sgs = 1 ? Or maybe 
urb->num_sgs should be 0 to pass buffer via urb->sg on such drivers ?
Or maybe non of above is correct and the only option that will work
in 100% is pass buffer via urb->transfer_buffer ?

Stanislaw
Alan Stern - Feb. 11, 2019, 5:49 p.m.
On Mon, 11 Feb 2019, Stanislaw Gruszka wrote:

> On Mon, Feb 11, 2019 at 10:12:57AM -0500, Alan Stern wrote:
> > On Mon, 11 Feb 2019, Lorenzo Bianconi wrote:
> > 
> > > Here it is a different issue respect to the AMD IOMMU one, dwc2 host driver
> > > does not implement SG I/O so probing fails. I guess it is still useful to
> > > implement a 'legacy' mode that enable mt76 on host controllers that do not implement
> > > SG I/O (rpi is a very common device so it will be cool to have mt76 working on
> > > it). Moreover we are not removing functionalities, user experience will remain
> > > the same
> > 
> > Has anyone considered adding SG support to dwc2?  It shouldn't be very 
> > difficult.  The corresponding change for ehci-hcd required adding no 
> > more than about 30 lines of code.
> 
> That would be cool. Perhaps somebody with dwc2 hardware could do this.
> 
> However in mt76x02u we possibly would like to support other usb host
> drivers with sg_tablesize = 0 . I would like to clarify what is correct
> to do with such drivers.
> 
> Is ok to pass buffer via urb->sg with urb->num_sgs = 1 ? Or maybe 
> urb->num_sgs should be 0 to pass buffer via urb->sg on such drivers ?
> Or maybe non of above is correct and the only option that will work
> in 100% is pass buffer via urb->transfer_buffer ?

See the kerneldoc for usb_sg_init(), usb_sg_wait(), and usb_sg_cancel() 
in drivers/usb/core/message.c.  These routines will always do the right 
thing -- however usb_sg_wait() must be called in process context.

Alan Stern
Lorenzo Bianconi - Feb. 12, 2019, 12:06 a.m.
>
> On Mon, 11 Feb 2019, Stanislaw Gruszka wrote:
>
> > On Mon, Feb 11, 2019 at 10:12:57AM -0500, Alan Stern wrote:
> > > On Mon, 11 Feb 2019, Lorenzo Bianconi wrote:
> > >
> > > > Here it is a different issue respect to the AMD IOMMU one, dwc2 host driver
> > > > does not implement SG I/O so probing fails. I guess it is still useful to
> > > > implement a 'legacy' mode that enable mt76 on host controllers that do not implement
> > > > SG I/O (rpi is a very common device so it will be cool to have mt76 working on
> > > > it). Moreover we are not removing functionalities, user experience will remain
> > > > the same
> > >
> > > Has anyone considered adding SG support to dwc2?  It shouldn't be very
> > > difficult.  The corresponding change for ehci-hcd required adding no
> > > more than about 30 lines of code.
> >
> > That would be cool. Perhaps somebody with dwc2 hardware could do this.
> >
> > However in mt76x02u we possibly would like to support other usb host
> > drivers with sg_tablesize = 0 . I would like to clarify what is correct
> > to do with such drivers.
> >
> > Is ok to pass buffer via urb->sg with urb->num_sgs = 1 ? Or maybe
> > urb->num_sgs should be 0 to pass buffer via urb->sg on such drivers ?
> > Or maybe non of above is correct and the only option that will work
> > in 100% is pass buffer via urb->transfer_buffer ?
>
> See the kerneldoc for usb_sg_init(), usb_sg_wait(), and usb_sg_cancel()
> in drivers/usb/core/message.c.  These routines will always do the right
> thing -- however usb_sg_wait() must be called in process context.
>
> Alan Stern
>

Hi Alan,

I actually used usb_sg_init()/usb_sg_wait() as reference to implement
mt76u {tx/rx} datapath, but I will double-check.
I guess we should even consider if there are other usb host drivers
that do not implement SG I/O and it is worth to support.
I am wondering if the right approach is to add SG to the controller
one by one or have legacy I/O in mt76 (not sure what is the 'best'
approach)
What do you think?

Regards,
Lorenzo
Alan Stern - Feb. 12, 2019, 3:27 p.m.
On Tue, 12 Feb 2019, Lorenzo Bianconi wrote:

> Hi Alan,
> 
> I actually used usb_sg_init()/usb_sg_wait() as reference to implement
> mt76u {tx/rx} datapath, but I will double-check.
> I guess we should even consider if there are other usb host drivers
> that do not implement SG I/O and it is worth to support.
> I am wondering if the right approach is to add SG to the controller
> one by one or have legacy I/O in mt76 (not sure what is the 'best'
> approach)
> What do you think?

If mt76u can use usb_sg_init/usb_sg_wait, that would be the simplest.  
It would allow you to remove a lot of code from the driver.  And then
adding SG support to the controller drivers one by one would be fine.

However, if that isn't feasible then you have to keep legacy I/O in 
mt76u as long as any controller drivers don't support SG.

Alan Stern
Stefan Wahren - Feb. 13, 2019, 7:05 a.m.
Hi,

> Lorenzo Bianconi <lorenzo.bianconi@redhat.com> hat am 11. Februar 2019 um 16:57 geschrieben:
> 
> 
> > 
> > Compiling the mainline kernel with arm/multi_v7_defconfig it works.
> > Using the same kernel but with arm64/defconfig doesn't work. But i don't
> > think this is a 32/64 bit issue. The arm64 defconfig is much more
> > complex (e.g. enables more IOMMU stuff).
> 
> thx for the clarification :)
> 

i was busy to enable the mt76xx wifi driver in my build system. Usually i handle this via scripts/config but i didn't manage to enable the driver. Finally i gave up and enabled it via defconfig patch.

Here are my test results for Raspberry Pi 3 B+ (without any additional patches):

multi_v7_defconfig 5.0.0-rc6 -> firmware timeout
multi_v7_defconfig next-2019-02-12 -> firmware timeout
arm64_defconfig 5.0.0-rc6 -> vendor request 07: -110 (timeout)
arm64_defconfig next-2019-02-12 -> vendor request 07: -110 (timeout)

Shame on me for the wrong result before (assume to not properly cleanup the kernel modules on sd card). So there is no config option in arm64_defconfig which breaks mt76xx.

I will try the recent patches later.

Thanks
Stefan

Patch

diff --git a/drivers/net/wireless/mediatek/mt76/mt76x02_usb_mcu.c b/drivers/net/wireless/mediatek/mt76/mt76x02_usb_mcu.c
index da299b8a1334..cfa14506eca6 100644
--- a/drivers/net/wireless/mediatek/mt76/mt76x02_usb_mcu.c
+++ b/drivers/net/wireless/mediatek/mt76/mt76x02_usb_mcu.c
@@ -287,6 +287,7 @@  __mt76x02u_mcu_fw_send_data(struct mt76x02_dev *dev, struct mt76u_buf *buf,
 			MT_FCE_DMA_LEN, len << 16);
 
 	buf->len = MT_CMD_HDR_LEN + len + sizeof(info);
+	buf->urb->sg[0].length = buf->len;
 	err = mt76u_submit_buf(&dev->mt76, USB_DIR_OUT,
 			       MT_EP_OUT_INBAND_CMD,
 			       buf, GFP_KERNEL,