Patchwork dmaengine: ti: omap-dma: Configure LCH_TYPE for OMAP1

login
register
mail settings
Submitter Peter Ujfalusi
Date Nov. 23, 2018, 11:45 a.m.
Message ID <cb1355b5-0d2e-1f83-5ea1-7615350ca0ec@ti.com>
Download mbox | patch
Permalink /patch/663579/
State New
Headers show

Comments

Peter Ujfalusi - Nov. 23, 2018, 11:45 a.m.
On 23/11/2018 0.01, Aaro Koskinen wrote:
> Hi,
> 
> On Thu, Nov 22, 2018 at 10:31:31AM +0200, Peter Ujfalusi wrote:
>> On 20/11/2018 23.04, Aaro Koskinen wrote:
>>> On Tue, Nov 20, 2018 at 09:28:37AM +0200, Peter Ujfalusi wrote:
>>>> On 19/11/2018 20.46, Aaro Koskinen wrote:
>>>>> On Mon, Nov 19, 2018 at 12:40:40PM +0200, Peter Ujfalusi wrote:
>>>>>> When the channel is configured for slave operation the LCH_TYPE needs to be
>>>>>> set to LCh-P. For memcpy channels the LCH_TYPE must be set to LCh-2D.
>>>>>>
>>>>>> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
>>>>>
>>>>> I don't have the documentation, but based on what omap_udc driver (still
>>>>> using the legacy OMAP DMA API) does this seems to be correct.
>>>>
>>>> They are hard to fine, true. From the omap1710 TRM:
>>>>
>>>> Logical channel types (LCh types) supported are:
>>>> - LCh-2D for nonsynchronized transfers (memory transfers, 1D and 2D)
>>>> - LCh-P for synchronized transfers (mostly peripheral transfers)
>>>> - LCh-PD similar to LCh-P but runs on a dedicated physical channel
>>>> - LCh-G for graphical transfers/operations
>>>> - LCh-D for display transfers
>>>
>>> (I found a public document "OMAP5912 Multimedia Processor Direct
>>> Memory Access (DMA) Support Reference Guide", documenting these; easy
>>> to confuse with "OMAP5910 Dual-Core Processor System DMA Controller
>>> Reference Guide".)
>>>
>>>> Looking at other part it looks like hat LCH-2D channel mode can happily
>>>> service a peripheral. LCH-P supports the same features as LCH-2D, but it
>>>> lacks support for Single/Double-indexed addressing mode on the
>>>> peripheral port side.
>>>>
>>>> So, this patch might not be needed at all. Can you test the omap_udc
>>>> with s/OMAP_DMA_LCH_P/OMAP_DMA_LCH_2D
>>>>
>>>> If USB works, then we can just drop this patch.
>>>
>>> Unfortunately omap_udc does not seem to work at all anymore with DMA on
>>> my 770 setup. :-(
>>>
>>> I had switched to PIO mode in 2015 since the WARNs about legacy DMA
>>> API were too annoying and flooding the console. And now that I tried
>>> using DMA again with g_ether, it doesn't work anymore. The device get's
>>> recognized on host side, but no traffic goes through. Switching back to
>>> PIO makes it to work again.
>>
>> After some tinkering I got omap_udc working with DMA (not DMAengine,
>> yet) on 770 - using nfsroot. Configuring the channels to OMAP_DMA_LCH_2D
>> works as expected.
> 
> Would be interesting to know how you got it working with DMA. Which
> gadget driver were you using?
> 
> I bisected my issue, and got:
> 
> commit 387f869d2579e379ee343f5493dcd360be60f5c6 (refs/bisect/bad)
> Author: Felipe Balbi <felipe.balbi@linux.intel.com>
> Date:   Wed Mar 22 13:25:18 2017 +0200
> 
>     usb: gadget: u_ether: conditionally align transfer size

I just:
commit 0d61d79625202c1c4fcf07fb960e27984a3657a3
Author: Peter Ujfalusi <peter.ujfalusi@ti.com>
Date:   Thu Nov 22 10:36:55 2018 +0200

    usb: gadget: omap_udc: HACK: Make RX dma work
    
    partial packets do work...
    
    Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>


> With that reverted, the DMA works OK (and I can also now confirm that
> OMAP_DMA_LCH_2D works). I haven't yet checked if we actually need that
> quirk in OMAP UDC,

The omap_udc driver is a bit of a mess, need to check it myself, but for
now we can just set the quirk_ep_out_aligned_size and investigate later.

> or if this is related to RMK's findings of potential
> bugs in the driver. Anyway, there is clearly yet another regression.

I'll check Russell's mail.

> 
> A.
> 

- P├ęter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Aaro Koskinen - Nov. 24, 2018, 12:17 a.m.
Hi,

On Fri, Nov 23, 2018 at 01:45:46PM +0200, Peter Ujfalusi wrote:
> On 23/11/2018 0.01, Aaro Koskinen wrote:
> > With that reverted, the DMA works OK (and I can also now confirm that
> > OMAP_DMA_LCH_2D works). I haven't yet checked if we actually need that
> > quirk in OMAP UDC,
> 
> The omap_udc driver is a bit of a mess, need to check it myself, but for
> now we can just set the quirk_ep_out_aligned_size and investigate later.

OK, with quirk_ep_out_aligned_size we get 770/16xx DMA working again,
but on 15xx the omap_udc DMA still doesn't work (tested today for the
first time ever, I have no idea if it has ever worked and if so, when?).

A.
Russell King - ARM Linux - Nov. 24, 2018, 5:48 p.m.
On Sat, Nov 24, 2018 at 02:17:40AM +0200, Aaro Koskinen wrote:
> Hi,
> 
> On Fri, Nov 23, 2018 at 01:45:46PM +0200, Peter Ujfalusi wrote:
> > On 23/11/2018 0.01, Aaro Koskinen wrote:
> > > With that reverted, the DMA works OK (and I can also now confirm that
> > > OMAP_DMA_LCH_2D works). I haven't yet checked if we actually need that
> > > quirk in OMAP UDC,
> > 
> > The omap_udc driver is a bit of a mess, need to check it myself, but for
> > now we can just set the quirk_ep_out_aligned_size and investigate later.
> 
> OK, with quirk_ep_out_aligned_size we get 770/16xx DMA working again,
> but on 15xx the omap_udc DMA still doesn't work (tested today for the
> first time ever, I have no idea if it has ever worked and if so, when?).

Hmm, there's more questionable stuff in this driver, and the gadget
layer.

Fundamental fact of struct device - it's a ref-counted structure and
will only be freed when the last reference is dropped.  dev_unregister()
merely drops the refcount, it doesn't guarantee that it's dropped to
zero (iow, there can be other users).  Only when the refcount drops
to zero is the dev.release function called.  However:

usb_add_gadget_udc_release(..., release)
{
        if (release)
                gadget->dev.release = release;
        else
                gadget->dev.release = usb_udc_nop_release;
        device_initialize(&gadget->dev);
        ret = device_add(&gadget->dev);
}

At this point, that struct device is registered, so its refcount can
be increased by other users.

void usb_del_gadget_udc(struct usb_gadget *gadget)
{
...
        device_unregister(&gadget->dev);
        memset(&gadget->dev, 0x00, sizeof(gadget->dev));
}

That memset() is down-right wrong - the refcount on this struct device
may _not_ be zero at this point, the struct device could well be in
use by another thread.  That memset will trample over the contents of
the structure potentially while someone else is using it, and
_potentially_ before the gadget->dev.release function has been called.

However, that _may_ be a good thing when you read the omap_udc code:

        status = usb_add_gadget_udc_release(&pdev->dev, &udc->gadget,
                        omap_udc_release);

During the omap_udc_remove() function:
{
...
        usb_del_gadget_udc(&udc->gadget);
        if (udc->driver)
                return -EBUSY;

        udc->done = &done;

... more dereferences of udc, which is a _global_ variable ...

        wait_for_completion(&done);
}

Now, omap_udc_release() does this:

        complete(udc->done);
        kfree(udc);
        udc = NULL;

So, when usb_del_gadget_udc() is called, if device_unregister() within
there drops the last reference count, omap_udc_release() will be called
immediately.  Since udc->done hasn't been setup at that point, that
complete() will fail with a NULL pointer dereference.  If that doesn't
happen, then the kfree() and following set of the global 'udc' variable
to NULL means that all future references to 'udc' after the call to
usb_del_gadget_udc() in omap_udc_remove() will be dereferencing a NULL
pointer.  So one way or the other, this leads to a kernel OOPS.

If, on the other hand, omap_udc_release() was not called in
device_unregister(), the function pointer will be zeroed by the
memset(), which will lead to 'udc' never being freed - in other words,
we leak memory.

What's more is that 'done' is never "completed" so we end up hanging
at the wait_for_completion().

Then there's the pointless:

        if (udc->driver)
                return -EBUSY;

in omap_udc_remove().  The effect of returning an error is... what
exactly?  It doesn't prevent the device being removed at all, it
doesn't delay it, in fact the whole "remove returns an int" is
nothing but confusion - the return value from all driver remove
methods is completely ignored.

If udc->driver is still set at this point, it basically means that
we skip the rest of the tear down, but the platform device will
still be unbound from the driver, leaving (eg) the transceiver phy
still claimed, the procfs file still registered, the interrupts
still claimed, the memory region still registered, etc.  If omap_udc
is built as a module, the module could even be removed while all
that is still registered.

So, whatever way I look at this, the code in the removal path both
in omap_udc and the gadget removal code higher up looks very wrong
and broken to me.
Aaro Koskinen - Nov. 24, 2018, 7:06 p.m.
Hello,

On Sat, Nov 24, 2018 at 05:48:23PM +0000, Russell King - ARM Linux wrote:
> Hmm, there's more questionable stuff in this driver, and the gadget
> layer.

[...]

> So, whatever way I look at this, the code in the removal path both
> in omap_udc and the gadget removal code higher up looks very wrong
> and broken to me.

Yes, week ago I saw omap_udc crashing on both probe failure and
module removal and sent some fixes for the most obvious failures (see
https://marc.info/?l=linux-usb&m=154258778316932&w=2).

Is there any good driver that uses usb_add_gadget_udc_release() correctly?
Looking at fsl_qe_udc.c and fsl_udc_core.c they should also crash if
usb_add_gadget_udc_release() fails.

A.
Russell King - ARM Linux - Nov. 24, 2018, 7:29 p.m.
On Sat, Nov 24, 2018 at 09:06:48PM +0200, Aaro Koskinen wrote:
> Hello,
> 
> On Sat, Nov 24, 2018 at 05:48:23PM +0000, Russell King - ARM Linux wrote:
> > Hmm, there's more questionable stuff in this driver, and the gadget
> > layer.
> 
> [...]
> 
> > So, whatever way I look at this, the code in the removal path both
> > in omap_udc and the gadget removal code higher up looks very wrong
> > and broken to me.
> 
> Yes, week ago I saw omap_udc crashing on both probe failure and
> module removal and sent some fixes for the most obvious failures (see
> https://marc.info/?l=linux-usb&m=154258778316932&w=2).

The effect of your patch is basically to replace the release function
with a no-op function.

> Is there any good driver that uses usb_add_gadget_udc_release() correctly?
> Looking at fsl_qe_udc.c and fsl_udc_core.c they should also crash if
> usb_add_gadget_udc_release() fails.

usb_add_gadget_udc_release() itself will call the release function
automatically on error.

The release function should _also_ be called when usb_del_gadget_udc()
is called (and would be guaranteed if the memset() is removed.)

So, moving the cleanup in the remove path into the release function
would solve the problem with omap_udc, and removing the memset()
would solve the problem with the core code.

It does leave a problem if the omap_udc module is removed - the
release function _could_ be called after the module has been removed
which would lead to an oops.  That's presumably why there's a
completion.  One solution to that would be to move the assignment
of udc->done before the call to usb_del_gadget_udc().

However, using a completion for something like this tends to be
frowned upon, but I don't see any other way to ensure correctness
here.

Patch

diff --git a/drivers/usb/gadget/udc/omap_udc.c b/drivers/usb/gadget/udc/omap_udc.c
index 94128eb69d97..0748c3841a25 100644
--- a/drivers/usb/gadget/udc/omap_udc.c
+++ b/drivers/usb/gadget/udc/omap_udc.c
@@ -886,14 +886,14 @@  omap_ep_queue(struct usb_ep *_ep, struct usb_request *_req, gfp_t gfp_flags)
 
        /* this isn't bogus, but OMAP DMA isn't the only hardware to
         * have a hard time with partial packet reads...  reject it.
+        * Wait a minute, it does work :o
         */
        if (use_dma
                        && ep->has_dma
                        && ep->bEndpointAddress != 0
                        && (ep->bEndpointAddress & USB_DIR_IN) == 0
                        && (req->req.length % ep->ep.maxpacket) != 0) {
-               DBG("%s, no partial packet OUT reads\n", __func__);
-               return -EMSGSIZE;
+               DBG("%s, partial packet OUT, might not work?\n", __func__);
        }
 
        udc = ep->udc;