Patchwork [2/3] udc: net2280: Fix overrun of OUT messages

login
register
mail settings
Submitter Guido Kiener
Date Feb. 4, 2019, 6:04 p.m.
Message ID <20190204180422.5095-2-guido@kiener-muenchen.de>
Download mbox | patch
Permalink /patch/717585/
State New
Headers show

Comments

Guido Kiener - Feb. 4, 2019, 6:04 p.m.
From: Guido Kiener <guido.kiener@rohde-schwarz.com>

The OUT endpoint normally blocks (NAK) subsequent packets when a
short packet is received and returns an incomplete queue entry to
the gadget driver. Thereby the gadget driver can detect
a short packet when reading queue entries with a length that is
not equal to a multiple of packet size.

The start_queue() function enables receiving OUT packets regardless
of the content of the OUT FIFO. This results in a problem:
When receiving is enabled more OUT packets are appended to the last
short packet and the gadget driver cannot determine the end of a
short packet anymore. Furthermore the remaining data in the OUT FIFO
is not delivered to the gadget driver immediately and can produce
timeout errors.

This fix only stops OUT naking when all FIFO data is delivered
to the gadget driver and the OUT FIFO is empty.

Signed-off-by: Guido Kiener <guido.kiener@rohde-schwarz.com>
---
 drivers/usb/gadget/udc/net2280.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Alan Stern - Feb. 4, 2019, 6:43 p.m.
On Mon, 4 Feb 2019, Guido Kiener wrote:

> From: Guido Kiener <guido.kiener@rohde-schwarz.com>
> 
> The OUT endpoint normally blocks (NAK) subsequent packets when a
> short packet is received and returns an incomplete queue entry to
> the gadget driver. Thereby the gadget driver can detect
> a short packet when reading queue entries with a length that is
> not equal to a multiple of packet size.
> 
> The start_queue() function enables receiving OUT packets regardless
> of the content of the OUT FIFO. This results in a problem:
> When receiving is enabled more OUT packets are appended to the last
> short packet and the gadget driver cannot determine the end of a
> short packet anymore. Furthermore the remaining data in the OUT FIFO
> is not delivered to the gadget driver immediately and can produce
> timeout errors.

How can that happen?  When the short packet is received, the endpoint 
immediately starts sending NAKs, so no more data can enter the FIFO.  
Furthermore, the incomplete transfer is returned to the gadget driver, 
which removes the short packet from the FIFO.  So the FIFO has to be 
empty when start_queue() is called.

Alan Stern

> This fix only stops OUT naking when all FIFO data is delivered
> to the gadget driver and the OUT FIFO is empty.
> 
> Signed-off-by: Guido Kiener <guido.kiener@rohde-schwarz.com>
> ---
>  drivers/usb/gadget/udc/net2280.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/gadget/udc/net2280.c b/drivers/usb/gadget/udc/net2280.c
> index 7154f00dea40..1cb58fd5d1c6 100644
> --- a/drivers/usb/gadget/udc/net2280.c
> +++ b/drivers/usb/gadget/udc/net2280.c
> @@ -867,7 +867,7 @@ static void start_queue(struct net2280_ep *ep, u32 dmactl, u32 td_dma)
>  
>  	writel(BIT(DMA_START), &dma->dmastat);
>  
> -	if (!ep->is_in)
> +	if (!ep->is_in && readl(&ep->regs->ep_avail) == 0)
>  		stop_out_naking(ep);
>  }
Guido Kiener - Feb. 5, 2019, 3:22 p.m.
Zitat von Alan Stern <stern@rowland.harvard.edu>:

> On Mon, 4 Feb 2019, Guido Kiener wrote:
>
>> From: Guido Kiener <guido.kiener@rohde-schwarz.com>
>>
>> The OUT endpoint normally blocks (NAK) subsequent packets when a
>> short packet is received and returns an incomplete queue entry to
>> the gadget driver. Thereby the gadget driver can detect
>> a short packet when reading queue entries with a length that is
>> not equal to a multiple of packet size.
>>
>> The start_queue() function enables receiving OUT packets regardless
>> of the content of the OUT FIFO. This results in a problem:
>> When receiving is enabled more OUT packets are appended to the last
>> short packet and the gadget driver cannot determine the end of a
>> short packet anymore. Furthermore the remaining data in the OUT FIFO
>> is not delivered to the gadget driver immediately and can produce
>> timeout errors.
>
> How can that happen?  When the short packet is received, the endpoint
> immediately starts sending NAKs, so no more data can enter the FIFO.
> Furthermore, the incomplete transfer is returned to the gadget driver,
> which removes the short packet from the FIFO.  So the FIFO has to be
> empty when start_queue() is called.

I don't remember the exact scenario. It happens something like this:
When you have only one request (e.g. 4kB) left in your endpoint OUT
queue and a host sends 4kB + some extra bytes, then your request (4kB)
is delivered to the gadget driver while the extra bytes (short packet)
are still in the OUT FIFO.
I try to reconstruct the situation (without fix) how the driver can call
(re)start_dma. I do not see a reasonable path in the moment.

Guido

>> This fix only stops OUT naking when all FIFO data is delivered
>> to the gadget driver and the OUT FIFO is empty.
Alan Stern - Feb. 5, 2019, 3:38 p.m.
On Tue, 5 Feb 2019 guido@kiener-muenchen.de wrote:

> 
> Zitat von Alan Stern <stern@rowland.harvard.edu>:
> 
> > On Mon, 4 Feb 2019, Guido Kiener wrote:
> >
> >> From: Guido Kiener <guido.kiener@rohde-schwarz.com>
> >>
> >> The OUT endpoint normally blocks (NAK) subsequent packets when a
> >> short packet is received and returns an incomplete queue entry to
> >> the gadget driver. Thereby the gadget driver can detect
> >> a short packet when reading queue entries with a length that is
> >> not equal to a multiple of packet size.
> >>
> >> The start_queue() function enables receiving OUT packets regardless
> >> of the content of the OUT FIFO. This results in a problem:
> >> When receiving is enabled more OUT packets are appended to the last
> >> short packet and the gadget driver cannot determine the end of a
> >> short packet anymore. Furthermore the remaining data in the OUT FIFO
> >> is not delivered to the gadget driver immediately and can produce
> >> timeout errors.
> >
> > How can that happen?  When the short packet is received, the endpoint
> > immediately starts sending NAKs, so no more data can enter the FIFO.
> > Furthermore, the incomplete transfer is returned to the gadget driver,
> > which removes the short packet from the FIFO.  So the FIFO has to be
> > empty when start_queue() is called.
> 
> I don't remember the exact scenario. It happens something like this:
> When you have only one request (e.g. 4kB) left in your endpoint OUT
> queue and a host sends 4kB + some extra bytes, then your request (4kB)
> is delivered to the gadget driver while the extra bytes (short packet)
> are still in the OUT FIFO.
> I try to reconstruct the situation (without fix) how the driver can call
> (re)start_dma. I do not see a reasonable path in the moment.

Okay, I see.  If there's no outstanding request when the short packet 
arrives, the data will remain in the FIFO.

But won't this situation be detected in start_dma(), which is the only
place where start_queue() is called?  start_dma() has code to test
specifically for a previous short OUT packet.  Should the fix go there
instead?

Alan Stern
Guido Kiener - Feb. 5, 2019, 4:35 p.m.
Zitat von Alan Stern <stern@rowland.harvard.edu>:

> On Tue, 5 Feb 2019 guido@kiener-muenchen.de wrote:
>
>>
>> Zitat von Alan Stern <stern@rowland.harvard.edu>:
>>
>> > On Mon, 4 Feb 2019, Guido Kiener wrote:
>> >
>> >> From: Guido Kiener <guido.kiener@rohde-schwarz.com>
>> >>
>> >> The OUT endpoint normally blocks (NAK) subsequent packets when a
>> >> short packet is received and returns an incomplete queue entry to
>> >> the gadget driver. Thereby the gadget driver can detect
>> >> a short packet when reading queue entries with a length that is
>> >> not equal to a multiple of packet size.
>> >>
>> >> The start_queue() function enables receiving OUT packets regardless
>> >> of the content of the OUT FIFO. This results in a problem:
>> >> When receiving is enabled more OUT packets are appended to the last
>> >> short packet and the gadget driver cannot determine the end of a
>> >> short packet anymore. Furthermore the remaining data in the OUT FIFO
>> >> is not delivered to the gadget driver immediately and can produce
>> >> timeout errors.
>> >
>> > How can that happen?  When the short packet is received, the endpoint
>> > immediately starts sending NAKs, so no more data can enter the FIFO.
>> > Furthermore, the incomplete transfer is returned to the gadget driver,
>> > which removes the short packet from the FIFO.  So the FIFO has to be
>> > empty when start_queue() is called.
>>
>> I don't remember the exact scenario. It happens something like this:
>> When you have only one request (e.g. 4kB) left in your endpoint OUT
>> queue and a host sends 4kB + some extra bytes, then your request (4kB)
>> is delivered to the gadget driver while the extra bytes (short packet)
>> are still in the OUT FIFO.
>> I try to reconstruct the situation (without fix) how the driver can call
>> (re)start_dma. I do not see a reasonable path in the moment.
>
> Okay, I see.  If there's no outstanding request when the short packet
> arrives, the data will remain in the FIFO.
>
> But won't this situation be detected in start_dma(), which is the only
> place where start_queue() is called?  start_dma() has code to test
> specifically for a previous short OUT packet.  Should the fix go there
> instead?

You are right. Unfortunately I did not save a callstack. I guess it was
more a restart_dma(). We do a lot of unusual things like stalling endpoints,
clear_halt, dequeue requests and so on.
The biggest problem is to stop/abort the DMA (I have some more comlexer
patches in the queue). I just remember that it took me a long time to find
this bug and I try to reproduce it this week.

Guido
Guido Kiener - Feb. 6, 2019, 1:56 p.m.
Zitat von Alan Stern <stern@rowland.harvard.edu>:

> On Tue, 5 Feb 2019 guido@kiener-muenchen.de wrote:
>
>>
>> Zitat von Alan Stern <stern@rowland.harvard.edu>:
>>
>> > On Mon, 4 Feb 2019, Guido Kiener wrote:
>> >
>> >> From: Guido Kiener <guido.kiener@rohde-schwarz.com>
>> >>
>> >> The OUT endpoint normally blocks (NAK) subsequent packets when a
>> >> short packet is received and returns an incomplete queue entry to
>> >> the gadget driver. Thereby the gadget driver can detect
>> >> a short packet when reading queue entries with a length that is
>> >> not equal to a multiple of packet size.
>> >>
>> >> The start_queue() function enables receiving OUT packets regardless
>> >> of the content of the OUT FIFO. This results in a problem:
>> >> When receiving is enabled more OUT packets are appended to the last
>> >> short packet and the gadget driver cannot determine the end of a
>> >> short packet anymore. Furthermore the remaining data in the OUT FIFO
>> >> is not delivered to the gadget driver immediately and can produce
>> >> timeout errors.
>> >
>> > How can that happen?  When the short packet is received, the endpoint
>> > immediately starts sending NAKs, so no more data can enter the FIFO.
>> > Furthermore, the incomplete transfer is returned to the gadget driver,
>> > which removes the short packet from the FIFO.  So the FIFO has to be
>> > empty when start_queue() is called.
>>
>> I don't remember the exact scenario. It happens something like this:
>> When you have only one request (e.g. 4kB) left in your endpoint OUT
>> queue and a host sends 4kB + some extra bytes, then your request (4kB)
>> is delivered to the gadget driver while the extra bytes (short packet)
>> are still in the OUT FIFO.
>> I try to reconstruct the situation (without fix) how the driver can call
>> (re)start_dma. I do not see a reasonable path in the moment.
>
> Okay, I see.  If there's no outstanding request when the short packet
> arrives, the data will remain in the FIFO.
>
> But won't this situation be detected in start_dma(), which is the only
> place where start_queue() is called?  start_dma() has code to test
> specifically for a previous short OUT packet.  Should the fix go there
> instead?
>
Alan,

I have inserted a bug detection and can now explain the race:

Ween receiving a long transfer the gadget driver queues a new request with
net2280_queue() and calls start_dma(). In the meantime the OUT FIFO is still
receiving data and the short packet is not detected yet. So the start_dma()
function does not use the shortcut in line:

https://elixir.bootlin.com/linux/v5.0-rc5/source/drivers/usb/gadget/udc/net2280.c#L905

Thus start_queue() is called at the end of the function which calls
stop_out_naking(). This will allow receiving new data, even when a short
packet is arrived in the meantime.
Indeed, I'm not sure now whether my patch only minimizes the race or
whether we need to rethink the logic here and call stop_out_naking() when
the request is really done and delivered to the gadget driver.
See function done().

Here is my code sample how I can detect the race (USB 2.0 with max packet
size of 512):

static void start_queue(struct net2280_ep *ep, u32 dmactl, u32 td_dma)
{
	struct net2280_dma_regs	__iomem *dma = ep->dma;
	unsigned int tmp = BIT(VALID_BIT) | (ep->is_in << DMA_DIRECTION);

	if (!(ep->dev->quirks & PLX_2280))
		tmp |= BIT(END_OF_CHAIN);
	ep_vdbg(ep->dev, "start_queue\n");

	writel(tmp, &dma->dmacount);
	writel(readl(&dma->dmastat), &dma->dmastat);

	writel(td_dma, &dma->dmadesc);
	if (ep->dev->quirks & PLX_PCIE)
		dmactl |= BIT(DMA_REQUEST_OUTSTANDING);
	writel(dmactl, &dma->dmactl);

	/* erratum 0116 workaround part 3:  pci arbiter away from net2280 */
	(void) readl(&ep->dev->pci->pcimstctl);

	// Show bug begin
	unsigned int avail = readl(&ep->regs->ep_avail);
	if (!ep->is_in && (avail % 512) != 0) {
		u32 t1, t2;
		t1 = readl(&ep->cfg->ep_cfg);
		t2 = readl(&ep->regs->ep_rsp) & 0xff;

		ep_err(ep->dev, " stat %08x avail %04x (ep%d%s-%s)%s\n",
			readl(&ep->regs->ep_stat),
			readl(&ep->regs->ep_avail),
			t1 & 0x0f, DIR_STRING(t1),
			type_string(t1 >> 8),
			ep->stopped ? "*" : "");
		WARN_ON(1);
	}
	// Show bug end

	writel(BIT(DMA_START), &dma->dmastat);

	//if (!ep->is_in && readl(&ep->regs->ep_avail) == 0) //<- patch
	if (!ep->is_in)
		stop_out_naking(ep);
}

This is the corresponding callstack.
Flag "NAK packets" (BIT4) is set in EP_STAT and there are now 944 (0x3b0)
bytes in the OUT FIFO.

net2280 0000:05:05.0:  stat 0001103a avail 03b0 (ep1out-bulk)
------------[ cut here ]------------
WARNING: CPU: 0 PID: 32093 at net2280.c:915 start_dma+0x1b2/0x2c0 [net2280]()
Call Trace:
  dump_stack+0x63/0x90
  warn_slowpath_common+0x82/0xc0
  warn_slowpath_null+0x1a/0x20
  start_dma+0x1b2/0x2c0 [net2280]
  net2280_queue+0x247/0x390 [net2280]
  _enqueueBulkoutBuf+0xa8/0x1f0 [rsusbtmc]
  dev_read+0x33e/0x490 [rsusbtmc]
  ? security_file_permission+0xa0/0xc0
  __vfs_read+0x18/0x40
  vfs_read+0x86/0x130
  SyS_read+0x55/0xc0
  ? SyS_ioctl+0x63/0x90

What do you think?

Regards,

Guido
Alan Stern - Feb. 6, 2019, 5:44 p.m.
On Wed, 6 Feb 2019 guido@kiener-muenchen.de wrote:

> Alan,
> 
> I have inserted a bug detection and can now explain the race:
> 
> Ween receiving a long transfer the gadget driver queues a new request with
> net2280_queue() and calls start_dma(). In the meantime the OUT FIFO is still
> receiving data and the short packet is not detected yet. So the start_dma()
> function does not use the shortcut in line:
> 
> https://elixir.bootlin.com/linux/v5.0-rc5/source/drivers/usb/gadget/udc/net2280.c#L905
> 
> Thus start_queue() is called at the end of the function which calls
> stop_out_naking(). This will allow receiving new data, even when a short
> packet is arrived in the meantime.

I'm confused.  A transfer is in progress, let's say for request R1.  
Before R1 finishes, the gadget driver queues a new request, R2.  Since
R1 is still running, ep->queue is not empty, so net2280_queue() will
not call start_dma() -- it will call queue_dma() instead.  But this is 
not what your stack dump shows.

Maybe you meant that R1 completes and then afterward the gadget driver
queues R2.  Then there is a race: start_dma() sees that no short packet
has arrived, but a short packet could arrive before stop_out_naking()  
is called.

> Indeed, I'm not sure now whether my patch only minimizes the race or
> whether we need to rethink the logic here and call stop_out_naking() when
> the request is really done and delivered to the gadget driver.
> See function done().

The right time to call stop_out_naking is immediately after we complete 
a short transfer (that is, a transfer which ended with a short packet).  
start_queue() is definitely the wrong place.

Right at the end of done() is probably the right place.  But do it only
if the last packet of the transfer was a short one.  And note that this
does not necessarily mean the transfer got an error: If the request was
for only 15 bytes then it could complete successfully but the (only)  
packet would still be short.

Alan Stern
Guido Kiener - Feb. 8, 2019, 5:26 p.m.
Zitat von Alan Stern <stern@rowland.harvard.edu>:

> On Wed, 6 Feb 2019 guido@kiener-muenchen.de wrote:
>
>> Alan,
>>
>> I have inserted a bug detection and can now explain the race:
>>
>> Ween receiving a long transfer the gadget driver queues a new request with
>> net2280_queue() and calls start_dma(). In the meantime the OUT FIFO is still
>> receiving data and the short packet is not detected yet. So the start_dma()
>> function does not use the shortcut in line:
>>
>> https://elixir.bootlin.com/linux/v5.0-rc5/source/drivers/usb/gadget/udc/net2280.c#L905
>>
>> Thus start_queue() is called at the end of the function which calls
>> stop_out_naking(). This will allow receiving new data, even when a short
>> packet is arrived in the meantime.
>
> I'm confused.  A transfer is in progress, let's say for request R1.
> Before R1 finishes, the gadget driver queues a new request, R2.  Since
> R1 is still running, ep->queue is not empty, so net2280_queue() will
> not call start_dma() -- it will call queue_dma() instead.  But this is
> not what your stack dump shows.
>
> Maybe you meant that R1 completes and then afterward the gadget driver
> queues R2.  Then there is a race: start_dma() sees that no short packet
> has arrived, but a short packet could arrive before stop_out_naking()
> is called.

Yes, I mean the second case. Currently we put only one request into the
OUT queue. Using 2-3 requests would even improve bandwidth for long transfers,
however it does not help to avoid the bug when the application input is
blocked.

>> Indeed, I'm not sure now whether my patch only minimizes the race or
>> whether we need to rethink the logic here and call stop_out_naking() when
>> the request is really done and delivered to the gadget driver.
>> See function done().
>
> The right time to call stop_out_naking is immediately after we complete
> a short transfer (that is, a transfer which ended with a short packet).
> start_queue() is definitely the wrong place.
>
> Right at the end of done() is probably the right place.  But do it only
> if the last packet of the transfer was a short one.  And note that this
> does not necessarily mean the transfer got an error: If the request was
> for only 15 bytes then it could complete successfully but the (only)
> packet would still be short.

I tried to call stop_out_naking() at the end of done(). However I have
overseen that the author of the driver only wants to receive data when there
is an OUT request available. E.g. see:
https://elixir.bootlin.com/linux/v5.0-rc5/source/drivers/usb/gadget/udc/net2280.c#L304

So I think my patch was not so "bad" and I assume we can prevent the race
with the following patch (or easier see below):

static void start_queue(struct net2280_ep *ep, u32 dmactl, u32 td_dma)
{
	struct net2280_dma_regs	__iomem *dma = ep->dma;
	unsigned int tmp = BIT(VALID_BIT) | (ep->is_in << DMA_DIRECTION);

	if (!(ep->dev->quirks & PLX_2280))
		tmp |= BIT(END_OF_CHAIN);

	writel(tmp, &dma->dmacount);
	writel(readl(&dma->dmastat), &dma->dmastat);

	writel(td_dma, &dma->dmadesc);
	if (ep->dev->quirks & PLX_PCIE)
		dmactl |= BIT(DMA_REQUEST_OUTSTANDING);
	writel(dmactl, &dma->dmactl);

	/* erratum 0116 workaround part 3:  pci arbiter away from net2280 */
	(void) readl(&ep->dev->pci->pcimstctl);

	if (!ep->is_in &&
			(readl(&ep->regs->ep_stat) & BIT(NAK_OUT_PACKETS)) &&
			(readl(&ep->regs->ep_avail) == 0)) {
		stop_out_naking(ep);
	}
	writel(BIT(DMA_START), &dma->dmastat);
}

start_queue() is only called by start_dma(). I see that we can  
simplify this patch
when we just insert the stop_out_nacking(ep) between line 909 and 910:

https://elixir.bootlin.com/linux/v5.0-rc5/source/drivers/usb/gadget/udc/net2280.c#L909

and remove the stop_out_naking() call in start_queue(). I will test  
this over weekend.

Regards,

Guido

Patch

diff --git a/drivers/usb/gadget/udc/net2280.c b/drivers/usb/gadget/udc/net2280.c
index 7154f00dea40..1cb58fd5d1c6 100644
--- a/drivers/usb/gadget/udc/net2280.c
+++ b/drivers/usb/gadget/udc/net2280.c
@@ -867,7 +867,7 @@  static void start_queue(struct net2280_ep *ep, u32 dmactl, u32 td_dma)
 
 	writel(BIT(DMA_START), &dma->dmastat);
 
-	if (!ep->is_in)
+	if (!ep->is_in && readl(&ep->regs->ep_avail) == 0)
 		stop_out_naking(ep);
 }