Patchwork xen-netfront: remove warning when unloading module

login
register
mail settings
Submitter Eduardo Otubo
Date Nov. 20, 2017, 10:41 a.m.
Message ID <20171120104109.11585-1-otubo@redhat.com>
Download mbox | patch
Permalink /patch/386685/
State New
Headers show

Comments

Eduardo Otubo - Nov. 20, 2017, 10:41 a.m.
When unloading module xen_netfront from guest, dmesg would output
warning messages like below:

  [  105.236836] xen:grant_table: WARNING: g.e. 0x903 still in use!
  [  105.236839] deferring g.e. 0x903 (pfn 0x35805)

This problem relies on netfront and netback being out of sync. By the time
netfront revokes the g.e.'s netback didn't have enough time to free all of
them, hence displaying the warnings on dmesg.

The trick here is to make netfront to wait until netback frees all the g.e.'s
and only then continue to cleanup for the module removal, and this is done by
manipulating both device states.

Signed-off-by: Eduardo Otubo <otubo@redhat.com>
---
 drivers/net/xen-netfront.c | 11 +++++++++++
 1 file changed, 11 insertions(+)
Wei Liu - Nov. 20, 2017, 10:49 a.m.
CC netfront maintainers.

On Mon, Nov 20, 2017 at 11:41:09AM +0100, Eduardo Otubo wrote:
> When unloading module xen_netfront from guest, dmesg would output
> warning messages like below:
> 
>   [  105.236836] xen:grant_table: WARNING: g.e. 0x903 still in use!
>   [  105.236839] deferring g.e. 0x903 (pfn 0x35805)
> 
> This problem relies on netfront and netback being out of sync. By the time
> netfront revokes the g.e.'s netback didn't have enough time to free all of
> them, hence displaying the warnings on dmesg.
> 
> The trick here is to make netfront to wait until netback frees all the g.e.'s
> and only then continue to cleanup for the module removal, and this is done by
> manipulating both device states.
> 
> Signed-off-by: Eduardo Otubo <otubo@redhat.com>
> ---
>  drivers/net/xen-netfront.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 8b8689c6d887..b948e2a1ce40 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -2130,6 +2130,17 @@ static int xennet_remove(struct xenbus_device *dev)
>  
>  	dev_dbg(&dev->dev, "%s\n", dev->nodename);
>  
> +	xenbus_switch_state(dev, XenbusStateClosing);
> +	while (xenbus_read_driver_state(dev->otherend) != XenbusStateClosing){
> +		cpu_relax();
> +		schedule();
> +	}
> +	xenbus_switch_state(dev, XenbusStateClosed);
> +	while (dev->xenbus_state != XenbusStateClosed){
> +		cpu_relax();
> +		schedule();
> +	}
> +
>  	xennet_disconnect_backend(info);
>  
>  	unregister_netdev(info->netdev);
> -- 
> 2.13.6
>
Paul Durrant - Nov. 20, 2017, 10:55 a.m.
> -----Original Message-----
> From: Eduardo Otubo [mailto:otubo@redhat.com]
> Sent: 20 November 2017 10:41
> To: xen-devel@lists.xenproject.org
> Cc: netdev@vger.kernel.org; Paul Durrant <Paul.Durrant@citrix.com>; Wei
> Liu <wei.liu2@citrix.com>; linux-kernel@vger.kernel.org;
> vkuznets@redhat.com; cavery@redhat.com; cheshi@redhat.com;
> mgamal@redhat.com; Eduardo Otubo <otubo@redhat.com>
> Subject: [PATCH] xen-netfront: remove warning when unloading module
> 
> When unloading module xen_netfront from guest, dmesg would output
> warning messages like below:
> 
>   [  105.236836] xen:grant_table: WARNING: g.e. 0x903 still in use!
>   [  105.236839] deferring g.e. 0x903 (pfn 0x35805)
> 
> This problem relies on netfront and netback being out of sync. By the time
> netfront revokes the g.e.'s netback didn't have enough time to free all of
> them, hence displaying the warnings on dmesg.
> 
> The trick here is to make netfront to wait until netback frees all the g.e.'s
> and only then continue to cleanup for the module removal, and this is done
> by
> manipulating both device states.
> 
> Signed-off-by: Eduardo Otubo <otubo@redhat.com>
> ---
>  drivers/net/xen-netfront.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 8b8689c6d887..b948e2a1ce40 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -2130,6 +2130,17 @@ static int xennet_remove(struct xenbus_device
> *dev)
> 
>  	dev_dbg(&dev->dev, "%s\n", dev->nodename);
> 
> +	xenbus_switch_state(dev, XenbusStateClosing);
> +	while (xenbus_read_driver_state(dev->otherend) !=
> XenbusStateClosing){
> +		cpu_relax();
> +		schedule();
> +	}
> +	xenbus_switch_state(dev, XenbusStateClosed);
> +	while (dev->xenbus_state != XenbusStateClosed){
> +		cpu_relax();
> +		schedule();
> +	}
> +

Waitiing for closing should be ok but waiting for closed is risky. As soon as a backend is in the closed state then a toolstack can completely remove the backend xenstore area, resulting a state of XenbusStateUnknown, which would cause your second loop to spin forever.

  Paul

>  	xennet_disconnect_backend(info);
> 
>  	unregister_netdev(info->netdev);
> --
> 2.13.6
Juergen Gross - Nov. 20, 2017, 11:17 a.m.
On 20/11/17 11:49, Wei Liu wrote:
> CC netfront maintainers.
> 
> On Mon, Nov 20, 2017 at 11:41:09AM +0100, Eduardo Otubo wrote:
>> When unloading module xen_netfront from guest, dmesg would output
>> warning messages like below:
>>
>>   [  105.236836] xen:grant_table: WARNING: g.e. 0x903 still in use!
>>   [  105.236839] deferring g.e. 0x903 (pfn 0x35805)
>>
>> This problem relies on netfront and netback being out of sync. By the time
>> netfront revokes the g.e.'s netback didn't have enough time to free all of
>> them, hence displaying the warnings on dmesg.
>>
>> The trick here is to make netfront to wait until netback frees all the g.e.'s
>> and only then continue to cleanup for the module removal, and this is done by
>> manipulating both device states.
>>
>> Signed-off-by: Eduardo Otubo <otubo@redhat.com>
>> ---
>>  drivers/net/xen-netfront.c | 11 +++++++++++
>>  1 file changed, 11 insertions(+)
>>
>> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
>> index 8b8689c6d887..b948e2a1ce40 100644
>> --- a/drivers/net/xen-netfront.c
>> +++ b/drivers/net/xen-netfront.c
>> @@ -2130,6 +2130,17 @@ static int xennet_remove(struct xenbus_device *dev)
>>  
>>  	dev_dbg(&dev->dev, "%s\n", dev->nodename);
>>  
>> +	xenbus_switch_state(dev, XenbusStateClosing);
>> +	while (xenbus_read_driver_state(dev->otherend) != XenbusStateClosing){
>> +		cpu_relax();
>> +		schedule();
>> +	}
>> +	xenbus_switch_state(dev, XenbusStateClosed);
>> +	while (dev->xenbus_state != XenbusStateClosed){
>> +		cpu_relax();
>> +		schedule();
>> +	}

I really don't like the busy waits.

Can't you use e.g. a wait queue and wait_event_interruptible() instead?

BTW: what happens if the device is already in closed state if you enter
xennet_remove()? In case this is impossible, please add a comment to
indicate you've thought about that case.

Other than that: you should run ./scripts/checkpatch.p1 against your
patch to avoid common style problems.


Juergen
Eduardo Otubo - Nov. 20, 2017, 12:56 p.m.
On Mon, Nov 20, 2017 at 10:55:55AM +0000, Paul Durrant wrote:
> > -----Original Message-----
> > From: Eduardo Otubo [mailto:otubo@redhat.com]
> > Sent: 20 November 2017 10:41
> > To: xen-devel@lists.xenproject.org
> > Cc: netdev@vger.kernel.org; Paul Durrant <Paul.Durrant@citrix.com>; Wei
> > Liu <wei.liu2@citrix.com>; linux-kernel@vger.kernel.org;
> > vkuznets@redhat.com; cavery@redhat.com; cheshi@redhat.com;
> > mgamal@redhat.com; Eduardo Otubo <otubo@redhat.com>
> > Subject: [PATCH] xen-netfront: remove warning when unloading module
> > 
> > When unloading module xen_netfront from guest, dmesg would output
> > warning messages like below:
> > 
> >   [  105.236836] xen:grant_table: WARNING: g.e. 0x903 still in use!
> >   [  105.236839] deferring g.e. 0x903 (pfn 0x35805)
> > 
> > This problem relies on netfront and netback being out of sync. By the time
> > netfront revokes the g.e.'s netback didn't have enough time to free all of
> > them, hence displaying the warnings on dmesg.
> > 
> > The trick here is to make netfront to wait until netback frees all the g.e.'s
> > and only then continue to cleanup for the module removal, and this is done
> > by
> > manipulating both device states.
> > 
> > Signed-off-by: Eduardo Otubo <otubo@redhat.com>
> > ---
> >  drivers/net/xen-netfront.c | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> > 
> > diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> > index 8b8689c6d887..b948e2a1ce40 100644
> > --- a/drivers/net/xen-netfront.c
> > +++ b/drivers/net/xen-netfront.c
> > @@ -2130,6 +2130,17 @@ static int xennet_remove(struct xenbus_device
> > *dev)
> > 
> >  	dev_dbg(&dev->dev, "%s\n", dev->nodename);
> > 
> > +	xenbus_switch_state(dev, XenbusStateClosing);
> > +	while (xenbus_read_driver_state(dev->otherend) !=
> > XenbusStateClosing){
> > +		cpu_relax();
> > +		schedule();
> > +	}
> > +	xenbus_switch_state(dev, XenbusStateClosed);
> > +	while (dev->xenbus_state != XenbusStateClosed){
> > +		cpu_relax();
> > +		schedule();
> > +	}
> > +
> 
> Waitiing for closing should be ok but waiting for closed is risky. As soon as a backend is in the closed state then a toolstack can completely remove the backend xenstore area, resulting a state of XenbusStateUnknown, which would cause your second loop to spin forever.
> 
>   Paul

Well, that's a scenario I didn't foresee. I'll come up with a solution in order
avoid this problem. Thanks for the review.

> 
> >  	xennet_disconnect_backend(info);
> > 
> >  	unregister_netdev(info->netdev);
> > --
> > 2.13.6
>
Eduardo Otubo - Nov. 20, 2017, 12:59 p.m.
On Mon, Nov 20, 2017 at 12:17:11PM +0100, Juergen Gross wrote:
> On 20/11/17 11:49, Wei Liu wrote:
> > CC netfront maintainers.
> > 
> > On Mon, Nov 20, 2017 at 11:41:09AM +0100, Eduardo Otubo wrote:
> >> When unloading module xen_netfront from guest, dmesg would output
> >> warning messages like below:
> >>
> >>   [  105.236836] xen:grant_table: WARNING: g.e. 0x903 still in use!
> >>   [  105.236839] deferring g.e. 0x903 (pfn 0x35805)
> >>
> >> This problem relies on netfront and netback being out of sync. By the time
> >> netfront revokes the g.e.'s netback didn't have enough time to free all of
> >> them, hence displaying the warnings on dmesg.
> >>
> >> The trick here is to make netfront to wait until netback frees all the g.e.'s
> >> and only then continue to cleanup for the module removal, and this is done by
> >> manipulating both device states.
> >>
> >> Signed-off-by: Eduardo Otubo <otubo@redhat.com>
> >> ---
> >>  drivers/net/xen-netfront.c | 11 +++++++++++
> >>  1 file changed, 11 insertions(+)
> >>
> >> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> >> index 8b8689c6d887..b948e2a1ce40 100644
> >> --- a/drivers/net/xen-netfront.c
> >> +++ b/drivers/net/xen-netfront.c
> >> @@ -2130,6 +2130,17 @@ static int xennet_remove(struct xenbus_device *dev)
> >>  
> >>  	dev_dbg(&dev->dev, "%s\n", dev->nodename);
> >>  
> >> +	xenbus_switch_state(dev, XenbusStateClosing);
> >> +	while (xenbus_read_driver_state(dev->otherend) != XenbusStateClosing){
> >> +		cpu_relax();
> >> +		schedule();
> >> +	}
> >> +	xenbus_switch_state(dev, XenbusStateClosed);
> >> +	while (dev->xenbus_state != XenbusStateClosed){
> >> +		cpu_relax();
> >> +		schedule();
> >> +	}
> 
> I really don't like the busy waits.
> 
> Can't you use e.g. a wait queue and wait_event_interruptible() instead?

I thought about using these, but I don't think the busy waits here are much of a
problem because it's just unloading a kernel module, not a very repetitive
action. But yes I can go for this approach on v2.

> 
> BTW: what happens if the device is already in closed state if you enter
> xennet_remove()? In case this is impossible, please add a comment to
> indicate you've thought about that case.

Looks like this is the same problem Paul Durrant mentioned on his comment. I'll
work on this as well on v2.

Thanks for the review and the help on IRC :-)
kbuild test robot - Nov. 22, 2017, 11:44 a.m.
Hi Eduardo,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on xen-tip/linux-next]
[also build test ERROR on v4.14 next-20171121]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Eduardo-Otubo/xen-netfront-remove-warning-when-unloading-module/20171122-163844
base:   https://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git linux-next
config: x86_64-allmodconfig (attached as .config)
compiler: gcc-6 (Debian 6.4.0-9) 6.4.0 20171026
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers//net/xen-netfront.c: In function 'xennet_remove':
>> drivers//net/xen-netfront.c:2139:12: error: 'struct xenbus_device' has no member named 'xenbus_state'
     while (dev->xenbus_state != XenbusStateClosed){
               ^~

vim +2139 drivers//net/xen-netfront.c

  2126	
  2127	static int xennet_remove(struct xenbus_device *dev)
  2128	{
  2129		struct netfront_info *info = dev_get_drvdata(&dev->dev);
  2130	
  2131		dev_dbg(&dev->dev, "%s\n", dev->nodename);
  2132	
  2133		xenbus_switch_state(dev, XenbusStateClosing);
  2134		while (xenbus_read_driver_state(dev->otherend) != XenbusStateClosing){
  2135			cpu_relax();
  2136			schedule();
  2137		}
  2138		xenbus_switch_state(dev, XenbusStateClosed);
> 2139		while (dev->xenbus_state != XenbusStateClosed){
  2140			cpu_relax();
  2141			schedule();
  2142		}
  2143	
  2144		xennet_disconnect_backend(info);
  2145	
  2146		unregister_netdev(info->netdev);
  2147	
  2148		if (info->queues)
  2149			xennet_destroy_queues(info);
  2150		xennet_free_netdev(info->netdev);
  2151	
  2152		return 0;
  2153	}
  2154	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

Patch

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 8b8689c6d887..b948e2a1ce40 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -2130,6 +2130,17 @@  static int xennet_remove(struct xenbus_device *dev)
 
 	dev_dbg(&dev->dev, "%s\n", dev->nodename);
 
+	xenbus_switch_state(dev, XenbusStateClosing);
+	while (xenbus_read_driver_state(dev->otherend) != XenbusStateClosing){
+		cpu_relax();
+		schedule();
+	}
+	xenbus_switch_state(dev, XenbusStateClosed);
+	while (dev->xenbus_state != XenbusStateClosed){
+		cpu_relax();
+		schedule();
+	}
+
 	xennet_disconnect_backend(info);
 
 	unregister_netdev(info->netdev);