Patchwork x86/hvm: Fix rcu_unlock_domain call bypass

login
register
mail settings
Submitter Adrian Pop
Date Nov. 14, 2017, 3:11 p.m.
Message ID <20171114151133.31478-1-apop@bitdefender.com>
Download mbox | patch
Permalink /patch/384103/
State New
Headers show

Comments

Adrian Pop - Nov. 14, 2017, 3:11 p.m.
rcu_lock_current_domain is called at the beginning of do_altp2m_op, but
the altp2m_vcpu_enable_notify subop handler might skip calling
rcu_unlock_domain, possibly hanging the domain altogether.

Signed-off-by: Adrian Pop <apop@bitdefender.com>
---
 xen/arch/x86/hvm/hvm.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)
Andrew Cooper - Nov. 14, 2017, 3:20 p.m.
On 14/11/17 15:11, Adrian Pop wrote:
> rcu_lock_current_domain is called at the beginning of do_altp2m_op, but
> the altp2m_vcpu_enable_notify subop handler might skip calling
> rcu_unlock_domain, possibly hanging the domain altogether.
>
> Signed-off-by: Adrian Pop <apop@bitdefender.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

CC'ing Julien.  This is 4.10 material IMO; it would be a security issue
if rcu_lock_current_domain() wasn't a nop in Xen.  Debug builds are also
liable to hit an assertion pertaining to the preempt_count() (which
again, is only ever read in debug builds).

> ---
>  xen/arch/x86/hvm/hvm.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 205b4cb685..0af498a312 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -4534,12 +4534,18 @@ static int do_altp2m_op(
>  
>          if ( a.u.enable_notify.pad || a.domain != DOMID_SELF ||
>               a.u.enable_notify.vcpu_id != curr->vcpu_id )
> +        {
>              rc = -EINVAL;
> +            break;
> +        }
>  
>          if ( !gfn_eq(vcpu_altp2m(curr).veinfo_gfn, INVALID_GFN) ||
>               mfn_eq(get_gfn_query_unlocked(curr->domain,
>                      a.u.enable_notify.gfn, &p2mt), INVALID_MFN) )
> -            return -EINVAL;
> +        {
> +            rc = -EINVAL;
> +            break;
> +        }
>  
>          vcpu_altp2m(curr).veinfo_gfn = _gfn(a.u.enable_notify.gfn);
>          altp2m_vcpu_update_vmfunc_ve(curr);
Jan Beulich - Nov. 14, 2017, 3:25 p.m.
>>> On 14.11.17 at 16:11, <apop@bitdefender.com> wrote:
> rcu_lock_current_domain is called at the beginning of do_altp2m_op, but
> the altp2m_vcpu_enable_notify subop handler might skip calling
> rcu_unlock_domain, possibly hanging the domain altogether.

I fully agree with the change, but the description needs improvement.
For one, why would the domain be hanging with

static inline struct domain *rcu_lock_current_domain(void)
{
    return /*rcu_lock_domain*/(current->domain);
}

? And even if the lock function invocation wasn't commented
out, all it does is preempt_disable(). That may cause an
assertion to trigger in debug builds, but that's not a domain
hang. Plus ...

> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -4534,12 +4534,18 @@ static int do_altp2m_op(
>  
>          if ( a.u.enable_notify.pad || a.domain != DOMID_SELF ||
>               a.u.enable_notify.vcpu_id != curr->vcpu_id )
> +        {
>              rc = -EINVAL;
> +            break;
> +        }

... you also change flow here, which is a second bug you address,
but you fail to mention it.

Jan
Adrian Pop - Nov. 14, 2017, 3:44 p.m.
Hello,

On Tue, Nov 14, 2017 at 08:25:57AM -0700, Jan Beulich wrote:
> >>> On 14.11.17 at 16:11, <apop@bitdefender.com> wrote:
> > rcu_lock_current_domain is called at the beginning of do_altp2m_op, but
> > the altp2m_vcpu_enable_notify subop handler might skip calling
> > rcu_unlock_domain, possibly hanging the domain altogether.
> 
> I fully agree with the change, but the description needs improvement.
> For one, why would the domain be hanging with
> 
> static inline struct domain *rcu_lock_current_domain(void)
> {
>     return /*rcu_lock_domain*/(current->domain);
> }
> 
> ? And even if the lock function invocation wasn't commented
> out, all it does is preempt_disable(). That may cause an
> assertion to trigger in debug builds, but that's not a domain
> hang. Plus ...

Sorry, I was indeed referring to the preempt_count() assertion, only
using poor wording.  I had tested something else using
rcu_lock_domain_by_id() instead of rcu_lock_current_domain() which
triggered the assertion.

> 
> > --- a/xen/arch/x86/hvm/hvm.c
> > +++ b/xen/arch/x86/hvm/hvm.c
> > @@ -4534,12 +4534,18 @@ static int do_altp2m_op(
> >  
> >          if ( a.u.enable_notify.pad || a.domain != DOMID_SELF ||
> >               a.u.enable_notify.vcpu_id != curr->vcpu_id )
> > +        {
> >              rc = -EINVAL;
> > +            break;
> > +        }
> 
> ... you also change flow here, which is a second bug you address,
> but you fail to mention it.

OK.

Patch

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 205b4cb685..0af498a312 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4534,12 +4534,18 @@  static int do_altp2m_op(
 
         if ( a.u.enable_notify.pad || a.domain != DOMID_SELF ||
              a.u.enable_notify.vcpu_id != curr->vcpu_id )
+        {
             rc = -EINVAL;
+            break;
+        }
 
         if ( !gfn_eq(vcpu_altp2m(curr).veinfo_gfn, INVALID_GFN) ||
              mfn_eq(get_gfn_query_unlocked(curr->domain,
                     a.u.enable_notify.gfn, &p2mt), INVALID_MFN) )
-            return -EINVAL;
+        {
+            rc = -EINVAL;
+            break;
+        }
 
         vcpu_altp2m(curr).veinfo_gfn = _gfn(a.u.enable_notify.gfn);
         altp2m_vcpu_update_vmfunc_ve(curr);