Patchwork kvm/x86/vmx: switch MSR_MISC_FEATURES_ENABLES between host and guest

login
register
mail settings
Submitter Xiaoyao Li
Date March 14, 2019, 6:38 a.m.
Message ID <20190314063858.18292-1-xiaoyao.li@linux.intel.com>
Download mbox | patch
Permalink /patch/748597/
State New
Headers show

Comments

Xiaoyao Li - March 14, 2019, 6:38 a.m.
CPUID Faulting is a feature about CPUID instruction. When CPUID Faulting is
enabled, all execution of the CPUID instruction outside system-management
mode (SMM) cause a general-protection (#GP) if the CPL > 0.

About this feature, detailed information can be found at
https://www.intel.com/content/dam/www/public/us/en/documents/application-notes/virtualization-technology-flexmigration-application-note.pdf

There is an issue that current kvm doesn't switch the value of
MSR_MISC_FEATURES_ENABLES between host and guest. If MSR_MISC_FEATURES_ENABLES
exists on the hardware cpu, and host enables CPUID faulting (setting the bit 0
of MSR_MISC_FEATURES_ENABLES), it will impact the guest's behavior because
cpuid faulting is enabled by host and passed to guest.

From my tests, when host enables cpuid faulting, it causes guest boot failure
when guest uses *modprobe* to load modules. Below is the error log:

[    1.233556] traps: modprobe[71] general protection fault ip:7f0077f6495c sp:7ffda148d808 error:0 in ld-2.17.so[7f0077f4d000+22000]
[    1.237780] traps: modprobe[73] general protection fault ip:7fad5aba095c sp:7ffd36067378 error:0 in ld-2.17.so[7fad5ab89000+22000]
[    1.241930] traps: modprobe[75] general protection fault ip:7f3edb89495c sp:7fffa1a81308 error:0 in ld-2.17.so[7f3edb87d000+22000]
[    1.245998] traps: modprobe[77] general protection fault ip:7f91d670895c sp:7ffc25fa7f38 error:0 in ld-2.17.so[7f91d66f1000+22000]
[    1.250016] traps: modprobe[79] general protection fault ip:7f0ddbbdc95c sp:7ffe9c34f8d8 error:0 in ld-2.17.so[7f0ddbbc5000+22000]

*modprobe* calls CPUID instruction thus causing cpuid faulting in guest.
At the end, because guest cannot *modprobe* modules, it boots failure.

This patch switches MSR_MISC_FEATURES_ENABLES between host and guest when
hardware has this MSR.

This patch doesn't confict with the commit db2336a80489 ("KVM: x86: virtualize
cpuid faulting"), which provides a software emulation of cpuid faulting for
x86 arch. Below analysing how cpuid faulting will work after applying this patch:

1. If host cpu is AMD. It doesn't have MSR_MISC_FEATURES_ENABLES, so we can just
use the software emulation of cpuid faulting.

2. If host cpu is Intel and it doesn't have MSR_MISC_FEATURES_ENABLES. The same
as case 1, we can just use the software emulation of cpuid faulting.

3. If host cpu is Intel and it has MSR_MISC_FEATURES_ENABLES. With this patch,
it will write guest's value into MSR_MISC_FEATURES_ENABLES when vm entry.
If guest enables cpuid faulting and when guest calls CPUID instruction with
CPL > 0, it will cause #GP exception in guest instead of VM exit because of
CPUID, thus it doesn't go to the kvm emualtion path but ues the hardware
feature. Also it's a benefit that we needn't use VM exit to inject #GP to
emulate cpuid faulting feature.

Intel SDM vol3.25.1.1 specifies the priority between cpuid faulting
and CPUID instruction.

Signed-off-by: Xiaoyao Li <xiaoyao.li@linux.intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)
Xiaoyao Li - March 14, 2019, 7:06 a.m.
Besides, Peter's this patch https://patchwork.kernel.org/patch/10850143/
adds the handling of cpuid faulting in #GP handler. 

What's more, it enalbes cpuid fauting once function *clear_cpu_cap()* called
successfully. From my tests, during kernel booting, there always are some
features cleared, thus cpuid faulting will enabled by default after applying
Peter's patch. It will make the problem more obvious.

On Thu, 2019-03-14 at 14:38 +0800, Xiaoyao Li wrote:
> CPUID Faulting is a feature about CPUID instruction. When CPUID Faulting is
> enabled, all execution of the CPUID instruction outside system-management
> mode (SMM) cause a general-protection (#GP) if the CPL > 0.
> 
> About this feature, detailed information can be found at
> 
https://www.intel.com/content/dam/www/public/us/en/documents/application-notes/virtualization-technology-flexmigration-application-note.pdf
> 
> There is an issue that current kvm doesn't switch the value of
> MSR_MISC_FEATURES_ENABLES between host and guest. If MSR_MISC_FEATURES_ENABLES
> exists on the hardware cpu, and host enables CPUID faulting (setting the bit 0
> of MSR_MISC_FEATURES_ENABLES), it will impact the guest's behavior because
> cpuid faulting is enabled by host and passed to guest.
> 
> From my tests, when host enables cpuid faulting, it causes guest boot failure
> when guest uses *modprobe* to load modules. Below is the error log:
> 
> [    1.233556] traps: modprobe[71] general protection fault ip:7f0077f6495c
> sp:7ffda148d808 error:0 in ld-2.17.so[7f0077f4d000+22000]
> [    1.237780] traps: modprobe[73] general protection fault ip:7fad5aba095c
> sp:7ffd36067378 error:0 in ld-2.17.so[7fad5ab89000+22000]
> [    1.241930] traps: modprobe[75] general protection fault ip:7f3edb89495c
> sp:7fffa1a81308 error:0 in ld-2.17.so[7f3edb87d000+22000]
> [    1.245998] traps: modprobe[77] general protection fault ip:7f91d670895c
> sp:7ffc25fa7f38 error:0 in ld-2.17.so[7f91d66f1000+22000]
> [    1.250016] traps: modprobe[79] general protection fault ip:7f0ddbbdc95c
> sp:7ffe9c34f8d8 error:0 in ld-2.17.so[7f0ddbbc5000+22000]
> 
> *modprobe* calls CPUID instruction thus causing cpuid faulting in guest.
> At the end, because guest cannot *modprobe* modules, it boots failure.
> 
> This patch switches MSR_MISC_FEATURES_ENABLES between host and guest when
> hardware has this MSR.
> 
> This patch doesn't confict with the commit db2336a80489 ("KVM: x86: virtualize
> cpuid faulting"), which provides a software emulation of cpuid faulting for
> x86 arch. Below analysing how cpuid faulting will work after applying this
> patch:
> 
> 1. If host cpu is AMD. It doesn't have MSR_MISC_FEATURES_ENABLES, so we can
> just
> use the software emulation of cpuid faulting.
> 
> 2. If host cpu is Intel and it doesn't have MSR_MISC_FEATURES_ENABLES. The
> same
> as case 1, we can just use the software emulation of cpuid faulting.
> 
> 3. If host cpu is Intel and it has MSR_MISC_FEATURES_ENABLES. With this patch,
> it will write guest's value into MSR_MISC_FEATURES_ENABLES when vm entry.
> If guest enables cpuid faulting and when guest calls CPUID instruction with
> CPL > 0, it will cause #GP exception in guest instead of VM exit because of
> CPUID, thus it doesn't go to the kvm emualtion path but ues the hardware
> feature. Also it's a benefit that we needn't use VM exit to inject #GP to
> emulate cpuid faulting feature.
> 
> Intel SDM vol3.25.1.1 specifies the priority between cpuid faulting
> and CPUID instruction.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@linux.intel.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 30a6bcd735ec..90707fae688e 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6321,6 +6321,23 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx
> *vmx)
>  					msrs[i].host, false);
>  }
>  
> +static void atomic_switch_msr_misc_features_enables(struct kvm_vcpu *vcpu)
> +{
> +	u64 host_msr;
> +	struct vcpu_vmx *vmx = to_vmx(vcpu);
> +
> +	/* if MSR MISC_FEATURES_ENABLES doesn't exist on the hardware, do
> nothing*/
> +	if (rdmsrl_safe(MSR_MISC_FEATURES_ENABLES, &host_msr))
> +		return;
> +
> +	if (host_msr == vcpu->arch.msr_misc_features_enables)
> +		clear_atomic_switch_msr(vmx, MSR_MISC_FEATURES_ENABLES);
> +	else
> +		add_atomic_switch_msr(vmx, MSR_MISC_FEATURES_ENABLES,
> +				      vcpu->arch.msr_misc_features_enables,
> +				      host_msr, false);
> +}
> +
>  static void vmx_arm_hv_timer(struct vcpu_vmx *vmx, u32 val)
>  {
>  	vmcs_write32(VMX_PREEMPTION_TIMER_VALUE, val);
> @@ -6562,6 +6579,8 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
>  
>  	atomic_switch_perf_msrs(vmx);
>  
> +	atomic_switch_msr_misc_features_enables(vcpu);
> +
>  	vmx_update_hv_timer(vcpu);
>  
>  	/*
Kyle Huey - March 14, 2019, 8:43 a.m.
On Thu, Mar 14, 2019 at 7:50 PM Xiaoyao Li <xiaoyao.li@linux.intel.com> wrote:
>
> CPUID Faulting is a feature about CPUID instruction. When CPUID Faulting is
> enabled, all execution of the CPUID instruction outside system-management
> mode (SMM) cause a general-protection (#GP) if the CPL > 0.
>
> About this feature, detailed information can be found at
> https://www.intel.com/content/dam/www/public/us/en/documents/application-notes/virtualization-technology-flexmigration-application-note.pdf
>
> There is an issue that current kvm doesn't switch the value of
> MSR_MISC_FEATURES_ENABLES between host and guest. If MSR_MISC_FEATURES_ENABLES
> exists on the hardware cpu, and host enables CPUID faulting (setting the bit 0
> of MSR_MISC_FEATURES_ENABLES), it will impact the guest's behavior because
> cpuid faulting is enabled by host and passed to guest.

The host doesn't enable CPUID faulting and keep it enabled for
everything though, it only enables it for specified user-space
processes (via arch_prctl). How does CPUID faulting "leak" into the
KVM guest?

- Kyle

> From my tests, when host enables cpuid faulting, it causes guest boot failure
> when guest uses *modprobe* to load modules. Below is the error log:
>
> [    1.233556] traps: modprobe[71] general protection fault ip:7f0077f6495c sp:7ffda148d808 error:0 in ld-2.17.so[7f0077f4d000+22000]
> [    1.237780] traps: modprobe[73] general protection fault ip:7fad5aba095c sp:7ffd36067378 error:0 in ld-2.17.so[7fad5ab89000+22000]
> [    1.241930] traps: modprobe[75] general protection fault ip:7f3edb89495c sp:7fffa1a81308 error:0 in ld-2.17.so[7f3edb87d000+22000]
> [    1.245998] traps: modprobe[77] general protection fault ip:7f91d670895c sp:7ffc25fa7f38 error:0 in ld-2.17.so[7f91d66f1000+22000]
> [    1.250016] traps: modprobe[79] general protection fault ip:7f0ddbbdc95c sp:7ffe9c34f8d8 error:0 in ld-2.17.so[7f0ddbbc5000+22000]
>
> *modprobe* calls CPUID instruction thus causing cpuid faulting in guest.
> At the end, because guest cannot *modprobe* modules, it boots failure.
>
> This patch switches MSR_MISC_FEATURES_ENABLES between host and guest when
> hardware has this MSR.
>
> This patch doesn't confict with the commit db2336a80489 ("KVM: x86: virtualize
> cpuid faulting"), which provides a software emulation of cpuid faulting for
> x86 arch. Below analysing how cpuid faulting will work after applying this patch:
>
> 1. If host cpu is AMD. It doesn't have MSR_MISC_FEATURES_ENABLES, so we can just
> use the software emulation of cpuid faulting.
>
> 2. If host cpu is Intel and it doesn't have MSR_MISC_FEATURES_ENABLES. The same
> as case 1, we can just use the software emulation of cpuid faulting.
>
> 3. If host cpu is Intel and it has MSR_MISC_FEATURES_ENABLES. With this patch,
> it will write guest's value into MSR_MISC_FEATURES_ENABLES when vm entry.
> If guest enables cpuid faulting and when guest calls CPUID instruction with
> CPL > 0, it will cause #GP exception in guest instead of VM exit because of
> CPUID, thus it doesn't go to the kvm emualtion path but ues the hardware
> feature. Also it's a benefit that we needn't use VM exit to inject #GP to
> emulate cpuid faulting feature.
>
> Intel SDM vol3.25.1.1 specifies the priority between cpuid faulting
> and CPUID instruction.
>
> Signed-off-by: Xiaoyao Li <xiaoyao.li@linux.intel.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 30a6bcd735ec..90707fae688e 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6321,6 +6321,23 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx)
>                                         msrs[i].host, false);
>  }
>
> +static void atomic_switch_msr_misc_features_enables(struct kvm_vcpu *vcpu)
> +{
> +       u64 host_msr;
> +       struct vcpu_vmx *vmx = to_vmx(vcpu);
> +
> +       /* if MSR MISC_FEATURES_ENABLES doesn't exist on the hardware, do nothing*/
> +       if (rdmsrl_safe(MSR_MISC_FEATURES_ENABLES, &host_msr))
> +               return;
> +
> +       if (host_msr == vcpu->arch.msr_misc_features_enables)
> +               clear_atomic_switch_msr(vmx, MSR_MISC_FEATURES_ENABLES);
> +       else
> +               add_atomic_switch_msr(vmx, MSR_MISC_FEATURES_ENABLES,
> +                                     vcpu->arch.msr_misc_features_enables,
> +                                     host_msr, false);
> +}
> +
>  static void vmx_arm_hv_timer(struct vcpu_vmx *vmx, u32 val)
>  {
>         vmcs_write32(VMX_PREEMPTION_TIMER_VALUE, val);
> @@ -6562,6 +6579,8 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
>
>         atomic_switch_perf_msrs(vmx);
>
> +       atomic_switch_msr_misc_features_enables(vcpu);
> +
>         vmx_update_hv_timer(vcpu);
>
>         /*
> --
> 2.19.1
Xiaoyao Li - March 14, 2019, 10:43 a.m.
On Thu, 2019-03-14 at 21:43 +1300, Kyle Huey wrote:
> On Thu, Mar 14, 2019 at 7:50 PM Xiaoyao Li <xiaoyao.li@linux.intel.com> wrote:
> > 
> > CPUID Faulting is a feature about CPUID instruction. When CPUID Faulting is
> > enabled, all execution of the CPUID instruction outside system-management
> > mode (SMM) cause a general-protection (#GP) if the CPL > 0.
> > 
> > About this feature, detailed information can be found at
> > 
https://www.intel.com/content/dam/www/public/us/en/documents/application-notes/virtualization-technology-flexmigration-application-note.pdf
> > 
> > There is an issue that current kvm doesn't switch the value of
> > MSR_MISC_FEATURES_ENABLES between host and guest. If
> > MSR_MISC_FEATURES_ENABLES
> > exists on the hardware cpu, and host enables CPUID faulting (setting the bit
> > 0
> > of MSR_MISC_FEATURES_ENABLES), it will impact the guest's behavior because
> > cpuid faulting is enabled by host and passed to guest.
> 
> The host doesn't enable CPUID faulting and keep it enabled for
> everything though, it only enables it for specified user-space
> processes (via arch_prctl). How does CPUID faulting "leak" into the
> KVM guest?

Yes, you are right. With your patches, only when we enable cpuid faulting for
the QEMU or other VMM userspace processes via arch_prctl, does it "leak" into
the KVM guest. 

But arch_prctl is not the only way to enable it. We can enable cpuid faulting
via module "msr.ko". And I think it's indeed an issue, we should not ignore it
as it needs special cases and purpose to make it happen so far. 

Besides, Peter's this patch https://patchwork.kernel.org/patch/10850143/ is much
likely to cause the cpuid fauting enabled in host by default during kernel
startup. If that one is going to be applied, it's better to fix the cpuid
faulting leak issue before that.

Thanks,
- Xiaoyao

> - Kyle
> 
> > From my tests, when host enables cpuid faulting, it causes guest boot
> > failure
> > when guest uses *modprobe* to load modules. Below is the error log:
> > 
> > [    1.233556] traps: modprobe[71] general protection fault ip:7f0077f6495c
> > sp:7ffda148d808 error:0 in ld-2.17.so[7f0077f4d000+22000]
> > [    1.237780] traps: modprobe[73] general protection fault ip:7fad5aba095c
> > sp:7ffd36067378 error:0 in ld-2.17.so[7fad5ab89000+22000]
> > [    1.241930] traps: modprobe[75] general protection fault ip:7f3edb89495c
> > sp:7fffa1a81308 error:0 in ld-2.17.so[7f3edb87d000+22000]
> > [    1.245998] traps: modprobe[77] general protection fault ip:7f91d670895c
> > sp:7ffc25fa7f38 error:0 in ld-2.17.so[7f91d66f1000+22000]
> > [    1.250016] traps: modprobe[79] general protection fault ip:7f0ddbbdc95c
> > sp:7ffe9c34f8d8 error:0 in ld-2.17.so[7f0ddbbc5000+22000]
> > 
> > *modprobe* calls CPUID instruction thus causing cpuid faulting in guest.
> > At the end, because guest cannot *modprobe* modules, it boots failure.
> > 
> > This patch switches MSR_MISC_FEATURES_ENABLES between host and guest when
> > hardware has this MSR.
> > 
> > This patch doesn't confict with the commit db2336a80489 ("KVM: x86:
> > virtualize
> > cpuid faulting"), which provides a software emulation of cpuid faulting for
> > x86 arch. Below analysing how cpuid faulting will work after applying this
> > patch:
> > 
> > 1. If host cpu is AMD. It doesn't have MSR_MISC_FEATURES_ENABLES, so we can
> > just
> > use the software emulation of cpuid faulting.
> > 
> > 2. If host cpu is Intel and it doesn't have MSR_MISC_FEATURES_ENABLES. The
> > same
> > as case 1, we can just use the software emulation of cpuid faulting.
> > 
> > 3. If host cpu is Intel and it has MSR_MISC_FEATURES_ENABLES. With this
> > patch,
> > it will write guest's value into MSR_MISC_FEATURES_ENABLES when vm entry.
> > If guest enables cpuid faulting and when guest calls CPUID instruction with
> > CPL > 0, it will cause #GP exception in guest instead of VM exit because of
> > CPUID, thus it doesn't go to the kvm emualtion path but ues the hardware
> > feature. Also it's a benefit that we needn't use VM exit to inject #GP to
> > emulate cpuid faulting feature.
> > 
> > Intel SDM vol3.25.1.1 specifies the priority between cpuid faulting
> > and CPUID instruction.
> > 
> > Signed-off-by: Xiaoyao Li <xiaoyao.li@linux.intel.com>
> > ---
> >  arch/x86/kvm/vmx/vmx.c | 19 +++++++++++++++++++
> >  1 file changed, 19 insertions(+)
> > 
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 30a6bcd735ec..90707fae688e 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -6321,6 +6321,23 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx
> > *vmx)
> >                                         msrs[i].host, false);
> >  }
> > 
> > +static void atomic_switch_msr_misc_features_enables(struct kvm_vcpu *vcpu)
> > +{
> > +       u64 host_msr;
> > +       struct vcpu_vmx *vmx = to_vmx(vcpu);
> > +
> > +       /* if MSR MISC_FEATURES_ENABLES doesn't exist on the hardware, do
> > nothing*/
> > +       if (rdmsrl_safe(MSR_MISC_FEATURES_ENABLES, &host_msr))
> > +               return;
> > +
> > +       if (host_msr == vcpu->arch.msr_misc_features_enables)
> > +               clear_atomic_switch_msr(vmx, MSR_MISC_FEATURES_ENABLES);
> > +       else
> > +               add_atomic_switch_msr(vmx, MSR_MISC_FEATURES_ENABLES,
> > +                                     vcpu->arch.msr_misc_features_enables,
> > +                                     host_msr, false);
> > +}
> > +
> >  static void vmx_arm_hv_timer(struct vcpu_vmx *vmx, u32 val)
> >  {
> >         vmcs_write32(VMX_PREEMPTION_TIMER_VALUE, val);
> > @@ -6562,6 +6579,8 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
> > 
> >         atomic_switch_perf_msrs(vmx);
> > 
> > +       atomic_switch_msr_misc_features_enables(vcpu);
> > +
> >         vmx_update_hv_timer(vcpu);
> > 
> >         /*
> > --
> > 2.19.1
Paolo Bonzini - March 14, 2019, 11:28 a.m.
On 14/03/19 07:38, Xiaoyao Li wrote:
> CPUID Faulting is a feature about CPUID instruction. When CPUID Faulting is
> enabled, all execution of the CPUID instruction outside system-management
> mode (SMM) cause a general-protection (#GP) if the CPL > 0.
> 
> About this feature, detailed information can be found at
> https://www.intel.com/content/dam/www/public/us/en/documents/application-notes/virtualization-technology-flexmigration-application-note.pdf
> 
> There is an issue that current kvm doesn't switch the value of
> MSR_MISC_FEATURES_ENABLES between host and guest. If MSR_MISC_FEATURES_ENABLES
> exists on the hardware cpu, and host enables CPUID faulting (setting the bit 0
> of MSR_MISC_FEATURES_ENABLES), it will impact the guest's behavior because
> cpuid faulting is enabled by host and passed to guest.
> 
> From my tests, when host enables cpuid faulting, it causes guest boot failure
> when guest uses *modprobe* to load modules. Below is the error log:
> 
> [    1.233556] traps: modprobe[71] general protection fault ip:7f0077f6495c sp:7ffda148d808 error:0 in ld-2.17.so[7f0077f4d000+22000]
> [    1.237780] traps: modprobe[73] general protection fault ip:7fad5aba095c sp:7ffd36067378 error:0 in ld-2.17.so[7fad5ab89000+22000]
> [    1.241930] traps: modprobe[75] general protection fault ip:7f3edb89495c sp:7fffa1a81308 error:0 in ld-2.17.so[7f3edb87d000+22000]
> [    1.245998] traps: modprobe[77] general protection fault ip:7f91d670895c sp:7ffc25fa7f38 error:0 in ld-2.17.so[7f91d66f1000+22000]
> [    1.250016] traps: modprobe[79] general protection fault ip:7f0ddbbdc95c sp:7ffe9c34f8d8 error:0 in ld-2.17.so[7f0ddbbc5000+22000]
> 
> *modprobe* calls CPUID instruction thus causing cpuid faulting in guest.
> At the end, because guest cannot *modprobe* modules, it boots failure.
> 
> This patch switches MSR_MISC_FEATURES_ENABLES between host and guest when
> hardware has this MSR.
> 
> This patch doesn't confict with the commit db2336a80489 ("KVM: x86: virtualize
> cpuid faulting"), which provides a software emulation of cpuid faulting for
> x86 arch. Below analysing how cpuid faulting will work after applying this patch:
> 
> 1. If host cpu is AMD. It doesn't have MSR_MISC_FEATURES_ENABLES, so we can just
> use the software emulation of cpuid faulting.
> 
> 2. If host cpu is Intel and it doesn't have MSR_MISC_FEATURES_ENABLES. The same
> as case 1, we can just use the software emulation of cpuid faulting.
> 
> 3. If host cpu is Intel and it has MSR_MISC_FEATURES_ENABLES. With this patch,
> it will write guest's value into MSR_MISC_FEATURES_ENABLES when vm entry.
> If guest enables cpuid faulting and when guest calls CPUID instruction with
> CPL > 0, it will cause #GP exception in guest instead of VM exit because of
> CPUID, thus it doesn't go to the kvm emualtion path but ues the hardware
> feature. Also it's a benefit that we needn't use VM exit to inject #GP to
> emulate cpuid faulting feature.
> 
> Intel SDM vol3.25.1.1 specifies the priority between cpuid faulting
> and CPUID instruction.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@linux.intel.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 30a6bcd735ec..90707fae688e 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6321,6 +6321,23 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx)
>  					msrs[i].host, false);
>  }
>  
> +static void atomic_switch_msr_misc_features_enables(struct kvm_vcpu *vcpu)
> +{
> +	u64 host_msr;
> +	struct vcpu_vmx *vmx = to_vmx(vcpu);
> +
> +	/* if MSR MISC_FEATURES_ENABLES doesn't exist on the hardware, do nothing*/
> +	if (rdmsrl_safe(MSR_MISC_FEATURES_ENABLES, &host_msr))
> +		return;
> +
> +	if (host_msr == vcpu->arch.msr_misc_features_enables)
> +		clear_atomic_switch_msr(vmx, MSR_MISC_FEATURES_ENABLES);
> +	else
> +		add_atomic_switch_msr(vmx, MSR_MISC_FEATURES_ENABLES,
> +				      vcpu->arch.msr_misc_features_enables,
> +				      host_msr, false);
> +}
> +
>  static void vmx_arm_hv_timer(struct vcpu_vmx *vmx, u32 val)
>  {
>  	vmcs_write32(VMX_PREEMPTION_TIMER_VALUE, val);
> @@ -6562,6 +6579,8 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
>  
>  	atomic_switch_perf_msrs(vmx);
>  
> +	atomic_switch_msr_misc_features_enables(vcpu);
> +
>  	vmx_update_hv_timer(vcpu);
>  
>  	/*
> 

Adding a RDMSR for this to each vmentry is too heavy.  Since we emulate
MSR_MISC_FEATURES_ENABLES, you can just clear the MSR on vcpu_load and
restore it on vcpu_put.

Also, this needs a test in tools/testing/selftests/kvm.

Paolo
Paolo Bonzini - March 14, 2019, 11:30 a.m.
On 14/03/19 11:43, Xiaoyao Li wrote:
> Yes, you are right. With your patches, only when we enable cpuid faulting for
> the QEMU or other VMM userspace processes via arch_prctl, does it "leak" into
> the KVM guest. 
> 
> But arch_prctl is not the only way to enable it. We can enable cpuid faulting
> via module "msr.ko". And I think it's indeed an issue, we should not ignore it
> as it needs special cases and purpose to make it happen so far. 

Whatever is done with "msr.ko" is a non-issue.  msr.ko is at best a
debugging tool if you use it for MSRs that the kernel can write.  That
said, even the arch_prctl is an issue, as well as Peter's patch you
quote below.

The patch is a good idea even though it should be implemented in a more
efficient way.

Paolo

> Besides, Peter's this patch https://patchwork.kernel.org/patch/10850143/ is much
> likely to cause the cpuid fauting enabled in host by default during kernel
> startup. If that one is going to be applied, it's better to fix the cpuid
> faulting leak issue before that.
Xiaoyao Li - March 14, 2019, 12:10 p.m.
On Thu, 2019-03-14 at 12:28 +0100, Paolo Bonzini wrote:
> On 14/03/19 07:38, Xiaoyao Li wrote:
> > CPUID Faulting is a feature about CPUID instruction. When CPUID Faulting is
> > enabled, all execution of the CPUID instruction outside system-management
> > mode (SMM) cause a general-protection (#GP) if the CPL > 0.
> > 
> > About this feature, detailed information can be found at
> > 
https://www.intel.com/content/dam/www/public/us/en/documents/application-notes/virtualization-technology-flexmigration-application-note.pdf
> > 
> > There is an issue that current kvm doesn't switch the value of
> > MSR_MISC_FEATURES_ENABLES between host and guest. If
> > MSR_MISC_FEATURES_ENABLES
> > exists on the hardware cpu, and host enables CPUID faulting (setting the bit
> > 0
> > of MSR_MISC_FEATURES_ENABLES), it will impact the guest's behavior because
> > cpuid faulting is enabled by host and passed to guest.
> > 
> > From my tests, when host enables cpuid faulting, it causes guest boot
> > failure
> > when guest uses *modprobe* to load modules. Below is the error log:
> > 
> > [    1.233556] traps: modprobe[71] general protection fault ip:7f0077f6495c
> > sp:7ffda148d808 error:0 in ld-2.17.so[7f0077f4d000+22000]
> > [    1.237780] traps: modprobe[73] general protection fault ip:7fad5aba095c
> > sp:7ffd36067378 error:0 in ld-2.17.so[7fad5ab89000+22000]
> > [    1.241930] traps: modprobe[75] general protection fault ip:7f3edb89495c
> > sp:7fffa1a81308 error:0 in ld-2.17.so[7f3edb87d000+22000]
> > [    1.245998] traps: modprobe[77] general protection fault ip:7f91d670895c
> > sp:7ffc25fa7f38 error:0 in ld-2.17.so[7f91d66f1000+22000]
> > [    1.250016] traps: modprobe[79] general protection fault ip:7f0ddbbdc95c
> > sp:7ffe9c34f8d8 error:0 in ld-2.17.so[7f0ddbbc5000+22000]
> > 
> > *modprobe* calls CPUID instruction thus causing cpuid faulting in guest.
> > At the end, because guest cannot *modprobe* modules, it boots failure.
> > 
> > This patch switches MSR_MISC_FEATURES_ENABLES between host and guest when
> > hardware has this MSR.
> > 
> > This patch doesn't confict with the commit db2336a80489 ("KVM: x86:
> > virtualize
> > cpuid faulting"), which provides a software emulation of cpuid faulting for
> > x86 arch. Below analysing how cpuid faulting will work after applying this
> > patch:
> > 
> > 1. If host cpu is AMD. It doesn't have MSR_MISC_FEATURES_ENABLES, so we can
> > just
> > use the software emulation of cpuid faulting.
> > 
> > 2. If host cpu is Intel and it doesn't have MSR_MISC_FEATURES_ENABLES. The
> > same
> > as case 1, we can just use the software emulation of cpuid faulting.
> > 
> > 3. If host cpu is Intel and it has MSR_MISC_FEATURES_ENABLES. With this
> > patch,
> > it will write guest's value into MSR_MISC_FEATURES_ENABLES when vm entry.
> > If guest enables cpuid faulting and when guest calls CPUID instruction with
> > CPL > 0, it will cause #GP exception in guest instead of VM exit because of
> > CPUID, thus it doesn't go to the kvm emualtion path but ues the hardware
> > feature. Also it's a benefit that we needn't use VM exit to inject #GP to
> > emulate cpuid faulting feature.
> > 
> > Intel SDM vol3.25.1.1 specifies the priority between cpuid faulting
> > and CPUID instruction.
> > 
> > Signed-off-by: Xiaoyao Li <xiaoyao.li@linux.intel.com>
> > ---
> >  arch/x86/kvm/vmx/vmx.c | 19 +++++++++++++++++++
> >  1 file changed, 19 insertions(+)
> > 
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 30a6bcd735ec..90707fae688e 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -6321,6 +6321,23 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx
> > *vmx)
> >  					msrs[i].host, false);
> >  }
> >  
> > +static void atomic_switch_msr_misc_features_enables(struct kvm_vcpu *vcpu)
> > +{
> > +	u64 host_msr;
> > +	struct vcpu_vmx *vmx = to_vmx(vcpu);
> > +
> > +	/* if MSR MISC_FEATURES_ENABLES doesn't exist on the hardware, do
> > nothing*/
> > +	if (rdmsrl_safe(MSR_MISC_FEATURES_ENABLES, &host_msr))
> > +		return;
> > +
> > +	if (host_msr == vcpu->arch.msr_misc_features_enables)
> > +		clear_atomic_switch_msr(vmx, MSR_MISC_FEATURES_ENABLES);
> > +	else
> > +		add_atomic_switch_msr(vmx, MSR_MISC_FEATURES_ENABLES,
> > +				      vcpu->arch.msr_misc_features_enables,
> > +				      host_msr, false);
> > +}
> > +
> >  static void vmx_arm_hv_timer(struct vcpu_vmx *vmx, u32 val)
> >  {
> >  	vmcs_write32(VMX_PREEMPTION_TIMER_VALUE, val);
> > @@ -6562,6 +6579,8 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
> >  
> >  	atomic_switch_perf_msrs(vmx);
> >  
> > +	atomic_switch_msr_misc_features_enables(vcpu);
> > +
> >  	vmx_update_hv_timer(vcpu);
> >  
> >  	/*
> > 
> 
> Adding a RDMSR for this to each vmentry is too heavy.  Since we emulate
> MSR_MISC_FEATURES_ENABLES, you can just clear the MSR on vcpu_load and
> restore it on vcpu_put.

It's much better and more efficient.
I will do it this way in next version.

> Also, this needs a test in tools/testing/selftests/kvm.

Sure, I will add that.

> Paolo
Xiaoyao Li - March 14, 2019, 12:37 p.m.
On Thu, 2019-03-14 at 12:28 +0100, Paolo Bonzini wrote:
> On 14/03/19 07:38, Xiaoyao Li wrote:
> > CPUID Faulting is a feature about CPUID instruction. When CPUID Faulting is
> > enabled, all execution of the CPUID instruction outside system-management
> > mode (SMM) cause a general-protection (#GP) if the CPL > 0.
> > 
> > About this feature, detailed information can be found at
> > 
https://www.intel.com/content/dam/www/public/us/en/documents/application-notes/virtualization-technology-flexmigration-application-note.pdf
> > 
> > There is an issue that current kvm doesn't switch the value of
> > MSR_MISC_FEATURES_ENABLES between host and guest. If
> > MSR_MISC_FEATURES_ENABLES
> > exists on the hardware cpu, and host enables CPUID faulting (setting the bit
> > 0
> > of MSR_MISC_FEATURES_ENABLES), it will impact the guest's behavior because
> > cpuid faulting is enabled by host and passed to guest.
> > 
> > From my tests, when host enables cpuid faulting, it causes guest boot
> > failure
> > when guest uses *modprobe* to load modules. Below is the error log:
> > 
> > [    1.233556] traps: modprobe[71] general protection fault ip:7f0077f6495c
> > sp:7ffda148d808 error:0 in ld-2.17.so[7f0077f4d000+22000]
> > [    1.237780] traps: modprobe[73] general protection fault ip:7fad5aba095c
> > sp:7ffd36067378 error:0 in ld-2.17.so[7fad5ab89000+22000]
> > [    1.241930] traps: modprobe[75] general protection fault ip:7f3edb89495c
> > sp:7fffa1a81308 error:0 in ld-2.17.so[7f3edb87d000+22000]
> > [    1.245998] traps: modprobe[77] general protection fault ip:7f91d670895c
> > sp:7ffc25fa7f38 error:0 in ld-2.17.so[7f91d66f1000+22000]
> > [    1.250016] traps: modprobe[79] general protection fault ip:7f0ddbbdc95c
> > sp:7ffe9c34f8d8 error:0 in ld-2.17.so[7f0ddbbc5000+22000]
> > 
> > *modprobe* calls CPUID instruction thus causing cpuid faulting in guest.
> > At the end, because guest cannot *modprobe* modules, it boots failure.
> > 
> > This patch switches MSR_MISC_FEATURES_ENABLES between host and guest when
> > hardware has this MSR.
> > 
> > This patch doesn't confict with the commit db2336a80489 ("KVM: x86:
> > virtualize
> > cpuid faulting"), which provides a software emulation of cpuid faulting for
> > x86 arch. Below analysing how cpuid faulting will work after applying this
> > patch:
> > 
> > 1. If host cpu is AMD. It doesn't have MSR_MISC_FEATURES_ENABLES, so we can
> > just
> > use the software emulation of cpuid faulting.
> > 
> > 2. If host cpu is Intel and it doesn't have MSR_MISC_FEATURES_ENABLES. The
> > same
> > as case 1, we can just use the software emulation of cpuid faulting.
> > 
> > 3. If host cpu is Intel and it has MSR_MISC_FEATURES_ENABLES. With this
> > patch,
> > it will write guest's value into MSR_MISC_FEATURES_ENABLES when vm entry.
> > If guest enables cpuid faulting and when guest calls CPUID instruction with
> > CPL > 0, it will cause #GP exception in guest instead of VM exit because of
> > CPUID, thus it doesn't go to the kvm emualtion path but ues the hardware
> > feature. Also it's a benefit that we needn't use VM exit to inject #GP to
> > emulate cpuid faulting feature.
> > 
> > Intel SDM vol3.25.1.1 specifies the priority between cpuid faulting
> > and CPUID instruction.
> > 
> > Signed-off-by: Xiaoyao Li <xiaoyao.li@linux.intel.com>
> > ---
> >  arch/x86/kvm/vmx/vmx.c | 19 +++++++++++++++++++
> >  1 file changed, 19 insertions(+)
> > 
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 30a6bcd735ec..90707fae688e 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -6321,6 +6321,23 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx
> > *vmx)
> >  					msrs[i].host, false);
> >  }
> >  
> > +static void atomic_switch_msr_misc_features_enables(struct kvm_vcpu *vcpu)
> > +{
> > +	u64 host_msr;
> > +	struct vcpu_vmx *vmx = to_vmx(vcpu);
> > +
> > +	/* if MSR MISC_FEATURES_ENABLES doesn't exist on the hardware, do
> > nothing*/
> > +	if (rdmsrl_safe(MSR_MISC_FEATURES_ENABLES, &host_msr))
> > +		return;
> > +
> > +	if (host_msr == vcpu->arch.msr_misc_features_enables)
> > +		clear_atomic_switch_msr(vmx, MSR_MISC_FEATURES_ENABLES);
> > +	else
> > +		add_atomic_switch_msr(vmx, MSR_MISC_FEATURES_ENABLES,
> > +				      vcpu->arch.msr_misc_features_enables,
> > +				      host_msr, false);
> > +}
> > +
> >  static void vmx_arm_hv_timer(struct vcpu_vmx *vmx, u32 val)
> >  {
> >  	vmcs_write32(VMX_PREEMPTION_TIMER_VALUE, val);
> > @@ -6562,6 +6579,8 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
> >  
> >  	atomic_switch_perf_msrs(vmx);
> >  
> > +	atomic_switch_msr_misc_features_enables(vcpu);
> > +
> >  	vmx_update_hv_timer(vcpu);
> >  
> >  	/*
> > 
> 
> Adding a RDMSR for this to each vmentry is too heavy.  Since we emulate
> MSR_MISC_FEATURES_ENABLES, you can just clear the MSR on vcpu_load and
> restore it on vcpu_put.

One question here. Just clear the MSR on vcpu_load instead of writing the
emulated value to MSR? 

I think writing the emulated value to MSR is better. As I mentioned in case 3,
if hardware has cpuid faulting feature. Using hardware capability is more
efficient than emulation that the emulation solution needs VM exit to inject
#GP.   

> Also, this needs a test in tools/testing/selftests/kvm.
> 
> Paolo
Paolo Bonzini - March 14, 2019, 2:44 p.m.
On 14/03/19 13:37, Xiaoyao Li wrote:
>> Adding a RDMSR for this to each vmentry is too heavy.  Since we emulate
>> MSR_MISC_FEATURES_ENABLES, you can just clear the MSR on vcpu_load and
>> restore it on vcpu_put.
> One question here. Just clear the MSR on vcpu_load instead of writing the
> emulated value to MSR? 
> 
> I think writing the emulated value to MSR is better. As I mentioned in case 3,
> if hardware has cpuid faulting feature. Using hardware capability is more
> efficient than emulation that the emulation solution needs VM exit to inject
> #GP.   

You can do that too, yes.  You can add it to vmx_msr_index and it will
be handled like that.

However, that wouldn't work on AMD (which doesn't use the
kvm_define_shared_msr infrastructure) and also on hosts that don't have
MSR_MISC_FEATURES_ENABLES, so my suggestion is to add this optimization
as a separate patch.

Paolo

Patch

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 30a6bcd735ec..90707fae688e 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6321,6 +6321,23 @@  static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx)
 					msrs[i].host, false);
 }
 
+static void atomic_switch_msr_misc_features_enables(struct kvm_vcpu *vcpu)
+{
+	u64 host_msr;
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+
+	/* if MSR MISC_FEATURES_ENABLES doesn't exist on the hardware, do nothing*/
+	if (rdmsrl_safe(MSR_MISC_FEATURES_ENABLES, &host_msr))
+		return;
+
+	if (host_msr == vcpu->arch.msr_misc_features_enables)
+		clear_atomic_switch_msr(vmx, MSR_MISC_FEATURES_ENABLES);
+	else
+		add_atomic_switch_msr(vmx, MSR_MISC_FEATURES_ENABLES,
+				      vcpu->arch.msr_misc_features_enables,
+				      host_msr, false);
+}
+
 static void vmx_arm_hv_timer(struct vcpu_vmx *vmx, u32 val)
 {
 	vmcs_write32(VMX_PREEMPTION_TIMER_VALUE, val);
@@ -6562,6 +6579,8 @@  static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
 
 	atomic_switch_perf_msrs(vmx);
 
+	atomic_switch_msr_misc_features_enables(vcpu);
+
 	vmx_update_hv_timer(vcpu);
 
 	/*