Patchwork Kick cpu when WFI in single-threaded kvm integration

login
register
mail settings
Submitter Christoffer Dall
Date March 15, 2019, 8:43 a.m.
Message ID <20190315084311.GA10950@e113682-lin.lund.arm.com>
Download mbox | patch
Permalink /patch/749387/
State New
Headers show

Comments

Christoffer Dall - March 15, 2019, 8:43 a.m.
Hi Jan,

On Thu, Mar 14, 2019 at 12:19:02PM +0000, Jan Bolke wrote:
> Hi all,
> 
> Currently I am working on a SystemC integration of kvm on arm.
> Therefore, I use the kvm api and of course SystemC (library to simulate hardware platforms with C++).
> 
> As I need the virtual cpu to interrupt its execution loop from time to time to let the rest of the SystemC simulation execute,
> I use a perf_event and let the kernel send a signal on overflow to the simulation thread which kicks the virtual cpu (suggested by this mailing list, thanks again).
> Thus I am able to simulate a quantum mechanism for the virtual cpu.
> 
> As I am running benchmarks (e.g. Coremark) on my virtual platform this works fine.
> 
> I also get to boot Linux until it spawns the terminal and then wait for interrupts from my virtual uart.
> Here comes the problem:
> The perf event counting mechanism does increment its counted instructions very very slowly when the virtual cpu executes wfi.
> Thus my whole simulation starts to hang.
> As my simulation is single threaded I need the signal from the kernel to kick my cpu to let the virtual uart deliver its interrupt to react to my input.
> I tried to use the request_interrupt_window flag but this does not seem to work.
> 
> Is there a way to kick the virtual cpu when it is waiting for interrupts? Or do I have to patch my kvm code?
> 

Let me see if I understand your question properly; you are running a KVM
virtual CPU which executes WFI in the guest, and then you are not
receiving interrupts in the guest nor getting events back from KVM which
you somehow use to run a backend simulation in userspace?

KVM/Arm can do two things for WFI:

 1. Let the guest directly execute it without trapping to the hypervisor
    (the physical CPU will NOT exit the guest until there's a physical
    interrupt on the CPU).

 2. Trap WFI to KVM.  KVM asks Linux to schedule another process until
    there's a virtual interrupt for the VCPU.  This is what mainline
    KVM/Arm does.


I suspect what's happening is that you are using a normal kernel
configured as (2), and therefore you only count cycles for the perf
event while the guest runs the timer ISR which obviously is much much
less than if you had a constant running VCPU.  Am I on the right track?

If so, there are a couple of things you could try.

First, you can try disabling the trap on WFI (note that this changes
pretty fundamental behavior of KVM, and this is not recommended for
production use or for systems level performance investigations where
more than one workload contends for a single physical CPU):



Note that I'm not sure how the performance counter counts on your
particular platform when the CPU is in WFI, so this may not help at all.


Second, and possibly preferred, you can hook up your simulation event to
a timer event in the case of trapping on a WFI.  See kvm_handle_wfx() in
arch/arm64/kvm/handle_exit.c and follow kvm_vcpu_block() from there to
see how KVM/Arm handles this event.


Hope this helps,

    Christoffer
Jan Bolke - March 16, 2019, 9:05 a.m.
>Let me see if I understand your question properly; you are running a KVM virtual CPU which executes WFI in the guest, and then you are not >receiving interrupts in the guest nor getting events back from KVM which you somehow use to run a backend simulation in userspace?

You are right!

>KVM/Arm can do two things for WFI: 
> 2. Trap WFI to KVM.  KVM asks Linux to schedule another process until
>    there's a virtual interrupt for the VCPU.  This is what mainline
>    KVM/Arm does.
>
>
>I suspect what's happening is that you are using a normal kernel configured as (2), and therefore you only count cycles for the perf event while the >guest runs the timer ISR which obviously is much much less than if you had a constant running VCPU.  Am I on the right track?

Yes. The only thing which is somewhat strange is that I am using a userspace interrupt controller so my timer interrupt goes its way from the kvm virtual timer into userspace via KVM_ARM_DEV_EL1_VTIMER, through the userspace interrupt controller and back into kvm via KVM_IRQ_LINE. 
This of course means, that as long the vcpu is executing and has not exited to userspace there is no interrupt delivery and thus no timer interrupt service routine. Nevertheless, the counting mechanism increases slowly and eventually the kvm_cpu exits to userspace enabling me to deliver timer/uart interrupts. This leads to a very slow simulation, once linux is completely booted and interrupt-driven. But that just as a sidenote.

>Second, and possibly preferred, you can hook up your simulation event to a timer event in the case of trapping on a WFI.  See kvm_handle_wfx() in >arch/arm64/kvm/handle_exit.c and follow kvm_vcpu_block() from there to see how KVM/Arm handles this event.

That sound like a great plan. I basically want to achieve the following: instead of waiting inside kvm_vcpu_block() for a virtual interrupt I want to exit to userspace and signal that the virtual cpu is waiting for a interrupt. Then I can react inside my simulation via a uart/timer interrupt and enter the guest again. Thanks four your ideas, they gave me a really good starting point.

There is still one thing I do not understand: In the documentation of the kvm api there is the the 'request_interrupt_window' flag in the kvm_run structure. This is a request that kvm_returns when it becomes possible to inject interrupts.
This looked as the thing I want to do/use but did not gave the expected result. Is this flag disabled on kvm/arm?

Thanks again,
Jan

Patch

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 7f9d2bfcf82e..b38a5a134fef 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -84,7 +84,7 @@ 
  * FMO:		Override CPSR.F and enable signaling with VF
  * SWIO:	Turn set/way invalidates into set/way clean+invalidate
  */
-#define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWE | HCR_TWI | HCR_VM | \
+#define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWE | HCR_VM | \
 			 HCR_TVM | HCR_BSU_IS | HCR_FB | HCR_TAC | \
 			 HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW | HCR_TLOR | \
 			 HCR_FMO | HCR_IMO)