Patchwork kvm: arm: Skip stage2 huge mappings for unaligned ipa backed by THP

login
register
mail settings
Submitter Suzuki K Poulose
Date April 8, 2019, 6:40 p.m.
Message ID <730c25b4-dbc5-8d8b-514c-4ed8641701ce@arm.com>
Download mbox | patch
Permalink /patch/767943/
State New
Headers show

Comments

Suzuki K Poulose - April 8, 2019, 6:40 p.m.
Hi Zenhui,

On 04/08/2019 04:11 PM, Zenghui Yu wrote:
> Hi Suzuki,
> 
> Thanks for the reply.
> 

...

>>> Hi Suzuki,
>>>
>>> Why not making use of fault_supports_stage2_huge_mapping()?  Let it do
>>> some checks for us.
>>>
>>> fault_supports_stage2_huge_mapping() was intended to do a *two-step*
>>> check to tell us that can we create stage2 huge block mappings, and this
>>> check is both for hugetlbfs and THP.  With commit a80868f398554842b14,
>>> we pass PAGE_SIZE as "map_size" for normal size pages (which turned out
>>> to be almost meaningless), and unfortunately the THP check no longer
>>> works.
>>
>> Thats correct.
>>
>>>
>>> So we want to rework *THP* check process.  Your patch fixes the first
>>> checking-step, but the second is still missed, am I wrong?
>>
>> It fixes the step explicitly for the THP by making sure that the GPA and
>> the HVA are aligned to the map size.
> 
> Yes, I understand how your patch had fixed the issue.  But what I'm
> really concerned about here is the *second* checking-step in
> fault_supports_stage2_huge_mapping().
> 
> We have to check if we are mapping a non-block aligned or non-block
> sized memslot, if so, we can not create block mappings for the beginning
> and end of this memslot.  This is what the second part of
> fault_supports_stage2_huge_mapping() had done.
> 
> I haven't seen this checking-step in your patch, did I miss something?
> 

I see.

>> I don't think this calls for a VM_BUG_ON(). It is simply a case where
>> the GPA is not aligned to HVA, but for normal VMA that could be made THP.
>>
>> We had this VM_BUG_ON(), which would have never hit because we would
>> have set force_pte if they were not aligned.
> 
> Yes, I agree.
> 
>>>> +        /* Skip memslots with unaligned IPA and user address */
>>>> +        if ((gfn & mask) != (pfn & mask))
>>>> +            return false;
>>>>           if (pfn & mask) {
>>>>               *ipap &= PMD_MASK;
>>>>               kvm_release_pfn_clean(pfn);
>>>>
>>>
>>> ---8>---
>>>
>>> Rework fault_supports_stage2_huge_mapping(), let it check THP again.
>>>
>>> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
>>> ---
>>>   virt/kvm/arm/mmu.c | 11 ++++++++++-
>>>   1 file changed, 10 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>> index 27c9583..5e1b258 100644
>>> --- a/virt/kvm/arm/mmu.c
>>> +++ b/virt/kvm/arm/mmu.c
>>> @@ -1632,6 +1632,15 @@ static bool 
>>> fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot,
>>>       uaddr_end = uaddr_start + size;
>>>
>>>       /*
>>> +     * If the memslot is _not_ backed by hugetlbfs, then check if it
>>> +     * can be backed by transparent hugepages.
>>> +     *
>>> +     * Currently only PMD_SIZE THPs are supported, revisit it later.
>>> +     */
>>> +    if (map_size == PAGE_SIZE)
>>> +        map_size = PMD_SIZE;
>>> +
>>
>> This looks hackish. What is we support PUD_SIZE huge page in the future
>> ?
> 
> Yes, this might make the code a little difficult to understand. But by
> doing so, we follow the same logic before commit a80868f398554842b14,
> that said, we do the two-step checking for normal size pages in
> fault_supports_stage2_huge_mapping(), to decide if we can create THP
> mappings for these pages.
> 
> As for PUD_SIZE THPs, to be honest, I have no idea now :(

How about the following diff ?
Zenghui Yu - April 9, 2019, 8:05 a.m.
On 2019/4/9 2:40, Suzuki K Poulose wrote:
> Hi Zenhui,
> 
> On 04/08/2019 04:11 PM, Zenghui Yu wrote:
>> Hi Suzuki,
>>
>> Thanks for the reply.
>>
> 
> ...
> 
>>>> Hi Suzuki,
>>>>
>>>> Why not making use of fault_supports_stage2_huge_mapping()?  Let it do
>>>> some checks for us.
>>>>
>>>> fault_supports_stage2_huge_mapping() was intended to do a *two-step*
>>>> check to tell us that can we create stage2 huge block mappings, and 
>>>> this
>>>> check is both for hugetlbfs and THP.  With commit a80868f398554842b14,
>>>> we pass PAGE_SIZE as "map_size" for normal size pages (which turned out
>>>> to be almost meaningless), and unfortunately the THP check no longer
>>>> works.
>>>
>>> Thats correct.
>>>
>>>>
>>>> So we want to rework *THP* check process.  Your patch fixes the first
>>>> checking-step, but the second is still missed, am I wrong?
>>>
>>> It fixes the step explicitly for the THP by making sure that the GPA and
>>> the HVA are aligned to the map size.
>>
>> Yes, I understand how your patch had fixed the issue.  But what I'm
>> really concerned about here is the *second* checking-step in
>> fault_supports_stage2_huge_mapping().
>>
>> We have to check if we are mapping a non-block aligned or non-block
>> sized memslot, if so, we can not create block mappings for the beginning
>> and end of this memslot.  This is what the second part of
>> fault_supports_stage2_huge_mapping() had done.
>>
>> I haven't seen this checking-step in your patch, did I miss something?
>>
> 
> I see.
> 
>>> I don't think this calls for a VM_BUG_ON(). It is simply a case where
>>> the GPA is not aligned to HVA, but for normal VMA that could be made 
>>> THP.
>>>
>>> We had this VM_BUG_ON(), which would have never hit because we would
>>> have set force_pte if they were not aligned.
>>
>> Yes, I agree.
>>
>>>>> +        /* Skip memslots with unaligned IPA and user address */
>>>>> +        if ((gfn & mask) != (pfn & mask))
>>>>> +            return false;
>>>>>           if (pfn & mask) {
>>>>>               *ipap &= PMD_MASK;
>>>>>               kvm_release_pfn_clean(pfn);
>>>>>
>>>>
>>>> ---8>---
>>>>
>>>> Rework fault_supports_stage2_huge_mapping(), let it check THP again.
>>>>
>>>> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
>>>> ---
>>>>   virt/kvm/arm/mmu.c | 11 ++++++++++-
>>>>   1 file changed, 10 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>>> index 27c9583..5e1b258 100644
>>>> --- a/virt/kvm/arm/mmu.c
>>>> +++ b/virt/kvm/arm/mmu.c
>>>> @@ -1632,6 +1632,15 @@ static bool 
>>>> fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot,
>>>>       uaddr_end = uaddr_start + size;
>>>>
>>>>       /*
>>>> +     * If the memslot is _not_ backed by hugetlbfs, then check if it
>>>> +     * can be backed by transparent hugepages.
>>>> +     *
>>>> +     * Currently only PMD_SIZE THPs are supported, revisit it later.
>>>> +     */
>>>> +    if (map_size == PAGE_SIZE)
>>>> +        map_size = PMD_SIZE;
>>>> +
>>>
>>> This looks hackish. What is we support PUD_SIZE huge page in the future
>>> ?
>>
>> Yes, this might make the code a little difficult to understand. But by
>> doing so, we follow the same logic before commit a80868f398554842b14,
>> that said, we do the two-step checking for normal size pages in
>> fault_supports_stage2_huge_mapping(), to decide if we can create THP
>> mappings for these pages.
>>
>> As for PUD_SIZE THPs, to be honest, I have no idea now :(
> 
> How about the following diff ?
> 
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index 97b5417..98e5cec 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -1791,7 +1791,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> phys_addr_t fault_ipa,
>            * currently supported. This code will need to be
>            * updated to support other THP sizes.
>            */
> -        if (transparent_hugepage_adjust(&pfn, &fault_ipa))
> +        if (fault_supports_stage2_huge_mappings(memslot, hva, PMD_SIZE) &&
> +            transparent_hugepage_adjust(&pfn, &fault_ipa))
>               vma_pagesize = PMD_SIZE;
>       }

I think this is good enough for the issue.

(One minor concern: With this change, it seems that we no longer need
"force_pte" and can just use "logging_active" instead. But this is not
much related to what we're fixing.)


thanks.
Suzuki K Poulose - April 9, 2019, 2:59 p.m.
Hi Zenghui

On 04/09/2019 09:05 AM, Zenghui Yu wrote:
> 
> 
> On 2019/4/9 2:40, Suzuki K Poulose wrote:
>> Hi Zenhui,
>>
>> On 04/08/2019 04:11 PM, Zenghui Yu wrote:
>>> Hi Suzuki,
>>>
>>> Thanks for the reply.
>>>
>>
>> ...
>>
>>>>> Hi Suzuki,
>>>>>
>>>>> Why not making use of fault_supports_stage2_huge_mapping()?  Let it do
>>>>> some checks for us.
>>>>>
>>>>> fault_supports_stage2_huge_mapping() was intended to do a *two-step*
>>>>> check to tell us that can we create stage2 huge block mappings, and 
>>>>> this
>>>>> check is both for hugetlbfs and THP.  With commit a80868f398554842b14,
>>>>> we pass PAGE_SIZE as "map_size" for normal size pages (which turned 
>>>>> out
>>>>> to be almost meaningless), and unfortunately the THP check no longer
>>>>> works.
>>>>
>>>> Thats correct.
>>>>
>>>>>
>>>>> So we want to rework *THP* check process.  Your patch fixes the first
>>>>> checking-step, but the second is still missed, am I wrong?
>>>>
>>>> It fixes the step explicitly for the THP by making sure that the GPA 
>>>> and
>>>> the HVA are aligned to the map size.
>>>
>>> Yes, I understand how your patch had fixed the issue.  But what I'm
>>> really concerned about here is the *second* checking-step in
>>> fault_supports_stage2_huge_mapping().
>>>
>>> We have to check if we are mapping a non-block aligned or non-block
>>> sized memslot, if so, we can not create block mappings for the beginning
>>> and end of this memslot.  This is what the second part of
>>> fault_supports_stage2_huge_mapping() had done.
>>>
>>> I haven't seen this checking-step in your patch, did I miss something?
>>>
>>
>> I see.
>>
>>>> I don't think this calls for a VM_BUG_ON(). It is simply a case where
>>>> the GPA is not aligned to HVA, but for normal VMA that could be made 
>>>> THP.
>>>>
>>>> We had this VM_BUG_ON(), which would have never hit because we would
>>>> have set force_pte if they were not aligned.
>>>
>>> Yes, I agree.
>>>
>>>>>> +        /* Skip memslots with unaligned IPA and user address */
>>>>>> +        if ((gfn & mask) != (pfn & mask))
>>>>>> +            return false;
>>>>>>           if (pfn & mask) {
>>>>>>               *ipap &= PMD_MASK;
>>>>>>               kvm_release_pfn_clean(pfn);
>>>>>>
>>>>>
>>>>> ---8>---
>>>>>
>>>>> Rework fault_supports_stage2_huge_mapping(), let it check THP again.
>>>>>
>>>>> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
>>>>> ---
>>>>>   virt/kvm/arm/mmu.c | 11 ++++++++++-
>>>>>   1 file changed, 10 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>>>> index 27c9583..5e1b258 100644
>>>>> --- a/virt/kvm/arm/mmu.c
>>>>> +++ b/virt/kvm/arm/mmu.c
>>>>> @@ -1632,6 +1632,15 @@ static bool 
>>>>> fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot,
>>>>>       uaddr_end = uaddr_start + size;
>>>>>
>>>>>       /*
>>>>> +     * If the memslot is _not_ backed by hugetlbfs, then check if it
>>>>> +     * can be backed by transparent hugepages.
>>>>> +     *
>>>>> +     * Currently only PMD_SIZE THPs are supported, revisit it later.
>>>>> +     */
>>>>> +    if (map_size == PAGE_SIZE)
>>>>> +        map_size = PMD_SIZE;
>>>>> +
>>>>
>>>> This looks hackish. What is we support PUD_SIZE huge page in the future
>>>> ?
>>>
>>> Yes, this might make the code a little difficult to understand. But by
>>> doing so, we follow the same logic before commit a80868f398554842b14,
>>> that said, we do the two-step checking for normal size pages in
>>> fault_supports_stage2_huge_mapping(), to decide if we can create THP
>>> mappings for these pages.
>>>
>>> As for PUD_SIZE THPs, to be honest, I have no idea now :(
>>
>> How about the following diff ?
>>
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index 97b5417..98e5cec 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>> @@ -1791,7 +1791,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
>> phys_addr_t fault_ipa,
>>            * currently supported. This code will need to be
>>            * updated to support other THP sizes.
>>            */
>> -        if (transparent_hugepage_adjust(&pfn, &fault_ipa))
>> +        if (fault_supports_stage2_huge_mappings(memslot, hva, 
>> PMD_SIZE) &&
>> +            transparent_hugepage_adjust(&pfn, &fault_ipa))
>>               vma_pagesize = PMD_SIZE;
>>       }
> 
> I think this is good enough for the issue.
> 
> (One minor concern: With this change, it seems that we no longer need
> "force_pte" and can just use "logging_active" instead. But this is not
> much related to what we're fixing.)

I would still leave the force_pte there to avoid checking for a THP case
in a situation where we forced to PTE level mapping on a hugepage backed
VMA. It would serve to avoid another check.

Cheers
Suzuki


> 
> 
> thanks.
> 
>
Zenghui Yu - April 10, 2019, 2:20 a.m.
On 2019/4/9 22:59, Suzuki K Poulose wrote:
> Hi Zenghui
> 
> On 04/09/2019 09:05 AM, Zenghui Yu wrote:
>>
>>
>> On 2019/4/9 2:40, Suzuki K Poulose wrote:
>>> Hi Zenhui,
>>>
>>> On 04/08/2019 04:11 PM, Zenghui Yu wrote:
>>>> Hi Suzuki,
>>>>
>>>> Thanks for the reply.
>>>>
>>>
>>> ...
>>>
>>>>>> Hi Suzuki,
>>>>>>
>>>>>> Why not making use of fault_supports_stage2_huge_mapping()?  Let 
>>>>>> it do
>>>>>> some checks for us.
>>>>>>
>>>>>> fault_supports_stage2_huge_mapping() was intended to do a *two-step*
>>>>>> check to tell us that can we create stage2 huge block mappings, 
>>>>>> and this
>>>>>> check is both for hugetlbfs and THP.  With commit 
>>>>>> a80868f398554842b14,
>>>>>> we pass PAGE_SIZE as "map_size" for normal size pages (which 
>>>>>> turned out
>>>>>> to be almost meaningless), and unfortunately the THP check no longer
>>>>>> works.
>>>>>
>>>>> Thats correct.
>>>>>
>>>>>>
>>>>>> So we want to rework *THP* check process.  Your patch fixes the first
>>>>>> checking-step, but the second is still missed, am I wrong?
>>>>>
>>>>> It fixes the step explicitly for the THP by making sure that the 
>>>>> GPA and
>>>>> the HVA are aligned to the map size.
>>>>
>>>> Yes, I understand how your patch had fixed the issue.  But what I'm
>>>> really concerned about here is the *second* checking-step in
>>>> fault_supports_stage2_huge_mapping().
>>>>
>>>> We have to check if we are mapping a non-block aligned or non-block
>>>> sized memslot, if so, we can not create block mappings for the 
>>>> beginning
>>>> and end of this memslot.  This is what the second part of
>>>> fault_supports_stage2_huge_mapping() had done.
>>>>
>>>> I haven't seen this checking-step in your patch, did I miss something?
>>>>
>>>
>>> I see.
>>>
>>>>> I don't think this calls for a VM_BUG_ON(). It is simply a case where
>>>>> the GPA is not aligned to HVA, but for normal VMA that could be 
>>>>> made THP.
>>>>>
>>>>> We had this VM_BUG_ON(), which would have never hit because we would
>>>>> have set force_pte if they were not aligned.
>>>>
>>>> Yes, I agree.
>>>>
>>>>>>> +        /* Skip memslots with unaligned IPA and user address */
>>>>>>> +        if ((gfn & mask) != (pfn & mask))
>>>>>>> +            return false;
>>>>>>>           if (pfn & mask) {
>>>>>>>               *ipap &= PMD_MASK;
>>>>>>>               kvm_release_pfn_clean(pfn);
>>>>>>>
>>>>>>
>>>>>> ---8>---
>>>>>>
>>>>>> Rework fault_supports_stage2_huge_mapping(), let it check THP again.
>>>>>>
>>>>>> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
>>>>>> ---
>>>>>>   virt/kvm/arm/mmu.c | 11 ++++++++++-
>>>>>>   1 file changed, 10 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>>>>> index 27c9583..5e1b258 100644
>>>>>> --- a/virt/kvm/arm/mmu.c
>>>>>> +++ b/virt/kvm/arm/mmu.c
>>>>>> @@ -1632,6 +1632,15 @@ static bool 
>>>>>> fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot,
>>>>>>       uaddr_end = uaddr_start + size;
>>>>>>
>>>>>>       /*
>>>>>> +     * If the memslot is _not_ backed by hugetlbfs, then check if it
>>>>>> +     * can be backed by transparent hugepages.
>>>>>> +     *
>>>>>> +     * Currently only PMD_SIZE THPs are supported, revisit it later.
>>>>>> +     */
>>>>>> +    if (map_size == PAGE_SIZE)
>>>>>> +        map_size = PMD_SIZE;
>>>>>> +
>>>>>
>>>>> This looks hackish. What is we support PUD_SIZE huge page in the 
>>>>> future
>>>>> ?
>>>>
>>>> Yes, this might make the code a little difficult to understand. But by
>>>> doing so, we follow the same logic before commit a80868f398554842b14,
>>>> that said, we do the two-step checking for normal size pages in
>>>> fault_supports_stage2_huge_mapping(), to decide if we can create THP
>>>> mappings for these pages.
>>>>
>>>> As for PUD_SIZE THPs, to be honest, I have no idea now :(
>>>
>>> How about the following diff ?
>>>
>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>> index 97b5417..98e5cec 100644
>>> --- a/virt/kvm/arm/mmu.c
>>> +++ b/virt/kvm/arm/mmu.c
>>> @@ -1791,7 +1791,8 @@ static int user_mem_abort(struct kvm_vcpu 
>>> *vcpu, phys_addr_t fault_ipa,
>>>            * currently supported. This code will need to be
>>>            * updated to support other THP sizes.
>>>            */
>>> -        if (transparent_hugepage_adjust(&pfn, &fault_ipa))
>>> +        if (fault_supports_stage2_huge_mappings(memslot, hva, 
>>> PMD_SIZE) &&
>>> +            transparent_hugepage_adjust(&pfn, &fault_ipa))
>>>               vma_pagesize = PMD_SIZE;
>>>       }
>>
>> I think this is good enough for the issue.
>>
>> (One minor concern: With this change, it seems that we no longer need
>> "force_pte" and can just use "logging_active" instead. But this is not
>> much related to what we're fixing.)
> 
> I would still leave the force_pte there to avoid checking for a THP case
> in a situation where we forced to PTE level mapping on a hugepage backed
> VMA. It would serve to avoid another check.

Hi Suzuki,

Yes, I agree, thanks.


zenghui
Suzuki K Poulose - April 10, 2019, 8:39 a.m.
On 10/04/2019 03:20, Zenghui Yu wrote:
> 
> On 2019/4/9 22:59, Suzuki K Poulose wrote:
>> Hi Zenghui
>>
>> On 04/09/2019 09:05 AM, Zenghui Yu wrote:
>>>
>>>
>>> On 2019/4/9 2:40, Suzuki K Poulose wrote:
>>>> Hi Zenhui,
>>>>
>>>> On 04/08/2019 04:11 PM, Zenghui Yu wrote:
>>>>> Hi Suzuki,
>>>>>
>>>>> Thanks for the reply.
>>>>>
>>>>
>>>> ...
>>>>
>>>>>>> Hi Suzuki,
>>>>>>>
>>>>>>> Why not making use of fault_supports_stage2_huge_mapping()?  Let
>>>>>>> it do
>>>>>>> some checks for us.
>>>>>>>
>>>>>>> fault_supports_stage2_huge_mapping() was intended to do a *two-step*
>>>>>>> check to tell us that can we create stage2 huge block mappings,
>>>>>>> and this
>>>>>>> check is both for hugetlbfs and THP.  With commit
>>>>>>> a80868f398554842b14,
>>>>>>> we pass PAGE_SIZE as "map_size" for normal size pages (which
>>>>>>> turned out
>>>>>>> to be almost meaningless), and unfortunately the THP check no longer
>>>>>>> works.
>>>>>>
>>>>>> Thats correct.
>>>>>>
>>>>>>>
>>>>>>> So we want to rework *THP* check process.  Your patch fixes the first
>>>>>>> checking-step, but the second is still missed, am I wrong?
>>>>>>
>>>>>> It fixes the step explicitly for the THP by making sure that the
>>>>>> GPA and
>>>>>> the HVA are aligned to the map size.
>>>>>
>>>>> Yes, I understand how your patch had fixed the issue.  But what I'm
>>>>> really concerned about here is the *second* checking-step in
>>>>> fault_supports_stage2_huge_mapping().
>>>>>
>>>>> We have to check if we are mapping a non-block aligned or non-block
>>>>> sized memslot, if so, we can not create block mappings for the
>>>>> beginning
>>>>> and end of this memslot.  This is what the second part of
>>>>> fault_supports_stage2_huge_mapping() had done.
>>>>>
>>>>> I haven't seen this checking-step in your patch, did I miss something?
>>>>>
>>>>
>>>> I see.
>>>>
>>>>>> I don't think this calls for a VM_BUG_ON(). It is simply a case where
>>>>>> the GPA is not aligned to HVA, but for normal VMA that could be
>>>>>> made THP.
>>>>>>
>>>>>> We had this VM_BUG_ON(), which would have never hit because we would
>>>>>> have set force_pte if they were not aligned.
>>>>>
>>>>> Yes, I agree.
>>>>>
>>>>>>>> +        /* Skip memslots with unaligned IPA and user address */
>>>>>>>> +        if ((gfn & mask) != (pfn & mask))
>>>>>>>> +            return false;
>>>>>>>>            if (pfn & mask) {
>>>>>>>>                *ipap &= PMD_MASK;
>>>>>>>>                kvm_release_pfn_clean(pfn);
>>>>>>>>
>>>>>>>
>>>>>>> ---8>---
>>>>>>>
>>>>>>> Rework fault_supports_stage2_huge_mapping(), let it check THP again.
>>>>>>>
>>>>>>> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
>>>>>>> ---
>>>>>>>    virt/kvm/arm/mmu.c | 11 ++++++++++-
>>>>>>>    1 file changed, 10 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>>>>>> index 27c9583..5e1b258 100644
>>>>>>> --- a/virt/kvm/arm/mmu.c
>>>>>>> +++ b/virt/kvm/arm/mmu.c
>>>>>>> @@ -1632,6 +1632,15 @@ static bool
>>>>>>> fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot,
>>>>>>>        uaddr_end = uaddr_start + size;
>>>>>>>
>>>>>>>        /*
>>>>>>> +     * If the memslot is _not_ backed by hugetlbfs, then check if it
>>>>>>> +     * can be backed by transparent hugepages.
>>>>>>> +     *
>>>>>>> +     * Currently only PMD_SIZE THPs are supported, revisit it later.
>>>>>>> +     */
>>>>>>> +    if (map_size == PAGE_SIZE)
>>>>>>> +        map_size = PMD_SIZE;
>>>>>>> +
>>>>>>
>>>>>> This looks hackish. What is we support PUD_SIZE huge page in the
>>>>>> future
>>>>>> ?
>>>>>
>>>>> Yes, this might make the code a little difficult to understand. But by
>>>>> doing so, we follow the same logic before commit a80868f398554842b14,
>>>>> that said, we do the two-step checking for normal size pages in
>>>>> fault_supports_stage2_huge_mapping(), to decide if we can create THP
>>>>> mappings for these pages.
>>>>>
>>>>> As for PUD_SIZE THPs, to be honest, I have no idea now :(
>>>>
>>>> How about the following diff ?
>>>>
>>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>>> index 97b5417..98e5cec 100644
>>>> --- a/virt/kvm/arm/mmu.c
>>>> +++ b/virt/kvm/arm/mmu.c
>>>> @@ -1791,7 +1791,8 @@ static int user_mem_abort(struct kvm_vcpu
>>>> *vcpu, phys_addr_t fault_ipa,
>>>>             * currently supported. This code will need to be
>>>>             * updated to support other THP sizes.
>>>>             */
>>>> -        if (transparent_hugepage_adjust(&pfn, &fault_ipa))
>>>> +        if (fault_supports_stage2_huge_mappings(memslot, hva,
>>>> PMD_SIZE) &&
>>>> +            transparent_hugepage_adjust(&pfn, &fault_ipa))
>>>>                vma_pagesize = PMD_SIZE;
>>>>        }
>>>
>>> I think this is good enough for the issue.
>>>
>>> (One minor concern: With this change, it seems that we no longer need
>>> "force_pte" and can just use "logging_active" instead. But this is not
>>> much related to what we're fixing.)
>>
>> I would still leave the force_pte there to avoid checking for a THP case
>> in a situation where we forced to PTE level mapping on a hugepage backed
>> VMA. It would serve to avoid another check.
> 
> Hi Suzuki,
> 
> Yes, I agree, thanks.

Cool, I have a patch to fix this properly and two other patches to clean up
and unify the way we handle the THP backed hugepages. Will send them out after
a bit of testing, later today.

Cheers
Suzuki

Patch

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 97b5417..98e5cec 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1791,7 +1791,8 @@  static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
  		 * currently supported. This code will need to be
  		 * updated to support other THP sizes.
  		 */
-		if (transparent_hugepage_adjust(&pfn, &fault_ipa))
+		if (fault_supports_stage2_huge_mappings(memslot, hva, PMD_SIZE) &&
+		    transparent_hugepage_adjust(&pfn, &fault_ipa))
  			vma_pagesize = PMD_SIZE;
  	}