Patchwork [3/3] x86/mm: Consolidate all Xen L4 slot writing into init_xen_l4_slots()

login
register
mail settings
Submitter Andrew Cooper
Date Oct. 12, 2017, 1:54 p.m.
Message ID <1507816462-15881-4-git-send-email-andrew.cooper3@citrix.com>
Download mbox | patch
Permalink /patch/360531/
State New
Headers show

Comments

Andrew Cooper - Oct. 12, 2017, 1:54 p.m.
There are currently three functions which write L4 pagetables for Xen, but
they all behave subtly differently.  sh_install_xen_entries_in_l4() in
particular is catering for two different usecases, which makes the safety of
the linear mappings hard to follow.

By consolidating the L4 pagetable writing in a single function, the resulting
setup of Xen's virtual layout is easier to understand.

No practical changes to the resulting L4, although the logic has been
rearranged to avoid rewriting some slots.  This changes the zap_ro_mpt
parameter to simply ro_mpt.

Both {hap,sh}_install_xen_entries_in_l4() get folded into their callers.  The
hap side only a single caller, while the shadow side has two.  The shadow
split helps highlight the correctness of the linear slots.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Tim Deegan <tim@xen.org>
CC: George Dunlap <george.dunlap@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Julien Grall <julien.grall@arm.com>
---
 xen/arch/x86/mm.c              |  87 +++++++++++++++++++++++++--------
 xen/arch/x86/mm/hap/hap.c      |  31 +++---------
 xen/arch/x86/mm/shadow/multi.c | 106 ++++++++++-------------------------------
 xen/arch/x86/pv/dom0_build.c   |   3 +-
 xen/arch/x86/pv/domain.c       |   2 +-
 xen/include/asm-x86/mm.h       |   4 +-
 6 files changed, 105 insertions(+), 128 deletions(-)
Jan Beulich - Oct. 12, 2017, 3:08 p.m.
>>> On 12.10.17 at 15:54, <andrew.cooper3@citrix.com> wrote:
> --- a/xen/arch/x86/mm/hap/hap.c
> +++ b/xen/arch/x86/mm/hap/hap.c
> @@ -391,41 +391,24 @@ int hap_set_allocation(struct domain *d, unsigned int pages, bool *preempted)
>      return 0;
>  }
>  
> -static void hap_install_xen_entries_in_l4(struct vcpu *v, mfn_t l4mfn)
> -{
> -    struct domain *d = v->domain;
> -    l4_pgentry_t *l4e;
> -
> -    l4e = map_domain_page(l4mfn);
> -
> -    /* Copy the common Xen mappings from the idle domain */
> -    memcpy(&l4e[ROOT_PAGETABLE_FIRST_XEN_SLOT],
> -           &idle_pg_table[ROOT_PAGETABLE_FIRST_XEN_SLOT],
> -           ROOT_PAGETABLE_XEN_SLOTS * sizeof(l4_pgentry_t));
> -
> -    /* Install the per-domain mappings for this domain */
> -    l4e[l4_table_offset(PERDOMAIN_VIRT_START)] =
> -        l4e_from_page(d->arch.perdomain_l3_pg, __PAGE_HYPERVISOR_RW);
> -
> -    /* Install a linear mapping */
> -    l4e[l4_table_offset(LINEAR_PT_VIRT_START)] =
> -        l4e_from_mfn(l4mfn, __PAGE_HYPERVISOR_RW);
> -
> -    unmap_domain_page(l4e);
> -}
> -
>  static mfn_t hap_make_monitor_table(struct vcpu *v)
>  {
>      struct domain *d = v->domain;
>      struct page_info *pg;
> +    l4_pgentry_t *l4e;
>      mfn_t m4mfn;
>  
>      ASSERT(pagetable_get_pfn(v->arch.monitor_table) == 0);
>  
>      if ( (pg = hap_alloc(d)) == NULL )
>          goto oom;
> +
>      m4mfn = page_to_mfn(pg);
> -    hap_install_xen_entries_in_l4(v, m4mfn);
> +    l4e = __map_domain_page(pg);

If you obtain the MFN anyway, map_domain_page() is cheaper
generated code wise.

> --- a/xen/arch/x86/pv/domain.c
> +++ b/xen/arch/x86/pv/domain.c
> @@ -35,7 +35,7 @@ static int setup_compat_l4(struct vcpu *v)
>  
>      l4tab = __map_domain_page(pg);
>      clear_page(l4tab);
> -    init_guest_l4_table(l4tab, v->domain, 1);
> +    init_xen_l4_slots(l4tab, page_to_mfn(pg), v->domain, INVALID_MFN, false);

Perhaps worth avoiding the double translation here too.

In any event
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan
Wei Liu - Oct. 12, 2017, 3:13 p.m.
On Thu, Oct 12, 2017 at 02:54:22PM +0100, Andrew Cooper wrote:
> There are currently three functions which write L4 pagetables for Xen, but
> they all behave subtly differently.  sh_install_xen_entries_in_l4() in
> particular is catering for two different usecases, which makes the safety of
> the linear mappings hard to follow.
> 
> By consolidating the L4 pagetable writing in a single function, the resulting
> setup of Xen's virtual layout is easier to understand.
> 
> No practical changes to the resulting L4, although the logic has been
> rearranged to avoid rewriting some slots.  This changes the zap_ro_mpt
> parameter to simply ro_mpt.
> 
> Both {hap,sh}_install_xen_entries_in_l4() get folded into their callers.  The
> hap side only a single caller, while the shadow side has two.  The shadow
> split helps highlight the correctness of the linear slots.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>
George Dunlap - Oct. 18, 2017, 11:35 a.m.
On 10/12/2017 02:54 PM, Andrew Cooper wrote:
> There are currently three functions which write L4 pagetables for Xen, but
> they all behave subtly differently.  sh_install_xen_entries_in_l4() in
> particular is catering for two different usecases, which makes the safety of
> the linear mappings hard to follow.
> 
> By consolidating the L4 pagetable writing in a single function, the resulting
> setup of Xen's virtual layout is easier to understand.
> 
> No practical changes to the resulting L4, although the logic has been
> rearranged to avoid rewriting some slots.  This changes the zap_ro_mpt
> parameter to simply ro_mpt.
> 
> Both {hap,sh}_install_xen_entries_in_l4() get folded into their callers.  The
> hap side only a single caller, while the shadow side has two.  The shadow
> split helps highlight the correctness of the linear slots.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: George Dunlap <george.dunlap@citrix.com>

Patch

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index ea4af16..5097958 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -1521,37 +1521,85 @@  void init_xen_pae_l2_slots(l2_pgentry_t *l2t, const struct domain *d)
 }
 
 /*
+ * Fill an L4 with Xen entries.
+ *
  * This function must write all ROOT_PAGETABLE_PV_XEN_SLOTS, to clobber any
  * values a guest may have left there from alloc_l4_table().
+ *
+ * l4t and l4mfn are mandatory, but l4mfn doesn't need to be the mfn under
+ * *l4t.  All other parameters are optional and will either fill or zero the
+ * appropriate slots.  Pagetables not shared with guests will gain the
+ * extended directmap.
  */
-void init_guest_l4_table(l4_pgentry_t l4tab[], const struct domain *d,
-                         bool zap_ro_mpt)
+void init_xen_l4_slots(l4_pgentry_t *l4t, mfn_t l4mfn,
+                       const struct domain *d, mfn_t sl4mfn, bool ro_mpt)
 {
-    /* Xen private mappings. */
-    memcpy(&l4tab[ROOT_PAGETABLE_FIRST_XEN_SLOT],
-           &idle_pg_table[ROOT_PAGETABLE_FIRST_XEN_SLOT],
-           root_pgt_pv_xen_slots * sizeof(l4_pgentry_t));
+    /*
+     * PV vcpus need a shortened directmap.  HVM and Idle vcpus get the full
+     * directmap.
+     */
+    bool short_directmap = d && !paging_mode_external(d);
+
+    /* Slot 256: RO M2P (if applicable). */
+    l4t[l4_table_offset(RO_MPT_VIRT_START)] =
+        ro_mpt ? idle_pg_table[l4_table_offset(RO_MPT_VIRT_START)]
+               : l4e_empty();
+
+    /* Slot 257: PCI MMCFG. */
+    l4t[l4_table_offset(PCI_MCFG_VIRT_START)] =
+        idle_pg_table[l4_table_offset(PCI_MCFG_VIRT_START)];
+
+    /* Slot 258: Self linear mappings. */
+    ASSERT(!mfn_eq(l4mfn, INVALID_MFN));
+    l4t[l4_table_offset(LINEAR_PT_VIRT_START)] =
+        l4e_from_mfn(l4mfn, __PAGE_HYPERVISOR_RW);
+
+    /* Slot 259: Shadow linear mappings (if applicable) .*/
+    l4t[l4_table_offset(SH_LINEAR_PT_VIRT_START)] =
+        mfn_eq(sl4mfn, INVALID_MFN) ? l4e_empty() :
+        l4e_from_mfn(sl4mfn, __PAGE_HYPERVISOR_RW);
+
+    /* Slot 260: Per-domain mappings (if applicable). */
+    l4t[l4_table_offset(PERDOMAIN_VIRT_START)] =
+        d ? l4e_from_page(d->arch.perdomain_l3_pg, __PAGE_HYPERVISOR_RW)
+          : l4e_empty();
+
+    /* Slot 261-: text/data/bss, RW M2P, vmap, frametable, directmap. */
 #ifndef NDEBUG
-    if ( unlikely(root_pgt_pv_xen_slots < ROOT_PAGETABLE_PV_XEN_SLOTS) )
+    if ( short_directmap &&
+         unlikely(root_pgt_pv_xen_slots < ROOT_PAGETABLE_PV_XEN_SLOTS) )
     {
-        l4_pgentry_t *next = &l4tab[ROOT_PAGETABLE_FIRST_XEN_SLOT +
-                                    root_pgt_pv_xen_slots];
+        /*
+         * If using highmem-start=, artificially shorten the directmap to
+         * simulate very large machines.
+         */
+        l4_pgentry_t *next;
+
+        memcpy(&l4t[l4_table_offset(XEN_VIRT_START)],
+               &idle_pg_table[l4_table_offset(XEN_VIRT_START)],
+               (ROOT_PAGETABLE_FIRST_XEN_SLOT + root_pgt_pv_xen_slots -
+                l4_table_offset(XEN_VIRT_START)) * sizeof(*l4t));
+
+        next = &l4t[ROOT_PAGETABLE_FIRST_XEN_SLOT + root_pgt_pv_xen_slots];
 
         if ( l4e_get_intpte(split_l4e) )
             *next++ = split_l4e;
 
         memset(next, 0,
-               _p(&l4tab[ROOT_PAGETABLE_LAST_XEN_SLOT + 1]) - _p(next));
+               _p(&l4t[ROOT_PAGETABLE_LAST_XEN_SLOT + 1]) - _p(next));
     }
-#else
-    BUILD_BUG_ON(root_pgt_pv_xen_slots != ROOT_PAGETABLE_PV_XEN_SLOTS);
+    else
 #endif
-    l4tab[l4_table_offset(LINEAR_PT_VIRT_START)] =
-        l4e_from_pfn(domain_page_map_to_mfn(l4tab), __PAGE_HYPERVISOR_RW);
-    l4tab[l4_table_offset(PERDOMAIN_VIRT_START)] =
-        l4e_from_page(d->arch.perdomain_l3_pg, __PAGE_HYPERVISOR_RW);
-    if ( zap_ro_mpt || is_pv_32bit_domain(d) )
-        l4tab[l4_table_offset(RO_MPT_VIRT_START)] = l4e_empty();
+    {
+        unsigned int slots = (short_directmap
+                              ? ROOT_PAGETABLE_PV_XEN_SLOTS
+                              : ROOT_PAGETABLE_XEN_SLOTS);
+
+        memcpy(&l4t[l4_table_offset(XEN_VIRT_START)],
+               &idle_pg_table[l4_table_offset(XEN_VIRT_START)],
+               (ROOT_PAGETABLE_FIRST_XEN_SLOT + slots -
+                l4_table_offset(XEN_VIRT_START)) * sizeof(*l4t));
+    }
 }
 
 bool fill_ro_mpt(mfn_t mfn)
@@ -1629,7 +1677,8 @@  static int alloc_l4_table(struct page_info *page)
 
     if ( rc >= 0 )
     {
-        init_guest_l4_table(pl4e, d, !VM_ASSIST(d, m2p_strict));
+        init_xen_l4_slots(pl4e, _mfn(pfn),
+                          d, INVALID_MFN, VM_ASSIST(d, m2p_strict));
         atomic_inc(&d->arch.pv_domain.nr_l4_pages);
         rc = 0;
     }
diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index dc85e82..1e7e8d0 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -391,41 +391,24 @@  int hap_set_allocation(struct domain *d, unsigned int pages, bool *preempted)
     return 0;
 }
 
-static void hap_install_xen_entries_in_l4(struct vcpu *v, mfn_t l4mfn)
-{
-    struct domain *d = v->domain;
-    l4_pgentry_t *l4e;
-
-    l4e = map_domain_page(l4mfn);
-
-    /* Copy the common Xen mappings from the idle domain */
-    memcpy(&l4e[ROOT_PAGETABLE_FIRST_XEN_SLOT],
-           &idle_pg_table[ROOT_PAGETABLE_FIRST_XEN_SLOT],
-           ROOT_PAGETABLE_XEN_SLOTS * sizeof(l4_pgentry_t));
-
-    /* Install the per-domain mappings for this domain */
-    l4e[l4_table_offset(PERDOMAIN_VIRT_START)] =
-        l4e_from_page(d->arch.perdomain_l3_pg, __PAGE_HYPERVISOR_RW);
-
-    /* Install a linear mapping */
-    l4e[l4_table_offset(LINEAR_PT_VIRT_START)] =
-        l4e_from_mfn(l4mfn, __PAGE_HYPERVISOR_RW);
-
-    unmap_domain_page(l4e);
-}
-
 static mfn_t hap_make_monitor_table(struct vcpu *v)
 {
     struct domain *d = v->domain;
     struct page_info *pg;
+    l4_pgentry_t *l4e;
     mfn_t m4mfn;
 
     ASSERT(pagetable_get_pfn(v->arch.monitor_table) == 0);
 
     if ( (pg = hap_alloc(d)) == NULL )
         goto oom;
+
     m4mfn = page_to_mfn(pg);
-    hap_install_xen_entries_in_l4(v, m4mfn);
+    l4e = __map_domain_page(pg);
+
+    init_xen_l4_slots(l4e, m4mfn, d, INVALID_MFN, false);
+    unmap_domain_page(l4e);
+
     return m4mfn;
 
  oom:
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 1b76e0c..b139d9f 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -1448,80 +1448,6 @@  do {                                                                    \
 #endif
 
 
-
-/**************************************************************************/
-/* Functions to install Xen mappings and linear mappings in shadow pages */
-
-// XXX -- this function should probably be moved to shadow-common.c, but that
-//        probably wants to wait until the shadow types have been moved from
-//        shadow-types.h to shadow-private.h
-//
-#if GUEST_PAGING_LEVELS == 4
-void sh_install_xen_entries_in_l4(struct domain *d, mfn_t gl4mfn, mfn_t sl4mfn)
-{
-    shadow_l4e_t *sl4e;
-    unsigned int slots;
-
-    sl4e = map_domain_page(sl4mfn);
-    BUILD_BUG_ON(sizeof (l4_pgentry_t) != sizeof (shadow_l4e_t));
-
-    /* Copy the common Xen mappings from the idle domain */
-    slots = (shadow_mode_external(d)
-             ? ROOT_PAGETABLE_XEN_SLOTS
-             : ROOT_PAGETABLE_PV_XEN_SLOTS);
-    memcpy(&sl4e[ROOT_PAGETABLE_FIRST_XEN_SLOT],
-           &idle_pg_table[ROOT_PAGETABLE_FIRST_XEN_SLOT],
-           slots * sizeof(l4_pgentry_t));
-
-    /* Install the per-domain mappings for this domain */
-    sl4e[shadow_l4_table_offset(PERDOMAIN_VIRT_START)] =
-        shadow_l4e_from_mfn(page_to_mfn(d->arch.perdomain_l3_pg),
-                            __PAGE_HYPERVISOR_RW);
-
-    if ( !shadow_mode_external(d) && !is_pv_32bit_domain(d) &&
-         !VM_ASSIST(d, m2p_strict) )
-    {
-        /* open coded zap_ro_mpt(mfn_x(sl4mfn)): */
-        sl4e[shadow_l4_table_offset(RO_MPT_VIRT_START)] = shadow_l4e_empty();
-    }
-
-    /*
-     * Linear mapping slots:
-     *
-     * Calling this function with gl4mfn == sl4mfn is used to construct a
-     * monitor table for translated domains.  In this case, gl4mfn forms the
-     * self-linear mapping (i.e. not pointing into the translated domain), and
-     * the shadow-linear slot is skipped.  The shadow-linear slot is either
-     * filled when constructing lower level monitor tables, or via
-     * sh_update_cr3() for 4-level guests.
-     *
-     * Calling this function with gl4mfn != sl4mfn is used for non-translated
-     * guests, where the shadow-linear slot is actually self-linear, and the
-     * guest-linear slot points into the guests view of its pagetables.
-     */
-    if ( shadow_mode_translate(d) )
-    {
-        ASSERT(mfn_eq(gl4mfn, sl4mfn));
-
-        sl4e[shadow_l4_table_offset(SH_LINEAR_PT_VIRT_START)] =
-            shadow_l4e_empty();
-    }
-    else
-    {
-        ASSERT(!mfn_eq(gl4mfn, sl4mfn));
-
-        sl4e[shadow_l4_table_offset(SH_LINEAR_PT_VIRT_START)] =
-            shadow_l4e_from_mfn(sl4mfn, __PAGE_HYPERVISOR_RW);
-    }
-
-    sl4e[shadow_l4_table_offset(LINEAR_PT_VIRT_START)] =
-        shadow_l4e_from_mfn(gl4mfn, __PAGE_HYPERVISOR_RW);
-
-    unmap_domain_page(sl4e);
-}
-#endif
-
-
 /**************************************************************************/
 /* Create a shadow of a given guest page.
  */
@@ -1580,8 +1506,16 @@  sh_make_shadow(struct vcpu *v, mfn_t gmfn, u32 shadow_type)
         {
 #if GUEST_PAGING_LEVELS == 4
         case SH_type_l4_shadow:
-            sh_install_xen_entries_in_l4(v->domain, gmfn, smfn);
-            break;
+        {
+            shadow_l4e_t *l4t = map_domain_page(smfn);
+
+            BUILD_BUG_ON(sizeof(l4_pgentry_t) != sizeof(shadow_l4e_t));
+
+            init_xen_l4_slots(l4t, gmfn, d, smfn, (!is_pv_32bit_domain(d) &&
+                                                   VM_ASSIST(d, m2p_strict)));
+            unmap_domain_page(l4t);
+        }
+        break;
 #endif
 #if GUEST_PAGING_LEVELS >= 3
         case SH_type_l2h_shadow:
@@ -1632,19 +1566,27 @@  sh_make_monitor_table(struct vcpu *v)
 
     {
         mfn_t m4mfn;
+        l4_pgentry_t *l4e;
+
         m4mfn = shadow_alloc(d, SH_type_monitor_table, 0);
-        sh_install_xen_entries_in_l4(d, m4mfn, m4mfn);
-        /* Remember the level of this table */
         mfn_to_page(m4mfn)->shadow_flags = 4;
+
+        l4e = map_domain_page(m4mfn);
+
+        /*
+         * Create a self-linear mapping, but no shadow-linear mapping.  A
+         * shadow-linear mapping will either be inserted below when creating
+         * lower level monitor tables, or later in sh_update_cr3().
+         */
+        init_xen_l4_slots(l4e, m4mfn, d, INVALID_MFN, false);
+
 #if SHADOW_PAGING_LEVELS < 4
         {
             mfn_t m3mfn, m2mfn;
-            l4_pgentry_t *l4e;
             l3_pgentry_t *l3e;
             /* Install an l3 table and an l2 table that will hold the shadow
              * linear map entries.  This overrides the linear map entry that
              * was installed by sh_install_xen_entries_in_l4. */
-            l4e = map_domain_page(m4mfn);
 
             m3mfn = shadow_alloc(d, SH_type_monitor_table, 0);
             mfn_to_page(m3mfn)->shadow_flags = 3;
@@ -1679,9 +1621,11 @@  sh_make_monitor_table(struct vcpu *v)
                 unmap_domain_page(l3e);
             }
 
-            unmap_domain_page(l4e);
         }
 #endif /* SHADOW_PAGING_LEVELS < 4 */
+
+        unmap_domain_page(l4e);
+
         return m4mfn;
     }
 }
diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c
index ec7f96d..ecce175 100644
--- a/xen/arch/x86/pv/dom0_build.c
+++ b/xen/arch/x86/pv/dom0_build.c
@@ -588,7 +588,8 @@  int __init dom0_construct_pv(struct domain *d,
         l3start = __va(mpt_alloc); mpt_alloc += PAGE_SIZE;
     }
     clear_page(l4tab);
-    init_guest_l4_table(l4tab, d, 0);
+    init_xen_l4_slots(l4tab, _mfn(virt_to_mfn(l4start)),
+                      d, INVALID_MFN, true);
     v->arch.guest_table = pagetable_from_paddr(__pa(l4start));
     if ( is_pv_32bit_domain(d) )
         v->arch.guest_table_user = v->arch.guest_table;
diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c
index c8b9cb6..a242037 100644
--- a/xen/arch/x86/pv/domain.c
+++ b/xen/arch/x86/pv/domain.c
@@ -35,7 +35,7 @@  static int setup_compat_l4(struct vcpu *v)
 
     l4tab = __map_domain_page(pg);
     clear_page(l4tab);
-    init_guest_l4_table(l4tab, v->domain, 1);
+    init_xen_l4_slots(l4tab, page_to_mfn(pg), v->domain, INVALID_MFN, false);
     unmap_domain_page(l4tab);
 
     /* This page needs to look like a pagetable so that it can be shadowed */
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index da3c5e2..8362608 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -341,8 +341,8 @@  int free_page_type(struct page_info *page, unsigned long type,
                    int preemptible);
 
 void init_xen_pae_l2_slots(l2_pgentry_t *l2t, const struct domain *d);
-void init_guest_l4_table(l4_pgentry_t[], const struct domain *,
-                         bool_t zap_ro_mpt);
+void init_xen_l4_slots(l4_pgentry_t *l4t, mfn_t l4mfn,
+                       const struct domain *d, mfn_t sl4mfn, bool ro_mpt);
 bool fill_ro_mpt(mfn_t mfn);
 void zap_ro_mpt(mfn_t mfn);