Patchwork [for-4.10] tools/libxl: mark hvm mmio area as reserved in e820 map

login
register
mail settings
Submitter Juergen Gross
Date Nov. 17, 2017, 11:47 a.m.
Message ID <20171117114733.21486-1-jgross@suse.com>
Download mbox | patch
Permalink /patch/385669/
State New
Headers show

Comments

Juergen Gross - Nov. 17, 2017, 11:47 a.m.
Make sure the HVM mmio area (especially console and Xenstore pages) is
marked as "reserved" in the guest's E820 map, as otherwise conflicts
might arise later, e.g. when hotplugging memory into the guest.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
This is a bugfix for PVH and HVM guests. Please consider for 4.10.
---
 tools/libxl/libxl_x86.c | 11 +++++++++++
 1 file changed, 11 insertions(+)
Wei Liu - Nov. 17, 2017, 11:49 a.m.
On Fri, Nov 17, 2017 at 12:47:33PM +0100, Juergen Gross wrote:
> Make sure the HVM mmio area (especially console and Xenstore pages) is
> marked as "reserved" in the guest's E820 map, as otherwise conflicts
> might arise later, e.g. when hotplugging memory into the guest.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>

> ---
> This is a bugfix for PVH and HVM guests. Please consider for 4.10.

I agree this is 4.10 material.
Jan Beulich - Nov. 17, 2017, 12:26 p.m.
>>> On 17.11.17 at 12:47, <jgross@suse.com> wrote:
> Make sure the HVM mmio area (especially console and Xenstore pages) is
> marked as "reserved" in the guest's E820 map, as otherwise conflicts
> might arise later, e.g. when hotplugging memory into the guest.

This is very certainly wrong. Have you looked at a couple of physical
machines? Have you found an E820_RESERVED area on any of them for
the MMIO hole? Two examples I can present right away:

<6>BIOS-e820: [mem 0x00000000c93f0000-0x00000000c9f8cfff] reserved
<6>BIOS-e820: [mem 0x00000000c9f8d000-0x00000000c9fdefff] ACPI data
<6>BIOS-e820: [mem 0x00000000c9fdf000-0x00000000cac82fff] ACPI NVS
<6>BIOS-e820: [mem 0x00000000cac83000-0x00000000cb172fff] reserved
<6>BIOS-e820: [mem 0x00000000cb173000-0x00000000cb173fff] usable
<6>BIOS-e820: [mem 0x00000000cb174000-0x00000000cb181fff] reserved
<6>BIOS-e820: [mem 0x00000000cb182000-0x00000000ccffffff] usable
<6>BIOS-e820: [mem 0x00000000cd000000-0x00000000cdffffff] reserved
<6>BIOS-e820: [mem 0x00000000d0000000-0x00000000dfffffff] reserved
<6>BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
<6>BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved

and

(XEN)  00000000cf4bd000 - 00000000cf4bf000 (reserved)
(XEN)  00000000cf4bf000 - 00000000cf636000 (usable)
(XEN)  00000000cf636000 - 00000000cf7bf000 (ACPI NVS)
(XEN)  00000000cf7bf000 - 00000000cf7df000 (usable)
(XEN)  00000000cf7df000 - 00000000cf7ff000 (ACPI data)
(XEN)  00000000cf7ff000 - 00000000cf800000 (usable)
(XEN)  00000000cf800000 - 00000000d0000000 (reserved)
(XEN)  00000000f8000000 - 00000000fd000000 (reserved)
(XEN)  00000000ffe00000 - 0000000100000000 (reserved)

Things covered by E820_RESERVED include the MCFG area, yes, but
not most other parts. The OS has to either be careful or consult
ACPI for further resource usage details. In particular, the ACPI spec
says

"The platform boot firmware does not return a range description for
 the memory mapping of PCI devices, ISA Option ROMs, and ISA Plug
 and Play cards because the OS has mechanisms available to detect
 them."

See the section "E820 Assumptions and Limitations" for further details.

Jan
Juergen Gross - Nov. 17, 2017, 1:27 p.m.
On 17/11/17 13:26, Jan Beulich wrote:
>>>> On 17.11.17 at 12:47, <jgross@suse.com> wrote:
>> Make sure the HVM mmio area (especially console and Xenstore pages) is
>> marked as "reserved" in the guest's E820 map, as otherwise conflicts
>> might arise later, e.g. when hotplugging memory into the guest.
> 
> This is very certainly wrong. Have you looked at a couple of physical
> machines? Have you found an E820_RESERVED area on any of them for
> the MMIO hole? Two examples I can present right away:
> 
> <6>BIOS-e820: [mem 0x00000000c93f0000-0x00000000c9f8cfff] reserved
> <6>BIOS-e820: [mem 0x00000000c9f8d000-0x00000000c9fdefff] ACPI data
> <6>BIOS-e820: [mem 0x00000000c9fdf000-0x00000000cac82fff] ACPI NVS
> <6>BIOS-e820: [mem 0x00000000cac83000-0x00000000cb172fff] reserved
> <6>BIOS-e820: [mem 0x00000000cb173000-0x00000000cb173fff] usable
> <6>BIOS-e820: [mem 0x00000000cb174000-0x00000000cb181fff] reserved
> <6>BIOS-e820: [mem 0x00000000cb182000-0x00000000ccffffff] usable
> <6>BIOS-e820: [mem 0x00000000cd000000-0x00000000cdffffff] reserved
> <6>BIOS-e820: [mem 0x00000000d0000000-0x00000000dfffffff] reserved
> <6>BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
> <6>BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
> 
> and
> 
> (XEN)  00000000cf4bd000 - 00000000cf4bf000 (reserved)
> (XEN)  00000000cf4bf000 - 00000000cf636000 (usable)
> (XEN)  00000000cf636000 - 00000000cf7bf000 (ACPI NVS)
> (XEN)  00000000cf7bf000 - 00000000cf7df000 (usable)
> (XEN)  00000000cf7df000 - 00000000cf7ff000 (ACPI data)
> (XEN)  00000000cf7ff000 - 00000000cf800000 (usable)
> (XEN)  00000000cf800000 - 00000000d0000000 (reserved)
> (XEN)  00000000f8000000 - 00000000fd000000 (reserved)
> (XEN)  00000000ffe00000 - 0000000100000000 (reserved)
> 
> Things covered by E820_RESERVED include the MCFG area, yes, but
> not most other parts. The OS has to either be careful or consult
> ACPI for further resource usage details. In particular, the ACPI spec
> says
> 
> "The platform boot firmware does not return a range description for
>  the memory mapping of PCI devices, ISA Option ROMs, and ISA Plug
>  and Play cards because the OS has mechanisms available to detect
>  them."
> 
> See the section "E820 Assumptions and Limitations" for further details.

So is it _wrong_ to return the mmio area as reserved? We at least want
the shared console and Xenstore page to be marked as reserved, and those
are part of the mmio area.

We could, of course, just report the HVM special pages as reserved, but
this would IMO be more hacky than reporting just the mmio area.

Oh yes, and the LAPIC, of course. Again part of mmio area.


Juergen
Juergen Gross - Nov. 17, 2017, 4:50 p.m.
On 17/11/17 12:47, Juergen Gross wrote:
> Make sure the HVM mmio area (especially console and Xenstore pages) is
> marked as "reserved" in the guest's E820 map, as otherwise conflicts
> might arise later, e.g. when hotplugging memory into the guest.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
> This is a bugfix for PVH and HVM guests. Please consider for 4.10.

Please ignore this patch, it upsets HVMloader.


Juergen

Patch

diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
index 5f91fe4f92..664bf8bd64 100644
--- a/tools/libxl/libxl_x86.c
+++ b/tools/libxl/libxl_x86.c
@@ -530,6 +530,9 @@  int libxl__arch_domain_construct_memmap(libxl__gc *gc,
         if (d_config->rdms[i].policy != LIBXL_RDM_RESERVE_POLICY_INVALID)
             e820_entries++;
 
+    /* Add mmio entry. */
+    if (dom->mmio_size)
+        e820_entries++;
 
     /* If we should have a highmem range. */
     if (highmem_size)
@@ -564,6 +567,14 @@  int libxl__arch_domain_construct_memmap(libxl__gc *gc,
         nr++;
     }
 
+    /* mmio area */
+    if (dom->mmio_size) {
+        e820[nr].addr = dom->mmio_start;
+        e820[nr].size = dom->mmio_size;
+        e820[nr].type = E820_RESERVED;
+        nr++;
+    }
+
     for (i = 0; i < MAX_ACPI_MODULES; i++) {
         if (dom->acpi_modules[i].length) {
             e820[nr].addr = dom->acpi_modules[i].guest_addr_out & ~(page_size - 1);