Patchwork [RFC,v2] x86/emul: Fix the handling of unimplemented Grp7 instructions

login
register
mail settings
Submitter Andrew Cooper
Date Sept. 4, 2017, 5:34 p.m.
Message ID <1504546461-6809-1-git-send-email-andrew.cooper3@citrix.com>
Download mbox | patch
Permalink /patch/331653/
State New
Headers show

Comments

Andrew Cooper - Sept. 4, 2017, 5:34 p.m.
Grp7 is abnormally complicated to decode, even by x86's standards, with
{s,l}msw being the problematic cases.

Previously, any value which fell through the first switch statement (looking
for instructions with entirely implicit operands) would be interpreted by the
second switch statement (handling instructions with memory operands).

Unimplemented instructions would then hit the #UD case for having a non-memory
operand, rather than taking the cannot_emulate path.

Place a big if/else around the two switch statements (accounting for {s,l}msw
which need handling in the else clause), so both switch statments can have a
default goto cannot_emulate path.

This fixes the emulation of xend, which would hit the #UD path when it should
complete with no side effects.

Reported-by: Petre Pircalabu <ppircalabu@bitdefender.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Petre Pircalabu <ppircalabu@bitdefender.com>

v2:
 * Use break rather than goto complete_insn for implicit instructions.
 * Note that we actually fix the behaviour of xend.

RFC as I've only done light testing so far.
---
 xen/arch/x86/x86_emulate/x86_emulate.c | 356 +++++++++++++++++----------------
 1 file changed, 188 insertions(+), 168 deletions(-)
Jan Beulich - Sept. 5, 2017, 6:57 a.m.
>>> Andrew Cooper <andrew.cooper3@citrix.com> 09/04/17 7:35 PM >>>
>Grp7 is abnormally complicated to decode, even by x86's standards, with
>{s,l}msw being the problematic cases.
>
>Previously, any value which fell through the first switch statement (looking
>for instructions with entirely implicit operands) would be interpreted by the
>second switch statement (handling instructions with memory operands).
>
>Unimplemented instructions would then hit the #UD case for having a non-memory
>operand, rather than taking the cannot_emulate path.
>
>Place a big if/else around the two switch statements (accounting for {s,l}msw
>which need handling in the else clause), so both switch statments can have a
>default goto cannot_emulate path.
>
>This fixes the emulation of xend, which would hit the #UD path when it should
>complete with no side effects.

This could be had with a single line change. And while I can see this mistake
of mine alone to be justification for the restructuring, it's still rather big a change
due to all the re-indentation. Did you instead consider simply combining the
two switch() statements (retaining present indentation), by using range case
labels for the opcodes permitting operands? That would have the added benefit
of no longer producing #UD for things like VMCALL, but instead having those
go to cannot_emulate too.

>+        if ( (modrm & 0xc0) == 0xc0 &&
>+             (modrm_reg & 7) != 4 /* smsw */ &&
>+             (modrm_reg & 7) != 6 /* lmsw */ )

(modrm & 5) == 4 would be the more compact variant; I'm not sure if all
compilers we support would be able to fold this.

Jan
Andrew Cooper - Sept. 5, 2017, 7:34 a.m.
On 05/09/2017 07:57, Jan Beulich wrote:
>>>> Andrew Cooper <andrew.cooper3@citrix.com> 09/04/17 7:35 PM >>>
>> Grp7 is abnormally complicated to decode, even by x86's standards, with
>> {s,l}msw being the problematic cases.
>>
>> Previously, any value which fell through the first switch statement (looking
>> for instructions with entirely implicit operands) would be interpreted by the
>> second switch statement (handling instructions with memory operands).
>>
>> Unimplemented instructions would then hit the #UD case for having a non-memory
>> operand, rather than taking the cannot_emulate path.
>>
>> Place a big if/else around the two switch statements (accounting for {s,l}msw
>> which need handling in the else clause), so both switch statments can have a
>> default goto cannot_emulate path.
>>
>> This fixes the emulation of xend, which would hit the #UD path when it should
>> complete with no side effects.
> This could be had with a single line change. And while I can see this mistake
> of mine alone to be justification for the restructuring, it's still rather big a change
> due to all the re-indentation. Did you instead consider simply combining the
> two switch() statements (retaining present indentation), by using range case
> labels for the opcodes permitting operands?

That was my first idea, but the cases are not adjacent.  You need 3
ranges for the mod != 11 instructions, and 4 for {s,l}msw, and there was
no clean way I could find to express that.

>  That would have the added benefit
> of no longer producing #UD for things like VMCALL, but instead having those
> go to cannot_emulate too.

This is the behaviour the patch is intended to introduce.  What's broken
with the logic?

~Andrew
Jan Beulich - Sept. 5, 2017, 9:43 a.m.
>>> On 05.09.17 at 09:34, <andrew.cooper3@citrix.com> wrote:
> On 05/09/2017 07:57, Jan Beulich wrote:
>>>>> Andrew Cooper <andrew.cooper3@citrix.com> 09/04/17 7:35 PM >>>
>>> Grp7 is abnormally complicated to decode, even by x86's standards, with
>>> {s,l}msw being the problematic cases.
>>>
>>> Previously, any value which fell through the first switch statement (looking
>>> for instructions with entirely implicit operands) would be interpreted by 
> the
>>> second switch statement (handling instructions with memory operands).
>>>
>>> Unimplemented instructions would then hit the #UD case for having a 
> non-memory
>>> operand, rather than taking the cannot_emulate path.
>>>
>>> Place a big if/else around the two switch statements (accounting for 
> {s,l}msw
>>> which need handling in the else clause), so both switch statments can have a
>>> default goto cannot_emulate path.
>>>
>>> This fixes the emulation of xend, which would hit the #UD path when it 
> should
>>> complete with no side effects.
>> This could be had with a single line change. And while I can see this 
> mistake
>> of mine alone to be justification for the restructuring, it's still rather 
> big a change
>> due to all the re-indentation. Did you instead consider simply combining the
>> two switch() statements (retaining present indentation), by using range case
>> labels for the opcodes permitting operands?
> 
> That was my first idea, but the cases are not adjacent.  You need 3
> ranges for the mod != 11 instructions, and 4 for {s,l}msw, and there was
> no clean way I could find to express that.

I see you've found one (which is largely what I was going to suggest).

>>  That would have the added benefit
>> of no longer producing #UD for things like VMCALL, but instead having those
>> go to cannot_emulate too.
> 
> This is the behaviour the patch is intended to introduce.  What's broken
> with the logic?

I guess you've realized meanwhile that it was the

                generate_exception_if(ea.type != OP_MEM, EXC_UD);

that were left in place, which were causing the sub-optimal
behavior. Speaking of which - do we want to go farther and
convert further similar #UD raising into cannot_emulate (or
with Petre's unimplemented_insn) goto-s?

Jan
Andrew Cooper - Sept. 5, 2017, 9:53 a.m.
On 05/09/2017 10:43, Jan Beulich wrote:
>>>> On 05.09.17 at 09:34, <andrew.cooper3@citrix.com> wrote:
>> On 05/09/2017 07:57, Jan Beulich wrote:
>>>>>> Andrew Cooper <andrew.cooper3@citrix.com> 09/04/17 7:35 PM >>>
>>>> Grp7 is abnormally complicated to decode, even by x86's standards, with
>>>> {s,l}msw being the problematic cases.
>>>>
>>>> Previously, any value which fell through the first switch statement (looking
>>>> for instructions with entirely implicit operands) would be interpreted by 
>> the
>>>> second switch statement (handling instructions with memory operands).
>>>>
>>>> Unimplemented instructions would then hit the #UD case for having a 
>> non-memory
>>>> operand, rather than taking the cannot_emulate path.
>>>>
>>>> Place a big if/else around the two switch statements (accounting for 
>> {s,l}msw
>>>> which need handling in the else clause), so both switch statments can have a
>>>> default goto cannot_emulate path.
>>>>
>>>> This fixes the emulation of xend, which would hit the #UD path when it 
>> should
>>>> complete with no side effects.
>>> This could be had with a single line change. And while I can see this 
>> mistake
>>> of mine alone to be justification for the restructuring, it's still rather 
>> big a change
>>> due to all the re-indentation. Did you instead consider simply combining the
>>> two switch() statements (retaining present indentation), by using range case
>>> labels for the opcodes permitting operands?
>> That was my first idea, but the cases are not adjacent.  You need 3
>> ranges for the mod != 11 instructions, and 4 for {s,l}msw, and there was
>> no clean way I could find to express that.
> I see you've found one (which is largely what I was going to suggest).
>
>>>  That would have the added benefit
>>> of no longer producing #UD for things like VMCALL, but instead having those
>>> go to cannot_emulate too.
>> This is the behaviour the patch is intended to introduce.  What's broken
>> with the logic?
> I guess you've realized meanwhile that it was the
>
>                 generate_exception_if(ea.type != OP_MEM, EXC_UD);
>
> that were left in place, which were causing the sub-optimal
> behavior.

VMCALL is encoded with mod == 11, so now doesn't fall into the sgdt case.

> Speaking of which - do we want to go farther and
> convert further similar #UD raising into cannot_emulate (or
> with Petre's unimplemented_insn) goto-s?

In this specific case, I think the generate_exception_if(ea.type !=
OP_MEM, EXC_UD); can be converted to asserts, because the only way to
violate that condition is with an earlier error calculating ea or modrm.

In the general case, I think we should prefer the cannot_emulate path. 
Are there any uses of this label which aren't due to having no
implementation?  If so, we probably want to introduce a new label so the
two cases can be distinguished.

~Andrew
Jan Beulich - Sept. 5, 2017, 10:07 a.m.
>>> On 05.09.17 at 11:53, <andrew.cooper3@citrix.com> wrote:
> On 05/09/2017 10:43, Jan Beulich wrote:
>>>>> On 05.09.17 at 09:34, <andrew.cooper3@citrix.com> wrote:
>>> On 05/09/2017 07:57, Jan Beulich wrote:
>>>>>>> Andrew Cooper <andrew.cooper3@citrix.com> 09/04/17 7:35 PM >>>
>>>>> Grp7 is abnormally complicated to decode, even by x86's standards, with
>>>>> {s,l}msw being the problematic cases.
>>>>>
>>>>> Previously, any value which fell through the first switch statement (looking
>>>>> for instructions with entirely implicit operands) would be interpreted by 
>>> the
>>>>> second switch statement (handling instructions with memory operands).
>>>>>
>>>>> Unimplemented instructions would then hit the #UD case for having a 
>>> non-memory
>>>>> operand, rather than taking the cannot_emulate path.
>>>>>
>>>>> Place a big if/else around the two switch statements (accounting for 
>>> {s,l}msw
>>>>> which need handling in the else clause), so both switch statments can have a
>>>>> default goto cannot_emulate path.
>>>>>
>>>>> This fixes the emulation of xend, which would hit the #UD path when it 
>>> should
>>>>> complete with no side effects.
>>>> This could be had with a single line change. And while I can see this 
>>> mistake
>>>> of mine alone to be justification for the restructuring, it's still rather 
>>> big a change
>>>> due to all the re-indentation. Did you instead consider simply combining the
>>>> two switch() statements (retaining present indentation), by using range case
>>>> labels for the opcodes permitting operands?
>>> That was my first idea, but the cases are not adjacent.  You need 3
>>> ranges for the mod != 11 instructions, and 4 for {s,l}msw, and there was
>>> no clean way I could find to express that.
>> I see you've found one (which is largely what I was going to suggest).
>>
>>>>  That would have the added benefit
>>>> of no longer producing #UD for things like VMCALL, but instead having those
>>>> go to cannot_emulate too.
>>> This is the behaviour the patch is intended to introduce.  What's broken
>>> with the logic?
>> I guess you've realized meanwhile that it was the
>>
>>                 generate_exception_if(ea.type != OP_MEM, EXC_UD);
>>
>> that were left in place, which were causing the sub-optimal
>> behavior.
> 
> VMCALL is encoded with mod == 11, so now doesn't fall into the sgdt case.

Oh, right.

>> Speaking of which - do we want to go farther and
>> convert further similar #UD raising into cannot_emulate (or
>> with Petre's unimplemented_insn) goto-s?
> 
> In this specific case, I think the generate_exception_if(ea.type !=
> OP_MEM, EXC_UD); can be converted to asserts, because the only way to
> violate that condition is with an earlier error calculating ea or modrm.

Yes, I was about to ask you to do that (in reply to v3).

> In the general case, I think we should prefer the cannot_emulate path. 
> Are there any uses of this label which aren't due to having no
> implementation?  If so, we probably want to introduce a new label so the
> two cases can be distinguished.

That's what Petre's patch does. But my question was about
questionable uses of generate_exception_if(..., EXC_UD).

Jan

Patch

diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
index 2201852..27c7ead 100644
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -4987,197 +4987,217 @@  x86_emulate(
         }
         break;
 
-    case X86EMUL_OPC(0x0f, 0x01): /* Grp7 */ {
-        unsigned long base, limit, cr0, cr0w;
+    case X86EMUL_OPC(0x0f, 0x01): /* Grp7 */
+    {
+        unsigned long base, limit;
 
-        switch( modrm )
+        if ( (modrm & 0xc0) == 0xc0 &&
+             (modrm_reg & 7) != 4 /* smsw */ &&
+             (modrm_reg & 7) != 6 /* lmsw */ )
         {
-        case 0xca: /* clac */
-        case 0xcb: /* stac */
-            vcpu_must_have(smap);
-            generate_exception_if(vex.pfx || !mode_ring0(), EXC_UD);
-
-            _regs.eflags &= ~X86_EFLAGS_AC;
-            if ( modrm == 0xcb )
-                _regs.eflags |= X86_EFLAGS_AC;
-            goto complete_insn;
+            switch ( modrm )
+            {
+            case 0xca: /* clac */
+            case 0xcb: /* stac */
+                vcpu_must_have(smap);
+                generate_exception_if(vex.pfx || !mode_ring0(), EXC_UD);
+
+                _regs.eflags &= ~X86_EFLAGS_AC;
+                if ( modrm == 0xcb )
+                    _regs.eflags |= X86_EFLAGS_AC;
+                break;
 
 #ifdef __XEN__
-        case 0xd1: /* xsetbv */
-            generate_exception_if(vex.pfx, EXC_UD);
-            if ( !ops->read_cr || ops->read_cr(4, &cr4, ctxt) != X86EMUL_OKAY )
-                cr4 = 0;
-            generate_exception_if(!(cr4 & X86_CR4_OSXSAVE), EXC_UD);
-            generate_exception_if(!mode_ring0() ||
-                                  handle_xsetbv(_regs.ecx,
-                                                _regs.eax | (_regs.rdx << 32)),
-                                  EXC_GP, 0);
-            goto complete_insn;
+            case 0xd1: /* xsetbv */
+                generate_exception_if(vex.pfx, EXC_UD);
+                if ( !ops->read_cr ||
+                     ops->read_cr(4, &cr4, ctxt) != X86EMUL_OKAY )
+                    cr4 = 0;
+                generate_exception_if(!(cr4 & X86_CR4_OSXSAVE), EXC_UD);
+                generate_exception_if(!mode_ring0() ||
+                                      handle_xsetbv(_regs.ecx, _regs.eax |
+                                                    (_regs.rdx << 32)),
+                                      EXC_GP, 0);
+                break;
 #endif
 
-        case 0xd4: /* vmfunc */
-            generate_exception_if(vex.pfx, EXC_UD);
-            fail_if(!ops->vmfunc);
-            if ( (rc = ops->vmfunc(ctxt)) != X86EMUL_OKAY )
-                goto done;
-            goto complete_insn;
+            case 0xd4: /* vmfunc */
+                generate_exception_if(vex.pfx, EXC_UD);
+                fail_if(!ops->vmfunc);
+                if ( (rc = ops->vmfunc(ctxt)) != X86EMUL_OKAY )
+                    goto done;
+                break;
 
-        case 0xd5: /* xend */
-            generate_exception_if(vex.pfx, EXC_UD);
-            generate_exception_if(!vcpu_has_rtm(), EXC_UD);
-            generate_exception_if(vcpu_has_rtm(), EXC_GP, 0);
-            break;
+            case 0xd5: /* xend */
+                generate_exception_if(vex.pfx, EXC_UD);
+                generate_exception_if(!vcpu_has_rtm(), EXC_UD);
+                generate_exception_if(vcpu_has_rtm(), EXC_GP, 0);
+                break;
 
-        case 0xd6: /* xtest */
-            generate_exception_if(vex.pfx, EXC_UD);
-            generate_exception_if(!vcpu_has_rtm() && !vcpu_has_hle(),
-                                  EXC_UD);
-            /* Neither HLE nor RTM can be active when we get here. */
-            _regs.eflags |= X86_EFLAGS_ZF;
-            goto complete_insn;
+            case 0xd6: /* xtest */
+                generate_exception_if(vex.pfx, EXC_UD);
+                generate_exception_if(!vcpu_has_rtm() && !vcpu_has_hle(),
+                                      EXC_UD);
+                /* Neither HLE nor RTM can be active when we get here. */
+                _regs.eflags |= X86_EFLAGS_ZF;
+                break;
 
-        case 0xdf: /* invlpga */
-            generate_exception_if(!in_protmode(ctxt, ops), EXC_UD);
-            generate_exception_if(!mode_ring0(), EXC_GP, 0);
-            fail_if(ops->invlpg == NULL);
-            if ( (rc = ops->invlpg(x86_seg_none, truncate_ea(_regs.r(ax)),
-                                   ctxt)) )
-                goto done;
-            goto complete_insn;
+            case 0xdf: /* invlpga */
+                generate_exception_if(!in_protmode(ctxt, ops), EXC_UD);
+                generate_exception_if(!mode_ring0(), EXC_GP, 0);
+                fail_if(ops->invlpg == NULL);
+                if ( (rc = ops->invlpg(x86_seg_none, truncate_ea(_regs.r(ax)),
+                                       ctxt)) )
+                    goto done;
+                break;
 
-        case 0xf9: /* rdtscp */
-            fail_if(ops->read_msr == NULL);
-            if ( (rc = ops->read_msr(MSR_TSC_AUX,
-                                     &msr_val, ctxt)) != X86EMUL_OKAY )
-                goto done;
-            _regs.r(cx) = (uint32_t)msr_val;
-            goto rdtsc;
+            case 0xf9: /* rdtscp */
+                fail_if(ops->read_msr == NULL);
+                if ( (rc = ops->read_msr(MSR_TSC_AUX,
+                                         &msr_val, ctxt)) != X86EMUL_OKAY )
+                    goto done;
+                _regs.r(cx) = (uint32_t)msr_val;
+                goto rdtsc;
+
+            case 0xfc: /* clzero */
+            {
+                unsigned long zero = 0;
+
+                vcpu_must_have(clzero);
+
+                base = ad_bytes == 8 ? _regs.r(ax) :
+                    ad_bytes == 4 ? _regs.eax : _regs.ax;
+                limit = 0;
+                if ( vcpu_has_clflush() &&
+                     ops->cpuid(1, 0, &cpuid_leaf, ctxt) == X86EMUL_OKAY )
+                    limit = ((cpuid_leaf.b >> 8) & 0xff) * 8;
+                generate_exception_if(limit < sizeof(long) ||
+                                      (limit & (limit - 1)), EXC_UD);
+                base &= ~(limit - 1);
+                if ( ops->rep_stos )
+                {
+                    unsigned long nr_reps = limit / sizeof(zero);
+
+                    rc = ops->rep_stos(&zero, ea.mem.seg, base, sizeof(zero),
+                                       &nr_reps, ctxt);
+                    if ( rc == X86EMUL_OKAY )
+                    {
+                        base += nr_reps * sizeof(zero);
+                        limit -= nr_reps * sizeof(zero);
+                    }
+                    else if ( rc != X86EMUL_UNHANDLEABLE )
+                        goto done;
+                }
+                fail_if(limit && !ops->write);
+                while ( limit )
+                {
+                    rc = ops->write(ea.mem.seg, base, &zero,
+                                    sizeof(zero), ctxt);
+                    if ( rc != X86EMUL_OKAY )
+                        goto done;
+                    base += sizeof(zero);
+                    limit -= sizeof(zero);
+                }
+                break;
+            }
 
-        case 0xfc: /* clzero */
+            default:
+                goto cannot_emulate;
+            }
+        }
+        else
         {
-            unsigned long zero = 0;
+            unsigned long cr0, cr0w;
 
-            vcpu_must_have(clzero);
+            seg = (modrm_reg & 1) ? x86_seg_idtr : x86_seg_gdtr;
 
-            base = ad_bytes == 8 ? _regs.r(ax) :
-                   ad_bytes == 4 ? _regs.eax : _regs.ax;
-            limit = 0;
-            if ( vcpu_has_clflush() &&
-                 ops->cpuid(1, 0, &cpuid_leaf, ctxt) == X86EMUL_OKAY )
-                limit = ((cpuid_leaf.b >> 8) & 0xff) * 8;
-            generate_exception_if(limit < sizeof(long) ||
-                                  (limit & (limit - 1)), EXC_UD);
-            base &= ~(limit - 1);
-            if ( ops->rep_stos )
+            switch ( modrm_reg & 7 )
             {
-                unsigned long nr_reps = limit / sizeof(zero);
+            case 0: /* sgdt */
+            case 1: /* sidt */
+                generate_exception_if(ea.type != OP_MEM, EXC_UD);
+                generate_exception_if(umip_active(ctxt, ops), EXC_GP, 0);
+                fail_if(!ops->read_segment || !ops->write);
+                if ( (rc = ops->read_segment(seg, &sreg, ctxt)) )
+                    goto done;
+                if ( mode_64bit() )
+                    op_bytes = 8;
+                else if ( op_bytes == 2 )
+                {
+                    sreg.base &= 0xffffff;
+                    op_bytes = 4;
+                }
+                if ( (rc = ops->write(ea.mem.seg, ea.mem.off, &sreg.limit,
+                                      2, ctxt)) != X86EMUL_OKAY ||
+                     (rc = ops->write(ea.mem.seg, ea.mem.off + 2, &sreg.base,
+                                      op_bytes, ctxt)) != X86EMUL_OKAY )
+                    goto done;
+                break;
+
+            case 2: /* lgdt */
+            case 3: /* lidt */
+                generate_exception_if(!mode_ring0(), EXC_GP, 0);
+                generate_exception_if(ea.type != OP_MEM, EXC_UD);
+                fail_if(ops->write_segment == NULL);
+                memset(&sreg, 0, sizeof(sreg));
+                if ( (rc = read_ulong(ea.mem.seg, ea.mem.off+0,
+                                      &limit, 2, ctxt, ops)) ||
+                     (rc = read_ulong(ea.mem.seg, ea.mem.off+2,
+                                      &base, mode_64bit() ? 8 : 4, ctxt, ops)) )
+                    goto done;
+                generate_exception_if(!is_canonical_address(base), EXC_GP, 0);
+                sreg.base = base;
+                sreg.limit = limit;
+                if ( !mode_64bit() && op_bytes == 2 )
+                    sreg.base &= 0xffffff;
+                if ( (rc = ops->write_segment(seg, &sreg, ctxt)) )
+                    goto done;
+                break;
 
-                rc = ops->rep_stos(&zero, ea.mem.seg, base, sizeof(zero),
-                                   &nr_reps, ctxt);
-                if ( rc == X86EMUL_OKAY )
+            case 4: /* smsw */
+                generate_exception_if(umip_active(ctxt, ops), EXC_GP, 0);
+                if ( ea.type == OP_MEM )
                 {
-                    base += nr_reps * sizeof(zero);
-                    limit -= nr_reps * sizeof(zero);
+                    fail_if(!ops->write);
+                    d |= Mov; /* force writeback */
+                    ea.bytes = 2;
                 }
-                else if ( rc != X86EMUL_UNHANDLEABLE )
+                else
+                    ea.bytes = op_bytes;
+                dst = ea;
+                fail_if(ops->read_cr == NULL);
+                if ( (rc = ops->read_cr(0, &dst.val, ctxt)) )
                     goto done;
-            }
-            fail_if(limit && !ops->write);
-            while ( limit )
-            {
-                rc = ops->write(ea.mem.seg, base, &zero, sizeof(zero), ctxt);
-                if ( rc != X86EMUL_OKAY )
+                break;
+
+            case 6: /* lmsw */
+                fail_if(ops->read_cr == NULL);
+                fail_if(ops->write_cr == NULL);
+                generate_exception_if(!mode_ring0(), EXC_GP, 0);
+                if ( (rc = ops->read_cr(0, &cr0, ctxt)) )
                     goto done;
-                base += sizeof(zero);
-                limit -= sizeof(zero);
-            }
-            goto complete_insn;
-        }
-        }
+                if ( ea.type == OP_REG )
+                    cr0w = *ea.reg;
+                else if ( (rc = read_ulong(ea.mem.seg, ea.mem.off,
+                                           &cr0w, 2, ctxt, ops)) )
+                    goto done;
+                /* LMSW can: (1) set bits 0-3; (2) clear bits 1-3. */
+                cr0 = (cr0 & ~0xe) | (cr0w & 0xf);
+                if ( (rc = ops->write_cr(0, cr0, ctxt)) )
+                    goto done;
+                break;
 
-        seg = (modrm_reg & 1) ? x86_seg_idtr : x86_seg_gdtr;
+            case 7: /* invlpg */
+                generate_exception_if(!mode_ring0(), EXC_GP, 0);
+                generate_exception_if(ea.type != OP_MEM, EXC_UD);
+                fail_if(ops->invlpg == NULL);
+                if ( (rc = ops->invlpg(ea.mem.seg, ea.mem.off, ctxt)) )
+                    goto done;
+                break;
 
-        switch ( modrm_reg & 7 )
-        {
-        case 0: /* sgdt */
-        case 1: /* sidt */
-            generate_exception_if(ea.type != OP_MEM, EXC_UD);
-            generate_exception_if(umip_active(ctxt, ops), EXC_GP, 0);
-            fail_if(!ops->read_segment || !ops->write);
-            if ( (rc = ops->read_segment(seg, &sreg, ctxt)) )
-                goto done;
-            if ( mode_64bit() )
-                op_bytes = 8;
-            else if ( op_bytes == 2 )
-            {
-                sreg.base &= 0xffffff;
-                op_bytes = 4;
-            }
-            if ( (rc = ops->write(ea.mem.seg, ea.mem.off, &sreg.limit,
-                                  2, ctxt)) != X86EMUL_OKAY ||
-                 (rc = ops->write(ea.mem.seg, ea.mem.off + 2, &sreg.base,
-                                  op_bytes, ctxt)) != X86EMUL_OKAY )
-                goto done;
-            break;
-        case 2: /* lgdt */
-        case 3: /* lidt */
-            generate_exception_if(!mode_ring0(), EXC_GP, 0);
-            generate_exception_if(ea.type != OP_MEM, EXC_UD);
-            fail_if(ops->write_segment == NULL);
-            memset(&sreg, 0, sizeof(sreg));
-            if ( (rc = read_ulong(ea.mem.seg, ea.mem.off+0,
-                                  &limit, 2, ctxt, ops)) ||
-                 (rc = read_ulong(ea.mem.seg, ea.mem.off+2,
-                                  &base, mode_64bit() ? 8 : 4, ctxt, ops)) )
-                goto done;
-            generate_exception_if(!is_canonical_address(base), EXC_GP, 0);
-            sreg.base = base;
-            sreg.limit = limit;
-            if ( !mode_64bit() && op_bytes == 2 )
-                sreg.base &= 0xffffff;
-            if ( (rc = ops->write_segment(seg, &sreg, ctxt)) )
-                goto done;
-            break;
-        case 4: /* smsw */
-            generate_exception_if(umip_active(ctxt, ops), EXC_GP, 0);
-            if ( ea.type == OP_MEM )
-            {
-                fail_if(!ops->write);
-                d |= Mov; /* force writeback */
-                ea.bytes = 2;
+            default:
+                goto cannot_emulate;
             }
-            else
-                ea.bytes = op_bytes;
-            dst = ea;
-            fail_if(ops->read_cr == NULL);
-            if ( (rc = ops->read_cr(0, &dst.val, ctxt)) )
-                goto done;
-            break;
-        case 6: /* lmsw */
-            fail_if(ops->read_cr == NULL);
-            fail_if(ops->write_cr == NULL);
-            generate_exception_if(!mode_ring0(), EXC_GP, 0);
-            if ( (rc = ops->read_cr(0, &cr0, ctxt)) )
-                goto done;
-            if ( ea.type == OP_REG )
-                cr0w = *ea.reg;
-            else if ( (rc = read_ulong(ea.mem.seg, ea.mem.off,
-                                       &cr0w, 2, ctxt, ops)) )
-                goto done;
-            /* LMSW can: (1) set bits 0-3; (2) clear bits 1-3. */
-            cr0 = (cr0 & ~0xe) | (cr0w & 0xf);
-            if ( (rc = ops->write_cr(0, cr0, ctxt)) )
-                goto done;
-            break;
-        case 7: /* invlpg */
-            generate_exception_if(!mode_ring0(), EXC_GP, 0);
-            generate_exception_if(ea.type != OP_MEM, EXC_UD);
-            fail_if(ops->invlpg == NULL);
-            if ( (rc = ops->invlpg(ea.mem.seg, ea.mem.off, ctxt)) )
-                goto done;
-            break;
-        default:
-            goto cannot_emulate;
         }
         break;
     }