Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lldb] Improve mid-function epilogue scanning for x86 #110965

Conversation

jasonmolenda
Copy link
Collaborator

The x86 assembly instruction scanner creates incorrect UnwindPlans when a mid-function epilogue has a non-epilogue instruction in it.

The x86 instruction analysis which creates an UnwindPlan handles mid-function epilogues by tracking "epilogue instructions" (register loads from stack, stack pointer increasing, etc) and any UnwindPlan updates which are NOT epilogue instructions update the "prologue UnwindPlan" saved row. It detects a LEAVE/RET/unconditional JMP out of the function and after that instruction, re-instates the "prologue Row".

There's a parallel piece of data tracked across the duration of the function, current_sp_bytes_offset_from_fa, and we reflect the "value after prologue instructions" in prologue_completed_sp_bytes_offset_from_cfa. When the CFA is calculated in terms of the frame pointer ($ebp/$rbp), we don't add changes to the stack pointer to the UnwindPlan, so this separate mechanism is used for the "current value" and the "last value at prologue setup".

(the x86 UnwindPlan generated writes "sp=CFA+0" as a register rule which is formally correct, but it could also track the stack pointer value as sp=$rsp+ and update this register rule every time $rsp is modified.)

This leads to a bug when there is an instruction in an epilogue which isn't recognzied as an epilogue instruction. prologue_completed_sp_bytes_offset_from_cfa is always set to the value of current_sp_bytes_offset_from_fa unless the current instruction is an epilogue instruction. With a non-epilogue instruction in the middle of the epilogue, we suddenly copy a current_sp_bytes_offset_from_fa value from the middle of the epilogue into this
prologue_completed_sp_bytes_offset_from_cfa. Once the epilogue is finished, we restore the "prologue Row" and
prologue_completed_sp_bytes_offset_from_cfa. But now $rsp has a very incorrect value in it.

This patch tracks when we've updated current_sp_bytes_offset_from_fa in the current instruction analysis. If it was updated looking at an epilogue instruction, is_epilogue will be set correctly. Otherwise it's a "prologue" instruction and we should update prologue_completed_sp_bytes_offset_from_cfa. Any instruction that is unrecognized will leave prologue_completed_sp_bytes_offset_from_cfa unmodified.

The actual instruction we hit this with was a BTRQ but I added a NOP to the unit test which is only 1 byte and made the update to the unit test a little simpler. This bug is hit with a NOP just as well.

UnwindAssemblyInstEmulation has a much better algorithm for handling mid-function epilogues, which "forward" the current unwind state Row when it sees branches within the function, to the target instruction offset. This avoids detecting prologue/epilogue instructions altogether.

rdar://137153323

The x86 assembly instruction scanner creates incorrect UnwindPlans
when a mid-function epilogue has a non-epilogue instruction in it.

The x86 instruction analysis which creates an UnwindPlan handles
mid-function epilogues by tracking "epilogue instructions" (register
loads from stack, stack pointer increasing, etc) and any UnwindPlan
updates which are NOT epilogue instructions update the "prologue
UnwindPlan" saved row.  It detects a LEAVE/RET/unconditional JMP
out of the function and after that instruction, re-instates the
"prologue Row".

There's a parallel piece of data tracked across the duration of the
function, current_sp_bytes_offset_from_fa, and we reflect the "value
after prologue instructions" in prologue_completed_sp_bytes_offset_from_cfa.
When the CFA is calculated in terms of the frame pointer ($ebp/$rbp),
we don't add changes to the stack pointer to the UnwindPlan, so
this separate mechanism is used for the "current value" and the
"last value at prologue setup".

(the x86 UnwindPlan generated writes "sp=CFA+0" as a register rule
which is formally correct, but it could also track the stack pointer
value as sp=$rsp+<x> and update this register rule every time $rsp
is modified.)

This leads to a bug when there is an instruction in an epilogue
which isn't recognzied as an epilogue instruction.
prologue_completed_sp_bytes_offset_from_cfa is always set to the
value of current_sp_bytes_offset_from_fa unless the current instruction
is an epilogue instruction.  With a non-epilogue instruction in the
middle of the epilogue, we suddenly copy a current_sp_bytes_offset_from_fa
value from the middle of the epilogue into this
prologue_completed_sp_bytes_offset_from_cfa.  Once the epilogue is
finished, we restore the "prologue Row" and
prologue_completed_sp_bytes_offset_from_cfa.  But now $rsp has a
very incorrect value in it.

The actual instruction we hit this with was a BTRQ but I added a NOP
to the unit test which is only 1 byte and made the update to the unit
test a little simpler.  This bug is hit with a NOP just as well.

UnwindAssemblyInstEmulation has a much better algorithm for handling
mid-function epilogues, which "forward" the current unwind state
Row when it sees branches within the function, to the target
instruction offset.  This avoids detecting prologue/epilogue
instructions altogether.

rdar://137153323
@llvmbot
Copy link
Collaborator

llvmbot commented Oct 3, 2024

@llvm/pr-subscribers-lldb

Author: Jason Molenda (jasonmolenda)

Changes

The x86 assembly instruction scanner creates incorrect UnwindPlans when a mid-function epilogue has a non-epilogue instruction in it.

The x86 instruction analysis which creates an UnwindPlan handles mid-function epilogues by tracking "epilogue instructions" (register loads from stack, stack pointer increasing, etc) and any UnwindPlan updates which are NOT epilogue instructions update the "prologue UnwindPlan" saved row. It detects a LEAVE/RET/unconditional JMP out of the function and after that instruction, re-instates the "prologue Row".

There's a parallel piece of data tracked across the duration of the function, current_sp_bytes_offset_from_fa, and we reflect the "value after prologue instructions" in prologue_completed_sp_bytes_offset_from_cfa. When the CFA is calculated in terms of the frame pointer ($ebp/$rbp), we don't add changes to the stack pointer to the UnwindPlan, so this separate mechanism is used for the "current value" and the "last value at prologue setup".

(the x86 UnwindPlan generated writes "sp=CFA+0" as a register rule which is formally correct, but it could also track the stack pointer value as sp=$rsp+<x> and update this register rule every time $rsp is modified.)

This leads to a bug when there is an instruction in an epilogue which isn't recognzied as an epilogue instruction. prologue_completed_sp_bytes_offset_from_cfa is always set to the value of current_sp_bytes_offset_from_fa unless the current instruction is an epilogue instruction. With a non-epilogue instruction in the middle of the epilogue, we suddenly copy a current_sp_bytes_offset_from_fa value from the middle of the epilogue into this
prologue_completed_sp_bytes_offset_from_cfa. Once the epilogue is finished, we restore the "prologue Row" and
prologue_completed_sp_bytes_offset_from_cfa. But now $rsp has a very incorrect value in it.

This patch tracks when we've updated current_sp_bytes_offset_from_fa in the current instruction analysis. If it was updated looking at an epilogue instruction, is_epilogue will be set correctly. Otherwise it's a "prologue" instruction and we should update prologue_completed_sp_bytes_offset_from_cfa. Any instruction that is unrecognized will leave prologue_completed_sp_bytes_offset_from_cfa unmodified.

The actual instruction we hit this with was a BTRQ but I added a NOP to the unit test which is only 1 byte and made the update to the unit test a little simpler. This bug is hit with a NOP just as well.

UnwindAssemblyInstEmulation has a much better algorithm for handling mid-function epilogues, which "forward" the current unwind state Row when it sees branches within the function, to the target instruction offset. This avoids detecting prologue/epilogue instructions altogether.

rdar://137153323


Full diff: https://github.com/llvm/llvm-project/pull/110965.diff

2 Files Affected:

  • (modified) lldb/source/Plugins/UnwindAssembly/x86/x86AssemblyInspectionEngine.cpp (+22-4)
  • (modified) lldb/unittests/UnwindAssembly/x86/Testx86AssemblyInspectionEngine.cpp (+15-14)
diff --git a/lldb/source/Plugins/UnwindAssembly/x86/x86AssemblyInspectionEngine.cpp b/lldb/source/Plugins/UnwindAssembly/x86/x86AssemblyInspectionEngine.cpp
index 81b7f138fe7caa..45d2f95433d5bb 100644
--- a/lldb/source/Plugins/UnwindAssembly/x86/x86AssemblyInspectionEngine.cpp
+++ b/lldb/source/Plugins/UnwindAssembly/x86/x86AssemblyInspectionEngine.cpp
@@ -809,7 +809,7 @@ bool x86AssemblyInspectionEngine::local_branch_p (
       // Branch target is before the start of this function
       return false;
     }
-    if (offset + next_pc_value > func_range.GetByteSize()) {
+    if (offset + next_pc_value >= func_range.GetByteSize()) {
       // Branch targets outside this function's bounds
       return false;
     }
@@ -967,6 +967,8 @@ bool x86AssemblyInspectionEngine::GetNonCallSiteUnwindPlanFromAssembly(
 
     bool in_epilogue = false; // we're in the middle of an epilogue sequence
     bool row_updated = false; // The UnwindPlan::Row 'row' has been updated
+    bool current_sp_offset_updated =
+        false; // current_sp_bytes_offset_from_fa has been updated this insn
 
     m_cur_insn = data + current_func_text_offset;
     if (!instruction_length(m_cur_insn, insn_len, size - current_func_text_offset)
@@ -1013,8 +1015,10 @@ bool x86AssemblyInspectionEngine::GetNonCallSiteUnwindPlanFromAssembly(
         afa_value.SetUnspecified();
         row_updated = true;
       }
-      if (fa_value_ptr->GetRegisterNumber() == m_lldb_fp_regnum)
+      if (fa_value_ptr->GetRegisterNumber() == m_lldb_fp_regnum) {
         current_sp_bytes_offset_from_fa = fa_value_ptr->GetOffset();
+        current_sp_offset_updated = true;
+      }
     }
 
     else if (mov_rbx_rsp_pattern_p()) {
@@ -1025,8 +1029,10 @@ bool x86AssemblyInspectionEngine::GetNonCallSiteUnwindPlanFromAssembly(
         afa_value.SetUnspecified();
         row_updated = true;
       }
-      if (fa_value_ptr->GetRegisterNumber() == m_lldb_alt_fp_regnum)
+      if (fa_value_ptr->GetRegisterNumber() == m_lldb_alt_fp_regnum) {
         current_sp_bytes_offset_from_fa = fa_value_ptr->GetOffset();
+        current_sp_offset_updated = true;
+      }
     }
 
     // This is the start() function (or a pthread equivalent), it starts with a
@@ -1039,6 +1045,7 @@ bool x86AssemblyInspectionEngine::GetNonCallSiteUnwindPlanFromAssembly(
 
     else if (push_reg_p(machine_regno)) {
       current_sp_bytes_offset_from_fa += m_wordsize;
+      current_sp_offset_updated = true;
       // the PUSH instruction has moved the stack pointer - if the FA is set
       // in terms of the stack pointer, we need to add a new row of
       // instructions.
@@ -1064,6 +1071,7 @@ bool x86AssemblyInspectionEngine::GetNonCallSiteUnwindPlanFromAssembly(
 
     else if (pop_reg_p(machine_regno)) {
       current_sp_bytes_offset_from_fa -= m_wordsize;
+      current_sp_offset_updated = true;
 
       if (nonvolatile_reg_p(machine_regno) &&
           machine_regno_to_lldb_regno(machine_regno, lldb_regno) &&
@@ -1091,6 +1099,7 @@ bool x86AssemblyInspectionEngine::GetNonCallSiteUnwindPlanFromAssembly(
 
     else if (pop_misc_reg_p()) {
       current_sp_bytes_offset_from_fa -= m_wordsize;
+      current_sp_offset_updated = true;
       if (fa_value_ptr->GetRegisterNumber() == m_lldb_sp_regnum) {
         fa_value_ptr->SetIsRegisterPlusOffset(
             m_lldb_sp_regnum, current_sp_bytes_offset_from_fa);
@@ -1126,6 +1135,7 @@ bool x86AssemblyInspectionEngine::GetNonCallSiteUnwindPlanFromAssembly(
       }
 
       current_sp_bytes_offset_from_fa -= m_wordsize;
+      current_sp_offset_updated = true;
 
       if (fa_value_ptr->GetRegisterNumber() == m_lldb_sp_regnum) {
         fa_value_ptr->SetIsRegisterPlusOffset(
@@ -1161,6 +1171,7 @@ bool x86AssemblyInspectionEngine::GetNonCallSiteUnwindPlanFromAssembly(
 
     else if (sub_rsp_pattern_p(stack_offset)) {
       current_sp_bytes_offset_from_fa += stack_offset;
+      current_sp_offset_updated = true;
       if (fa_value_ptr->GetRegisterNumber() == m_lldb_sp_regnum) {
         fa_value_ptr->SetOffset(current_sp_bytes_offset_from_fa);
         row_updated = true;
@@ -1169,6 +1180,7 @@ bool x86AssemblyInspectionEngine::GetNonCallSiteUnwindPlanFromAssembly(
 
     else if (add_rsp_pattern_p(stack_offset)) {
       current_sp_bytes_offset_from_fa -= stack_offset;
+      current_sp_offset_updated = true;
       if (fa_value_ptr->GetRegisterNumber() == m_lldb_sp_regnum) {
         fa_value_ptr->SetOffset(current_sp_bytes_offset_from_fa);
         row_updated = true;
@@ -1179,6 +1191,7 @@ bool x86AssemblyInspectionEngine::GetNonCallSiteUnwindPlanFromAssembly(
     else if (push_extended_pattern_p() || push_imm_pattern_p() ||
              push_misc_reg_p()) {
       current_sp_bytes_offset_from_fa += m_wordsize;
+      current_sp_offset_updated = true;
       if (fa_value_ptr->GetRegisterNumber() == m_lldb_sp_regnum) {
         fa_value_ptr->SetOffset(current_sp_bytes_offset_from_fa);
         row_updated = true;
@@ -1187,6 +1200,7 @@ bool x86AssemblyInspectionEngine::GetNonCallSiteUnwindPlanFromAssembly(
 
     else if (lea_rsp_pattern_p(stack_offset)) {
       current_sp_bytes_offset_from_fa -= stack_offset;
+      current_sp_offset_updated = true;
       if (fa_value_ptr->GetRegisterNumber() == m_lldb_sp_regnum) {
         fa_value_ptr->SetOffset(current_sp_bytes_offset_from_fa);
         row_updated = true;
@@ -1206,6 +1220,7 @@ bool x86AssemblyInspectionEngine::GetNonCallSiteUnwindPlanFromAssembly(
       if (fa_value_ptr->GetRegisterNumber() == m_lldb_fp_regnum) {
         current_sp_bytes_offset_from_fa =
           fa_value_ptr->GetOffset() - stack_offset;
+        current_sp_offset_updated = true;
       }
     }
 
@@ -1219,6 +1234,7 @@ bool x86AssemblyInspectionEngine::GetNonCallSiteUnwindPlanFromAssembly(
       }
       if (fa_value_ptr->GetRegisterNumber() == m_lldb_alt_fp_regnum) {
         current_sp_bytes_offset_from_fa = fa_value_ptr->GetOffset() - stack_offset;
+        current_sp_offset_updated = true;
       }
     }
 
@@ -1251,6 +1267,7 @@ bool x86AssemblyInspectionEngine::GetNonCallSiteUnwindPlanFromAssembly(
           row.reset(newrow);
           current_sp_bytes_offset_from_fa =
               prologue_completed_sp_bytes_offset_from_cfa;
+          current_sp_offset_updated = true;
           is_aligned = prologue_completed_is_aligned;
 
           saved_registers.clear();
@@ -1272,6 +1289,7 @@ bool x86AssemblyInspectionEngine::GetNonCallSiteUnwindPlanFromAssembly(
     // global data
     else if (call_next_insn_pattern_p()) {
       current_sp_bytes_offset_from_fa += m_wordsize;
+      current_sp_offset_updated = true;
       if (fa_value_ptr->GetRegisterNumber() == m_lldb_sp_regnum) {
         fa_value_ptr->SetOffset(current_sp_bytes_offset_from_fa);
         row_updated = true;
@@ -1304,7 +1322,7 @@ bool x86AssemblyInspectionEngine::GetNonCallSiteUnwindPlanFromAssembly(
 
     // We may change the sp value without adding a new Row necessarily -- keep
     // track of it either way.
-    if (!in_epilogue) {
+    if (!in_epilogue && current_sp_offset_updated) {
       prologue_completed_sp_bytes_offset_from_cfa =
           current_sp_bytes_offset_from_fa;
       prologue_completed_is_aligned = is_aligned;
diff --git a/lldb/unittests/UnwindAssembly/x86/Testx86AssemblyInspectionEngine.cpp b/lldb/unittests/UnwindAssembly/x86/Testx86AssemblyInspectionEngine.cpp
index 3ff57b4f97f178..f5bb3b6d642362 100644
--- a/lldb/unittests/UnwindAssembly/x86/Testx86AssemblyInspectionEngine.cpp
+++ b/lldb/unittests/UnwindAssembly/x86/Testx86AssemblyInspectionEngine.cpp
@@ -2859,16 +2859,17 @@ TEST_F(Testx86AssemblyInspectionEngine, TestDisassemblyMidFunctionEpilogues) {
 
     0x90,                   // <+18>: nop             // prologue setup back
 
-    0x74, 0x7,              // <+19>: je 6 <+27>
+    0x74, 0x8,              // <+19>: je 7 <+28>
     0x48, 0x83, 0xc4, 0x70, // <+21>: addq $0x70, %rsp
     0x5d,                   // <+25>: popq %rbp
-    0xc3,                   // <+26>: retq            // epilogue completed
+    0x90,                   // <+26>: nop           // mid-epilogue non-epilogue
+    0xc3,                   // <+27>: retq            // epilogue completed
 
-    0x90,                   // <+27>: nop             // prologue setup back
+    0x90,                   // <+28>: nop             // prologue setup back
 
-    0x48, 0x83, 0xc4, 0x70, // <+28>: addq $0x70, %rsp
-    0x5d,                   // <+32>: popq %rbp
-    0xc3,                   // <+33>: retq            // epilogue completed
+    0x48, 0x83, 0xc4, 0x70, // <+29>: addq $0x70, %rsp
+    0x5d,                   // <+33>: popq %rbp
+    0xc3,                   // <+34>: retq            // epilogue completed
 
   };
 
@@ -2898,8 +2899,8 @@ TEST_F(Testx86AssemblyInspectionEngine, TestDisassemblyMidFunctionEpilogues) {
   // Check that we've reinstated the stack frame setup 
   // unwind instructions after a mid-function retq
   // row:   CFA=ebp +8 => esp=CFA+0 eip=[CFA-8]
-  row_sp = unwind_plan.GetRowForFunctionOffset(27);
-  EXPECT_EQ(27ull, row_sp->GetOffset());
+  row_sp = unwind_plan.GetRowForFunctionOffset(28);
+  EXPECT_EQ(28ull, row_sp->GetOffset());
   EXPECT_TRUE(row_sp->GetCFAValue().GetRegisterNumber() == k_ebp);
   EXPECT_TRUE(row_sp->GetCFAValue().IsRegisterPlusOffset() == true);
   EXPECT_EQ(wordsize * 2, row_sp->GetCFAValue().GetOffset());
@@ -2907,8 +2908,8 @@ TEST_F(Testx86AssemblyInspectionEngine, TestDisassemblyMidFunctionEpilogues) {
   // After last instruction in the function, verify that
   // the stack frame has been unwound
   // row:   CFA=esp +4 => esp=CFA+0 eip=[CFA-4]
-  row_sp = unwind_plan.GetRowForFunctionOffset(33);
-  EXPECT_EQ(33ull, row_sp->GetOffset());
+  row_sp = unwind_plan.GetRowForFunctionOffset(34);
+  EXPECT_EQ(34ull, row_sp->GetOffset());
   EXPECT_TRUE(row_sp->GetCFAValue().GetRegisterNumber() == k_esp);
   EXPECT_TRUE(row_sp->GetCFAValue().IsRegisterPlusOffset() == true);
   EXPECT_EQ(wordsize, row_sp->GetCFAValue().GetOffset());
@@ -2940,8 +2941,8 @@ TEST_F(Testx86AssemblyInspectionEngine, TestDisassemblyMidFunctionEpilogues) {
   // Check that we've reinstated the stack frame setup 
   // unwind instructions after a mid-function retq
   // row:   CFA=rbp+16 => rsp=CFA+0 rip=[CFA-16]
-  row_sp = unwind_plan.GetRowForFunctionOffset(27);
-  EXPECT_EQ(27ull, row_sp->GetOffset());
+  row_sp = unwind_plan.GetRowForFunctionOffset(28);
+  EXPECT_EQ(28ull, row_sp->GetOffset());
   EXPECT_TRUE(row_sp->GetCFAValue().GetRegisterNumber() == k_rbp);
   EXPECT_TRUE(row_sp->GetCFAValue().IsRegisterPlusOffset() == true);
   EXPECT_EQ(wordsize * 2, row_sp->GetCFAValue().GetOffset());
@@ -2949,8 +2950,8 @@ TEST_F(Testx86AssemblyInspectionEngine, TestDisassemblyMidFunctionEpilogues) {
   // After last instruction in the function, verify that
   // the stack frame has been unwound
   // row:   CFA=rsp +8 => esp=CFA+0 rip=[CFA-8]
-  row_sp = unwind_plan.GetRowForFunctionOffset(33);
-  EXPECT_EQ(33ull, row_sp->GetOffset());
+  row_sp = unwind_plan.GetRowForFunctionOffset(34);
+  EXPECT_EQ(34ull, row_sp->GetOffset());
   EXPECT_TRUE(row_sp->GetCFAValue().GetRegisterNumber() == k_rsp);
   EXPECT_TRUE(row_sp->GetCFAValue().IsRegisterPlusOffset() == true);
   EXPECT_EQ(wordsize, row_sp->GetCFAValue().GetOffset());

Copy link

github-actions bot commented Oct 3, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff c1343a29216f08ec762b3e58572e5c3e41f6f285 8deef52bd49a54c8ea23d97460641d01d0daf898 --extensions cpp -- lldb/source/Plugins/UnwindAssembly/x86/x86AssemblyInspectionEngine.cpp lldb/unittests/UnwindAssembly/x86/Testx86AssemblyInspectionEngine.cpp
View the diff from clang-format here.
diff --git a/lldb/unittests/UnwindAssembly/x86/Testx86AssemblyInspectionEngine.cpp b/lldb/unittests/UnwindAssembly/x86/Testx86AssemblyInspectionEngine.cpp
index f5bb3b6d64..f737d0db24 100644
--- a/lldb/unittests/UnwindAssembly/x86/Testx86AssemblyInspectionEngine.cpp
+++ b/lldb/unittests/UnwindAssembly/x86/Testx86AssemblyInspectionEngine.cpp
@@ -2847,29 +2847,29 @@ TEST_F(Testx86AssemblyInspectionEngine, TestDisassemblyMidFunctionEpilogues) {
   std::unique_ptr<x86AssemblyInspectionEngine> engine64 = Getx86_64Inspector();
 
   uint8_t data[] = {
-    0x55,                   // <+0>: pushq %rbp
-    0x48, 0x89, 0xe5,       // <+1>: movq %rsp, %rbp
-    0x48, 0x83, 0xec, 0x70, // <+4>: subq $0x70, %rsp
-    0x90,                   // <+8>: nop               // prologue set up
+      0x55,                   // <+0>: pushq %rbp
+      0x48, 0x89, 0xe5,       // <+1>: movq %rsp, %rbp
+      0x48, 0x83, 0xec, 0x70, // <+4>: subq $0x70, %rsp
+      0x90,                   // <+8>: nop               // prologue set up
 
-    0x74, 0x7,              // <+9>: je 7 <+18>
-    0x48, 0x83, 0xc4, 0x70, // <+11>: addq $0x70, %rsp
-    0x5d,                   // <+15>: popq %rbp
-    0xff, 0xe0,             // <+16>: jmpq *%rax      // epilogue completed
+      0x74, 0x7,              // <+9>: je 7 <+18>
+      0x48, 0x83, 0xc4, 0x70, // <+11>: addq $0x70, %rsp
+      0x5d,                   // <+15>: popq %rbp
+      0xff, 0xe0,             // <+16>: jmpq *%rax      // epilogue completed
 
-    0x90,                   // <+18>: nop             // prologue setup back
+      0x90, // <+18>: nop             // prologue setup back
 
-    0x74, 0x8,              // <+19>: je 7 <+28>
-    0x48, 0x83, 0xc4, 0x70, // <+21>: addq $0x70, %rsp
-    0x5d,                   // <+25>: popq %rbp
-    0x90,                   // <+26>: nop           // mid-epilogue non-epilogue
-    0xc3,                   // <+27>: retq            // epilogue completed
+      0x74, 0x8,              // <+19>: je 7 <+28>
+      0x48, 0x83, 0xc4, 0x70, // <+21>: addq $0x70, %rsp
+      0x5d,                   // <+25>: popq %rbp
+      0x90, // <+26>: nop           // mid-epilogue non-epilogue
+      0xc3, // <+27>: retq            // epilogue completed
 
-    0x90,                   // <+28>: nop             // prologue setup back
+      0x90, // <+28>: nop             // prologue setup back
 
-    0x48, 0x83, 0xc4, 0x70, // <+29>: addq $0x70, %rsp
-    0x5d,                   // <+33>: popq %rbp
-    0xc3,                   // <+34>: retq            // epilogue completed
+      0x48, 0x83, 0xc4, 0x70, // <+29>: addq $0x70, %rsp
+      0x5d,                   // <+33>: popq %rbp
+      0xc3,                   // <+34>: retq            // epilogue completed
 
   };
 

Copy link
Contributor

@felipepiovezan felipepiovezan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM! Thanks a lot of fixing it.
Your comments match my observations and I tested this patch with a test case where we stop on every instruction of a few functions and assert we can unwind correctly.

@jasonmolenda jasonmolenda merged commit 00c1989 into llvm:main Oct 3, 2024
8 of 9 checks passed
@jasonmolenda jasonmolenda deleted the non-epilogue-x86-instruction-mid-epilogue branch October 3, 2024 16:59
jasonmolenda added a commit to jasonmolenda/llvm-project that referenced this pull request Oct 3, 2024
The x86 assembly instruction scanner creates incorrect UnwindPlans when
a mid-function epilogue has a non-epilogue instruction in it.

The x86 instruction analysis which creates an UnwindPlan handles
mid-function epilogues by tracking "epilogue instructions" (register
loads from stack, stack pointer increasing, etc) and any UnwindPlan
updates which are NOT epilogue instructions update the "prologue
UnwindPlan" saved row. It detects a LEAVE/RET/unconditional JMP out of
the function and after that instruction, re-instates the "prologue Row".

There's a parallel piece of data tracked across the duration of the
function, current_sp_bytes_offset_from_fa, and we reflect the "value
after prologue instructions" in
prologue_completed_sp_bytes_offset_from_cfa. When the CFA is calculated
in terms of the frame pointer ($ebp/$rbp), we don't add changes to the
stack pointer to the UnwindPlan, so this separate mechanism is used for
the "current value" and the "last value at prologue setup".

(the x86 UnwindPlan generated writes "sp=CFA+0" as a register rule which
is formally correct, but it could also track the stack pointer value as
sp=$rsp+<x> and update this register rule every time $rsp is modified.)

This leads to a bug when there is an instruction in an epilogue which
isn't recognzied as an epilogue instruction.
prologue_completed_sp_bytes_offset_from_cfa is always set to the value
of current_sp_bytes_offset_from_fa unless the current instruction is an
epilogue instruction. With a non-epilogue instruction in the middle of
the epilogue, we suddenly copy a current_sp_bytes_offset_from_fa value
from the middle of the epilogue into this
prologue_completed_sp_bytes_offset_from_cfa. Once the epilogue is
finished, we restore the "prologue Row" and
prologue_completed_sp_bytes_offset_from_cfa. But now $rsp has a very
incorrect value in it.

This patch tracks when we've updated current_sp_bytes_offset_from_fa in
the current instruction analysis. If it was updated looking at an
epilogue instruction, `is_epilogue` will be set correctly. Otherwise
it's a "prologue" instruction and we should update
prologue_completed_sp_bytes_offset_from_cfa. Any instruction that is
unrecognized will leave prologue_completed_sp_bytes_offset_from_cfa
unmodified.

The actual instruction we hit this with was a BTRQ but I added a NOP to
the unit test which is only 1 byte and made the update to the unit test
a little simpler. This bug is hit with a NOP just as well.

UnwindAssemblyInstEmulation has a much better algorithm for handling
mid-function epilogues, which "forward" the current unwind state Row
when it sees branches within the function, to the target instruction
offset. This avoids detecting prologue/epilogue instructions altogether.

rdar://137153323
(cherry picked from commit 00c1989)
jasonmolenda added a commit to swiftlang/llvm-project that referenced this pull request Oct 3, 2024
…pilogue-instruction-mid-epilogue

[lldb] Improve mid-function epilogue scanning for x86 (llvm#110965)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants