Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[llvm] Win x64 Unwind V2 1/n: Mark beginning and end of epilogs #110024

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dpaoliello
Copy link
Contributor

Windows x64 Unwind V2 adds epilog information to unwind data: specifically, the length of the epilog and the offset of each epilog.

The first step to do this is to add markers to the beginning and end of each epilog when generating Windows x64 code. I've modelled this after how LLVM was marking ARM and AArch64 epilogs in Windows (and unified the code between the three).

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 25, 2024

@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-mc

Author: Daniel Paoliello (dpaoliello)

Changes

Windows x64 Unwind V2 adds epilog information to unwind data: specifically, the length of the epilog and the offset of each epilog.

The first step to do this is to add markers to the beginning and end of each epilog when generating Windows x64 code. I've modelled this after how LLVM was marking ARM and AArch64 epilogs in Windows (and unified the code between the three).


Patch is 79.87 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110024.diff

61 Files Affected:

  • (modified) llvm/include/llvm/MC/MCStreamer.h (+12)
  • (modified) llvm/lib/MC/MCAsmStreamer.cpp (+16)
  • (modified) llvm/lib/MC/MCParser/COFFAsmParser.cpp (+18)
  • (modified) llvm/lib/MC/MCStreamer.cpp (+20)
  • (modified) llvm/lib/Target/AArch64/MCTargetDesc/AArch64TargetStreamer.h (-6)
  • (modified) llvm/lib/Target/AArch64/MCTargetDesc/AArch64WinCOFFStreamer.cpp (+5-14)
  • (modified) llvm/lib/Target/ARM/MCTargetDesc/ARMWinCOFFStreamer.cpp (+8-19)
  • (modified) llvm/lib/Target/X86/X86FrameLowering.cpp (+4-7)
  • (modified) llvm/lib/Target/X86/X86InstrCompiler.td (+5-3)
  • (modified) llvm/lib/Target/X86/X86MCInstLower.cpp (+17-1)
  • (modified) llvm/test/CodeGen/X86/apx/push2-pop2-cfi-seh.ll (+6)
  • (modified) llvm/test/CodeGen/X86/avx512-intel-ocl.ll (+4)
  • (modified) llvm/test/CodeGen/X86/avx512-regcall-Mask.ll (+22)
  • (modified) llvm/test/CodeGen/X86/avx512-regcall-NoMask.ll (+26)
  • (modified) llvm/test/CodeGen/X86/break-false-dep.ll (+22)
  • (modified) llvm/test/CodeGen/X86/catchpad-realign-savexmm.ll (+4)
  • (modified) llvm/test/CodeGen/X86/cfguard-x86-64-vectorcall.ll (+2)
  • (modified) llvm/test/CodeGen/X86/cleanuppad-realign.ll (+2)
  • (modified) llvm/test/CodeGen/X86/conditional-tailcall-pgso.ll (+4)
  • (modified) llvm/test/CodeGen/X86/conditional-tailcall.ll (+4)
  • (modified) llvm/test/CodeGen/X86/ldexp.ll (+12)
  • (modified) llvm/test/CodeGen/X86/localescape.ll (+2)
  • (modified) llvm/test/CodeGen/X86/mixed-ptr-sizes.ll (+2)
  • (modified) llvm/test/CodeGen/X86/musttail-varargs.ll (+2)
  • (modified) llvm/test/CodeGen/X86/no-sse-win64.ll (+8)
  • (modified) llvm/test/CodeGen/X86/preserve_nonecc_call_win.ll (+2)
  • (modified) llvm/test/CodeGen/X86/segmented-stacks.ll (+14)
  • (modified) llvm/test/CodeGen/X86/seh-catchpad.ll (+2)
  • (modified) llvm/test/CodeGen/X86/sse-regcall.ll (+2)
  • (modified) llvm/test/CodeGen/X86/sse-regcall4.ll (+2)
  • (modified) llvm/test/CodeGen/X86/stack-coloring-wineh.ll (+6)
  • (modified) llvm/test/CodeGen/X86/swift-async-win64.ll (+2)
  • (modified) llvm/test/CodeGen/X86/tailcc-ssp.ll (+4)
  • (modified) llvm/test/CodeGen/X86/taildup-callsiteinfo.mir (+2-1)
  • (modified) llvm/test/CodeGen/X86/win-catchpad-csrs.ll (+12)
  • (modified) llvm/test/CodeGen/X86/win-catchpad.ll (+10)
  • (modified) llvm/test/CodeGen/X86/win-funclet-cfi.ll (+2)
  • (modified) llvm/test/CodeGen/X86/win-smallparams.ll (+2)
  • (modified) llvm/test/CodeGen/X86/win64-byval.ll (+6)
  • (modified) llvm/test/CodeGen/X86/win64-eh-empty-block-2.mir (+4-2)
  • (modified) llvm/test/CodeGen/X86/win64-funclet-savexmm.ll (+2)
  • (modified) llvm/test/CodeGen/X86/win64-seh-epilogue-statepoint.ll (+2)
  • (modified) llvm/test/CodeGen/X86/win64_eh.ll (+4)
  • (modified) llvm/test/CodeGen/X86/win64_frame.ll (+22)
  • (modified) llvm/test/CodeGen/X86/x86-64-flags-intrinsics.ll (+4)
  • (modified) llvm/test/CodeGen/X86/x86-win64-shrink-wrapping.ll (+26-12)
  • (modified) llvm/test/DebugInfo/COFF/trailing-inlined-function.s (+2)
  • (modified) llvm/test/DebugInfo/MIR/X86/instr-ref-join-def-vphi.mir (+2-1)
  • (modified) llvm/test/ExecutionEngine/JITLink/x86-64/COFF_pdata_no_strip.s (+2)
  • (modified) llvm/test/ExecutionEngine/JITLink/x86-64/COFF_pdata_strip.s (+2)
  • (modified) llvm/test/MC/AsmParser/directive_seh.s (+4)
  • (modified) llvm/test/MC/AsmParser/seh-directive-errors.s (+6)
  • (modified) llvm/test/MC/COFF/cv-def-range-align.s (+2)
  • (modified) llvm/test/MC/COFF/cv-inline-linetable-unlikely.s (+2)
  • (modified) llvm/test/MC/COFF/seh-align2.s (+2)
  • (modified) llvm/test/MC/COFF/seh-align3.s (+2)
  • (modified) llvm/test/MC/COFF/seh-linkonce.s (+2)
  • (modified) llvm/test/MC/COFF/seh-section-2.s (+2)
  • (modified) llvm/test/MC/COFF/seh-section.s (+6)
  • (modified) llvm/test/MC/COFF/seh.s (+2)
  • (modified) llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/x86-basic.ll.expected (+2)
diff --git a/llvm/include/llvm/MC/MCStreamer.h b/llvm/include/llvm/MC/MCStreamer.h
index 707aecc5dc578e..c3770ffa6a5dc8 100644
--- a/llvm/include/llvm/MC/MCStreamer.h
+++ b/llvm/include/llvm/MC/MCStreamer.h
@@ -252,6 +252,12 @@ class MCStreamer {
   bool AllowAutoPadding = false;
 
 protected:
+  // True if we are processing SEH directives in an epilogue.
+  bool InEpilogCFI = false;
+
+  // Symbol of the current epilog for which we are processing SEH directives.
+  MCSymbol *CurrentEpilog = nullptr;
+
   MCFragment *CurFrag = nullptr;
 
   MCStreamer(MCContext &Ctx);
@@ -331,6 +337,10 @@ class MCStreamer {
     return WinFrameInfos;
   }
 
+  MCSymbol *getCurrentEpilog() const { return CurrentEpilog; }
+
+  bool isInEpilogCFI() const { return InEpilogCFI; }
+
   void generateCompactUnwindEncodings(MCAsmBackend *MAB);
 
   /// \name Assembly File Formatting.
@@ -1043,6 +1053,8 @@ class MCStreamer {
                                  SMLoc Loc = SMLoc());
   virtual void emitWinCFIPushFrame(bool Code, SMLoc Loc = SMLoc());
   virtual void emitWinCFIEndProlog(SMLoc Loc = SMLoc());
+  virtual void emitWinCFIBeginEpilogue(SMLoc Loc = SMLoc());
+  virtual void emitWinCFIEndEpilogue(SMLoc Loc = SMLoc());
   virtual void emitWinEHHandler(const MCSymbol *Sym, bool Unwind, bool Except,
                                 SMLoc Loc = SMLoc());
   virtual void emitWinEHHandlerData(SMLoc Loc = SMLoc());
diff --git a/llvm/lib/MC/MCAsmStreamer.cpp b/llvm/lib/MC/MCAsmStreamer.cpp
index 31b519a3e5c56a..34bfa139cea290 100644
--- a/llvm/lib/MC/MCAsmStreamer.cpp
+++ b/llvm/lib/MC/MCAsmStreamer.cpp
@@ -391,6 +391,8 @@ class MCAsmStreamer final : public MCStreamer {
                          SMLoc Loc) override;
   void emitWinCFIPushFrame(bool Code, SMLoc Loc) override;
   void emitWinCFIEndProlog(SMLoc Loc) override;
+  void emitWinCFIBeginEpilogue(SMLoc Loc) override;
+  void emitWinCFIEndEpilogue(SMLoc Loc) override;
 
   void emitWinEHHandler(const MCSymbol *Sym, bool Unwind, bool Except,
                         SMLoc Loc) override;
@@ -2306,6 +2308,20 @@ void MCAsmStreamer::emitWinCFIEndProlog(SMLoc Loc) {
   EmitEOL();
 }
 
+void MCAsmStreamer::emitWinCFIBeginEpilogue(SMLoc Loc) {
+  MCStreamer::emitWinCFIBeginEpilogue(Loc);
+
+  OS << "\t.seh_beginepilogue";
+  EmitEOL();
+}
+
+void MCAsmStreamer::emitWinCFIEndEpilogue(SMLoc Loc) {
+  MCStreamer::emitWinCFIEndEpilogue(Loc);
+
+  OS << "\t.seh_endepilogue";
+  EmitEOL();
+}
+
 void MCAsmStreamer::emitCGProfileEntry(const MCSymbolRefExpr *From,
                                        const MCSymbolRefExpr *To,
                                        uint64_t Count) {
diff --git a/llvm/lib/MC/MCParser/COFFAsmParser.cpp b/llvm/lib/MC/MCParser/COFFAsmParser.cpp
index a69276c36c56b3..22e72292966f46 100644
--- a/llvm/lib/MC/MCParser/COFFAsmParser.cpp
+++ b/llvm/lib/MC/MCParser/COFFAsmParser.cpp
@@ -90,6 +90,10 @@ class COFFAsmParser : public MCAsmParserExtension {
                                                              ".seh_stackalloc");
     addDirectiveHandler<&COFFAsmParser::ParseSEHDirectiveEndProlog>(
                                                             ".seh_endprologue");
+    addDirectiveHandler<&COFFAsmParser::ParseSEHDirectiveBeginEpilog>(
+        ".seh_beginepilogue");
+    addDirectiveHandler<&COFFAsmParser::ParseSEHDirectiveEndEpilog>(
+        ".seh_endepilogue");
   }
 
   bool ParseSectionDirectiveText(StringRef, SMLoc) {
@@ -137,6 +141,8 @@ class COFFAsmParser : public MCAsmParserExtension {
   bool ParseSEHDirectiveHandlerData(StringRef, SMLoc);
   bool ParseSEHDirectiveAllocStack(StringRef, SMLoc);
   bool ParseSEHDirectiveEndProlog(StringRef, SMLoc);
+  bool ParseSEHDirectiveBeginEpilog(StringRef, SMLoc);
+  bool ParseSEHDirectiveEndEpilog(StringRef, SMLoc);
 
   bool ParseAtUnwindOrAtExcept(bool &unwind, bool &except);
   bool ParseDirectiveSymbolAttribute(StringRef Directive, SMLoc);
@@ -715,6 +721,18 @@ bool COFFAsmParser::ParseSEHDirectiveEndProlog(StringRef, SMLoc Loc) {
   return false;
 }
 
+bool COFFAsmParser::ParseSEHDirectiveBeginEpilog(StringRef, SMLoc Loc) {
+  Lex();
+  getStreamer().emitWinCFIBeginEpilogue(Loc);
+  return false;
+}
+
+bool COFFAsmParser::ParseSEHDirectiveEndEpilog(StringRef, SMLoc Loc) {
+  Lex();
+  getStreamer().emitWinCFIEndEpilogue(Loc);
+  return false;
+}
+
 bool COFFAsmParser::ParseAtUnwindOrAtExcept(bool &unwind, bool &except) {
   StringRef identifier;
   if (getLexer().isNot(AsmToken::At) && getLexer().isNot(AsmToken::Percent))
diff --git a/llvm/lib/MC/MCStreamer.cpp b/llvm/lib/MC/MCStreamer.cpp
index 13b162768578c5..b179aa1cef39c9 100644
--- a/llvm/lib/MC/MCStreamer.cpp
+++ b/llvm/lib/MC/MCStreamer.cpp
@@ -979,6 +979,26 @@ void MCStreamer::emitWinCFIEndProlog(SMLoc Loc) {
   CurFrame->PrologEnd = Label;
 }
 
+void MCStreamer::emitWinCFIBeginEpilogue(SMLoc Loc) {
+  WinEH::FrameInfo *CurFrame = EnsureValidWinFrameInfo(Loc);
+  if (!CurFrame)
+    return;
+
+  InEpilogCFI = true;
+  CurrentEpilog = emitCFILabel();
+}
+
+void MCStreamer::emitWinCFIEndEpilogue(SMLoc Loc) {
+  WinEH::FrameInfo *CurFrame = EnsureValidWinFrameInfo(Loc);
+  if (!CurFrame)
+    return;
+
+  InEpilogCFI = false;
+  MCSymbol *Label = emitCFILabel();
+  CurFrame->EpilogMap[CurrentEpilog].End = Label;
+  CurrentEpilog = nullptr;
+}
+
 void MCStreamer::emitCOFFSafeSEH(MCSymbol const *Symbol) {}
 
 void MCStreamer::emitCOFFSymbolIndex(MCSymbol const *Symbol) {}
diff --git a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64TargetStreamer.h b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64TargetStreamer.h
index ac441ae3b603ff..119dcc38edbfcd 100644
--- a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64TargetStreamer.h
+++ b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64TargetStreamer.h
@@ -100,12 +100,6 @@ class AArch64TargetELFStreamer : public AArch64TargetStreamer {
 };
 
 class AArch64TargetWinCOFFStreamer : public llvm::AArch64TargetStreamer {
-private:
-  // True if we are processing SEH directives in an epilogue.
-  bool InEpilogCFI = false;
-
-  // Symbol of the current epilog for which we are processing SEH directives.
-  MCSymbol *CurrentEpilog = nullptr;
 public:
   AArch64TargetWinCOFFStreamer(llvm::MCStreamer &S)
     : AArch64TargetStreamer(S) {}
diff --git a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64WinCOFFStreamer.cpp b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64WinCOFFStreamer.cpp
index 208d43502cb88a..160768350a6b86 100644
--- a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64WinCOFFStreamer.cpp
+++ b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64WinCOFFStreamer.cpp
@@ -73,8 +73,8 @@ void AArch64TargetWinCOFFStreamer::emitARM64WinUnwindCode(unsigned UnwindCode,
   if (!CurFrame)
     return;
   auto Inst = WinEH::Instruction(UnwindCode, /*Label=*/nullptr, Reg, Offset);
-  if (InEpilogCFI)
-    CurFrame->EpilogMap[CurrentEpilog].Instructions.push_back(Inst);
+  if (S.isInEpilogCFI())
+    CurFrame->EpilogMap[S.getCurrentEpilog()].Instructions.push_back(Inst);
   else
     CurFrame->Instructions.push_back(Inst);
 }
@@ -183,13 +183,7 @@ void AArch64TargetWinCOFFStreamer::emitARM64WinCFIPrologEnd() {
 }
 
 void AArch64TargetWinCOFFStreamer::emitARM64WinCFIEpilogStart() {
-  auto &S = getStreamer();
-  WinEH::FrameInfo *CurFrame = S.EnsureValidWinFrameInfo(SMLoc());
-  if (!CurFrame)
-    return;
-
-  InEpilogCFI = true;
-  CurrentEpilog = S.emitCFILabel();
+  getStreamer().emitWinCFIBeginEpilogue();
 }
 
 void AArch64TargetWinCOFFStreamer::emitARM64WinCFIEpilogEnd() {
@@ -198,13 +192,10 @@ void AArch64TargetWinCOFFStreamer::emitARM64WinCFIEpilogEnd() {
   if (!CurFrame)
     return;
 
-  InEpilogCFI = false;
   WinEH::Instruction Inst =
       WinEH::Instruction(Win64EH::UOP_End, /*Label=*/nullptr, -1, 0);
-  CurFrame->EpilogMap[CurrentEpilog].Instructions.push_back(Inst);
-  MCSymbol *Label = S.emitCFILabel();
-  CurFrame->EpilogMap[CurrentEpilog].End = Label;
-  CurrentEpilog = nullptr;
+  CurFrame->EpilogMap[S.getCurrentEpilog()].Instructions.push_back(Inst);
+  S.emitWinCFIEndEpilogue();
 }
 
 void AArch64TargetWinCOFFStreamer::emitARM64WinCFITrapFrame() {
diff --git a/llvm/lib/Target/ARM/MCTargetDesc/ARMWinCOFFStreamer.cpp b/llvm/lib/Target/ARM/MCTargetDesc/ARMWinCOFFStreamer.cpp
index e66059c2a0e096..b541755cf6b621 100644
--- a/llvm/lib/Target/ARM/MCTargetDesc/ARMWinCOFFStreamer.cpp
+++ b/llvm/lib/Target/ARM/MCTargetDesc/ARMWinCOFFStreamer.cpp
@@ -77,13 +77,6 @@ llvm::createARMWinCOFFStreamer(MCContext &Context,
 
 namespace {
 class ARMTargetWinCOFFStreamer : public llvm::ARMTargetStreamer {
-private:
-  // True if we are processing SEH directives in an epilogue.
-  bool InEpilogCFI = false;
-
-  // Symbol of the current epilog for which we are processing SEH directives.
-  MCSymbol *CurrentEpilog = nullptr;
-
 public:
   ARMTargetWinCOFFStreamer(llvm::MCStreamer &S) : ARMTargetStreamer(S) {}
 
@@ -114,8 +107,8 @@ void ARMTargetWinCOFFStreamer::emitARMWinUnwindCode(unsigned UnwindCode,
     return;
   MCSymbol *Label = S.emitCFILabel();
   auto Inst = WinEH::Instruction(UnwindCode, Label, Reg, Offset);
-  if (InEpilogCFI)
-    CurFrame->EpilogMap[CurrentEpilog].Instructions.push_back(Inst);
+  if (S.isInEpilogCFI())
+    CurFrame->EpilogMap[S.getCurrentEpilog()].Instructions.push_back(Inst);
   else
     CurFrame->Instructions.push_back(Inst);
 }
@@ -224,9 +217,8 @@ void ARMTargetWinCOFFStreamer::emitARMWinCFIEpilogStart(unsigned Condition) {
   if (!CurFrame)
     return;
 
-  InEpilogCFI = true;
-  CurrentEpilog = S.emitCFILabel();
-  CurFrame->EpilogMap[CurrentEpilog].Condition = Condition;
+  S.emitWinCFIBeginEpilogue();
+  CurFrame->EpilogMap[S.getCurrentEpilog()].Condition = Condition;
 }
 
 void ARMTargetWinCOFFStreamer::emitARMWinCFIEpilogEnd() {
@@ -235,14 +227,14 @@ void ARMTargetWinCOFFStreamer::emitARMWinCFIEpilogEnd() {
   if (!CurFrame)
     return;
 
-  if (!CurrentEpilog) {
+  if (!S.getCurrentEpilog()) {
     S.getContext().reportError(SMLoc(), "Stray .seh_endepilogue in " +
                                             CurFrame->Function->getName());
     return;
   }
 
   std::vector<WinEH::Instruction> &Epilog =
-      CurFrame->EpilogMap[CurrentEpilog].Instructions;
+      CurFrame->EpilogMap[S.getCurrentEpilog()].Instructions;
 
   unsigned UnwindCode = Win64EH::UOP_End;
   if (!Epilog.empty()) {
@@ -256,12 +248,9 @@ void ARMTargetWinCOFFStreamer::emitARMWinCFIEpilogEnd() {
     }
   }
 
-  InEpilogCFI = false;
   WinEH::Instruction Inst = WinEH::Instruction(UnwindCode, nullptr, -1, 0);
-  CurFrame->EpilogMap[CurrentEpilog].Instructions.push_back(Inst);
-  MCSymbol *Label = S.emitCFILabel();
-  CurFrame->EpilogMap[CurrentEpilog].End = Label;
-  CurrentEpilog = nullptr;
+  CurFrame->EpilogMap[S.getCurrentEpilog()].Instructions.push_back(Inst);
+  S.emitWinCFIEndEpilogue();
 }
 
 void ARMTargetWinCOFFStreamer::emitARMWinCFICustom(unsigned Opcode) {
diff --git a/llvm/lib/Target/X86/X86FrameLowering.cpp b/llvm/lib/Target/X86/X86FrameLowering.cpp
index 4f83267c999e4a..f49c6c1125d613 100644
--- a/llvm/lib/Target/X86/X86FrameLowering.cpp
+++ b/llvm/lib/Target/X86/X86FrameLowering.cpp
@@ -2549,14 +2549,8 @@ void X86FrameLowering::emitEpilogue(MachineFunction &MF,
     --MBBI;
   }
 
-  // Windows unwinder will not invoke function's exception handler if IP is
-  // either in prologue or in epilogue.  This behavior causes a problem when a
-  // call immediately precedes an epilogue, because the return address points
-  // into the epilogue.  To cope with that, we insert an epilogue marker here,
-  // then replace it with a 'nop' if it ends up immediately after a CALL in the
-  // final emitted code.
   if (NeedsWin64CFI && MF.hasWinCFI())
-    BuildMI(MBB, MBBI, DL, TII.get(X86::SEH_Epilogue));
+    BuildMI(MBB, MBBI, DL, TII.get(X86::SEH_BeginEpilogue));
 
   if (!HasFP && NeedsDwarfCFI) {
     MBBI = FirstCSPop;
@@ -2601,6 +2595,9 @@ void X86FrameLowering::emitEpilogue(MachineFunction &MF,
   // Emit tilerelease for AMX kernel.
   if (X86FI->getAMXProgModel() == AMXProgModelEnum::ManagedRA)
     BuildMI(MBB, Terminator, DL, TII.get(X86::TILERELEASE));
+
+  if (NeedsWin64CFI && MF.hasWinCFI())
+    BuildMI(MBB, Terminator, DL, TII.get(X86::SEH_EndEpilogue));
 }
 
 StackOffset X86FrameLowering::getFrameIndexReference(const MachineFunction &MF,
diff --git a/llvm/lib/Target/X86/X86InstrCompiler.td b/llvm/lib/Target/X86/X86InstrCompiler.td
index 5a8177e2b3607b..f8f572d662aa1a 100644
--- a/llvm/lib/Target/X86/X86InstrCompiler.td
+++ b/llvm/lib/Target/X86/X86InstrCompiler.td
@@ -235,7 +235,7 @@ let isBranch = 1, isTerminator = 1, isCodeGenOnly = 1 in {
 //===----------------------------------------------------------------------===//
 // Pseudo instructions used by unwind info.
 //
-let isPseudo = 1, SchedRW = [WriteSystem] in {
+let isPseudo = 1, isMeta = 1, SchedRW = [WriteSystem] in {
   def SEH_PushReg : I<0, Pseudo, (outs), (ins i32imm:$reg),
                             "#SEH_PushReg $reg", []>;
   def SEH_SaveReg : I<0, Pseudo, (outs), (ins i32imm:$reg, i32imm:$dst),
@@ -252,8 +252,10 @@ let isPseudo = 1, SchedRW = [WriteSystem] in {
                             "#SEH_PushFrame $mode", []>;
   def SEH_EndPrologue : I<0, Pseudo, (outs), (ins),
                             "#SEH_EndPrologue", []>;
-  def SEH_Epilogue : I<0, Pseudo, (outs), (ins),
-                            "#SEH_Epilogue", []>;
+  def SEH_BeginEpilogue : I<0, Pseudo, (outs), (ins),
+                            "#SEH_BeginEpilogue", []>;
+  def SEH_EndEpilogue : I<0, Pseudo, (outs), (ins),
+                            "#SEH_EndEpilogue", []>;
 }
 
 //===----------------------------------------------------------------------===//
diff --git a/llvm/lib/Target/X86/X86MCInstLower.cpp b/llvm/lib/Target/X86/X86MCInstLower.cpp
index 24db39c4e98b96..83c7ac6562b854 100644
--- a/llvm/lib/Target/X86/X86MCInstLower.cpp
+++ b/llvm/lib/Target/X86/X86MCInstLower.cpp
@@ -1781,6 +1781,14 @@ void X86AsmPrinter::EmitSEHInstruction(const MachineInstr *MI) {
     OutStreamer->emitWinCFIEndProlog();
     break;
 
+  case X86::SEH_BeginEpilogue:
+    OutStreamer->emitWinCFIBeginEpilogue();
+    break;
+
+  case X86::SEH_EndEpilogue:
+    OutStreamer->emitWinCFIEndEpilogue();
+    break;
+
   default:
     llvm_unreachable("expected SEH_ instruction");
   }
@@ -2422,11 +2430,17 @@ void X86AsmPrinter::emitInstruction(const MachineInstr *MI) {
   case X86::SEH_SetFrame:
   case X86::SEH_PushFrame:
   case X86::SEH_EndPrologue:
+  case X86::SEH_EndEpilogue:
     EmitSEHInstruction(MI);
     return;
 
-  case X86::SEH_Epilogue: {
+  case X86::SEH_BeginEpilogue: {
     assert(MF->hasWinCFI() && "SEH_ instruction in function without WinCFI?");
+    // Windows unwinder will not invoke function's exception handler if IP is
+    // either in prologue or in epilogue.  This behavior causes a problem when a
+    // call immediately precedes an epilogue, because the return address points
+    // into the epilogue.  To cope with that, we insert a 'nop' if it ends up
+    // immediately after a CALL in the final emitted code.
     MachineBasicBlock::const_iterator MBBI(MI);
     // Check if preceded by a call and emit nop if so.
     for (MBBI = PrevCrossBBInst(MBBI);
@@ -2441,6 +2455,8 @@ void X86AsmPrinter::emitInstruction(const MachineInstr *MI) {
         break;
       }
     }
+
+    EmitSEHInstruction(MI);
     return;
   }
   case X86::UBSAN_UD1:
diff --git a/llvm/test/CodeGen/X86/apx/push2-pop2-cfi-seh.ll b/llvm/test/CodeGen/X86/apx/push2-pop2-cfi-seh.ll
index 6c9fdc2adce2ff..224e4c1cc09f8d 100644
--- a/llvm/test/CodeGen/X86/apx/push2-pop2-cfi-seh.ll
+++ b/llvm/test/CodeGen/X86/apx/push2-pop2-cfi-seh.ll
@@ -142,6 +142,7 @@ define i32 @csr6_alloc16(ptr %argv) {
 ; WIN-REF-NEXT:    xorl %eax, %eax
 ; WIN-REF-NEXT:    callq *%rax
 ; WIN-REF-NEXT:    nop
+; WIN-REF-NEXT:    .seh_beginepilogue
 ; WIN-REF-NEXT:    addq $56, %rsp
 ; WIN-REF-NEXT:    popq %rbx
 ; WIN-REF-NEXT:    popq %rbp
@@ -149,6 +150,7 @@ define i32 @csr6_alloc16(ptr %argv) {
 ; WIN-REF-NEXT:    popq %r13
 ; WIN-REF-NEXT:    popq %r14
 ; WIN-REF-NEXT:    popq %r15
+; WIN-REF-NEXT:    .seh_endepilogue
 ; WIN-REF-NEXT:    retq
 ; WIN-REF-NEXT:    .seh_endproc
 ;
@@ -173,11 +175,13 @@ define i32 @csr6_alloc16(ptr %argv) {
 ; WIN-NEXT:    xorl %eax, %eax
 ; WIN-NEXT:    callq *%rax
 ; WIN-NEXT:    nop
+; WIN-NEXT:    .seh_beginepilogue
 ; WIN-NEXT:    addq $64, %rsp
 ; WIN-NEXT:    pop2 %rbp, %rbx
 ; WIN-NEXT:    pop2 %r13, %r12
 ; WIN-NEXT:    pop2 %r15, %r14
 ; WIN-NEXT:    popq %rcx
+; WIN-NEXT:    .seh_endepilogue
 ; WIN-NEXT:    retq
 ; WIN-NEXT:    .seh_endproc
 ;
@@ -202,11 +206,13 @@ define i32 @csr6_alloc16(ptr %argv) {
 ; WIN-PPX-NEXT:    xorl %eax, %eax
 ; WIN-PPX-NEXT:    callq *%rax
 ; WIN-PPX-NEXT:    nop
+; WIN-PPX-NEXT:    .seh_beginepilogue
 ; WIN-PPX-NEXT:    addq $64, %rsp
 ; WIN-PPX-NEXT:    pop2p %rbp, %rbx
 ; WIN-PPX-NEXT:    pop2p %r13, %r12
 ; WIN-PPX-NEXT:    pop2p %r15, %r14
 ; WIN-PPX-NEXT:    popq %rcx
+; WIN-PPX-NEXT:    .seh_endepilogue
 ; WIN-PPX-NEXT:    retq
 ; WIN-PPX-NEXT:    .seh_endproc
 entry:
diff --git a/llvm/test/CodeGen/X86/avx512-intel-ocl.ll b/llvm/test/CodeGen/X86/avx512-intel-ocl.ll
index 6c68279b8d04ae..941bf0d63778f0 100644
--- a/llvm/test/CodeGen/X86/avx512-intel-ocl.ll
+++ b/llvm/test/CodeGen/X86/avx512-intel-ocl.ll
@@ -429,7 +429,9 @@ define <16 x float> @testf16_inp_mask(<16 x float> %a, i16 %mask)  {
 ; WIN64-KNL-NEXT:    kmovw %edx, %k1
 ; WIN64-KNL-NEXT:    callq func_float16_mask
 ; WIN64-KNL-NEXT:    nop
+; WIN64-KNL-NEXT:    .seh_beginepilogue
 ; WIN64-KNL-NEXT:    addq $40, %rsp
+; WIN64-KNL-NEXT:    .seh_endepilogue
 ; WIN64-KNL-NEXT:    retq
 ; WIN64-KNL-NEXT:    .seh_endproc
 ;
@@ -443,7 +445,9 @@ define <16 x float> @testf16_inp_mask(<16 x float> %a, i16 %mask)  {
 ; WIN64-SKX-NEXT:    kmovd %edx, %k1
 ; WIN64-SKX-NEXT:    callq func_float16_mask
 ; WIN64-SKX-NEXT:    nop
+; WIN64-SKX-NEXT:    .seh_beginepilogue
 ; WIN64-SKX-NEXT:    addq $40, %rsp
+; WIN64-SKX-NEXT:    .seh_endepilogue
 ; WIN64-SKX-NEXT:    retq
 ; WIN64-SKX-NEXT:    .seh_endproc
 ;
diff --git a/llvm/test/CodeGen/X86/avx512-regcall-Mask.ll b/llvm/test/CodeGen/X86/avx512-regcall-Mask.ll
index b3a0c7dffae117..d9efc35e6893b8 100644
--- a/llvm/test/CodeGen/X86/avx512-regcall-Mask.ll
+++ b/llvm/test/CodeGen/X86/avx512-regcall-Mask.ll
@@ -149,12 +149,14 @@ define dso_local i64 @caller_argv64i1() #0 {
 ; WIN64-NEXT:    callq test_argv64i1
 ; WIN64-NEXT:    vmovaps {{[-0-9]+}}(%r{{[sb]}}p), %xmm6 # 16-byte Reload
 ; WIN64-NEXT:    vmovaps {{[-0-9]+}}(%r{{[sb]}}p), %xmm7 # 16-byte Reload
+; WIN64-NEXT:    .seh_beginepilogue
 ; WIN64-NEXT:    addq $48, %rsp
 ; WIN64-NEXT:    popq %rdi
 ; WIN64-NEXT:    popq %rsi
 ; WIN64-NEXT:    popq %r12
 ; WIN64-NEXT:    popq %r14
 ; WIN64-NEXT:    popq %r15
+; WIN64-NEXT:    .seh_endepilogue
 ; WIN64-NEXT:    retq
 ; WIN64-NEXT:    .seh_endproc
 ;
@@ -256,9 +258,11 @@ define dso_local <64 x i1> @caller_retv64i1() #0 {
 ; WIN64-NEXT:    vpmovm2b %k0, %zmm0
 ; WIN64-NEXT:    vmovaps (%rsp), %xmm6 # 16-byte Reload
 ; WIN64-NEXT:    vmovaps {{[-0-9]+}}(%r{{[sb]}}p), %xmm7 # 16-byte Reload
+; WIN64-NEXT:    .seh_beginepilogue
 ; WIN64-NEXT:    addq $40, %rsp
 ; WIN64-NEXT:    popq %rdi
 ; WIN64-NEXT:    popq %rsi
+; WIN64-NEXT:    .seh_endepilogue
 ; WIN64-NEXT:    retq
 ; WIN64-NEXT:    .seh_endproc
 ;
@@ -334,10 +338,12 @@ define dso_local x86_regcallcc i32 @test_argv32i1(<32 x i1> %x0, <32 x i1> %x1,
 ; WIN64-NEXT:    vzeroupper
 ; WIN64-NEXT:    callq test_argv32i1helper
 ; WIN64-NEXT:    nop
+; WIN64-NEXT:    .seh_beginepilogue
 ; WIN64-NEXT:    movq %rbp, %rsp
 ; WIN64-NEXT:    popq %r10
 ; WIN64-NEXT:    popq %r11
 ; WIN64-NEXT:    popq %rbp
+; WIN64-NEXT:    .seh_endepilogue
 ; WIN64-NEXT:    retq
 ; WIN64-NEXT:    .seh_endproc
 ;
@@ -417,9 +423,11 @@ define dso_local i32 @caller_argv32i1() #0 {
 ; WIN64-NEXT:    callq test_argv32i1
 ; WIN64-NEXT:    vmovaps (%rsp), %xmm6 # 16-byte Reload
 ; WIN64-NEXT:    vmovaps {{[-0-9]+}}(%r{{[sb]}}p), %xmm7 # 16-byte Reload
+; WIN64-NEXT:    .seh_beginepilogue
 ; WIN64-NEXT:    addq $40, %rsp
 ; WIN64-NEXT:    popq %rdi
 ; WIN64-NEXT:    popq %rsi
+; WIN64-NEXT:    .seh_endepilogue
 ; WIN64-NEXT:    retq
 ; WIN64-NEXT:    .seh_endproc
 ;
@@ -480,9 +488,11 @@ define dso_local i32 @caller_retv32i1() #0 {
 ; WIN64-NEXT:    incl %eax
 ; WIN64-NEXT:    vmovaps (%rsp), %xmm6 # 16-byte Reload
 ; WIN64-NEXT:    vmovaps {{[-0-9]+}}(%r{{[sb]}}p), %xmm7 # 16-byte Reload
+; WIN64-NEXT:    .seh_beginepilogue
 ; WIN64-N...
[truncated]

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 25, 2024

@llvm/pr-subscribers-backend-aarch64

Author: Daniel Paoliello (dpaoliello)

Changes

Windows x64 Unwind V2 adds epilog information to unwind data: specifically, the length of the epilog and the offset of each epilog.

The first step to do this is to add markers to the beginning and end of each epilog when generating Windows x64 code. I've modelled this after how LLVM was marking ARM and AArch64 epilogs in Windows (and unified the code between the three).


Patch is 79.87 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110024.diff

61 Files Affected:

  • (modified) llvm/include/llvm/MC/MCStreamer.h (+12)
  • (modified) llvm/lib/MC/MCAsmStreamer.cpp (+16)
  • (modified) llvm/lib/MC/MCParser/COFFAsmParser.cpp (+18)
  • (modified) llvm/lib/MC/MCStreamer.cpp (+20)
  • (modified) llvm/lib/Target/AArch64/MCTargetDesc/AArch64TargetStreamer.h (-6)
  • (modified) llvm/lib/Target/AArch64/MCTargetDesc/AArch64WinCOFFStreamer.cpp (+5-14)
  • (modified) llvm/lib/Target/ARM/MCTargetDesc/ARMWinCOFFStreamer.cpp (+8-19)
  • (modified) llvm/lib/Target/X86/X86FrameLowering.cpp (+4-7)
  • (modified) llvm/lib/Target/X86/X86InstrCompiler.td (+5-3)
  • (modified) llvm/lib/Target/X86/X86MCInstLower.cpp (+17-1)
  • (modified) llvm/test/CodeGen/X86/apx/push2-pop2-cfi-seh.ll (+6)
  • (modified) llvm/test/CodeGen/X86/avx512-intel-ocl.ll (+4)
  • (modified) llvm/test/CodeGen/X86/avx512-regcall-Mask.ll (+22)
  • (modified) llvm/test/CodeGen/X86/avx512-regcall-NoMask.ll (+26)
  • (modified) llvm/test/CodeGen/X86/break-false-dep.ll (+22)
  • (modified) llvm/test/CodeGen/X86/catchpad-realign-savexmm.ll (+4)
  • (modified) llvm/test/CodeGen/X86/cfguard-x86-64-vectorcall.ll (+2)
  • (modified) llvm/test/CodeGen/X86/cleanuppad-realign.ll (+2)
  • (modified) llvm/test/CodeGen/X86/conditional-tailcall-pgso.ll (+4)
  • (modified) llvm/test/CodeGen/X86/conditional-tailcall.ll (+4)
  • (modified) llvm/test/CodeGen/X86/ldexp.ll (+12)
  • (modified) llvm/test/CodeGen/X86/localescape.ll (+2)
  • (modified) llvm/test/CodeGen/X86/mixed-ptr-sizes.ll (+2)
  • (modified) llvm/test/CodeGen/X86/musttail-varargs.ll (+2)
  • (modified) llvm/test/CodeGen/X86/no-sse-win64.ll (+8)
  • (modified) llvm/test/CodeGen/X86/preserve_nonecc_call_win.ll (+2)
  • (modified) llvm/test/CodeGen/X86/segmented-stacks.ll (+14)
  • (modified) llvm/test/CodeGen/X86/seh-catchpad.ll (+2)
  • (modified) llvm/test/CodeGen/X86/sse-regcall.ll (+2)
  • (modified) llvm/test/CodeGen/X86/sse-regcall4.ll (+2)
  • (modified) llvm/test/CodeGen/X86/stack-coloring-wineh.ll (+6)
  • (modified) llvm/test/CodeGen/X86/swift-async-win64.ll (+2)
  • (modified) llvm/test/CodeGen/X86/tailcc-ssp.ll (+4)
  • (modified) llvm/test/CodeGen/X86/taildup-callsiteinfo.mir (+2-1)
  • (modified) llvm/test/CodeGen/X86/win-catchpad-csrs.ll (+12)
  • (modified) llvm/test/CodeGen/X86/win-catchpad.ll (+10)
  • (modified) llvm/test/CodeGen/X86/win-funclet-cfi.ll (+2)
  • (modified) llvm/test/CodeGen/X86/win-smallparams.ll (+2)
  • (modified) llvm/test/CodeGen/X86/win64-byval.ll (+6)
  • (modified) llvm/test/CodeGen/X86/win64-eh-empty-block-2.mir (+4-2)
  • (modified) llvm/test/CodeGen/X86/win64-funclet-savexmm.ll (+2)
  • (modified) llvm/test/CodeGen/X86/win64-seh-epilogue-statepoint.ll (+2)
  • (modified) llvm/test/CodeGen/X86/win64_eh.ll (+4)
  • (modified) llvm/test/CodeGen/X86/win64_frame.ll (+22)
  • (modified) llvm/test/CodeGen/X86/x86-64-flags-intrinsics.ll (+4)
  • (modified) llvm/test/CodeGen/X86/x86-win64-shrink-wrapping.ll (+26-12)
  • (modified) llvm/test/DebugInfo/COFF/trailing-inlined-function.s (+2)
  • (modified) llvm/test/DebugInfo/MIR/X86/instr-ref-join-def-vphi.mir (+2-1)
  • (modified) llvm/test/ExecutionEngine/JITLink/x86-64/COFF_pdata_no_strip.s (+2)
  • (modified) llvm/test/ExecutionEngine/JITLink/x86-64/COFF_pdata_strip.s (+2)
  • (modified) llvm/test/MC/AsmParser/directive_seh.s (+4)
  • (modified) llvm/test/MC/AsmParser/seh-directive-errors.s (+6)
  • (modified) llvm/test/MC/COFF/cv-def-range-align.s (+2)
  • (modified) llvm/test/MC/COFF/cv-inline-linetable-unlikely.s (+2)
  • (modified) llvm/test/MC/COFF/seh-align2.s (+2)
  • (modified) llvm/test/MC/COFF/seh-align3.s (+2)
  • (modified) llvm/test/MC/COFF/seh-linkonce.s (+2)
  • (modified) llvm/test/MC/COFF/seh-section-2.s (+2)
  • (modified) llvm/test/MC/COFF/seh-section.s (+6)
  • (modified) llvm/test/MC/COFF/seh.s (+2)
  • (modified) llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/x86-basic.ll.expected (+2)
diff --git a/llvm/include/llvm/MC/MCStreamer.h b/llvm/include/llvm/MC/MCStreamer.h
index 707aecc5dc578e..c3770ffa6a5dc8 100644
--- a/llvm/include/llvm/MC/MCStreamer.h
+++ b/llvm/include/llvm/MC/MCStreamer.h
@@ -252,6 +252,12 @@ class MCStreamer {
   bool AllowAutoPadding = false;
 
 protected:
+  // True if we are processing SEH directives in an epilogue.
+  bool InEpilogCFI = false;
+
+  // Symbol of the current epilog for which we are processing SEH directives.
+  MCSymbol *CurrentEpilog = nullptr;
+
   MCFragment *CurFrag = nullptr;
 
   MCStreamer(MCContext &Ctx);
@@ -331,6 +337,10 @@ class MCStreamer {
     return WinFrameInfos;
   }
 
+  MCSymbol *getCurrentEpilog() const { return CurrentEpilog; }
+
+  bool isInEpilogCFI() const { return InEpilogCFI; }
+
   void generateCompactUnwindEncodings(MCAsmBackend *MAB);
 
   /// \name Assembly File Formatting.
@@ -1043,6 +1053,8 @@ class MCStreamer {
                                  SMLoc Loc = SMLoc());
   virtual void emitWinCFIPushFrame(bool Code, SMLoc Loc = SMLoc());
   virtual void emitWinCFIEndProlog(SMLoc Loc = SMLoc());
+  virtual void emitWinCFIBeginEpilogue(SMLoc Loc = SMLoc());
+  virtual void emitWinCFIEndEpilogue(SMLoc Loc = SMLoc());
   virtual void emitWinEHHandler(const MCSymbol *Sym, bool Unwind, bool Except,
                                 SMLoc Loc = SMLoc());
   virtual void emitWinEHHandlerData(SMLoc Loc = SMLoc());
diff --git a/llvm/lib/MC/MCAsmStreamer.cpp b/llvm/lib/MC/MCAsmStreamer.cpp
index 31b519a3e5c56a..34bfa139cea290 100644
--- a/llvm/lib/MC/MCAsmStreamer.cpp
+++ b/llvm/lib/MC/MCAsmStreamer.cpp
@@ -391,6 +391,8 @@ class MCAsmStreamer final : public MCStreamer {
                          SMLoc Loc) override;
   void emitWinCFIPushFrame(bool Code, SMLoc Loc) override;
   void emitWinCFIEndProlog(SMLoc Loc) override;
+  void emitWinCFIBeginEpilogue(SMLoc Loc) override;
+  void emitWinCFIEndEpilogue(SMLoc Loc) override;
 
   void emitWinEHHandler(const MCSymbol *Sym, bool Unwind, bool Except,
                         SMLoc Loc) override;
@@ -2306,6 +2308,20 @@ void MCAsmStreamer::emitWinCFIEndProlog(SMLoc Loc) {
   EmitEOL();
 }
 
+void MCAsmStreamer::emitWinCFIBeginEpilogue(SMLoc Loc) {
+  MCStreamer::emitWinCFIBeginEpilogue(Loc);
+
+  OS << "\t.seh_beginepilogue";
+  EmitEOL();
+}
+
+void MCAsmStreamer::emitWinCFIEndEpilogue(SMLoc Loc) {
+  MCStreamer::emitWinCFIEndEpilogue(Loc);
+
+  OS << "\t.seh_endepilogue";
+  EmitEOL();
+}
+
 void MCAsmStreamer::emitCGProfileEntry(const MCSymbolRefExpr *From,
                                        const MCSymbolRefExpr *To,
                                        uint64_t Count) {
diff --git a/llvm/lib/MC/MCParser/COFFAsmParser.cpp b/llvm/lib/MC/MCParser/COFFAsmParser.cpp
index a69276c36c56b3..22e72292966f46 100644
--- a/llvm/lib/MC/MCParser/COFFAsmParser.cpp
+++ b/llvm/lib/MC/MCParser/COFFAsmParser.cpp
@@ -90,6 +90,10 @@ class COFFAsmParser : public MCAsmParserExtension {
                                                              ".seh_stackalloc");
     addDirectiveHandler<&COFFAsmParser::ParseSEHDirectiveEndProlog>(
                                                             ".seh_endprologue");
+    addDirectiveHandler<&COFFAsmParser::ParseSEHDirectiveBeginEpilog>(
+        ".seh_beginepilogue");
+    addDirectiveHandler<&COFFAsmParser::ParseSEHDirectiveEndEpilog>(
+        ".seh_endepilogue");
   }
 
   bool ParseSectionDirectiveText(StringRef, SMLoc) {
@@ -137,6 +141,8 @@ class COFFAsmParser : public MCAsmParserExtension {
   bool ParseSEHDirectiveHandlerData(StringRef, SMLoc);
   bool ParseSEHDirectiveAllocStack(StringRef, SMLoc);
   bool ParseSEHDirectiveEndProlog(StringRef, SMLoc);
+  bool ParseSEHDirectiveBeginEpilog(StringRef, SMLoc);
+  bool ParseSEHDirectiveEndEpilog(StringRef, SMLoc);
 
   bool ParseAtUnwindOrAtExcept(bool &unwind, bool &except);
   bool ParseDirectiveSymbolAttribute(StringRef Directive, SMLoc);
@@ -715,6 +721,18 @@ bool COFFAsmParser::ParseSEHDirectiveEndProlog(StringRef, SMLoc Loc) {
   return false;
 }
 
+bool COFFAsmParser::ParseSEHDirectiveBeginEpilog(StringRef, SMLoc Loc) {
+  Lex();
+  getStreamer().emitWinCFIBeginEpilogue(Loc);
+  return false;
+}
+
+bool COFFAsmParser::ParseSEHDirectiveEndEpilog(StringRef, SMLoc Loc) {
+  Lex();
+  getStreamer().emitWinCFIEndEpilogue(Loc);
+  return false;
+}
+
 bool COFFAsmParser::ParseAtUnwindOrAtExcept(bool &unwind, bool &except) {
   StringRef identifier;
   if (getLexer().isNot(AsmToken::At) && getLexer().isNot(AsmToken::Percent))
diff --git a/llvm/lib/MC/MCStreamer.cpp b/llvm/lib/MC/MCStreamer.cpp
index 13b162768578c5..b179aa1cef39c9 100644
--- a/llvm/lib/MC/MCStreamer.cpp
+++ b/llvm/lib/MC/MCStreamer.cpp
@@ -979,6 +979,26 @@ void MCStreamer::emitWinCFIEndProlog(SMLoc Loc) {
   CurFrame->PrologEnd = Label;
 }
 
+void MCStreamer::emitWinCFIBeginEpilogue(SMLoc Loc) {
+  WinEH::FrameInfo *CurFrame = EnsureValidWinFrameInfo(Loc);
+  if (!CurFrame)
+    return;
+
+  InEpilogCFI = true;
+  CurrentEpilog = emitCFILabel();
+}
+
+void MCStreamer::emitWinCFIEndEpilogue(SMLoc Loc) {
+  WinEH::FrameInfo *CurFrame = EnsureValidWinFrameInfo(Loc);
+  if (!CurFrame)
+    return;
+
+  InEpilogCFI = false;
+  MCSymbol *Label = emitCFILabel();
+  CurFrame->EpilogMap[CurrentEpilog].End = Label;
+  CurrentEpilog = nullptr;
+}
+
 void MCStreamer::emitCOFFSafeSEH(MCSymbol const *Symbol) {}
 
 void MCStreamer::emitCOFFSymbolIndex(MCSymbol const *Symbol) {}
diff --git a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64TargetStreamer.h b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64TargetStreamer.h
index ac441ae3b603ff..119dcc38edbfcd 100644
--- a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64TargetStreamer.h
+++ b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64TargetStreamer.h
@@ -100,12 +100,6 @@ class AArch64TargetELFStreamer : public AArch64TargetStreamer {
 };
 
 class AArch64TargetWinCOFFStreamer : public llvm::AArch64TargetStreamer {
-private:
-  // True if we are processing SEH directives in an epilogue.
-  bool InEpilogCFI = false;
-
-  // Symbol of the current epilog for which we are processing SEH directives.
-  MCSymbol *CurrentEpilog = nullptr;
 public:
   AArch64TargetWinCOFFStreamer(llvm::MCStreamer &S)
     : AArch64TargetStreamer(S) {}
diff --git a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64WinCOFFStreamer.cpp b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64WinCOFFStreamer.cpp
index 208d43502cb88a..160768350a6b86 100644
--- a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64WinCOFFStreamer.cpp
+++ b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64WinCOFFStreamer.cpp
@@ -73,8 +73,8 @@ void AArch64TargetWinCOFFStreamer::emitARM64WinUnwindCode(unsigned UnwindCode,
   if (!CurFrame)
     return;
   auto Inst = WinEH::Instruction(UnwindCode, /*Label=*/nullptr, Reg, Offset);
-  if (InEpilogCFI)
-    CurFrame->EpilogMap[CurrentEpilog].Instructions.push_back(Inst);
+  if (S.isInEpilogCFI())
+    CurFrame->EpilogMap[S.getCurrentEpilog()].Instructions.push_back(Inst);
   else
     CurFrame->Instructions.push_back(Inst);
 }
@@ -183,13 +183,7 @@ void AArch64TargetWinCOFFStreamer::emitARM64WinCFIPrologEnd() {
 }
 
 void AArch64TargetWinCOFFStreamer::emitARM64WinCFIEpilogStart() {
-  auto &S = getStreamer();
-  WinEH::FrameInfo *CurFrame = S.EnsureValidWinFrameInfo(SMLoc());
-  if (!CurFrame)
-    return;
-
-  InEpilogCFI = true;
-  CurrentEpilog = S.emitCFILabel();
+  getStreamer().emitWinCFIBeginEpilogue();
 }
 
 void AArch64TargetWinCOFFStreamer::emitARM64WinCFIEpilogEnd() {
@@ -198,13 +192,10 @@ void AArch64TargetWinCOFFStreamer::emitARM64WinCFIEpilogEnd() {
   if (!CurFrame)
     return;
 
-  InEpilogCFI = false;
   WinEH::Instruction Inst =
       WinEH::Instruction(Win64EH::UOP_End, /*Label=*/nullptr, -1, 0);
-  CurFrame->EpilogMap[CurrentEpilog].Instructions.push_back(Inst);
-  MCSymbol *Label = S.emitCFILabel();
-  CurFrame->EpilogMap[CurrentEpilog].End = Label;
-  CurrentEpilog = nullptr;
+  CurFrame->EpilogMap[S.getCurrentEpilog()].Instructions.push_back(Inst);
+  S.emitWinCFIEndEpilogue();
 }
 
 void AArch64TargetWinCOFFStreamer::emitARM64WinCFITrapFrame() {
diff --git a/llvm/lib/Target/ARM/MCTargetDesc/ARMWinCOFFStreamer.cpp b/llvm/lib/Target/ARM/MCTargetDesc/ARMWinCOFFStreamer.cpp
index e66059c2a0e096..b541755cf6b621 100644
--- a/llvm/lib/Target/ARM/MCTargetDesc/ARMWinCOFFStreamer.cpp
+++ b/llvm/lib/Target/ARM/MCTargetDesc/ARMWinCOFFStreamer.cpp
@@ -77,13 +77,6 @@ llvm::createARMWinCOFFStreamer(MCContext &Context,
 
 namespace {
 class ARMTargetWinCOFFStreamer : public llvm::ARMTargetStreamer {
-private:
-  // True if we are processing SEH directives in an epilogue.
-  bool InEpilogCFI = false;
-
-  // Symbol of the current epilog for which we are processing SEH directives.
-  MCSymbol *CurrentEpilog = nullptr;
-
 public:
   ARMTargetWinCOFFStreamer(llvm::MCStreamer &S) : ARMTargetStreamer(S) {}
 
@@ -114,8 +107,8 @@ void ARMTargetWinCOFFStreamer::emitARMWinUnwindCode(unsigned UnwindCode,
     return;
   MCSymbol *Label = S.emitCFILabel();
   auto Inst = WinEH::Instruction(UnwindCode, Label, Reg, Offset);
-  if (InEpilogCFI)
-    CurFrame->EpilogMap[CurrentEpilog].Instructions.push_back(Inst);
+  if (S.isInEpilogCFI())
+    CurFrame->EpilogMap[S.getCurrentEpilog()].Instructions.push_back(Inst);
   else
     CurFrame->Instructions.push_back(Inst);
 }
@@ -224,9 +217,8 @@ void ARMTargetWinCOFFStreamer::emitARMWinCFIEpilogStart(unsigned Condition) {
   if (!CurFrame)
     return;
 
-  InEpilogCFI = true;
-  CurrentEpilog = S.emitCFILabel();
-  CurFrame->EpilogMap[CurrentEpilog].Condition = Condition;
+  S.emitWinCFIBeginEpilogue();
+  CurFrame->EpilogMap[S.getCurrentEpilog()].Condition = Condition;
 }
 
 void ARMTargetWinCOFFStreamer::emitARMWinCFIEpilogEnd() {
@@ -235,14 +227,14 @@ void ARMTargetWinCOFFStreamer::emitARMWinCFIEpilogEnd() {
   if (!CurFrame)
     return;
 
-  if (!CurrentEpilog) {
+  if (!S.getCurrentEpilog()) {
     S.getContext().reportError(SMLoc(), "Stray .seh_endepilogue in " +
                                             CurFrame->Function->getName());
     return;
   }
 
   std::vector<WinEH::Instruction> &Epilog =
-      CurFrame->EpilogMap[CurrentEpilog].Instructions;
+      CurFrame->EpilogMap[S.getCurrentEpilog()].Instructions;
 
   unsigned UnwindCode = Win64EH::UOP_End;
   if (!Epilog.empty()) {
@@ -256,12 +248,9 @@ void ARMTargetWinCOFFStreamer::emitARMWinCFIEpilogEnd() {
     }
   }
 
-  InEpilogCFI = false;
   WinEH::Instruction Inst = WinEH::Instruction(UnwindCode, nullptr, -1, 0);
-  CurFrame->EpilogMap[CurrentEpilog].Instructions.push_back(Inst);
-  MCSymbol *Label = S.emitCFILabel();
-  CurFrame->EpilogMap[CurrentEpilog].End = Label;
-  CurrentEpilog = nullptr;
+  CurFrame->EpilogMap[S.getCurrentEpilog()].Instructions.push_back(Inst);
+  S.emitWinCFIEndEpilogue();
 }
 
 void ARMTargetWinCOFFStreamer::emitARMWinCFICustom(unsigned Opcode) {
diff --git a/llvm/lib/Target/X86/X86FrameLowering.cpp b/llvm/lib/Target/X86/X86FrameLowering.cpp
index 4f83267c999e4a..f49c6c1125d613 100644
--- a/llvm/lib/Target/X86/X86FrameLowering.cpp
+++ b/llvm/lib/Target/X86/X86FrameLowering.cpp
@@ -2549,14 +2549,8 @@ void X86FrameLowering::emitEpilogue(MachineFunction &MF,
     --MBBI;
   }
 
-  // Windows unwinder will not invoke function's exception handler if IP is
-  // either in prologue or in epilogue.  This behavior causes a problem when a
-  // call immediately precedes an epilogue, because the return address points
-  // into the epilogue.  To cope with that, we insert an epilogue marker here,
-  // then replace it with a 'nop' if it ends up immediately after a CALL in the
-  // final emitted code.
   if (NeedsWin64CFI && MF.hasWinCFI())
-    BuildMI(MBB, MBBI, DL, TII.get(X86::SEH_Epilogue));
+    BuildMI(MBB, MBBI, DL, TII.get(X86::SEH_BeginEpilogue));
 
   if (!HasFP && NeedsDwarfCFI) {
     MBBI = FirstCSPop;
@@ -2601,6 +2595,9 @@ void X86FrameLowering::emitEpilogue(MachineFunction &MF,
   // Emit tilerelease for AMX kernel.
   if (X86FI->getAMXProgModel() == AMXProgModelEnum::ManagedRA)
     BuildMI(MBB, Terminator, DL, TII.get(X86::TILERELEASE));
+
+  if (NeedsWin64CFI && MF.hasWinCFI())
+    BuildMI(MBB, Terminator, DL, TII.get(X86::SEH_EndEpilogue));
 }
 
 StackOffset X86FrameLowering::getFrameIndexReference(const MachineFunction &MF,
diff --git a/llvm/lib/Target/X86/X86InstrCompiler.td b/llvm/lib/Target/X86/X86InstrCompiler.td
index 5a8177e2b3607b..f8f572d662aa1a 100644
--- a/llvm/lib/Target/X86/X86InstrCompiler.td
+++ b/llvm/lib/Target/X86/X86InstrCompiler.td
@@ -235,7 +235,7 @@ let isBranch = 1, isTerminator = 1, isCodeGenOnly = 1 in {
 //===----------------------------------------------------------------------===//
 // Pseudo instructions used by unwind info.
 //
-let isPseudo = 1, SchedRW = [WriteSystem] in {
+let isPseudo = 1, isMeta = 1, SchedRW = [WriteSystem] in {
   def SEH_PushReg : I<0, Pseudo, (outs), (ins i32imm:$reg),
                             "#SEH_PushReg $reg", []>;
   def SEH_SaveReg : I<0, Pseudo, (outs), (ins i32imm:$reg, i32imm:$dst),
@@ -252,8 +252,10 @@ let isPseudo = 1, SchedRW = [WriteSystem] in {
                             "#SEH_PushFrame $mode", []>;
   def SEH_EndPrologue : I<0, Pseudo, (outs), (ins),
                             "#SEH_EndPrologue", []>;
-  def SEH_Epilogue : I<0, Pseudo, (outs), (ins),
-                            "#SEH_Epilogue", []>;
+  def SEH_BeginEpilogue : I<0, Pseudo, (outs), (ins),
+                            "#SEH_BeginEpilogue", []>;
+  def SEH_EndEpilogue : I<0, Pseudo, (outs), (ins),
+                            "#SEH_EndEpilogue", []>;
 }
 
 //===----------------------------------------------------------------------===//
diff --git a/llvm/lib/Target/X86/X86MCInstLower.cpp b/llvm/lib/Target/X86/X86MCInstLower.cpp
index 24db39c4e98b96..83c7ac6562b854 100644
--- a/llvm/lib/Target/X86/X86MCInstLower.cpp
+++ b/llvm/lib/Target/X86/X86MCInstLower.cpp
@@ -1781,6 +1781,14 @@ void X86AsmPrinter::EmitSEHInstruction(const MachineInstr *MI) {
     OutStreamer->emitWinCFIEndProlog();
     break;
 
+  case X86::SEH_BeginEpilogue:
+    OutStreamer->emitWinCFIBeginEpilogue();
+    break;
+
+  case X86::SEH_EndEpilogue:
+    OutStreamer->emitWinCFIEndEpilogue();
+    break;
+
   default:
     llvm_unreachable("expected SEH_ instruction");
   }
@@ -2422,11 +2430,17 @@ void X86AsmPrinter::emitInstruction(const MachineInstr *MI) {
   case X86::SEH_SetFrame:
   case X86::SEH_PushFrame:
   case X86::SEH_EndPrologue:
+  case X86::SEH_EndEpilogue:
     EmitSEHInstruction(MI);
     return;
 
-  case X86::SEH_Epilogue: {
+  case X86::SEH_BeginEpilogue: {
     assert(MF->hasWinCFI() && "SEH_ instruction in function without WinCFI?");
+    // Windows unwinder will not invoke function's exception handler if IP is
+    // either in prologue or in epilogue.  This behavior causes a problem when a
+    // call immediately precedes an epilogue, because the return address points
+    // into the epilogue.  To cope with that, we insert a 'nop' if it ends up
+    // immediately after a CALL in the final emitted code.
     MachineBasicBlock::const_iterator MBBI(MI);
     // Check if preceded by a call and emit nop if so.
     for (MBBI = PrevCrossBBInst(MBBI);
@@ -2441,6 +2455,8 @@ void X86AsmPrinter::emitInstruction(const MachineInstr *MI) {
         break;
       }
     }
+
+    EmitSEHInstruction(MI);
     return;
   }
   case X86::UBSAN_UD1:
diff --git a/llvm/test/CodeGen/X86/apx/push2-pop2-cfi-seh.ll b/llvm/test/CodeGen/X86/apx/push2-pop2-cfi-seh.ll
index 6c9fdc2adce2ff..224e4c1cc09f8d 100644
--- a/llvm/test/CodeGen/X86/apx/push2-pop2-cfi-seh.ll
+++ b/llvm/test/CodeGen/X86/apx/push2-pop2-cfi-seh.ll
@@ -142,6 +142,7 @@ define i32 @csr6_alloc16(ptr %argv) {
 ; WIN-REF-NEXT:    xorl %eax, %eax
 ; WIN-REF-NEXT:    callq *%rax
 ; WIN-REF-NEXT:    nop
+; WIN-REF-NEXT:    .seh_beginepilogue
 ; WIN-REF-NEXT:    addq $56, %rsp
 ; WIN-REF-NEXT:    popq %rbx
 ; WIN-REF-NEXT:    popq %rbp
@@ -149,6 +150,7 @@ define i32 @csr6_alloc16(ptr %argv) {
 ; WIN-REF-NEXT:    popq %r13
 ; WIN-REF-NEXT:    popq %r14
 ; WIN-REF-NEXT:    popq %r15
+; WIN-REF-NEXT:    .seh_endepilogue
 ; WIN-REF-NEXT:    retq
 ; WIN-REF-NEXT:    .seh_endproc
 ;
@@ -173,11 +175,13 @@ define i32 @csr6_alloc16(ptr %argv) {
 ; WIN-NEXT:    xorl %eax, %eax
 ; WIN-NEXT:    callq *%rax
 ; WIN-NEXT:    nop
+; WIN-NEXT:    .seh_beginepilogue
 ; WIN-NEXT:    addq $64, %rsp
 ; WIN-NEXT:    pop2 %rbp, %rbx
 ; WIN-NEXT:    pop2 %r13, %r12
 ; WIN-NEXT:    pop2 %r15, %r14
 ; WIN-NEXT:    popq %rcx
+; WIN-NEXT:    .seh_endepilogue
 ; WIN-NEXT:    retq
 ; WIN-NEXT:    .seh_endproc
 ;
@@ -202,11 +206,13 @@ define i32 @csr6_alloc16(ptr %argv) {
 ; WIN-PPX-NEXT:    xorl %eax, %eax
 ; WIN-PPX-NEXT:    callq *%rax
 ; WIN-PPX-NEXT:    nop
+; WIN-PPX-NEXT:    .seh_beginepilogue
 ; WIN-PPX-NEXT:    addq $64, %rsp
 ; WIN-PPX-NEXT:    pop2p %rbp, %rbx
 ; WIN-PPX-NEXT:    pop2p %r13, %r12
 ; WIN-PPX-NEXT:    pop2p %r15, %r14
 ; WIN-PPX-NEXT:    popq %rcx
+; WIN-PPX-NEXT:    .seh_endepilogue
 ; WIN-PPX-NEXT:    retq
 ; WIN-PPX-NEXT:    .seh_endproc
 entry:
diff --git a/llvm/test/CodeGen/X86/avx512-intel-ocl.ll b/llvm/test/CodeGen/X86/avx512-intel-ocl.ll
index 6c68279b8d04ae..941bf0d63778f0 100644
--- a/llvm/test/CodeGen/X86/avx512-intel-ocl.ll
+++ b/llvm/test/CodeGen/X86/avx512-intel-ocl.ll
@@ -429,7 +429,9 @@ define <16 x float> @testf16_inp_mask(<16 x float> %a, i16 %mask)  {
 ; WIN64-KNL-NEXT:    kmovw %edx, %k1
 ; WIN64-KNL-NEXT:    callq func_float16_mask
 ; WIN64-KNL-NEXT:    nop
+; WIN64-KNL-NEXT:    .seh_beginepilogue
 ; WIN64-KNL-NEXT:    addq $40, %rsp
+; WIN64-KNL-NEXT:    .seh_endepilogue
 ; WIN64-KNL-NEXT:    retq
 ; WIN64-KNL-NEXT:    .seh_endproc
 ;
@@ -443,7 +445,9 @@ define <16 x float> @testf16_inp_mask(<16 x float> %a, i16 %mask)  {
 ; WIN64-SKX-NEXT:    kmovd %edx, %k1
 ; WIN64-SKX-NEXT:    callq func_float16_mask
 ; WIN64-SKX-NEXT:    nop
+; WIN64-SKX-NEXT:    .seh_beginepilogue
 ; WIN64-SKX-NEXT:    addq $40, %rsp
+; WIN64-SKX-NEXT:    .seh_endepilogue
 ; WIN64-SKX-NEXT:    retq
 ; WIN64-SKX-NEXT:    .seh_endproc
 ;
diff --git a/llvm/test/CodeGen/X86/avx512-regcall-Mask.ll b/llvm/test/CodeGen/X86/avx512-regcall-Mask.ll
index b3a0c7dffae117..d9efc35e6893b8 100644
--- a/llvm/test/CodeGen/X86/avx512-regcall-Mask.ll
+++ b/llvm/test/CodeGen/X86/avx512-regcall-Mask.ll
@@ -149,12 +149,14 @@ define dso_local i64 @caller_argv64i1() #0 {
 ; WIN64-NEXT:    callq test_argv64i1
 ; WIN64-NEXT:    vmovaps {{[-0-9]+}}(%r{{[sb]}}p), %xmm6 # 16-byte Reload
 ; WIN64-NEXT:    vmovaps {{[-0-9]+}}(%r{{[sb]}}p), %xmm7 # 16-byte Reload
+; WIN64-NEXT:    .seh_beginepilogue
 ; WIN64-NEXT:    addq $48, %rsp
 ; WIN64-NEXT:    popq %rdi
 ; WIN64-NEXT:    popq %rsi
 ; WIN64-NEXT:    popq %r12
 ; WIN64-NEXT:    popq %r14
 ; WIN64-NEXT:    popq %r15
+; WIN64-NEXT:    .seh_endepilogue
 ; WIN64-NEXT:    retq
 ; WIN64-NEXT:    .seh_endproc
 ;
@@ -256,9 +258,11 @@ define dso_local <64 x i1> @caller_retv64i1() #0 {
 ; WIN64-NEXT:    vpmovm2b %k0, %zmm0
 ; WIN64-NEXT:    vmovaps (%rsp), %xmm6 # 16-byte Reload
 ; WIN64-NEXT:    vmovaps {{[-0-9]+}}(%r{{[sb]}}p), %xmm7 # 16-byte Reload
+; WIN64-NEXT:    .seh_beginepilogue
 ; WIN64-NEXT:    addq $40, %rsp
 ; WIN64-NEXT:    popq %rdi
 ; WIN64-NEXT:    popq %rsi
+; WIN64-NEXT:    .seh_endepilogue
 ; WIN64-NEXT:    retq
 ; WIN64-NEXT:    .seh_endproc
 ;
@@ -334,10 +338,12 @@ define dso_local x86_regcallcc i32 @test_argv32i1(<32 x i1> %x0, <32 x i1> %x1,
 ; WIN64-NEXT:    vzeroupper
 ; WIN64-NEXT:    callq test_argv32i1helper
 ; WIN64-NEXT:    nop
+; WIN64-NEXT:    .seh_beginepilogue
 ; WIN64-NEXT:    movq %rbp, %rsp
 ; WIN64-NEXT:    popq %r10
 ; WIN64-NEXT:    popq %r11
 ; WIN64-NEXT:    popq %rbp
+; WIN64-NEXT:    .seh_endepilogue
 ; WIN64-NEXT:    retq
 ; WIN64-NEXT:    .seh_endproc
 ;
@@ -417,9 +423,11 @@ define dso_local i32 @caller_argv32i1() #0 {
 ; WIN64-NEXT:    callq test_argv32i1
 ; WIN64-NEXT:    vmovaps (%rsp), %xmm6 # 16-byte Reload
 ; WIN64-NEXT:    vmovaps {{[-0-9]+}}(%r{{[sb]}}p), %xmm7 # 16-byte Reload
+; WIN64-NEXT:    .seh_beginepilogue
 ; WIN64-NEXT:    addq $40, %rsp
 ; WIN64-NEXT:    popq %rdi
 ; WIN64-NEXT:    popq %rsi
+; WIN64-NEXT:    .seh_endepilogue
 ; WIN64-NEXT:    retq
 ; WIN64-NEXT:    .seh_endproc
 ;
@@ -480,9 +488,11 @@ define dso_local i32 @caller_retv32i1() #0 {
 ; WIN64-NEXT:    incl %eax
 ; WIN64-NEXT:    vmovaps (%rsp), %xmm6 # 16-byte Reload
 ; WIN64-NEXT:    vmovaps {{[-0-9]+}}(%r{{[sb]}}p), %xmm7 # 16-byte Reload
+; WIN64-NEXT:    .seh_beginepilogue
 ; WIN64-N...
[truncated]

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 25, 2024

@llvm/pr-subscribers-debuginfo

Author: Daniel Paoliello (dpaoliello)

Changes

Windows x64 Unwind V2 adds epilog information to unwind data: specifically, the length of the epilog and the offset of each epilog.

The first step to do this is to add markers to the beginning and end of each epilog when generating Windows x64 code. I've modelled this after how LLVM was marking ARM and AArch64 epilogs in Windows (and unified the code between the three).


Patch is 79.87 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110024.diff

61 Files Affected:

  • (modified) llvm/include/llvm/MC/MCStreamer.h (+12)
  • (modified) llvm/lib/MC/MCAsmStreamer.cpp (+16)
  • (modified) llvm/lib/MC/MCParser/COFFAsmParser.cpp (+18)
  • (modified) llvm/lib/MC/MCStreamer.cpp (+20)
  • (modified) llvm/lib/Target/AArch64/MCTargetDesc/AArch64TargetStreamer.h (-6)
  • (modified) llvm/lib/Target/AArch64/MCTargetDesc/AArch64WinCOFFStreamer.cpp (+5-14)
  • (modified) llvm/lib/Target/ARM/MCTargetDesc/ARMWinCOFFStreamer.cpp (+8-19)
  • (modified) llvm/lib/Target/X86/X86FrameLowering.cpp (+4-7)
  • (modified) llvm/lib/Target/X86/X86InstrCompiler.td (+5-3)
  • (modified) llvm/lib/Target/X86/X86MCInstLower.cpp (+17-1)
  • (modified) llvm/test/CodeGen/X86/apx/push2-pop2-cfi-seh.ll (+6)
  • (modified) llvm/test/CodeGen/X86/avx512-intel-ocl.ll (+4)
  • (modified) llvm/test/CodeGen/X86/avx512-regcall-Mask.ll (+22)
  • (modified) llvm/test/CodeGen/X86/avx512-regcall-NoMask.ll (+26)
  • (modified) llvm/test/CodeGen/X86/break-false-dep.ll (+22)
  • (modified) llvm/test/CodeGen/X86/catchpad-realign-savexmm.ll (+4)
  • (modified) llvm/test/CodeGen/X86/cfguard-x86-64-vectorcall.ll (+2)
  • (modified) llvm/test/CodeGen/X86/cleanuppad-realign.ll (+2)
  • (modified) llvm/test/CodeGen/X86/conditional-tailcall-pgso.ll (+4)
  • (modified) llvm/test/CodeGen/X86/conditional-tailcall.ll (+4)
  • (modified) llvm/test/CodeGen/X86/ldexp.ll (+12)
  • (modified) llvm/test/CodeGen/X86/localescape.ll (+2)
  • (modified) llvm/test/CodeGen/X86/mixed-ptr-sizes.ll (+2)
  • (modified) llvm/test/CodeGen/X86/musttail-varargs.ll (+2)
  • (modified) llvm/test/CodeGen/X86/no-sse-win64.ll (+8)
  • (modified) llvm/test/CodeGen/X86/preserve_nonecc_call_win.ll (+2)
  • (modified) llvm/test/CodeGen/X86/segmented-stacks.ll (+14)
  • (modified) llvm/test/CodeGen/X86/seh-catchpad.ll (+2)
  • (modified) llvm/test/CodeGen/X86/sse-regcall.ll (+2)
  • (modified) llvm/test/CodeGen/X86/sse-regcall4.ll (+2)
  • (modified) llvm/test/CodeGen/X86/stack-coloring-wineh.ll (+6)
  • (modified) llvm/test/CodeGen/X86/swift-async-win64.ll (+2)
  • (modified) llvm/test/CodeGen/X86/tailcc-ssp.ll (+4)
  • (modified) llvm/test/CodeGen/X86/taildup-callsiteinfo.mir (+2-1)
  • (modified) llvm/test/CodeGen/X86/win-catchpad-csrs.ll (+12)
  • (modified) llvm/test/CodeGen/X86/win-catchpad.ll (+10)
  • (modified) llvm/test/CodeGen/X86/win-funclet-cfi.ll (+2)
  • (modified) llvm/test/CodeGen/X86/win-smallparams.ll (+2)
  • (modified) llvm/test/CodeGen/X86/win64-byval.ll (+6)
  • (modified) llvm/test/CodeGen/X86/win64-eh-empty-block-2.mir (+4-2)
  • (modified) llvm/test/CodeGen/X86/win64-funclet-savexmm.ll (+2)
  • (modified) llvm/test/CodeGen/X86/win64-seh-epilogue-statepoint.ll (+2)
  • (modified) llvm/test/CodeGen/X86/win64_eh.ll (+4)
  • (modified) llvm/test/CodeGen/X86/win64_frame.ll (+22)
  • (modified) llvm/test/CodeGen/X86/x86-64-flags-intrinsics.ll (+4)
  • (modified) llvm/test/CodeGen/X86/x86-win64-shrink-wrapping.ll (+26-12)
  • (modified) llvm/test/DebugInfo/COFF/trailing-inlined-function.s (+2)
  • (modified) llvm/test/DebugInfo/MIR/X86/instr-ref-join-def-vphi.mir (+2-1)
  • (modified) llvm/test/ExecutionEngine/JITLink/x86-64/COFF_pdata_no_strip.s (+2)
  • (modified) llvm/test/ExecutionEngine/JITLink/x86-64/COFF_pdata_strip.s (+2)
  • (modified) llvm/test/MC/AsmParser/directive_seh.s (+4)
  • (modified) llvm/test/MC/AsmParser/seh-directive-errors.s (+6)
  • (modified) llvm/test/MC/COFF/cv-def-range-align.s (+2)
  • (modified) llvm/test/MC/COFF/cv-inline-linetable-unlikely.s (+2)
  • (modified) llvm/test/MC/COFF/seh-align2.s (+2)
  • (modified) llvm/test/MC/COFF/seh-align3.s (+2)
  • (modified) llvm/test/MC/COFF/seh-linkonce.s (+2)
  • (modified) llvm/test/MC/COFF/seh-section-2.s (+2)
  • (modified) llvm/test/MC/COFF/seh-section.s (+6)
  • (modified) llvm/test/MC/COFF/seh.s (+2)
  • (modified) llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/x86-basic.ll.expected (+2)
diff --git a/llvm/include/llvm/MC/MCStreamer.h b/llvm/include/llvm/MC/MCStreamer.h
index 707aecc5dc578e..c3770ffa6a5dc8 100644
--- a/llvm/include/llvm/MC/MCStreamer.h
+++ b/llvm/include/llvm/MC/MCStreamer.h
@@ -252,6 +252,12 @@ class MCStreamer {
   bool AllowAutoPadding = false;
 
 protected:
+  // True if we are processing SEH directives in an epilogue.
+  bool InEpilogCFI = false;
+
+  // Symbol of the current epilog for which we are processing SEH directives.
+  MCSymbol *CurrentEpilog = nullptr;
+
   MCFragment *CurFrag = nullptr;
 
   MCStreamer(MCContext &Ctx);
@@ -331,6 +337,10 @@ class MCStreamer {
     return WinFrameInfos;
   }
 
+  MCSymbol *getCurrentEpilog() const { return CurrentEpilog; }
+
+  bool isInEpilogCFI() const { return InEpilogCFI; }
+
   void generateCompactUnwindEncodings(MCAsmBackend *MAB);
 
   /// \name Assembly File Formatting.
@@ -1043,6 +1053,8 @@ class MCStreamer {
                                  SMLoc Loc = SMLoc());
   virtual void emitWinCFIPushFrame(bool Code, SMLoc Loc = SMLoc());
   virtual void emitWinCFIEndProlog(SMLoc Loc = SMLoc());
+  virtual void emitWinCFIBeginEpilogue(SMLoc Loc = SMLoc());
+  virtual void emitWinCFIEndEpilogue(SMLoc Loc = SMLoc());
   virtual void emitWinEHHandler(const MCSymbol *Sym, bool Unwind, bool Except,
                                 SMLoc Loc = SMLoc());
   virtual void emitWinEHHandlerData(SMLoc Loc = SMLoc());
diff --git a/llvm/lib/MC/MCAsmStreamer.cpp b/llvm/lib/MC/MCAsmStreamer.cpp
index 31b519a3e5c56a..34bfa139cea290 100644
--- a/llvm/lib/MC/MCAsmStreamer.cpp
+++ b/llvm/lib/MC/MCAsmStreamer.cpp
@@ -391,6 +391,8 @@ class MCAsmStreamer final : public MCStreamer {
                          SMLoc Loc) override;
   void emitWinCFIPushFrame(bool Code, SMLoc Loc) override;
   void emitWinCFIEndProlog(SMLoc Loc) override;
+  void emitWinCFIBeginEpilogue(SMLoc Loc) override;
+  void emitWinCFIEndEpilogue(SMLoc Loc) override;
 
   void emitWinEHHandler(const MCSymbol *Sym, bool Unwind, bool Except,
                         SMLoc Loc) override;
@@ -2306,6 +2308,20 @@ void MCAsmStreamer::emitWinCFIEndProlog(SMLoc Loc) {
   EmitEOL();
 }
 
+void MCAsmStreamer::emitWinCFIBeginEpilogue(SMLoc Loc) {
+  MCStreamer::emitWinCFIBeginEpilogue(Loc);
+
+  OS << "\t.seh_beginepilogue";
+  EmitEOL();
+}
+
+void MCAsmStreamer::emitWinCFIEndEpilogue(SMLoc Loc) {
+  MCStreamer::emitWinCFIEndEpilogue(Loc);
+
+  OS << "\t.seh_endepilogue";
+  EmitEOL();
+}
+
 void MCAsmStreamer::emitCGProfileEntry(const MCSymbolRefExpr *From,
                                        const MCSymbolRefExpr *To,
                                        uint64_t Count) {
diff --git a/llvm/lib/MC/MCParser/COFFAsmParser.cpp b/llvm/lib/MC/MCParser/COFFAsmParser.cpp
index a69276c36c56b3..22e72292966f46 100644
--- a/llvm/lib/MC/MCParser/COFFAsmParser.cpp
+++ b/llvm/lib/MC/MCParser/COFFAsmParser.cpp
@@ -90,6 +90,10 @@ class COFFAsmParser : public MCAsmParserExtension {
                                                              ".seh_stackalloc");
     addDirectiveHandler<&COFFAsmParser::ParseSEHDirectiveEndProlog>(
                                                             ".seh_endprologue");
+    addDirectiveHandler<&COFFAsmParser::ParseSEHDirectiveBeginEpilog>(
+        ".seh_beginepilogue");
+    addDirectiveHandler<&COFFAsmParser::ParseSEHDirectiveEndEpilog>(
+        ".seh_endepilogue");
   }
 
   bool ParseSectionDirectiveText(StringRef, SMLoc) {
@@ -137,6 +141,8 @@ class COFFAsmParser : public MCAsmParserExtension {
   bool ParseSEHDirectiveHandlerData(StringRef, SMLoc);
   bool ParseSEHDirectiveAllocStack(StringRef, SMLoc);
   bool ParseSEHDirectiveEndProlog(StringRef, SMLoc);
+  bool ParseSEHDirectiveBeginEpilog(StringRef, SMLoc);
+  bool ParseSEHDirectiveEndEpilog(StringRef, SMLoc);
 
   bool ParseAtUnwindOrAtExcept(bool &unwind, bool &except);
   bool ParseDirectiveSymbolAttribute(StringRef Directive, SMLoc);
@@ -715,6 +721,18 @@ bool COFFAsmParser::ParseSEHDirectiveEndProlog(StringRef, SMLoc Loc) {
   return false;
 }
 
+bool COFFAsmParser::ParseSEHDirectiveBeginEpilog(StringRef, SMLoc Loc) {
+  Lex();
+  getStreamer().emitWinCFIBeginEpilogue(Loc);
+  return false;
+}
+
+bool COFFAsmParser::ParseSEHDirectiveEndEpilog(StringRef, SMLoc Loc) {
+  Lex();
+  getStreamer().emitWinCFIEndEpilogue(Loc);
+  return false;
+}
+
 bool COFFAsmParser::ParseAtUnwindOrAtExcept(bool &unwind, bool &except) {
   StringRef identifier;
   if (getLexer().isNot(AsmToken::At) && getLexer().isNot(AsmToken::Percent))
diff --git a/llvm/lib/MC/MCStreamer.cpp b/llvm/lib/MC/MCStreamer.cpp
index 13b162768578c5..b179aa1cef39c9 100644
--- a/llvm/lib/MC/MCStreamer.cpp
+++ b/llvm/lib/MC/MCStreamer.cpp
@@ -979,6 +979,26 @@ void MCStreamer::emitWinCFIEndProlog(SMLoc Loc) {
   CurFrame->PrologEnd = Label;
 }
 
+void MCStreamer::emitWinCFIBeginEpilogue(SMLoc Loc) {
+  WinEH::FrameInfo *CurFrame = EnsureValidWinFrameInfo(Loc);
+  if (!CurFrame)
+    return;
+
+  InEpilogCFI = true;
+  CurrentEpilog = emitCFILabel();
+}
+
+void MCStreamer::emitWinCFIEndEpilogue(SMLoc Loc) {
+  WinEH::FrameInfo *CurFrame = EnsureValidWinFrameInfo(Loc);
+  if (!CurFrame)
+    return;
+
+  InEpilogCFI = false;
+  MCSymbol *Label = emitCFILabel();
+  CurFrame->EpilogMap[CurrentEpilog].End = Label;
+  CurrentEpilog = nullptr;
+}
+
 void MCStreamer::emitCOFFSafeSEH(MCSymbol const *Symbol) {}
 
 void MCStreamer::emitCOFFSymbolIndex(MCSymbol const *Symbol) {}
diff --git a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64TargetStreamer.h b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64TargetStreamer.h
index ac441ae3b603ff..119dcc38edbfcd 100644
--- a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64TargetStreamer.h
+++ b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64TargetStreamer.h
@@ -100,12 +100,6 @@ class AArch64TargetELFStreamer : public AArch64TargetStreamer {
 };
 
 class AArch64TargetWinCOFFStreamer : public llvm::AArch64TargetStreamer {
-private:
-  // True if we are processing SEH directives in an epilogue.
-  bool InEpilogCFI = false;
-
-  // Symbol of the current epilog for which we are processing SEH directives.
-  MCSymbol *CurrentEpilog = nullptr;
 public:
   AArch64TargetWinCOFFStreamer(llvm::MCStreamer &S)
     : AArch64TargetStreamer(S) {}
diff --git a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64WinCOFFStreamer.cpp b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64WinCOFFStreamer.cpp
index 208d43502cb88a..160768350a6b86 100644
--- a/llvm/lib/Target/AArch64/MCTargetDesc/AArch64WinCOFFStreamer.cpp
+++ b/llvm/lib/Target/AArch64/MCTargetDesc/AArch64WinCOFFStreamer.cpp
@@ -73,8 +73,8 @@ void AArch64TargetWinCOFFStreamer::emitARM64WinUnwindCode(unsigned UnwindCode,
   if (!CurFrame)
     return;
   auto Inst = WinEH::Instruction(UnwindCode, /*Label=*/nullptr, Reg, Offset);
-  if (InEpilogCFI)
-    CurFrame->EpilogMap[CurrentEpilog].Instructions.push_back(Inst);
+  if (S.isInEpilogCFI())
+    CurFrame->EpilogMap[S.getCurrentEpilog()].Instructions.push_back(Inst);
   else
     CurFrame->Instructions.push_back(Inst);
 }
@@ -183,13 +183,7 @@ void AArch64TargetWinCOFFStreamer::emitARM64WinCFIPrologEnd() {
 }
 
 void AArch64TargetWinCOFFStreamer::emitARM64WinCFIEpilogStart() {
-  auto &S = getStreamer();
-  WinEH::FrameInfo *CurFrame = S.EnsureValidWinFrameInfo(SMLoc());
-  if (!CurFrame)
-    return;
-
-  InEpilogCFI = true;
-  CurrentEpilog = S.emitCFILabel();
+  getStreamer().emitWinCFIBeginEpilogue();
 }
 
 void AArch64TargetWinCOFFStreamer::emitARM64WinCFIEpilogEnd() {
@@ -198,13 +192,10 @@ void AArch64TargetWinCOFFStreamer::emitARM64WinCFIEpilogEnd() {
   if (!CurFrame)
     return;
 
-  InEpilogCFI = false;
   WinEH::Instruction Inst =
       WinEH::Instruction(Win64EH::UOP_End, /*Label=*/nullptr, -1, 0);
-  CurFrame->EpilogMap[CurrentEpilog].Instructions.push_back(Inst);
-  MCSymbol *Label = S.emitCFILabel();
-  CurFrame->EpilogMap[CurrentEpilog].End = Label;
-  CurrentEpilog = nullptr;
+  CurFrame->EpilogMap[S.getCurrentEpilog()].Instructions.push_back(Inst);
+  S.emitWinCFIEndEpilogue();
 }
 
 void AArch64TargetWinCOFFStreamer::emitARM64WinCFITrapFrame() {
diff --git a/llvm/lib/Target/ARM/MCTargetDesc/ARMWinCOFFStreamer.cpp b/llvm/lib/Target/ARM/MCTargetDesc/ARMWinCOFFStreamer.cpp
index e66059c2a0e096..b541755cf6b621 100644
--- a/llvm/lib/Target/ARM/MCTargetDesc/ARMWinCOFFStreamer.cpp
+++ b/llvm/lib/Target/ARM/MCTargetDesc/ARMWinCOFFStreamer.cpp
@@ -77,13 +77,6 @@ llvm::createARMWinCOFFStreamer(MCContext &Context,
 
 namespace {
 class ARMTargetWinCOFFStreamer : public llvm::ARMTargetStreamer {
-private:
-  // True if we are processing SEH directives in an epilogue.
-  bool InEpilogCFI = false;
-
-  // Symbol of the current epilog for which we are processing SEH directives.
-  MCSymbol *CurrentEpilog = nullptr;
-
 public:
   ARMTargetWinCOFFStreamer(llvm::MCStreamer &S) : ARMTargetStreamer(S) {}
 
@@ -114,8 +107,8 @@ void ARMTargetWinCOFFStreamer::emitARMWinUnwindCode(unsigned UnwindCode,
     return;
   MCSymbol *Label = S.emitCFILabel();
   auto Inst = WinEH::Instruction(UnwindCode, Label, Reg, Offset);
-  if (InEpilogCFI)
-    CurFrame->EpilogMap[CurrentEpilog].Instructions.push_back(Inst);
+  if (S.isInEpilogCFI())
+    CurFrame->EpilogMap[S.getCurrentEpilog()].Instructions.push_back(Inst);
   else
     CurFrame->Instructions.push_back(Inst);
 }
@@ -224,9 +217,8 @@ void ARMTargetWinCOFFStreamer::emitARMWinCFIEpilogStart(unsigned Condition) {
   if (!CurFrame)
     return;
 
-  InEpilogCFI = true;
-  CurrentEpilog = S.emitCFILabel();
-  CurFrame->EpilogMap[CurrentEpilog].Condition = Condition;
+  S.emitWinCFIBeginEpilogue();
+  CurFrame->EpilogMap[S.getCurrentEpilog()].Condition = Condition;
 }
 
 void ARMTargetWinCOFFStreamer::emitARMWinCFIEpilogEnd() {
@@ -235,14 +227,14 @@ void ARMTargetWinCOFFStreamer::emitARMWinCFIEpilogEnd() {
   if (!CurFrame)
     return;
 
-  if (!CurrentEpilog) {
+  if (!S.getCurrentEpilog()) {
     S.getContext().reportError(SMLoc(), "Stray .seh_endepilogue in " +
                                             CurFrame->Function->getName());
     return;
   }
 
   std::vector<WinEH::Instruction> &Epilog =
-      CurFrame->EpilogMap[CurrentEpilog].Instructions;
+      CurFrame->EpilogMap[S.getCurrentEpilog()].Instructions;
 
   unsigned UnwindCode = Win64EH::UOP_End;
   if (!Epilog.empty()) {
@@ -256,12 +248,9 @@ void ARMTargetWinCOFFStreamer::emitARMWinCFIEpilogEnd() {
     }
   }
 
-  InEpilogCFI = false;
   WinEH::Instruction Inst = WinEH::Instruction(UnwindCode, nullptr, -1, 0);
-  CurFrame->EpilogMap[CurrentEpilog].Instructions.push_back(Inst);
-  MCSymbol *Label = S.emitCFILabel();
-  CurFrame->EpilogMap[CurrentEpilog].End = Label;
-  CurrentEpilog = nullptr;
+  CurFrame->EpilogMap[S.getCurrentEpilog()].Instructions.push_back(Inst);
+  S.emitWinCFIEndEpilogue();
 }
 
 void ARMTargetWinCOFFStreamer::emitARMWinCFICustom(unsigned Opcode) {
diff --git a/llvm/lib/Target/X86/X86FrameLowering.cpp b/llvm/lib/Target/X86/X86FrameLowering.cpp
index 4f83267c999e4a..f49c6c1125d613 100644
--- a/llvm/lib/Target/X86/X86FrameLowering.cpp
+++ b/llvm/lib/Target/X86/X86FrameLowering.cpp
@@ -2549,14 +2549,8 @@ void X86FrameLowering::emitEpilogue(MachineFunction &MF,
     --MBBI;
   }
 
-  // Windows unwinder will not invoke function's exception handler if IP is
-  // either in prologue or in epilogue.  This behavior causes a problem when a
-  // call immediately precedes an epilogue, because the return address points
-  // into the epilogue.  To cope with that, we insert an epilogue marker here,
-  // then replace it with a 'nop' if it ends up immediately after a CALL in the
-  // final emitted code.
   if (NeedsWin64CFI && MF.hasWinCFI())
-    BuildMI(MBB, MBBI, DL, TII.get(X86::SEH_Epilogue));
+    BuildMI(MBB, MBBI, DL, TII.get(X86::SEH_BeginEpilogue));
 
   if (!HasFP && NeedsDwarfCFI) {
     MBBI = FirstCSPop;
@@ -2601,6 +2595,9 @@ void X86FrameLowering::emitEpilogue(MachineFunction &MF,
   // Emit tilerelease for AMX kernel.
   if (X86FI->getAMXProgModel() == AMXProgModelEnum::ManagedRA)
     BuildMI(MBB, Terminator, DL, TII.get(X86::TILERELEASE));
+
+  if (NeedsWin64CFI && MF.hasWinCFI())
+    BuildMI(MBB, Terminator, DL, TII.get(X86::SEH_EndEpilogue));
 }
 
 StackOffset X86FrameLowering::getFrameIndexReference(const MachineFunction &MF,
diff --git a/llvm/lib/Target/X86/X86InstrCompiler.td b/llvm/lib/Target/X86/X86InstrCompiler.td
index 5a8177e2b3607b..f8f572d662aa1a 100644
--- a/llvm/lib/Target/X86/X86InstrCompiler.td
+++ b/llvm/lib/Target/X86/X86InstrCompiler.td
@@ -235,7 +235,7 @@ let isBranch = 1, isTerminator = 1, isCodeGenOnly = 1 in {
 //===----------------------------------------------------------------------===//
 // Pseudo instructions used by unwind info.
 //
-let isPseudo = 1, SchedRW = [WriteSystem] in {
+let isPseudo = 1, isMeta = 1, SchedRW = [WriteSystem] in {
   def SEH_PushReg : I<0, Pseudo, (outs), (ins i32imm:$reg),
                             "#SEH_PushReg $reg", []>;
   def SEH_SaveReg : I<0, Pseudo, (outs), (ins i32imm:$reg, i32imm:$dst),
@@ -252,8 +252,10 @@ let isPseudo = 1, SchedRW = [WriteSystem] in {
                             "#SEH_PushFrame $mode", []>;
   def SEH_EndPrologue : I<0, Pseudo, (outs), (ins),
                             "#SEH_EndPrologue", []>;
-  def SEH_Epilogue : I<0, Pseudo, (outs), (ins),
-                            "#SEH_Epilogue", []>;
+  def SEH_BeginEpilogue : I<0, Pseudo, (outs), (ins),
+                            "#SEH_BeginEpilogue", []>;
+  def SEH_EndEpilogue : I<0, Pseudo, (outs), (ins),
+                            "#SEH_EndEpilogue", []>;
 }
 
 //===----------------------------------------------------------------------===//
diff --git a/llvm/lib/Target/X86/X86MCInstLower.cpp b/llvm/lib/Target/X86/X86MCInstLower.cpp
index 24db39c4e98b96..83c7ac6562b854 100644
--- a/llvm/lib/Target/X86/X86MCInstLower.cpp
+++ b/llvm/lib/Target/X86/X86MCInstLower.cpp
@@ -1781,6 +1781,14 @@ void X86AsmPrinter::EmitSEHInstruction(const MachineInstr *MI) {
     OutStreamer->emitWinCFIEndProlog();
     break;
 
+  case X86::SEH_BeginEpilogue:
+    OutStreamer->emitWinCFIBeginEpilogue();
+    break;
+
+  case X86::SEH_EndEpilogue:
+    OutStreamer->emitWinCFIEndEpilogue();
+    break;
+
   default:
     llvm_unreachable("expected SEH_ instruction");
   }
@@ -2422,11 +2430,17 @@ void X86AsmPrinter::emitInstruction(const MachineInstr *MI) {
   case X86::SEH_SetFrame:
   case X86::SEH_PushFrame:
   case X86::SEH_EndPrologue:
+  case X86::SEH_EndEpilogue:
     EmitSEHInstruction(MI);
     return;
 
-  case X86::SEH_Epilogue: {
+  case X86::SEH_BeginEpilogue: {
     assert(MF->hasWinCFI() && "SEH_ instruction in function without WinCFI?");
+    // Windows unwinder will not invoke function's exception handler if IP is
+    // either in prologue or in epilogue.  This behavior causes a problem when a
+    // call immediately precedes an epilogue, because the return address points
+    // into the epilogue.  To cope with that, we insert a 'nop' if it ends up
+    // immediately after a CALL in the final emitted code.
     MachineBasicBlock::const_iterator MBBI(MI);
     // Check if preceded by a call and emit nop if so.
     for (MBBI = PrevCrossBBInst(MBBI);
@@ -2441,6 +2455,8 @@ void X86AsmPrinter::emitInstruction(const MachineInstr *MI) {
         break;
       }
     }
+
+    EmitSEHInstruction(MI);
     return;
   }
   case X86::UBSAN_UD1:
diff --git a/llvm/test/CodeGen/X86/apx/push2-pop2-cfi-seh.ll b/llvm/test/CodeGen/X86/apx/push2-pop2-cfi-seh.ll
index 6c9fdc2adce2ff..224e4c1cc09f8d 100644
--- a/llvm/test/CodeGen/X86/apx/push2-pop2-cfi-seh.ll
+++ b/llvm/test/CodeGen/X86/apx/push2-pop2-cfi-seh.ll
@@ -142,6 +142,7 @@ define i32 @csr6_alloc16(ptr %argv) {
 ; WIN-REF-NEXT:    xorl %eax, %eax
 ; WIN-REF-NEXT:    callq *%rax
 ; WIN-REF-NEXT:    nop
+; WIN-REF-NEXT:    .seh_beginepilogue
 ; WIN-REF-NEXT:    addq $56, %rsp
 ; WIN-REF-NEXT:    popq %rbx
 ; WIN-REF-NEXT:    popq %rbp
@@ -149,6 +150,7 @@ define i32 @csr6_alloc16(ptr %argv) {
 ; WIN-REF-NEXT:    popq %r13
 ; WIN-REF-NEXT:    popq %r14
 ; WIN-REF-NEXT:    popq %r15
+; WIN-REF-NEXT:    .seh_endepilogue
 ; WIN-REF-NEXT:    retq
 ; WIN-REF-NEXT:    .seh_endproc
 ;
@@ -173,11 +175,13 @@ define i32 @csr6_alloc16(ptr %argv) {
 ; WIN-NEXT:    xorl %eax, %eax
 ; WIN-NEXT:    callq *%rax
 ; WIN-NEXT:    nop
+; WIN-NEXT:    .seh_beginepilogue
 ; WIN-NEXT:    addq $64, %rsp
 ; WIN-NEXT:    pop2 %rbp, %rbx
 ; WIN-NEXT:    pop2 %r13, %r12
 ; WIN-NEXT:    pop2 %r15, %r14
 ; WIN-NEXT:    popq %rcx
+; WIN-NEXT:    .seh_endepilogue
 ; WIN-NEXT:    retq
 ; WIN-NEXT:    .seh_endproc
 ;
@@ -202,11 +206,13 @@ define i32 @csr6_alloc16(ptr %argv) {
 ; WIN-PPX-NEXT:    xorl %eax, %eax
 ; WIN-PPX-NEXT:    callq *%rax
 ; WIN-PPX-NEXT:    nop
+; WIN-PPX-NEXT:    .seh_beginepilogue
 ; WIN-PPX-NEXT:    addq $64, %rsp
 ; WIN-PPX-NEXT:    pop2p %rbp, %rbx
 ; WIN-PPX-NEXT:    pop2p %r13, %r12
 ; WIN-PPX-NEXT:    pop2p %r15, %r14
 ; WIN-PPX-NEXT:    popq %rcx
+; WIN-PPX-NEXT:    .seh_endepilogue
 ; WIN-PPX-NEXT:    retq
 ; WIN-PPX-NEXT:    .seh_endproc
 entry:
diff --git a/llvm/test/CodeGen/X86/avx512-intel-ocl.ll b/llvm/test/CodeGen/X86/avx512-intel-ocl.ll
index 6c68279b8d04ae..941bf0d63778f0 100644
--- a/llvm/test/CodeGen/X86/avx512-intel-ocl.ll
+++ b/llvm/test/CodeGen/X86/avx512-intel-ocl.ll
@@ -429,7 +429,9 @@ define <16 x float> @testf16_inp_mask(<16 x float> %a, i16 %mask)  {
 ; WIN64-KNL-NEXT:    kmovw %edx, %k1
 ; WIN64-KNL-NEXT:    callq func_float16_mask
 ; WIN64-KNL-NEXT:    nop
+; WIN64-KNL-NEXT:    .seh_beginepilogue
 ; WIN64-KNL-NEXT:    addq $40, %rsp
+; WIN64-KNL-NEXT:    .seh_endepilogue
 ; WIN64-KNL-NEXT:    retq
 ; WIN64-KNL-NEXT:    .seh_endproc
 ;
@@ -443,7 +445,9 @@ define <16 x float> @testf16_inp_mask(<16 x float> %a, i16 %mask)  {
 ; WIN64-SKX-NEXT:    kmovd %edx, %k1
 ; WIN64-SKX-NEXT:    callq func_float16_mask
 ; WIN64-SKX-NEXT:    nop
+; WIN64-SKX-NEXT:    .seh_beginepilogue
 ; WIN64-SKX-NEXT:    addq $40, %rsp
+; WIN64-SKX-NEXT:    .seh_endepilogue
 ; WIN64-SKX-NEXT:    retq
 ; WIN64-SKX-NEXT:    .seh_endproc
 ;
diff --git a/llvm/test/CodeGen/X86/avx512-regcall-Mask.ll b/llvm/test/CodeGen/X86/avx512-regcall-Mask.ll
index b3a0c7dffae117..d9efc35e6893b8 100644
--- a/llvm/test/CodeGen/X86/avx512-regcall-Mask.ll
+++ b/llvm/test/CodeGen/X86/avx512-regcall-Mask.ll
@@ -149,12 +149,14 @@ define dso_local i64 @caller_argv64i1() #0 {
 ; WIN64-NEXT:    callq test_argv64i1
 ; WIN64-NEXT:    vmovaps {{[-0-9]+}}(%r{{[sb]}}p), %xmm6 # 16-byte Reload
 ; WIN64-NEXT:    vmovaps {{[-0-9]+}}(%r{{[sb]}}p), %xmm7 # 16-byte Reload
+; WIN64-NEXT:    .seh_beginepilogue
 ; WIN64-NEXT:    addq $48, %rsp
 ; WIN64-NEXT:    popq %rdi
 ; WIN64-NEXT:    popq %rsi
 ; WIN64-NEXT:    popq %r12
 ; WIN64-NEXT:    popq %r14
 ; WIN64-NEXT:    popq %r15
+; WIN64-NEXT:    .seh_endepilogue
 ; WIN64-NEXT:    retq
 ; WIN64-NEXT:    .seh_endproc
 ;
@@ -256,9 +258,11 @@ define dso_local <64 x i1> @caller_retv64i1() #0 {
 ; WIN64-NEXT:    vpmovm2b %k0, %zmm0
 ; WIN64-NEXT:    vmovaps (%rsp), %xmm6 # 16-byte Reload
 ; WIN64-NEXT:    vmovaps {{[-0-9]+}}(%r{{[sb]}}p), %xmm7 # 16-byte Reload
+; WIN64-NEXT:    .seh_beginepilogue
 ; WIN64-NEXT:    addq $40, %rsp
 ; WIN64-NEXT:    popq %rdi
 ; WIN64-NEXT:    popq %rsi
+; WIN64-NEXT:    .seh_endepilogue
 ; WIN64-NEXT:    retq
 ; WIN64-NEXT:    .seh_endproc
 ;
@@ -334,10 +338,12 @@ define dso_local x86_regcallcc i32 @test_argv32i1(<32 x i1> %x0, <32 x i1> %x1,
 ; WIN64-NEXT:    vzeroupper
 ; WIN64-NEXT:    callq test_argv32i1helper
 ; WIN64-NEXT:    nop
+; WIN64-NEXT:    .seh_beginepilogue
 ; WIN64-NEXT:    movq %rbp, %rsp
 ; WIN64-NEXT:    popq %r10
 ; WIN64-NEXT:    popq %r11
 ; WIN64-NEXT:    popq %rbp
+; WIN64-NEXT:    .seh_endepilogue
 ; WIN64-NEXT:    retq
 ; WIN64-NEXT:    .seh_endproc
 ;
@@ -417,9 +423,11 @@ define dso_local i32 @caller_argv32i1() #0 {
 ; WIN64-NEXT:    callq test_argv32i1
 ; WIN64-NEXT:    vmovaps (%rsp), %xmm6 # 16-byte Reload
 ; WIN64-NEXT:    vmovaps {{[-0-9]+}}(%r{{[sb]}}p), %xmm7 # 16-byte Reload
+; WIN64-NEXT:    .seh_beginepilogue
 ; WIN64-NEXT:    addq $40, %rsp
 ; WIN64-NEXT:    popq %rdi
 ; WIN64-NEXT:    popq %rsi
+; WIN64-NEXT:    .seh_endepilogue
 ; WIN64-NEXT:    retq
 ; WIN64-NEXT:    .seh_endproc
 ;
@@ -480,9 +488,11 @@ define dso_local i32 @caller_retv32i1() #0 {
 ; WIN64-NEXT:    incl %eax
 ; WIN64-NEXT:    vmovaps (%rsp), %xmm6 # 16-byte Reload
 ; WIN64-NEXT:    vmovaps {{[-0-9]+}}(%r{{[sb]}}p), %xmm7 # 16-byte Reload
+; WIN64-NEXT:    .seh_beginepilogue
 ; WIN64-N...
[truncated]

@mstorsjo
Copy link
Member

Can you share some high level summary of what x64 unwind v2 is? I presume it is some extension to the unwind info format used currently; is there some public documentation for the additions? Are the additions backwards compatible, or can they be enabled only if the target is known to be new enough?

@pmsjt
Copy link

pmsjt commented Sep 25, 2024

EpilogV2 isn't publicly documented (yet) but there are a couple of high-level characteristics that I'll share to address some of the questions:

With the original ABI (V1) definition, when unwinding the stack, the original unwinder was forced to disassemble the instruction stream, conclude whether execution was in the epilog and, if yes, essentially emulate the instructions forward. UWOPs were only used for prolog and body. As a result of this, epilogs must follow some canonical format rules which the unwinder can detect and emulate.

EpilogV2's goal is to enable stack unwinding without having to access the instruction stream, but in a backward compatible way. This means that functions compiled with EpilogV2 must still abide by the epilog canonical rules of V1, so an older unwinder (not V2 aware) can still do the right thing, by employing the old strategy to detect and emulate x86 instructions when in the epilog.

So, when the EpilogV2-aware unwinder encounters an EpilogV2-compiled function, it will be aware of the instruction offset of where the epilog starts. Then it will use the Prolog's UWOPs to unwind the epilog a) assuming symmetry and b) assuming that each UWOP is associated with a given instruction (length).

This means that, technically, EpilogV2 adds to the V1 list of the canonical rules for epilogs, making it even more restrictive. In practice, V1 was already so restrictive that V2 doesn't change the practical result much. In return, unwinding can be performed without accessing the code stream (which might have security or runtime implications) and operates exclusively on UWOPs instead for prolog, body and epilog. This is beneficial especially for code which runs in kernel.

@efriedma-quic
Copy link
Collaborator

efriedma-quic commented Sep 25, 2024

The existing ARM/AArch64 backends use .seh_startepilogue; is there some reason you chose to use .seh_beginepilogue here instead?

It looks like this doesn't touch the existing parsing codepaths for .seh_startepilogue/.seh_endepilogue; can we share those across targets?

@dpaoliello
Copy link
Contributor Author

The existing ARM/AArch64 backends use .seh_startepilogue; is there some reason you chose to use .seh_beginepilogue here instead?

I was matching the name of the psuedo-instruction. I'll change it to .seh_startepilogue.

It looks like this doesn't touch the existing parsing codepaths for .seh_startepilogue/.seh_endepilogue; can we share those across targets?

I don't really know how we'd share it - the ARM implementation is trivial, but it's use of TargetStreamer instead of deriving from MCStreamer makes it difficult to share code.

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this seems fine, but I'd like to see some description of the binary layout before we start landing patches.

@@ -33,10 +33,12 @@ f: # @f
.seh_stackalloc 32
.seh_endprologue
nop
.seh_startepilogue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test somewhere for the ".seh_ directive must appear within an active frame" diagnostic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to the MC/AsmParser/seh-directives-errors.s test and added checks and tests for ordering between .seh_endprologue, .seh_startepilogue and .seh_endepilogue

; DISABLE-NEXT: addl %edx, %edx
; DISABLE-NEXT: movl %edx, %eax
; DISABLE-NEXT: .LBB1_5: # %if.end
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding an SEH directive shouldn't affect block layout/merging this way. Can we fix the heuristic here so it isn't sensitive to this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did - the SEH instructions weren't marked as being meta-instructions, so when I first added the new instructions it broke some of the "taildup" tests.

After I marked them as meta-instructions, this test changed - presumably because it now sees the jmp is no longer profitable because it knows the SEH instructions won't produce real instructions (before my change .seh_epilogue was still being emitted where .seh_startepilogue now is, but it wasn't being streamed into the ASM listing).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's just the addition of isMeta to seh_epilogue that's causing this, please separate that out into its own patch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: #110889

@dpaoliello
Copy link
Contributor Author

Overall, this seems fine, but I'd like to see some description of the binary layout before we start landing patches.

@efriedma-quic did you want the full documentation on learn.microsoft.com, or would updating readobj/objdump to handle the new unwind codes be sufficient?

@dpaoliello dpaoliello force-pushed the epilogmarkers branch 2 times, most recently from bbaad3a to 0568e37 Compare September 26, 2024 19:49
@mstorsjo
Copy link
Member

With the original ABI (V1) definition, when unwinding the stack, the original unwinder was forced to disassemble the instruction stream, conclude whether execution was in the epilog and, if yes, essentially emulate the instructions forward. UWOPs were only used for prolog and body. As a result of this, epilogs must follow some canonical format rules which the unwinder can detect and emulate.

Ah, I wasn't aware of this (although I haven't spent much time around the x64 unwinding formats either, I've mostly looked at the ARM and ARM64 formats). Does code generated by LLVM fulfill these criteria at the moment, and are those rules documented anywhere?

As a side note - if dealing with functions that doesn't abide by these rules (either compiler generated that doesn't know about these rules, or custom assembly); one can't expect to get correct unwinding from within the epilog of course. But I wonder if there's a risk if the heuristic for guessing whether we're in body or epilog does an incorrect guess? (And I guess that whole issue is fixed by this V2 format.)

So, when the EpilogV2-aware unwinder encounters an EpilogV2-compiled function, it will be aware of the instruction offset of where the epilog starts. Then it will use the Prolog's UWOPs to unwind the epilog a) assuming symmetry and b) assuming that each UWOP is associated with a given instruction (length).

So, iirc for the x64 unwind format, each unwind opcode also contains an offset - so you can have a prolog that intermixed with other instructions, that don't have any opcode. (This requires NOP unwind opcodes on ARM/ARM64, but on x64, due to the offsets, isn't needed there.) How does this work for assuming the epilog is a symmetrical mirror of the prolog? Assuming that the epilog is tightly packed, given normative instruction lengths for each unwind opcode?

So, the only extra data that epilog V2 needs to be signaled, is the start offset of the epilog (or epilogs)?

@efriedma-quic
Copy link
Collaborator

@efriedma-quic did you want the full documentation on learn.microsoft.com, or would updating readobj/objdump to handle the new unwind codes be sufficient?

I'd prefer real documentation if you have it... but given I'm pretty familiar with unwinding on Windows, probably I can figure out what's going on just from reading llvm-readobj dumps.

Does code generated by LLVM fulfill these criteria at the moment, and are those rules documented anywhere?

https://learn.microsoft.com/en-us/cpp/build/prolog-and-epilog?view=msvc-170 has the rules... I think LLVM follows them? Maybe not in some edge cases; I don't think there's a verifier or anything like that.

@pmsjt
Copy link

pmsjt commented Sep 27, 2024

Ah, I wasn't aware of this (although I haven't spent much time around the x64 unwinding formats either, I've mostly looked at the ARM and ARM64 formats). Does code generated by LLVM fulfill these criteria at the moment, and are those rules documented anywhere?

Canonical epilog rules, which also apply to V2:
https://learn.microsoft.com/en-us/cpp/build/prolog-and-epilog?view=msvc-170#epilog-code

As a side note - if dealing with functions that doesn't abide by these rules (either compiler generated that doesn't know about these rules, or custom assembly); one can't expect to get correct unwinding from within the epilog of course. But I wonder if there's a risk if the heuristic for guessing whether we're in body or epilog does an incorrect guess? (And I guess that whole issue is fixed by this V2 format.)

It is unlikely that V2 will solve anything that is already broken in V1. The V2 rules are as strict (technically more) than the V1.

So, iirc for the x64 unwind format, each unwind opcode also contains an offset - so you can have a prolog that intermixed with other instructions, that don't have any opcode. (This requires NOP unwind opcodes on ARM/ARM64, but on x64, due to the offsets, isn't needed there.) How does this work for assuming the epilog is a symmetrical mirror of the prolog? Assuming that the epilog is tightly packed, given normative instruction lengths for each unwind opcode?

The offsets are not used for epilog processing. The epilog is assumed to be compact, without any unrelated instructions in the middle. So not strictly symmetrical. Just the order of UWOPs is symmetrical. There are also other details. For example, if there are any SAVE_NON_VOLATILE and ALLOC in the prolog, these are assumed to been executed already at the start of the epilog. In other words, the instruction prior to the start of the epilog is expected to be the (single) instruction truncating the non-push/pop portion of the stack.

@dpaoliello
Copy link
Contributor Author

I can figure out what's going on just from reading llvm-readobj dumps.

PR to add support for dumping unwind v2: #110338

@namazso
Copy link
Contributor

namazso commented Sep 28, 2024

@efriedma-quic

Reference for the behavior, this is the best "public documentation" one could get: https://github.com/dotnet/coreclr/blob/master/src/unwinder/amd64/unwinder_amd64.cpp#L1543

The code originates from Windows and was published as part of CLR by Microsoft:

https://github.com/dotnet/coreclr/blob/master/src/unwinder/amd64/unwinder_amd64.cpp#L358-L362

@pmsjt
Copy link

pmsjt commented Sep 28, 2024

This fork is a bit dated (missing a few rules around pushfq) but it is not a bad start at all.

dpaoliello added a commit that referenced this pull request Oct 2, 2024
)

When adding new SEH pseudo instructions in #110024 I noticed that some
of the tests were changing their output since these new instructions
were counting towards thresholds for branching versus folding decisions.

These instructions do not result in real machine instructions being
emitted, so they should be marked as meta instructions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants