diff options
Diffstat (limited to 'poky/meta/recipes-devtools/gcc/gcc-9.3/0003-aarch64-Mitigate-SLS-for-BLR-instruction.patch')
-rw-r--r-- | poky/meta/recipes-devtools/gcc/gcc-9.3/0003-aarch64-Mitigate-SLS-for-BLR-instruction.patch | 659 |
1 files changed, 659 insertions, 0 deletions
diff --git a/poky/meta/recipes-devtools/gcc/gcc-9.3/0003-aarch64-Mitigate-SLS-for-BLR-instruction.patch b/poky/meta/recipes-devtools/gcc/gcc-9.3/0003-aarch64-Mitigate-SLS-for-BLR-instruction.patch new file mode 100644 index 0000000000..6dffef0a34 --- /dev/null +++ b/poky/meta/recipes-devtools/gcc/gcc-9.3/0003-aarch64-Mitigate-SLS-for-BLR-instruction.patch @@ -0,0 +1,659 @@ +CVE: CVE-2020-13844 +Upstream-Status: Backport +Signed-off-by: Ross Burton <ross.burton@arm.com> + +From 2155170525f93093b90a1a065e7ed71a925566e9 Mon Sep 17 00:00:00 2001 +From: Matthew Malcomson <matthew.malcomson@arm.com> +Date: Thu, 9 Jul 2020 09:11:59 +0100 +Subject: [PATCH 3/3] aarch64: Mitigate SLS for BLR instruction + +This patch introduces the mitigation for Straight Line Speculation past +the BLR instruction. + +This mitigation replaces BLR instructions with a BL to a stub which uses +a BR to jump to the original value. These function stubs are then +appended with a speculation barrier to ensure no straight line +speculation happens after these jumps. + +When optimising for speed we use a set of stubs for each function since +this should help the branch predictor make more accurate predictions +about where a stub should branch. + +When optimising for size we use one set of stubs for all functions. +This set of stubs can have human readable names, and we are using +`__call_indirect_x<N>` for register x<N>. + +When BTI branch protection is enabled the BLR instruction can jump to a +`BTI c` instruction using any register, while the BR instruction can +only jump to a `BTI c` instruction using the x16 or x17 registers. +Hence, in order to ensure this transformation is safe we mov the value +of the original register into x16 and use x16 for the BR. + +As an example when optimising for size: +a + BLR x0 +instruction would get transformed to something like + BL __call_indirect_x0 +where __call_indirect_x0 labels a thunk that contains +__call_indirect_x0: + MOV X16, X0 + BR X16 + <speculation barrier> + +The first version of this patch used local symbols specific to a +compilation unit to try and avoid relocations. +This was mistaken since functions coming from the same compilation unit +can still be in different sections, and the assembler will insert +relocations at jumps between sections. + +On any relocation the linker is permitted to emit a veneer to handle +jumps between symbols that are very far apart. The registers x16 and +x17 may be clobbered by these veneers. +Hence the function stubs cannot rely on the values of x16 and x17 being +the same as just before the function stub is called. + +Similar can be said for the hot/cold partitioning of single functions, +so function-local stubs have the same restriction. + +This updated version of the patch never emits function stubs for x16 and +x17, and instead forces other registers to be used. + +Given the above, there is now no benefit to local symbols (since they +are not enough to avoid dealing with linker intricacies). This patch +now uses global symbols with hidden visibility each stored in their own +COMDAT section. This means stubs can be shared between compilation +units while still avoiding the PLT indirection. + +This patch also removes the `__call_indirect_x30` stub (and +function-local equivalent) which would simply jump back to the original +location. + +The function-local stubs are emitted to the assembly output file in one +chunk, which means we need not add the speculation barrier directly +after each one. +This is because we know for certain that the instructions directly after +the BR in all but the last function stub will be from another one of +these stubs and hence will not contain a speculation gadget. +Instead we add a speculation barrier at the end of the sequence of +stubs. + +The global stubs are emitted in COMDAT/.linkonce sections by +themselves so that the linker can remove duplicates from multiple object +files. This means they are not emitted in one chunk, and each one must +include the speculation barrier. + +Another difference is that since the global stubs are shared across +compilation units we do not know that all functions will be targeting an +architecture supporting the SB instruction. +Rather than provide multiple stubs for each architecture, we provide a +stub that will work for all architectures -- using the DSB+ISB barrier. + +This mitigation does not apply for BLR instructions in the following +places: +- Some accesses to thread-local variables use a code sequence with a BLR + instruction. This code sequence is part of the binary interface between + compiler and linker. If this BLR instruction needs to be mitigated, it'd + probably be best to do so in the linker. It seems that the code sequence + for thread-local variable access is unlikely to lead to a Spectre Revalation + Gadget. +- PLT stubs are produced by the linker and each contain a BLR instruction. + It seems that at most only after the last PLT stub a Spectre Revalation + Gadget might appear. + +Testing: + Bootstrap and regtest on AArch64 + (with BOOT_CFLAGS="-mharden-sls=retbr,blr") + Used a temporary hack(1) in gcc-dg.exp to use these options on every + test in the testsuite, a slight modification to emit the speculation + barrier after every function stub, and a script to check that the + output never emitted a BLR, or unmitigated BR or RET instruction. + Similar on an aarch64-none-elf cross-compiler. + +1) Temporary hack emitted a speculation barrier at the end of every stub +function, and used a script to ensure that: + a) Every RET or BR is immediately followed by a speculation barrier. + b) No BLR instruction is emitted by compiler. + +(cherry picked from 96b7f495f9269d5448822e4fc28882edb35a58d7) + +gcc/ChangeLog: + + * config/aarch64/aarch64-protos.h (aarch64_indirect_call_asm): + New declaration. + * config/aarch64/aarch64.c (aarch64_regno_regclass): Handle new + stub registers class. + (aarch64_class_max_nregs): Likewise. + (aarch64_register_move_cost): Likewise. + (aarch64_sls_shared_thunks): Global array to store stub labels. + (aarch64_sls_emit_function_stub): New. + (aarch64_create_blr_label): New. + (aarch64_sls_emit_blr_function_thunks): New. + (aarch64_sls_emit_shared_blr_thunks): New. + (aarch64_asm_file_end): New. + (aarch64_indirect_call_asm): New. + (TARGET_ASM_FILE_END): Use aarch64_asm_file_end. + (TARGET_ASM_FUNCTION_EPILOGUE): Use + aarch64_sls_emit_blr_function_thunks. + * config/aarch64/aarch64.h (STB_REGNUM_P): New. + (enum reg_class): Add STUB_REGS class. + (machine_function): Introduce `call_via` array for + function-local stub labels. + * config/aarch64/aarch64.md (*call_insn, *call_value_insn): Use + aarch64_indirect_call_asm to emit code when hardening BLR + instructions. + * config/aarch64/constraints.md (Ucr): New constraint + representing registers for indirect calls. Is GENERAL_REGS + usually, and STUB_REGS when hardening BLR instruction against + SLS. + * config/aarch64/predicates.md (aarch64_general_reg): STUB_REGS class + is also a general register. + +gcc/testsuite/ChangeLog: + + * gcc.target/aarch64/sls-mitigation/sls-miti-blr-bti.c: New test. + * gcc.target/aarch64/sls-mitigation/sls-miti-blr.c: New test. +--- + gcc/config/aarch64/aarch64-protos.h | 1 + + gcc/config/aarch64/aarch64.c | 225 +++++++++++++++++- + gcc/config/aarch64/aarch64.h | 15 ++ + gcc/config/aarch64/aarch64.md | 11 +- + gcc/config/aarch64/constraints.md | 9 + + gcc/config/aarch64/predicates.md | 3 +- + .../aarch64/sls-mitigation/sls-miti-blr-bti.c | 40 ++++ + .../aarch64/sls-mitigation/sls-miti-blr.c | 33 +++ + 8 files changed, 328 insertions(+), 9 deletions(-) + create mode 100644 gcc/testsuite/gcc.target/aarch64/sls-mitigation/sls-miti-blr-bti.c + create mode 100644 gcc/testsuite/gcc.target/aarch64/sls-mitigation/sls-miti-blr.c + +diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h +index 885eae893..2676e43ae 100644 +--- a/gcc/config/aarch64/aarch64-protos.h ++++ b/gcc/config/aarch64/aarch64-protos.h +@@ -645,6 +645,7 @@ poly_uint64 aarch64_regmode_natural_size (machine_mode); + bool aarch64_high_bits_all_ones_p (HOST_WIDE_INT); + + const char *aarch64_sls_barrier (int); ++const char *aarch64_indirect_call_asm (rtx); + extern bool aarch64_harden_sls_retbr_p (void); + extern bool aarch64_harden_sls_blr_p (void); + +diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c +index dff61105c..bc6c02c3a 100644 +--- a/gcc/config/aarch64/aarch64.c ++++ b/gcc/config/aarch64/aarch64.c +@@ -8190,6 +8190,9 @@ aarch64_label_mentioned_p (rtx x) + enum reg_class + aarch64_regno_regclass (unsigned regno) + { ++ if (STUB_REGNUM_P (regno)) ++ return STUB_REGS; ++ + if (GP_REGNUM_P (regno)) + return GENERAL_REGS; + +@@ -8499,6 +8502,7 @@ aarch64_class_max_nregs (reg_class_t regclass, machine_mode mode) + unsigned int nregs; + switch (regclass) + { ++ case STUB_REGS: + case TAILCALL_ADDR_REGS: + case POINTER_REGS: + case GENERAL_REGS: +@@ -10693,10 +10697,12 @@ aarch64_register_move_cost (machine_mode mode, + = aarch64_tune_params.regmove_cost; + + /* Caller save and pointer regs are equivalent to GENERAL_REGS. */ +- if (to == TAILCALL_ADDR_REGS || to == POINTER_REGS) ++ if (to == TAILCALL_ADDR_REGS || to == POINTER_REGS ++ || to == STUB_REGS) + to = GENERAL_REGS; + +- if (from == TAILCALL_ADDR_REGS || from == POINTER_REGS) ++ if (from == TAILCALL_ADDR_REGS || from == POINTER_REGS ++ || from == STUB_REGS) + from = GENERAL_REGS; + + /* Moving between GPR and stack cost is the same as GP2GP. */ +@@ -19009,6 +19015,215 @@ aarch64_sls_barrier (int mitigation_required) + : ""; + } + ++static GTY (()) tree aarch64_sls_shared_thunks[30]; ++static GTY (()) bool aarch64_sls_shared_thunks_needed = false; ++const char *indirect_symbol_names[30] = { ++ "__call_indirect_x0", ++ "__call_indirect_x1", ++ "__call_indirect_x2", ++ "__call_indirect_x3", ++ "__call_indirect_x4", ++ "__call_indirect_x5", ++ "__call_indirect_x6", ++ "__call_indirect_x7", ++ "__call_indirect_x8", ++ "__call_indirect_x9", ++ "__call_indirect_x10", ++ "__call_indirect_x11", ++ "__call_indirect_x12", ++ "__call_indirect_x13", ++ "__call_indirect_x14", ++ "__call_indirect_x15", ++ "", /* "__call_indirect_x16", */ ++ "", /* "__call_indirect_x17", */ ++ "__call_indirect_x18", ++ "__call_indirect_x19", ++ "__call_indirect_x20", ++ "__call_indirect_x21", ++ "__call_indirect_x22", ++ "__call_indirect_x23", ++ "__call_indirect_x24", ++ "__call_indirect_x25", ++ "__call_indirect_x26", ++ "__call_indirect_x27", ++ "__call_indirect_x28", ++ "__call_indirect_x29", ++}; ++ ++/* Function to create a BLR thunk. This thunk is used to mitigate straight ++ line speculation. Instead of a simple BLR that can be speculated past, ++ we emit a BL to this thunk, and this thunk contains a BR to the relevant ++ register. These thunks have the relevant speculation barries put after ++ their indirect branch so that speculation is blocked. ++ ++ We use such a thunk so the speculation barriers are kept off the ++ architecturally executed path in order to reduce the performance overhead. ++ ++ When optimizing for size we use stubs shared by the linked object. ++ When optimizing for performance we emit stubs for each function in the hope ++ that the branch predictor can better train on jumps specific for a given ++ function. */ ++rtx ++aarch64_sls_create_blr_label (int regnum) ++{ ++ gcc_assert (STUB_REGNUM_P (regnum)); ++ if (optimize_function_for_size_p (cfun)) ++ { ++ /* For the thunks shared between different functions in this compilation ++ unit we use a named symbol -- this is just for users to more easily ++ understand the generated assembly. */ ++ aarch64_sls_shared_thunks_needed = true; ++ const char *thunk_name = indirect_symbol_names[regnum]; ++ if (aarch64_sls_shared_thunks[regnum] == NULL) ++ { ++ /* Build a decl representing this function stub and record it for ++ later. We build a decl here so we can use the GCC machinery for ++ handling sections automatically (through `get_named_section` and ++ `make_decl_one_only`). That saves us a lot of trouble handling ++ the specifics of different output file formats. */ ++ tree decl = build_decl (BUILTINS_LOCATION, FUNCTION_DECL, ++ get_identifier (thunk_name), ++ build_function_type_list (void_type_node, ++ NULL_TREE)); ++ DECL_RESULT (decl) = build_decl (BUILTINS_LOCATION, RESULT_DECL, ++ NULL_TREE, void_type_node); ++ TREE_PUBLIC (decl) = 1; ++ TREE_STATIC (decl) = 1; ++ DECL_IGNORED_P (decl) = 1; ++ DECL_ARTIFICIAL (decl) = 1; ++ make_decl_one_only (decl, DECL_ASSEMBLER_NAME (decl)); ++ resolve_unique_section (decl, 0, false); ++ aarch64_sls_shared_thunks[regnum] = decl; ++ } ++ ++ return gen_rtx_SYMBOL_REF (Pmode, thunk_name); ++ } ++ ++ if (cfun->machine->call_via[regnum] == NULL) ++ cfun->machine->call_via[regnum] ++ = gen_rtx_LABEL_REF (Pmode, gen_label_rtx ()); ++ return cfun->machine->call_via[regnum]; ++} ++ ++/* Helper function for aarch64_sls_emit_blr_function_thunks and ++ aarch64_sls_emit_shared_blr_thunks below. */ ++static void ++aarch64_sls_emit_function_stub (FILE *out_file, int regnum) ++{ ++ /* Save in x16 and branch to that function so this transformation does ++ not prevent jumping to `BTI c` instructions. */ ++ asm_fprintf (out_file, "\tmov\tx16, x%d\n", regnum); ++ asm_fprintf (out_file, "\tbr\tx16\n"); ++} ++ ++/* Emit all BLR stubs for this particular function. ++ Here we emit all the BLR stubs needed for the current function. Since we ++ emit these stubs in a consecutive block we know there will be no speculation ++ gadgets between each stub, and hence we only emit a speculation barrier at ++ the end of the stub sequences. ++ ++ This is called in the TARGET_ASM_FUNCTION_EPILOGUE hook. */ ++void ++aarch64_sls_emit_blr_function_thunks (FILE *out_file) ++{ ++ if (! aarch64_harden_sls_blr_p ()) ++ return; ++ ++ bool any_functions_emitted = false; ++ /* We must save and restore the current function section since this assembly ++ is emitted at the end of the function. This means it can be emitted *just ++ after* the cold section of a function. That cold part would be emitted in ++ a different section. That switch would trigger a `.cfi_endproc` directive ++ to be emitted in the original section and a `.cfi_startproc` directive to ++ be emitted in the new section. Switching to the original section without ++ restoring would mean that the `.cfi_endproc` emitted as a function ends ++ would happen in a different section -- leaving an unmatched ++ `.cfi_startproc` in the cold text section and an unmatched `.cfi_endproc` ++ in the standard text section. */ ++ section *save_text_section = in_section; ++ switch_to_section (function_section (current_function_decl)); ++ for (int regnum = 0; regnum < 30; ++regnum) ++ { ++ rtx specu_label = cfun->machine->call_via[regnum]; ++ if (specu_label == NULL) ++ continue; ++ ++ targetm.asm_out.print_operand (out_file, specu_label, 0); ++ asm_fprintf (out_file, ":\n"); ++ aarch64_sls_emit_function_stub (out_file, regnum); ++ any_functions_emitted = true; ++ } ++ if (any_functions_emitted) ++ /* Can use the SB if needs be here, since this stub will only be used ++ by the current function, and hence for the current target. */ ++ asm_fprintf (out_file, "\t%s\n", aarch64_sls_barrier (true)); ++ switch_to_section (save_text_section); ++} ++ ++/* Emit shared BLR stubs for the current compilation unit. ++ Over the course of compiling this unit we may have converted some BLR ++ instructions to a BL to a shared stub function. This is where we emit those ++ stub functions. ++ This function is for the stubs shared between different functions in this ++ compilation unit. We share when optimizing for size instead of speed. ++ ++ This function is called through the TARGET_ASM_FILE_END hook. */ ++void ++aarch64_sls_emit_shared_blr_thunks (FILE *out_file) ++{ ++ if (! aarch64_sls_shared_thunks_needed) ++ return; ++ ++ for (int regnum = 0; regnum < 30; ++regnum) ++ { ++ tree decl = aarch64_sls_shared_thunks[regnum]; ++ if (!decl) ++ continue; ++ ++ const char *name = indirect_symbol_names[regnum]; ++ switch_to_section (get_named_section (decl, NULL, 0)); ++ ASM_OUTPUT_ALIGN (out_file, 2); ++ targetm.asm_out.globalize_label (out_file, name); ++ /* Only emits if the compiler is configured for an assembler that can ++ handle visibility directives. */ ++ targetm.asm_out.assemble_visibility (decl, VISIBILITY_HIDDEN); ++ ASM_OUTPUT_TYPE_DIRECTIVE (out_file, name, "function"); ++ ASM_OUTPUT_LABEL (out_file, name); ++ aarch64_sls_emit_function_stub (out_file, regnum); ++ /* Use the most conservative target to ensure it can always be used by any ++ function in the translation unit. */ ++ asm_fprintf (out_file, "\tdsb\tsy\n\tisb\n"); ++ ASM_DECLARE_FUNCTION_SIZE (out_file, name, decl); ++ } ++} ++ ++/* Implement TARGET_ASM_FILE_END. */ ++void ++aarch64_asm_file_end () ++{ ++ aarch64_sls_emit_shared_blr_thunks (asm_out_file); ++ /* Since this function will be called for the ASM_FILE_END hook, we ensure ++ that what would be called otherwise (e.g. `file_end_indicate_exec_stack` ++ for FreeBSD) still gets called. */ ++#ifdef TARGET_ASM_FILE_END ++ TARGET_ASM_FILE_END (); ++#endif ++} ++ ++const char * ++aarch64_indirect_call_asm (rtx addr) ++{ ++ gcc_assert (REG_P (addr)); ++ if (aarch64_harden_sls_blr_p ()) ++ { ++ rtx stub_label = aarch64_sls_create_blr_label (REGNO (addr)); ++ output_asm_insn ("bl\t%0", &stub_label); ++ } ++ else ++ output_asm_insn ("blr\t%0", &addr); ++ return ""; ++} ++ + /* Target-specific selftests. */ + + #if CHECKING_P +@@ -19529,6 +19744,12 @@ aarch64_libgcc_floating_mode_supported_p + #define TARGET_RUN_TARGET_SELFTESTS selftest::aarch64_run_selftests + #endif /* #if CHECKING_P */ + ++#undef TARGET_ASM_FILE_END ++#define TARGET_ASM_FILE_END aarch64_asm_file_end ++ ++#undef TARGET_ASM_FUNCTION_EPILOGUE ++#define TARGET_ASM_FUNCTION_EPILOGUE aarch64_sls_emit_blr_function_thunks ++ + struct gcc_target targetm = TARGET_INITIALIZER; + + #include "gt-aarch64.h" +diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h +index 72ddc6fd9..60682a100 100644 +--- a/gcc/config/aarch64/aarch64.h ++++ b/gcc/config/aarch64/aarch64.h +@@ -540,6 +540,16 @@ extern unsigned aarch64_architecture_version; + #define GP_REGNUM_P(REGNO) \ + (((unsigned) (REGNO - R0_REGNUM)) <= (R30_REGNUM - R0_REGNUM)) + ++/* Registers known to be preserved over a BL instruction. This consists of the ++ GENERAL_REGS without x16, x17, and x30. The x30 register is changed by the ++ BL instruction itself, while the x16 and x17 registers may be used by ++ veneers which can be inserted by the linker. */ ++#define STUB_REGNUM_P(REGNO) \ ++ (GP_REGNUM_P (REGNO) \ ++ && (REGNO) != R16_REGNUM \ ++ && (REGNO) != R17_REGNUM \ ++ && (REGNO) != R30_REGNUM) \ ++ + #define FP_REGNUM_P(REGNO) \ + (((unsigned) (REGNO - V0_REGNUM)) <= (V31_REGNUM - V0_REGNUM)) + +@@ -561,6 +571,7 @@ enum reg_class + { + NO_REGS, + TAILCALL_ADDR_REGS, ++ STUB_REGS, + GENERAL_REGS, + STACK_REG, + POINTER_REGS, +@@ -580,6 +591,7 @@ enum reg_class + { \ + "NO_REGS", \ + "TAILCALL_ADDR_REGS", \ ++ "STUB_REGS", \ + "GENERAL_REGS", \ + "STACK_REG", \ + "POINTER_REGS", \ +@@ -596,6 +608,7 @@ enum reg_class + { \ + { 0x00000000, 0x00000000, 0x00000000 }, /* NO_REGS */ \ + { 0x00030000, 0x00000000, 0x00000000 }, /* TAILCALL_ADDR_REGS */\ ++ { 0x3ffcffff, 0x00000000, 0x00000000 }, /* STUB_REGS */ \ + { 0x7fffffff, 0x00000000, 0x00000003 }, /* GENERAL_REGS */ \ + { 0x80000000, 0x00000000, 0x00000000 }, /* STACK_REG */ \ + { 0xffffffff, 0x00000000, 0x00000003 }, /* POINTER_REGS */ \ +@@ -735,6 +748,8 @@ typedef struct GTY (()) machine_function + struct aarch64_frame frame; + /* One entry for each hard register. */ + bool reg_is_wrapped_separately[LAST_SAVED_REGNUM]; ++ /* One entry for each general purpose register. */ ++ rtx call_via[SP_REGNUM]; + bool label_is_assembled; + } machine_function; + #endif +diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md +index 494aee964..ed8cf8ece 100644 +--- a/gcc/config/aarch64/aarch64.md ++++ b/gcc/config/aarch64/aarch64.md +@@ -908,15 +908,14 @@ + ) + + (define_insn "*call_insn" +- [(call (mem:DI (match_operand:DI 0 "aarch64_call_insn_operand" "r, Usf")) ++ [(call (mem:DI (match_operand:DI 0 "aarch64_call_insn_operand" "Ucr, Usf")) + (match_operand 1 "" "")) + (clobber (reg:DI LR_REGNUM))] + "" + "@ +- blr\\t%0 ++ * return aarch64_indirect_call_asm (operands[0]); + bl\\t%c0" +- [(set_attr "type" "call, call")] +-) ++ [(set_attr "type" "call, call")]) + + (define_expand "call_value" + [(parallel [(set (match_operand 0 "" "") +@@ -934,12 +933,12 @@ + + (define_insn "*call_value_insn" + [(set (match_operand 0 "" "") +- (call (mem:DI (match_operand:DI 1 "aarch64_call_insn_operand" "r, Usf")) ++ (call (mem:DI (match_operand:DI 1 "aarch64_call_insn_operand" "Ucr, Usf")) + (match_operand 2 "" ""))) + (clobber (reg:DI LR_REGNUM))] + "" + "@ +- blr\\t%1 ++ * return aarch64_indirect_call_asm (operands[1]); + bl\\t%c1" + [(set_attr "type" "call, call")] + ) +diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md +index 21f9549e6..7756dbe83 100644 +--- a/gcc/config/aarch64/constraints.md ++++ b/gcc/config/aarch64/constraints.md +@@ -24,6 +24,15 @@ + (define_register_constraint "Ucs" "TAILCALL_ADDR_REGS" + "@internal Registers suitable for an indirect tail call") + ++(define_register_constraint "Ucr" ++ "aarch64_harden_sls_blr_p () ? STUB_REGS : GENERAL_REGS" ++ "@internal Registers to be used for an indirect call. ++ This is usually the general registers, but when we are hardening against ++ Straight Line Speculation we disallow x16, x17, and x30 so we can use ++ indirection stubs. These indirection stubs cannot use the above registers ++ since they will be reached by a BL that may have to go through a linker ++ veneer.") ++ + (define_register_constraint "w" "FP_REGS" + "Floating point and SIMD vector registers.") + +diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md +index 8e1b78421..4250aecb3 100644 +--- a/gcc/config/aarch64/predicates.md ++++ b/gcc/config/aarch64/predicates.md +@@ -32,7 +32,8 @@ + + (define_predicate "aarch64_general_reg" + (and (match_operand 0 "register_operand") +- (match_test "REGNO_REG_CLASS (REGNO (op)) == GENERAL_REGS"))) ++ (match_test "REGNO_REG_CLASS (REGNO (op)) == STUB_REGS ++ || REGNO_REG_CLASS (REGNO (op)) == GENERAL_REGS"))) + + ;; Return true if OP a (const_int 0) operand. + (define_predicate "const0_operand" +diff --git a/gcc/testsuite/gcc.target/aarch64/sls-mitigation/sls-miti-blr-bti.c b/gcc/testsuite/gcc.target/aarch64/sls-mitigation/sls-miti-blr-bti.c +new file mode 100644 +index 000000000..b1fb754c7 +--- /dev/null ++++ b/gcc/testsuite/gcc.target/aarch64/sls-mitigation/sls-miti-blr-bti.c +@@ -0,0 +1,40 @@ ++/* { dg-do compile } */ ++/* { dg-additional-options "-mharden-sls=blr -mbranch-protection=bti" } */ ++/* ++ Ensure that the SLS hardening of BLR leaves no BLR instructions. ++ Here we also check that there are no BR instructions with anything except an ++ x16 or x17 register. This is because a `BTI c` instruction can be branched ++ to using a BLR instruction using any register, but can only be branched to ++ with a BR using an x16 or x17 register. ++ */ ++typedef int (foo) (int, int); ++typedef void (bar) (int, int); ++struct sls_testclass { ++ foo *x; ++ bar *y; ++ int left; ++ int right; ++}; ++ ++/* We test both RTL patterns for a call which returns a value and a call which ++ does not. */ ++int blr_call_value (struct sls_testclass x) ++{ ++ int retval = x.x(x.left, x.right); ++ if (retval % 10) ++ return 100; ++ return 9; ++} ++ ++int blr_call (struct sls_testclass x) ++{ ++ x.y(x.left, x.right); ++ if (x.left % 10) ++ return 100; ++ return 9; ++} ++ ++/* { dg-final { scan-assembler-not {\tblr\t} } } */ ++/* { dg-final { scan-assembler-not {\tbr\tx(?!16|17)} } } */ ++/* { dg-final { scan-assembler {\tbr\tx(16|17)} } } */ ++ +diff --git a/gcc/testsuite/gcc.target/aarch64/sls-mitigation/sls-miti-blr.c b/gcc/testsuite/gcc.target/aarch64/sls-mitigation/sls-miti-blr.c +new file mode 100644 +index 000000000..88baffffe +--- /dev/null ++++ b/gcc/testsuite/gcc.target/aarch64/sls-mitigation/sls-miti-blr.c +@@ -0,0 +1,33 @@ ++/* { dg-additional-options "-mharden-sls=blr -save-temps" } */ ++/* Ensure that the SLS hardening of BLR leaves no BLR instructions. ++ We only test that all BLR instructions have been removed, not that the ++ resulting code makes sense. */ ++typedef int (foo) (int, int); ++typedef void (bar) (int, int); ++struct sls_testclass { ++ foo *x; ++ bar *y; ++ int left; ++ int right; ++}; ++ ++/* We test both RTL patterns for a call which returns a value and a call which ++ does not. */ ++int blr_call_value (struct sls_testclass x) ++{ ++ int retval = x.x(x.left, x.right); ++ if (retval % 10) ++ return 100; ++ return 9; ++} ++ ++int blr_call (struct sls_testclass x) ++{ ++ x.y(x.left, x.right); ++ if (x.left % 10) ++ return 100; ++ return 9; ++} ++ ++/* { dg-final { scan-assembler-not {\tblr\t} } } */ ++/* { dg-final { scan-assembler {\tbr\tx[0-9][0-9]?} } } */ +-- +2.25.1 + |