Skip to content

MCParser: Move LCurly/RCurly testing into tokenIsStartOfStatement #140101

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

MaskRay
Copy link
Member

@MaskRay MaskRay commented May 15, 2025

Commit 8a0453e (2015) added LCurly and
RCurly cases for Hexagon instruction bundles. While gas x86 also adopted
{ in 2017 for pseudo prefixes (see tc_symbol_chars), { remains
uncommon among targets. Move { and } parsing into the newly
introduced tokenIsStartOfStatement hook (#137997).

Created using spr 1.3.5-bogner
@MaskRay MaskRay requested review from s-barannikov and nvjle May 15, 2025 16:42
@llvmbot
Copy link
Member

llvmbot commented May 15, 2025

@llvm/pr-subscribers-mc
@llvm/pr-subscribers-backend-hexagon

@llvm/pr-subscribers-backend-x86

Author: Fangrui Song (MaskRay)

Changes

Commit 8a0453e (2015) added LCurly and
RCurly cases for Hexagon instruction bundles. While gas x86 also adopted
{ in 2017 for pseudo prefixes (see tc_symbol_chars), { remains
uncommon among targets. Move { and } parsing into the newly
introduced tokenIsStartOfStatement hook (#137997).


Full diff: https://github.com/llvm/llvm-project/pull/140101.diff

4 Files Affected:

  • (modified) llvm/lib/MC/MCParser/AsmParser.cpp (-9)
  • (modified) llvm/lib/Target/Hexagon/AsmParser/HexagonAsmParser.cpp (+5)
  • (modified) llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp (+4)
  • (added) llvm/test/MC/AsmParser/token.s (+7)
diff --git a/llvm/lib/MC/MCParser/AsmParser.cpp b/llvm/lib/MC/MCParser/AsmParser.cpp
index f27a27833858a..857985199cc48 100644
--- a/llvm/lib/MC/MCParser/AsmParser.cpp
+++ b/llvm/lib/MC/MCParser/AsmParser.cpp
@@ -1760,15 +1760,6 @@ bool AsmParser::parseStatement(ParseStatementInfo &Info,
     // Treat '.' as a valid identifier in this context.
     Lex();
     IDVal = ".";
-  } else if (Lexer.is(AsmToken::LCurly)) {
-    // Treat '{' as a valid identifier in this context.
-    Lex();
-    IDVal = "{";
-
-  } else if (Lexer.is(AsmToken::RCurly)) {
-    // Treat '}' as a valid identifier in this context.
-    Lex();
-    IDVal = "}";
   } else if (getTargetParser().tokenIsStartOfStatement(ID.getKind())) {
     Lex();
     IDVal = ID.getString();
diff --git a/llvm/lib/Target/Hexagon/AsmParser/HexagonAsmParser.cpp b/llvm/lib/Target/Hexagon/AsmParser/HexagonAsmParser.cpp
index 686e1609c376d..1c9fb8f0a42ae 100644
--- a/llvm/lib/Target/Hexagon/AsmParser/HexagonAsmParser.cpp
+++ b/llvm/lib/Target/Hexagon/AsmParser/HexagonAsmParser.cpp
@@ -110,6 +110,7 @@ class HexagonAsmParser : public MCTargetAsmParser {
 
   bool equalIsAsmAssignment() override { return false; }
   bool isLabel(AsmToken &Token) override;
+  bool tokenIsStartOfStatement(AsmToken::TokenKind Token) override;
 
   void Warning(SMLoc L, const Twine &Msg) { Parser.Warning(L, Msg); }
   bool Error(SMLoc L, const Twine &Msg) { return Parser.Error(L, Msg); }
@@ -1007,6 +1008,10 @@ bool HexagonAsmParser::isLabel(AsmToken &Token) {
   return false;
 }
 
+bool HexagonAsmParser::tokenIsStartOfStatement(AsmToken::TokenKind Token) {
+  return Token == AsmToken::LCurly || Token == AsmToken::RCurly;
+}
+
 bool HexagonAsmParser::handleNoncontigiousRegister(bool Contigious,
                                                    SMLoc &Loc) {
   if (!Contigious && ErrorNoncontigiousRegister) {
diff --git a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
index 642a9cff4853c..11193304a785d 100644
--- a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
+++ b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
@@ -124,6 +124,10 @@ class X86AsmParser : public MCTargetAsmParser {
     return Result;
   }
 
+  bool tokenIsStartOfStatement(AsmToken::TokenKind Token) override {
+    return Token == AsmToken::LCurly;
+  }
+
   X86TargetStreamer &getTargetStreamer() {
     assert(getParser().getStreamer().getTargetStreamer() &&
            "do not have a target streamer");
diff --git a/llvm/test/MC/AsmParser/token.s b/llvm/test/MC/AsmParser/token.s
new file mode 100644
index 0000000000000..c162e8336a2d7
--- /dev/null
+++ b/llvm/test/MC/AsmParser/token.s
@@ -0,0 +1,7 @@
+## Tested invalid statement start tokens. X86 supports "{". Use a different target.
+# REQUIRES: aarch64-registered-target
+
+# RUN: not llvm-mc -triple=aarch64 %s 2>&1 | FileCheck %s
+
+# CHECK: [[#@LINE+1]]:2: error: unexpected token at start of statement
+ {insn}

@nvjle
Copy link
Contributor

nvjle commented May 15, 2025

LGTM.

Copy link
Contributor

@s-barannikov s-barannikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@MaskRay MaskRay merged commit 97ad399 into main May 16, 2025
15 checks passed
@MaskRay MaskRay deleted the users/MaskRay/spr/mcparser-move-lcurlyrcurly-testing-into-tokenisstartofstatement branch May 16, 2025 01:28
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request May 16, 2025
…atement

Commit 8a0453e (2015) added LCurly and
RCurly cases for Hexagon instruction bundles. While gas x86 also adopted
`{` in 2017 for pseudo prefixes (see `tc_symbol_chars`), `{` remains
uncommon among targets. Move `{` and `}` parsing into the newly
introduced `tokenIsStartOfStatement` hook (#137997).

Pull Request: llvm/llvm-project#140101
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants