[clang][NFC] Regroup declarations in `Sema` #82217

Endilll · 2024-02-19T06:55:21Z

This patch regroups declarations in Sema based on the file they are implemented in (e.g. SemaChecking.cpp). This allows to logically split Sema in 42 groups. No physical separation is done (e.g. splitting Sema into multiple classes). Table of contents added at the very beginning of Sema. Grouping is reflected in Doxygen commands, so structure of API reference of Sema is also significantly improved (example from official documentation, comparison of Sema API reference).

While grouping is intentional, as well as each group consisting of public declarations followed by private ones (without changing access in-between), exact contents and order of declarations of each group is partially carried over from old structure, partially accidental due to time constrains to do the regrouping over the weekend (Sema is just enormously big). Data members and inline function definitions in Sema.h complicate the matter, since it's not obvious which group they belong to. Further work is expected to refine contents and order of declarations.

What is also intentional is some kind of layering, where Concepts group follows template groups, and ObjC, code completion, CUDA, HLSL, OpenACC, OpenMP, and SYCL are all placed at the end of the file, after C and C++ parts of Sema.

I used clang-query to verify that access specifiers were preserved during the process (https://gcc.godbolt.org/z/9johffY9T, thank you @ilya-biryukov). Only the following 3 member types were converted from private to public because of limitations of the new grouping: DeclareTargetContextInfo, TypoExprState, SatisfactionStackEntryTy.

Member initializer list of Sema in Sema.cpp is rewritten to reflect new order of data members in order to avoid -Wreorder-ctor.

Since this patch touches almost every line in Sema.h, it was considered appropriate to run clang-format on the whole file, and not just on changed lines.

This patch regroups declarations in `Sema` based on the file they are implemented in (e.g. `SemaChecking.cpp`). This allows to logically split `Sema` in 42 groups. Table of contents added at the very beginning of `Sema`. While grouping is intentional, as well as each group consisting of `public` declarations followed by `private` ones (without changing access in-between), exact contents and order of declarations of each group is partially carried over from old structure, partially accidental due to time constrains to do the regrouping over the weekend (`Sema` is just enormously big). Further work is expected to refine contents and order of declarations. What is also intentional is some kind of layering, where Concepts group follows template groups, and ObjC, code completion, CUDA, HLSL, OpenACC, OpenMP, and SYCL are all placed at the end of the file, after C and C++ parts of `Sema`. Member initializer list of `Sema` in `Sema.cpp` is rewritten to reflect new order of data members in order to avoid `-Wreorder-ctor`.

llvmbot · 2024-02-19T06:55:55Z

@llvm/pr-subscribers-clang

Author: Vlad Serebrennikov (Endilll)

Changes

This patch regroups declarations in Sema based on the file they are implemented in (e.g. SemaChecking.cpp). This allows to logically split Sema in 42 groups. Table of contents added at the very beginning of Sema.

While grouping is intentional, as well as each group consisting of public declarations followed by private ones (without changing access in-between), exact contents and order of declarations of each group is partially carried over from old structure, partially accidental due to time constrains to do the regrouping over the weekend (Sema is just enormously big). Further work is expected to refine contents and order of declarations.

What is also intentional is some kind of layering, where Concepts group follows template groups, and ObjC, code completion, CUDA, HLSL, OpenACC, OpenMP, and SYCL are all placed at the end of the file, after C and C++ parts of Sema.

Member initializer list of Sema in Sema.cpp is rewritten to reflect new order of data members in order to avoid -Wreorder-ctor.

Patch is 1.19 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/82217.diff

2 Files Affected:

(modified) clang/include/clang/Sema/Sema.h (+11441-11193)
(modified) clang/lib/Sema/Sema.cpp (+19-19)

diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index e9cd42ae777df5..e039f9719826d3 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -423,590 +423,650 @@ enum class TemplateDeductionResult {
 
 /// Sema - This implements semantic analysis and AST building for C.
 class Sema final {
-  Sema(const Sema &) = delete;
-  void operator=(const Sema &) = delete;
-
-  ///Source of additional semantic information.
-  IntrusiveRefCntPtr<ExternalSemaSource> ExternalSource;
-
-  static bool mightHaveNonExternalLinkage(const DeclaratorDecl *FD);
-
-  /// Determine whether two declarations should be linked together, given that
-  /// the old declaration might not be visible and the new declaration might
-  /// not have external linkage.
-  bool shouldLinkPossiblyHiddenDecl(const NamedDecl *Old,
-                                    const NamedDecl *New) {
-    if (isVisible(Old))
-     return true;
-    // See comment in below overload for why it's safe to compute the linkage
-    // of the new declaration here.
-    if (New->isExternallyDeclarable()) {
-      assert(Old->isExternallyDeclarable() &&
-             "should not have found a non-externally-declarable previous decl");
-      return true;
-    }
-    return false;
-  }
-  bool shouldLinkPossiblyHiddenDecl(LookupResult &Old, const NamedDecl *New);
-
-  void setupImplicitSpecialMemberType(CXXMethodDecl *SpecialMem,
-                                      QualType ResultTy,
-                                      ArrayRef<QualType> Args);
+  // Table of Contents
+  // -----------------
+  // 1. Semantic Analysis (Sema.cpp)
+  // 2. C++ Access Control (SemaAccess.cpp)
+  // 3. Attributes (SemaAttr.cpp)
+  // 4. Availability Attribute Handling (SemaAvailability.cpp)
+  // 5. Casts (SemaCast.cpp)
+  // 6. Extra Semantic Checking (SemaChecking.cpp)
+  // 7. C++ Coroutines (SemaCoroutine.cpp)
+  // 8. C++ Scope Specifiers (SemaCXXScopeSpec.cpp)
+  // 9. Declarations (SemaDecl.cpp)
+  // 10. Declaration Attribute Handling (SemaDeclAttr.cpp)
+  // 11. C++ Declarations (SemaDeclCXX.cpp)
+  // 12. C++ Exception Specifications (SemaExceptionSpec.cpp)
+  // 13. Expressions (SemaExpr.cpp)
+  // 14. C++ Expressions (SemaExprCXX.cpp)
+  // 15. Member Access Expressions (SemaExprMember.cpp)
+  // 16. Initializers (SemaInit.cpp)
+  // 17. C++ Lambda Expressions (SemaLambda.cpp)
+  // 18. Name Lookup (SemaLookup.cpp)
+  // 19. Modules (SemaModule.cpp)
+  // 20. C++ Overloading (SemaOverload.cpp)
+  // 21. Pseudo-Object (SemaPseudoObject.cpp)
+  // 22. Statements (SemaStmt.cpp)
+  // 23. `inline asm` Statement (SemaStmtAsm.cpp)
+  // 24. Statement Attribute Handling (SemaStmtAttr.cpp)
+  // 25. C++ Templates (SemaTemplate.cpp)
+  // 26. C++ Template Argument Deduction (SemaTemplateDeduction.cpp)
+  // 27. C++ Template Instantiation (SemaTemplateInstantiate.cpp)
+  // 28. C++ Template Declaration Instantiation
+  // (SemaTemplateInstantiateDecl.cpp)
+  // 29. C++ Variadic Templates (SemaTemplateVariadic.cpp)
+  // 30. Constraints and Concepts (SemaConcept.cpp)
+  // 31. Types (SemaType.cpp)
+  // 32. ObjC Declarations (SemaDeclObjC.cpp)
+  // 33. ObjC Expressions (SemaExprObjC.cpp)
+  // 34. ObjC @property and @synthesize (SemaObjCProperty.cpp)
+  // 35. Code Completion (SemaCodeComplete.cpp)
+  // 36. FixIt Helpers (SemaFixItUtils.cpp)
+  // 37. Name Lookup for RISC-V Vector Intrinsic (SemaRISCVVectorLookup.cpp)
+  // 38. CUDA (SemaCUDA.cpp)
+  // 39. HLSL Constructs (SemaHLSL.cpp)
+  // 40. OpenACC Constructs (SemaOpenACC.cpp)
+  // 41. OpenMP Directives and Clauses (SemaOpenMP.cpp)
+  // 42. SYCL Constructs (SemaSYCL.cpp)
+
+  /// \name Semantic Analysis
+  /// Implementations are in Sema.cpp
+  /// @{
 
 public:
-  /// The maximum alignment, same as in llvm::Value. We duplicate them here
-  /// because that allows us not to duplicate the constants in clang code,
-  /// which we must to since we can't directly use the llvm constants.
-  /// The value is verified against llvm here: lib/CodeGen/CGDecl.cpp
-  ///
-  /// This is the greatest alignment value supported by load, store, and alloca
-  /// instructions, and global values.
-  static const unsigned MaxAlignmentExponent = 32;
-  static const uint64_t MaximumAlignment = 1ull << MaxAlignmentExponent;
-
-  typedef OpaquePtr<DeclGroupRef> DeclGroupPtrTy;
-  typedef OpaquePtr<TemplateName> TemplateTy;
-  typedef OpaquePtr<QualType> TypeTy;
-
-  OpenCLOptions OpenCLFeatures;
-  FPOptions CurFPFeatures;
-
-  const LangOptions &LangOpts;
-  Preprocessor &PP;
-  ASTContext &Context;
-  ASTConsumer &Consumer;
-  DiagnosticsEngine &Diags;
-  SourceManager &SourceMgr;
-  api_notes::APINotesManager APINotes;
-
-  /// Flag indicating whether or not to collect detailed statistics.
-  bool CollectStats;
-
-  /// Code-completion consumer.
-  CodeCompleteConsumer *CodeCompleter;
+  Sema(Preprocessor &pp, ASTContext &ctxt, ASTConsumer &consumer,
+       TranslationUnitKind TUKind = TU_Complete,
+       CodeCompleteConsumer *CompletionConsumer = nullptr);
+  ~Sema();
 
-  /// CurContext - This is the current declaration context of parsing.
-  DeclContext *CurContext;
+  /// Perform initialization that occurs after the parser has been
+  /// initialized but before it parses anything.
+  void Initialize();
 
-  /// Generally null except when we temporarily switch decl contexts,
-  /// like in \see ActOnObjCTemporaryExitContainerContext.
-  DeclContext *OriginalLexicalContext;
+  /// This virtual key function only exists to limit the emission of debug info
+  /// describing the Sema class. GCC and Clang only emit debug info for a class
+  /// with a vtable when the vtable is emitted. Sema is final and not
+  /// polymorphic, but the debug info size savings are so significant that it is
+  /// worth adding a vtable just to take advantage of this optimization.
+  virtual void anchor();
 
-  /// VAListTagName - The declaration name corresponding to __va_list_tag.
-  /// This is used as part of a hack to omit that class from ADL results.
-  DeclarationName VAListTagName;
+  const LangOptions &getLangOpts() const { return LangOpts; }
+  OpenCLOptions &getOpenCLOptions() { return OpenCLFeatures; }
+  FPOptions &getCurFPFeatures() { return CurFPFeatures; }
 
-  bool MSStructPragmaOn; // True when \#pragma ms_struct on
+  DiagnosticsEngine &getDiagnostics() const { return Diags; }
+  SourceManager &getSourceManager() const { return SourceMgr; }
+  Preprocessor &getPreprocessor() const { return PP; }
+  ASTContext &getASTContext() const { return Context; }
+  ASTConsumer &getASTConsumer() const { return Consumer; }
+  ASTMutationListener *getASTMutationListener() const;
+  ExternalSemaSource *getExternalSource() const { return ExternalSource.get(); }
 
-  /// Controls member pointer representation format under the MS ABI.
-  LangOptions::PragmaMSPointersToMembersKind
-      MSPointerToMemberRepresentationMethod;
+  DarwinSDKInfo *getDarwinSDKInfoForAvailabilityChecking(SourceLocation Loc,
+                                                         StringRef Platform);
+  DarwinSDKInfo *getDarwinSDKInfoForAvailabilityChecking();
 
-  /// Stack of active SEH __finally scopes.  Can be empty.
-  SmallVector<Scope*, 2> CurrentSEHFinally;
+  /// Registers an external source. If an external source already exists,
+  ///  creates a multiplex external source and appends to it.
+  ///
+  ///\param[in] E - A non-null external sema source.
+  ///
+  void addExternalSource(ExternalSemaSource *E);
 
-  /// Source location for newly created implicit MSInheritanceAttrs
-  SourceLocation ImplicitMSInheritanceAttrLoc;
+  /// Helper class that creates diagnostics with optional
+  /// template instantiation stacks.
+  ///
+  /// This class provides a wrapper around the basic DiagnosticBuilder
+  /// class that emits diagnostics. ImmediateDiagBuilder is
+  /// responsible for emitting the diagnostic (as DiagnosticBuilder
+  /// does) and, if the diagnostic comes from inside a template
+  /// instantiation, printing the template instantiation stack as
+  /// well.
+  class ImmediateDiagBuilder : public DiagnosticBuilder {
+    Sema &SemaRef;
+    unsigned DiagID;
 
-  /// Holds TypoExprs that are created from `createDelayedTypo`. This is used by
-  /// `TransformTypos` in order to keep track of any TypoExprs that are created
-  /// recursively during typo correction and wipe them away if the correction
-  /// fails.
-  llvm::SmallVector<TypoExpr *, 2> TypoExprs;
+  public:
+    ImmediateDiagBuilder(DiagnosticBuilder &DB, Sema &SemaRef, unsigned DiagID)
+        : DiagnosticBuilder(DB), SemaRef(SemaRef), DiagID(DiagID) {}
+    ImmediateDiagBuilder(DiagnosticBuilder &&DB, Sema &SemaRef, unsigned DiagID)
+        : DiagnosticBuilder(DB), SemaRef(SemaRef), DiagID(DiagID) {}
 
-  /// pragma clang section kind
-  enum PragmaClangSectionKind {
-    PCSK_Invalid      = 0,
-    PCSK_BSS          = 1,
-    PCSK_Data         = 2,
-    PCSK_Rodata       = 3,
-    PCSK_Text         = 4,
-    PCSK_Relro        = 5
-   };
+    // This is a cunning lie. DiagnosticBuilder actually performs move
+    // construction in its copy constructor (but due to varied uses, it's not
+    // possible to conveniently express this as actual move construction). So
+    // the default copy ctor here is fine, because the base class disables the
+    // source anyway, so the user-defined ~ImmediateDiagBuilder is a safe no-op
+    // in that case anwyay.
+    ImmediateDiagBuilder(const ImmediateDiagBuilder &) = default;
 
-  enum PragmaClangSectionAction {
-    PCSA_Set     = 0,
-    PCSA_Clear   = 1
-  };
+    ~ImmediateDiagBuilder() {
+      // If we aren't active, there is nothing to do.
+      if (!isActive())
+        return;
 
-  struct PragmaClangSection {
-    std::string SectionName;
-    bool Valid = false;
-    SourceLocation PragmaLocation;
-  };
+      // Otherwise, we need to emit the diagnostic. First clear the diagnostic
+      // builder itself so it won't emit the diagnostic in its own destructor.
+      //
+      // This seems wasteful, in that as written the DiagnosticBuilder dtor will
+      // do its own needless checks to see if the diagnostic needs to be
+      // emitted. However, because we take care to ensure that the builder
+      // objects never escape, a sufficiently smart compiler will be able to
+      // eliminate that code.
+      Clear();
 
-   PragmaClangSection PragmaClangBSSSection;
-   PragmaClangSection PragmaClangDataSection;
-   PragmaClangSection PragmaClangRodataSection;
-   PragmaClangSection PragmaClangRelroSection;
-   PragmaClangSection PragmaClangTextSection;
+      // Dispatch to Sema to emit the diagnostic.
+      SemaRef.EmitCurrentDiagnostic(DiagID);
+    }
 
-  enum PragmaMsStackAction {
-    PSK_Reset     = 0x0,                // #pragma ()
-    PSK_Set       = 0x1,                // #pragma (value)
-    PSK_Push      = 0x2,                // #pragma (push[, id])
-    PSK_Pop       = 0x4,                // #pragma (pop[, id])
-    PSK_Show      = 0x8,                // #pragma (show) -- only for "pack"!
-    PSK_Push_Set  = PSK_Push | PSK_Set, // #pragma (push[, id], value)
-    PSK_Pop_Set   = PSK_Pop | PSK_Set,  // #pragma (pop[, id], value)
-  };
+    /// Teach operator<< to produce an object of the correct type.
+    template <typename T>
+    friend const ImmediateDiagBuilder &
+    operator<<(const ImmediateDiagBuilder &Diag, const T &Value) {
+      const DiagnosticBuilder &BaseDiag = Diag;
+      BaseDiag << Value;
+      return Diag;
+    }
 
-  struct PragmaPackInfo {
-    PragmaMsStackAction Action;
-    StringRef SlotLabel;
-    Token Alignment;
+    // It is necessary to limit this to rvalue reference to avoid calling this
+    // function with a bitfield lvalue argument since non-const reference to
+    // bitfield is not allowed.
+    template <typename T,
+              typename = std::enable_if_t<!std::is_lvalue_reference<T>::value>>
+    const ImmediateDiagBuilder &operator<<(T &&V) const {
+      const DiagnosticBuilder &BaseDiag = *this;
+      BaseDiag << std::move(V);
+      return *this;
+    }
   };
 
-  // #pragma pack and align.
-  class AlignPackInfo {
+  /// A generic diagnostic builder for errors which may or may not be deferred.
+  ///
+  /// In CUDA, there exist constructs (e.g. variable-length arrays, try/catch)
+  /// which are not allowed to appear inside __device__ functions and are
+  /// allowed to appear in __host__ __device__ functions only if the host+device
+  /// function is never codegen'ed.
+  ///
+  /// To handle this, we use the notion of "deferred diagnostics", where we
+  /// attach a diagnostic to a FunctionDecl that's emitted iff it's codegen'ed.
+  ///
+  /// This class lets you emit either a regular diagnostic, a deferred
+  /// diagnostic, or no diagnostic at all, according to an argument you pass to
+  /// its constructor, thus simplifying the process of creating these "maybe
+  /// deferred" diagnostics.
+  class SemaDiagnosticBuilder {
   public:
-    // `Native` represents default align mode, which may vary based on the
-    // platform.
-    enum Mode : unsigned char { Native, Natural, Packed, Mac68k };
-
-    // #pragma pack info constructor
-    AlignPackInfo(AlignPackInfo::Mode M, unsigned Num, bool IsXL)
-        : PackAttr(true), AlignMode(M), PackNumber(Num), XLStack(IsXL) {
-      assert(Num == PackNumber && "The pack number has been truncated.");
-    }
-
-    // #pragma align info constructor
-    AlignPackInfo(AlignPackInfo::Mode M, bool IsXL)
-        : PackAttr(false), AlignMode(M),
-          PackNumber(M == Packed ? 1 : UninitPackVal), XLStack(IsXL) {}
+    enum Kind {
+      /// Emit no diagnostics.
+      K_Nop,
+      /// Emit the diagnostic immediately (i.e., behave like Sema::Diag()).
+      K_Immediate,
+      /// Emit the diagnostic immediately, and, if it's a warning or error, also
+      /// emit a call stack showing how this function can be reached by an a
+      /// priori known-emitted function.
+      K_ImmediateWithCallStack,
+      /// Create a deferred diagnostic, which is emitted only if the function
+      /// it's attached to is codegen'ed.  Also emit a call stack as with
+      /// K_ImmediateWithCallStack.
+      K_Deferred
+    };
 
-    explicit AlignPackInfo(bool IsXL) : AlignPackInfo(Native, IsXL) {}
+    SemaDiagnosticBuilder(Kind K, SourceLocation Loc, unsigned DiagID,
+                          const FunctionDecl *Fn, Sema &S);
+    SemaDiagnosticBuilder(SemaDiagnosticBuilder &&D);
+    SemaDiagnosticBuilder(const SemaDiagnosticBuilder &) = default;
 
-    AlignPackInfo() : AlignPackInfo(Native, false) {}
+    // The copy and move assignment operator is defined as deleted pending
+    // further motivation.
+    SemaDiagnosticBuilder &operator=(const SemaDiagnosticBuilder &) = delete;
+    SemaDiagnosticBuilder &operator=(SemaDiagnosticBuilder &&) = delete;
 
-    // When a AlignPackInfo itself cannot be used, this returns an 32-bit
-    // integer encoding for it. This should only be passed to
-    // AlignPackInfo::getFromRawEncoding, it should not be inspected directly.
-    static uint32_t getRawEncoding(const AlignPackInfo &Info) {
-      std::uint32_t Encoding{};
-      if (Info.IsXLStack())
-        Encoding |= IsXLMask;
+    ~SemaDiagnosticBuilder();
 
-      Encoding |= static_cast<uint32_t>(Info.getAlignMode()) << 1;
+    bool isImmediate() const { return ImmediateDiag.has_value(); }
 
-      if (Info.IsPackAttr())
-        Encoding |= PackAttrMask;
+    /// Convertible to bool: True if we immediately emitted an error, false if
+    /// we didn't emit an error or we created a deferred error.
+    ///
+    /// Example usage:
+    ///
+    ///   if (SemaDiagnosticBuilder(...) << foo << bar)
+    ///     return ExprError();
+    ///
+    /// But see CUDADiagIfDeviceCode() and CUDADiagIfHostCode() -- you probably
+    /// want to use these instead of creating a SemaDiagnosticBuilder yourself.
+    operator bool() const { return isImmediate(); }
 
-      Encoding |= static_cast<uint32_t>(Info.getPackNumber()) << 4;
-
-      return Encoding;
+    template <typename T>
+    friend const SemaDiagnosticBuilder &
+    operator<<(const SemaDiagnosticBuilder &Diag, const T &Value) {
+      if (Diag.ImmediateDiag)
+        *Diag.ImmediateDiag << Value;
+      else if (Diag.PartialDiagId)
+        Diag.S.DeviceDeferredDiags[Diag.Fn][*Diag.PartialDiagId].second
+            << Value;
+      return Diag;
     }
 
-    static AlignPackInfo getFromRawEncoding(unsigned Encoding) {
-      bool IsXL = static_cast<bool>(Encoding & IsXLMask);
-      AlignPackInfo::Mode M =
-          static_cast<AlignPackInfo::Mode>((Encoding & AlignModeMask) >> 1);
-      int PackNumber = (Encoding & PackNumMask) >> 4;
+    // It is necessary to limit this to rvalue reference to avoid calling this
+    // function with a bitfield lvalue argument since non-const reference to
+    // bitfield is not allowed.
+    template <typename T,
+              typename = std::enable_if_t<!std::is_lvalue_reference<T>::value>>
+    const SemaDiagnosticBuilder &operator<<(T &&V) const {
+      if (ImmediateDiag)
+        *ImmediateDiag << std::move(V);
+      else if (PartialDiagId)
+        S.DeviceDeferredDiags[Fn][*PartialDiagId].second << std::move(V);
+      return *this;
+    }
 
-      if (Encoding & PackAttrMask)
-        return AlignPackInfo(M, PackNumber, IsXL);
+    friend const SemaDiagnosticBuilder &
+    operator<<(const SemaDiagnosticBuilder &Diag, const PartialDiagnostic &PD) {
+      if (Diag.ImmediateDiag)
+        PD.Emit(*Diag.ImmediateDiag);
+      else if (Diag.PartialDiagId)
+        Diag.S.DeviceDeferredDiags[Diag.Fn][*Diag.PartialDiagId].second = PD;
+      return Diag;
+    }
 
-      return AlignPackInfo(M, IsXL);
+    void AddFixItHint(const FixItHint &Hint) const {
+      if (ImmediateDiag)
+        ImmediateDiag->AddFixItHint(Hint);
+      else if (PartialDiagId)
+        S.DeviceDeferredDiags[Fn][*PartialDiagId].second.AddFixItHint(Hint);
     }
 
-    bool IsPackAttr() const { return PackAttr; }
+    friend ExprResult ExprError(const SemaDiagnosticBuilder &) {
+      return ExprError();
+    }
+    friend StmtResult StmtError(const SemaDiagnosticBuilder &) {
+      return StmtError();
+    }
+    operator ExprResult() const { return ExprError(); }
+    operator StmtResult() const { return StmtError(); }
+    operator TypeResult() const { return TypeError(); }
+    operator DeclResult() const { return DeclResult(true); }
+    operator MemInitResult() const { return MemInitResult(true); }
 
-    bool IsAlignAttr() const { return !PackAttr; }
+  private:
+    Sema &S;
+    SourceLocation Loc;
+    unsigned DiagID;
+    const FunctionDecl *Fn;
+    bool ShowCallStack;
 
-    Mode getAlignMode() const { return AlignMode; }
+    // Invariant: At most one of these Optionals has a value.
+    // FIXME: Switch these to a Variant once that exists.
+    std::optional<ImmediateDiagBuilder> ImmediateDiag;
+    std::optional<unsigned> PartialDiagId;
+  };
 
-    unsigned getPackNumber() const { return PackNumber; }
+  void PrintStats() const;
 
-    bool IsPackSet() const {
-      // #pragma align, #pragma pack(), and #pragma pack(0) do not set the pack
-      // attriute on a decl.
-      return PackNumber != UninitPackVal && PackNumber != 0;
-    }
+  /// Warn that the stack is nearly exhausted.
+  void warnStackExhausted(SourceLocation Loc);
 
-    bool IsXLStack() const { return XLStack; }
+  /// Run some code with "sufficient" stack space. (Currently, at least 256K is
+  /// guaranteed). Produces a warning if we're low on stack space and allocates
+  /// more in that case. Use this in code that may recurse deeply (for example,
+  /// in template instantiation) to avoid stack overflow.
+  void runWithSufficientStackSpace(SourceLocation Loc,
+                                   llvm::function_ref<void()> Fn);
 
-    bool operator==(const AlignPackInfo &Info) const {
-      return std::tie(AlignMode, PackNumber, PackAttr, XLStack) ==
-             std::tie(Info.AlignMode, Info.PackNumber, Info.PackAttr,
-                      Info.XLStack);
-    }
+  /// Returns default addr space for method qualifiers.
+  LangAS getDefaultCXXMethodAddrSpace() const;
 
-    bool operator!=(const AlignPackInfo &Info) const {
-      return !(*this == Info);
-    }
+  /// Load weak undeclared identifiers from the exter...
[truncated]

github-actions · 2024-02-19T06:57:59Z

✅ With the latest revision this PR passed the C/C++ code formatter.

Endilll · 2024-02-19T07:56:29Z

I've built doxygen documentation locally. Now I see what I expected to see (proper member groups).

tbaederr · 2024-02-19T09:04:55Z

11,427 additions, 11,178 deletions not shown because the diff is too large. Please use a local Git client to view these changes.

Never seen that before, haha.

cor3ntin · 2024-02-19T09:19:49Z

This is going to be rather disruptive on downstream projects.
At least we should wait until after the release of clang 18 to merge it, to avoid endless merge conflicts

erichkeane · 2024-02-20T15:39:10Z

This is going to be rather disruptive on downstream projects. At least we should wait until after the release of clang 18 to merge it, to avoid endless merge conflicts

For the most part, git will handle these pretty well on downstreams I think, and as they are declarations, I suspect this isn't going to be super tough of a merge conflict. That said, waiting until after 18 is perhaps a good diea.

Table of contents added at the very beginning of Sema. Grouping is reflected in Doxygen commands, so structure of API reference of Sema is also significantly improved (example from official documentation).

Neat!

While grouping is intentional, as well as each group consisting of public declarations followed by private ones (without changing access in-between),

I MIGHT suggest private followed by public? It is a not-uncommon pattern I've seen to have a private 'helper' class(or data member) defined inline, that is then used by inline-defined public functions.

Data members and inline function definitions in Sema.h complicate the matter, since it's not obvious which group they belong to. Further work is expected to refine contents and order of declarations.

This IS going to be complicated unfortunately. Many of the function definitions at least are going to be 1-liners that are just calling into a different definition. Data members you'll probably have to hunt down where they are used to see the relationship.

What is also intentional is some kind of layering, where Concepts group follows template groups, and ObjC, code completion, CUDA, HLSL, OpenACC, OpenMP, and SYCL are all placed at the end of the file, after C and C++ parts of Sema.

Member initializer list of Sema in Sema.cpp is rewritten to reflect new order of data members in order to avoid -Wreorder-ctor.

I wouldn't mind some sort of 'static_assert' to ensure that this doesn't accidentally increase the size of Sema or cause some sort of pessimization for layout. I realize we're not particularly concerned about the size, but I could imagine goofy things going on.

Endilll · 2024-02-20T16:17:53Z

That's how API reference looks with this patch. I specifically picked small groups to showcase member grouping.

Existing API reference can be found here: https://clang.llvm.org/doxygen/classclang_1_1Sema.html

Endilll · 2024-02-20T16:43:35Z

@erichkeane

That said, waiting until after 18 is perhaps a good diea.

Resolving merge conflicts that will arise in the meantime is not going to be trivial, but should be doable in a reasonable time. So I'm willing to wait.

I MIGHT suggest private followed by public? It is a not-uncommon pattern I've seen to have a private 'helper' class(or data member) defined inline, that is then used by inline-defined public functions.

I don't mind either way, as long as we don't inserting a couple of private members in the middle of public section.

I wouldn't mind some sort of 'static_assert' to ensure that this doesn't accidentally increase the size of Sema

Checking on Linux, sizeof(Sema) is 18824 bytes at moment. This patch increases that by 24 bytes, to 18848. I don't deem it significant enough to care for an object this big, but I can claw it back if this is considered important.

or cause some sort of pessimization for layout. I realize we're not particularly concerned about the size, but I could imagine goofy things going on.

If there was an intent to put some data members first to improve cache hits, I think it has been lost by this point. Our first non-static data members are OpenCLOptions OpenCLFeatures and FPOptions CurFPFeatures at the moment. Since this patch puts generic Sema stuff (Sema.cpp) first, opportunity to put widely used data members first is still there. As I mentioned in the description, follow-up patches that improve ordering in each particular section are expected.

erichkeane · 2024-02-20T16:47:16Z

@erichkeane

That said, waiting until after 18 is perhaps a good diea.

Resolving merge conflicts that will arise in the meantime is not going to be trivial, but should be doable in a reasonable time. So I'm willing to wait.

I'm glad to hear that! I'm hopeful it isn't TOO much of a delay and not too much work. Sema.h changes are at least pretty innocuous/small.

I MIGHT suggest private followed by public? It is a not-uncommon pattern I've seen to have a private 'helper' class(or data member) defined inline, that is then used by inline-defined public functions.

I don't mind either way, as long as we don't inserting a couple of private members in the middle of public section.

I agree with the goal as well!

I wouldn't mind some sort of 'static_assert' to ensure that this doesn't accidentally increase the size of Sema

Checking on Linux, sizeof(Sema) is 18824 bytes at moment. This patch increases that by 24 bytes, to 18848. I don't deem it significant enough to care for an object this big, but I can claw it back if this is considered important.

Urgh, and double-urgh. Yeah, not big enough/important enough, we only ever instantiate a handful of Sema objects anyway at most.

or cause some sort of pessimization for layout. I realize we're not particularly concerned about the size, but I could imagine goofy things going on.

If there was an intent to put some data members first to improve cache hits, I think it has been lost by this point. Our first non-static data members are OpenCLOptions OpenCLFeatures and FPOptions CurFPFeatures at the moment. Since this patch puts generic Sema stuff (Sema.cpp) first, opportunity to put widely used data members first is still there. As I mentioned in the description, follow-up patches that improve ordering in each particular section are expected.

"Commonly used" stuff together perhaps has some value for cache-locality reasons (as well as any other "closely related" stuff with itself), but I don't have a good way to measure that so 'best effort' here is probably good enough/the changes you're intending a good enough attempt.

to268 · 2024-02-21T17:13:33Z

FYI: There is currently 26 PRs that impacts Sema.h, bellow is the bash one liner with GitHub CLI to count the numbers of impacted PRs and another one to list links to impacted PRs.

Count all impacted PRs:

gh pr list --json files --jq '.[] | {paths: [.files[].path | select(. == "clang/include/clang/Sema/Sema.h")]} | select(.paths | length > 0)' -L 30000 | wc -l

List the link to all impacted PRs:

gh pr list --json number,files --jq '.[] | select(.files[].path == "clang/include/clang/Sema/Sema.h") | {number}' -L 30000 | cut -d ':' -f2 | tr -d '}' | sed 's|^|https://github.com/llvm/llvm-project/pull/|'

Endilll · 2024-02-21T17:16:45Z

Funny enough, search in the web version shows only 23 PRs: https://github.com/llvm/llvm-project/pulls?q=is%3Apr+is%3Aopen+%22clang%2Finclude%2Fclang%2FSema%2FSema.h%22

to268 · 2024-02-21T17:23:16Z

That's because you are filtering only opened PRs, not including drafts PRs.
For example there is this draft PR: https://github.com/llvm/llvm-project/pull/81095/files

tstellar · 2024-02-21T18:55:40Z

This will make backports more difficult, but I think it's OK. If we wanted to avoid impacting backports, we'd have to wait ~10 weeks to merge this, but that doesn't seem practical.

zyn0217 · 2024-02-22T04:48:25Z

That's because you are filtering only opened PRs, not including drafts PRs. For example there is this draft PR: https://github.com/llvm/llvm-project/pull/81095/files

(I think my recent PR #82310 now supersedes my two draft PRs, and they can be closed now.)

cor3ntin · 2024-03-05T08:58:42Z

Please run clang-format as a separate PR (before this one) - this will help folks deal with merge conflicts. Thanks

Endilll · 2024-03-05T09:44:30Z

Sorry, but that's not possible without redoing this PR from scratch. On the other hand, it should be easy for downstreams to format Sema.h on their end before merging, if they think that would help (explained below).

On top of that, I don't see how it's going to help with merge conflicts. Even with formatting changes factored out, it's quite hard to figure out why line NNN is now line MMM. Either way, downstreams have to embrace the new structure of the file, and redo their changes in a way that fits into it.

AaronBallman · 2024-03-05T13:16:30Z

We talked about this at the Clang Language WG meeting and there were a lot of good arguments for either formatting/reorganizing as separate commits or for formatting/reorganizing in the same commit. If the folks doing a downstream pulldown are experienced with the code base, a single-step process is likely less disruptive. If the folks doing the pulldown are not experienced with the code base, a two-step process is likely less disruptive. (Nobody felt they had enough changes in Sema.h to block making sweeping changes to the file.)

I don't think we had strong consensus, but my read of the room was leaning more towards doing everything in one step.

This patch introduces `SemaHLSL` class, and moves some HLSL-related functions there. No functional changes intended. Removing "HLSL" from function names inside `SemaHLSL` is left for a subsequent PR by HLSL contributors, if they deem that desirable. This is a part of the effort to split `Sema` into smaller manageable parts, and follows the example of OpenACC. See #82217, #84184, #87634 for additional context.

This patch moves SYCL-related `Sema` functions into new `SemaSYCL` class, following the recent example of OpenACC and HLSL. This is a part of the effort to split `Sema`. Additional context can be found in #82217, #84184, #87634.

This patch moves CUDA-related `Sema` function into new `SemaCUDA` class, following the recent example of SYCL, OpenACC, and HLSL. This is a part of the effort to split Sema. Additional context can be found in #82217, #84184, #87634.

This patch moves CUDA-related `Sema` function into new `SemaCUDA` class, following the recent example of SYCL, OpenACC, and HLSL. This is a part of the effort to split Sema. Additional context can be found in llvm#82217, llvm#84184, llvm#87634.

This patch moves OpenMP-related entities out of `Sema` to a newly created `SemaOpenMP` class. This is a part of the effort to split `Sema` up, and follows the recent example of CUDA, OpenACC, SYCL, HLSL. Additional context can be found in #82217, #84184, #87634.

This is continuation of efforts to split `Sema` up, following the example of OpenMP, OpenACC, etc. Context can be found in #82217 and #84184. I split formatting changes into a separate commit to help reviewing the actual changes.

Following the steps of #82217, this patch reorganizes declarations in `Parse.h`. Highlights are: 1) Declarations are grouped in the same fashion as in `Sema.h`. Table of contents is provided at the beginning of `Parser` class. `public` declaration go first, then `private` ones, but unlike `Sema`, most of the stuff in `Parser` is private. 2) Documentation has been moved from `.cpp` files to the header. Grammar was consistently put in `\verbatim` blocks to render nicely in Doxygen. 3) File has been formatted with clang-format, except for the grammar, because clang-format butchers it.

Endilll added 3 commits February 19, 2024 09:51

Run clang-format

c801950

Merge remote-tracking branch 'upstream/main' into sema-refactoring

681cecc

Endilll requested review from AaronBallman and erichkeane February 19, 2024 06:55

llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Feb 19, 2024

Endilll added 3 commits February 19, 2024 10:00

Commit clang-format suggestion

c0ee64f

Fix formatting in table of contents

d8ee5da

Fix doxygen syntax

c917707

Endilll added 4 commits February 22, 2024 11:14

Fix access specifiers for some declarations

90df977

Run clang-format on whole Sema.h

772ed08

Merge remote-tracking branch 'upstream/main' into sema-refactoring

8546774

Manually re-apply some of the changes from main branch

2c4ddd7

Merge remote-tracking branch 'upstream/main' into sema-refactoring

a34675f

Endilll merged commit ac74d9e into llvm:main Mar 6, 2024

Endilll deleted the sema-refactoring branch March 6, 2024 10:40

Endilll mentioned this pull request Mar 6, 2024

[clang-repl] Names declared in if conditions and for-init statements are local to the inner context (C++ 3.3.2p4) #84150

Merged

Endilll mentioned this pull request Apr 7, 2024

[clang] Introduce SemaHLSL #87912

Merged

Endilll mentioned this pull request Apr 9, 2024

[clang] Introduce SemaSYCL #88086

Merged

Endilll mentioned this pull request Apr 12, 2024

[clang] Introduce SemaCUDA #88559

Merged

Endilll mentioned this pull request Apr 14, 2024

[clang] Introduce SemaOpenMP #88642

Merged

Endilll mentioned this pull request Apr 17, 2024

[clang] Introduce SemaObjC #89086

Merged

Endilll mentioned this pull request May 19, 2024

[clang] Introduce SemaCoroutine #92645

Open

Endilll mentioned this pull request May 5, 2025

[clang][NFC] Regroup declarations in Parser #138511

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[clang][NFC] Regroup declarations in `Sema` #82217

[clang][NFC] Regroup declarations in `Sema` #82217

Endilll commented Feb 19, 2024 •

edited

Loading

llvmbot commented Feb 19, 2024

github-actions bot commented Feb 19, 2024 •

edited

Loading

Endilll commented Feb 19, 2024 •

edited

Loading

tbaederr commented Feb 19, 2024

cor3ntin commented Feb 19, 2024

erichkeane commented Feb 20, 2024

Endilll commented Feb 20, 2024

Endilll commented Feb 20, 2024 •

edited

Loading

erichkeane commented Feb 20, 2024

to268 commented Feb 21, 2024

Endilll commented Feb 21, 2024 •

edited

Loading

to268 commented Feb 21, 2024 •

edited

Loading

tstellar commented Feb 21, 2024

zyn0217 commented Feb 22, 2024

cor3ntin commented Mar 5, 2024

Endilll commented Mar 5, 2024

AaronBallman commented Mar 5, 2024

[clang][NFC] Regroup declarations in Sema #82217

[clang][NFC] Regroup declarations in Sema #82217

Conversation

Endilll commented Feb 19, 2024 • edited Loading

llvmbot commented Feb 19, 2024

github-actions bot commented Feb 19, 2024 • edited Loading

Endilll commented Feb 19, 2024 • edited Loading

tbaederr commented Feb 19, 2024

cor3ntin commented Feb 19, 2024

erichkeane commented Feb 20, 2024

Endilll commented Feb 20, 2024

Endilll commented Feb 20, 2024 • edited Loading

erichkeane commented Feb 20, 2024

to268 commented Feb 21, 2024

Endilll commented Feb 21, 2024 • edited Loading

to268 commented Feb 21, 2024 • edited Loading

tstellar commented Feb 21, 2024

zyn0217 commented Feb 22, 2024

cor3ntin commented Mar 5, 2024

Endilll commented Mar 5, 2024

AaronBallman commented Mar 5, 2024

[clang][NFC] Regroup declarations in `Sema` #82217

[clang][NFC] Regroup declarations in `Sema` #82217

Endilll commented Feb 19, 2024 •

edited

Loading

github-actions bot commented Feb 19, 2024 •

edited

Loading

Endilll commented Feb 19, 2024 •

edited

Loading

Endilll commented Feb 20, 2024 •

edited

Loading

Endilll commented Feb 21, 2024 •

edited

Loading

to268 commented Feb 21, 2024 •

edited

Loading