Skip to content

[clang] Introduce SemaCoroutine #92645

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Conversation

Endilll
Copy link
Contributor

@Endilll Endilll commented May 18, 2024

This patch moves coroutines-specific Sema functions into the new SemaCoroutine class. This continues previous efforts to split Sema up. Additional context can be found in #84184.
As usual, in order to help reviewing this, formatting changes are split into a separate commit.

@Endilll Endilll added clang:frontend Language frontend issues, e.g. anything involving "Sema" coroutines C++20 coroutines labels May 18, 2024
@llvmbot llvmbot added the clang Clang issues not falling into any other category label May 18, 2024
@llvmbot
Copy link
Member

llvmbot commented May 18, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-coroutines

Author: Vlad Serebrennikov (Endilll)

Changes

This patch moves coroutines-specific Sema functions into the new SemaCoroutine class. This continues previous efforts to split Sema up. Additional context can be found in #84184.
As usual, in order to help reviewing this, formatting changes are split into a separate commit.


Patch is 48.46 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/92645.diff

10 Files Affected:

  • (modified) clang/include/clang/Sema/Sema.h (+7-51)
  • (added) clang/include/clang/Sema/SemaCoroutine.h (+73)
  • (modified) clang/lib/Parse/ParseExpr.cpp (+3-1)
  • (modified) clang/lib/Parse/ParseExprCXX.cpp (+2-1)
  • (modified) clang/lib/Parse/ParseStmt.cpp (+3-1)
  • (modified) clang/lib/Sema/Sema.cpp (+4-2)
  • (modified) clang/lib/Sema/SemaCoroutine.cpp (+153-120)
  • (modified) clang/lib/Sema/SemaDecl.cpp (+3-13)
  • (modified) clang/lib/Sema/SemaStmt.cpp (+6-5)
  • (modified) clang/lib/Sema/TreeTransform.h (+17-14)
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index d4d4a82525a02..1d0fbeacfe061 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -168,6 +168,7 @@ class PseudoDestructorTypeStorage;
 class PseudoObjectExpr;
 class QualType;
 class SemaCodeCompletion;
+class SemaCoroutine;
 class SemaCUDA;
 class SemaHLSL;
 class SemaObjC;
@@ -989,6 +990,11 @@ class Sema final : public SemaBase {
     return *CodeCompletionPtr;
   }
 
+  SemaCoroutine &Coroutine() {
+    assert(CoroutinePtr);
+    return *CoroutinePtr;
+  }
+
   SemaCUDA &CUDA() {
     assert(CUDAPtr);
     return *CUDAPtr;
@@ -1050,6 +1056,7 @@ class Sema final : public SemaBase {
   mutable IdentifierInfo *Ident_super;
 
   std::unique_ptr<SemaCodeCompletion> CodeCompletionPtr;
+  std::unique_ptr<SemaCoroutine> CoroutinePtr;
   std::unique_ptr<SemaCUDA> CUDAPtr;
   std::unique_ptr<SemaHLSL> HLSLPtr;
   std::unique_ptr<SemaObjC> ObjCPtr;
@@ -2267,57 +2274,6 @@ class Sema final : public SemaBase {
   //
   //
 
-  /// \name C++ Coroutines
-  /// Implementations are in SemaCoroutine.cpp
-  ///@{
-
-public:
-  /// The C++ "std::coroutine_traits" template, which is defined in
-  /// \<coroutine_traits>
-  ClassTemplateDecl *StdCoroutineTraitsCache;
-
-  bool ActOnCoroutineBodyStart(Scope *S, SourceLocation KwLoc,
-                               StringRef Keyword);
-  ExprResult ActOnCoawaitExpr(Scope *S, SourceLocation KwLoc, Expr *E);
-  ExprResult ActOnCoyieldExpr(Scope *S, SourceLocation KwLoc, Expr *E);
-  StmtResult ActOnCoreturnStmt(Scope *S, SourceLocation KwLoc, Expr *E);
-
-  ExprResult BuildOperatorCoawaitLookupExpr(Scope *S, SourceLocation Loc);
-  ExprResult BuildOperatorCoawaitCall(SourceLocation Loc, Expr *E,
-                                      UnresolvedLookupExpr *Lookup);
-  ExprResult BuildResolvedCoawaitExpr(SourceLocation KwLoc, Expr *Operand,
-                                      Expr *Awaiter, bool IsImplicit = false);
-  ExprResult BuildUnresolvedCoawaitExpr(SourceLocation KwLoc, Expr *Operand,
-                                        UnresolvedLookupExpr *Lookup);
-  ExprResult BuildCoyieldExpr(SourceLocation KwLoc, Expr *E);
-  StmtResult BuildCoreturnStmt(SourceLocation KwLoc, Expr *E,
-                               bool IsImplicit = false);
-  StmtResult BuildCoroutineBodyStmt(CoroutineBodyStmt::CtorArgs);
-  bool buildCoroutineParameterMoves(SourceLocation Loc);
-  VarDecl *buildCoroutinePromise(SourceLocation Loc);
-  void CheckCompletedCoroutineBody(FunctionDecl *FD, Stmt *&Body);
-
-  // As a clang extension, enforces that a non-coroutine function must be marked
-  // with [[clang::coro_wrapper]] if it returns a type marked with
-  // [[clang::coro_return_type]].
-  // Expects that FD is not a coroutine.
-  void CheckCoroutineWrapper(FunctionDecl *FD);
-  /// Lookup 'coroutine_traits' in std namespace and std::experimental
-  /// namespace. The namespace found is recorded in Namespace.
-  ClassTemplateDecl *lookupCoroutineTraits(SourceLocation KwLoc,
-                                           SourceLocation FuncLoc);
-  /// Check that the expression co_await promise.final_suspend() shall not be
-  /// potentially-throwing.
-  bool checkFinalSuspendNoThrow(const Stmt *FinalSuspend);
-
-  ///@}
-
-  //
-  //
-  // -------------------------------------------------------------------------
-  //
-  //
-
   /// \name C++ Scope Specifiers
   /// Implementations are in SemaCXXScopeSpec.cpp
   ///@{
diff --git a/clang/include/clang/Sema/SemaCoroutine.h b/clang/include/clang/Sema/SemaCoroutine.h
new file mode 100644
index 0000000000000..d4ca6cb0d860a
--- /dev/null
+++ b/clang/include/clang/Sema/SemaCoroutine.h
@@ -0,0 +1,73 @@
+//===----- SemaCUDA.h ----- Semantic Analysis for C++20 coroutines --------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+/// \file
+/// This file declares semantic analysis for C++20 coroutines.
+///
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_SEMA_SEMACOROUTINE_H
+#define LLVM_CLANG_SEMA_SEMACOROUTINE_H
+
+#include "clang/AST/Decl.h"
+#include "clang/AST/DeclTemplate.h"
+#include "clang/AST/Expr.h"
+#include "clang/AST/ExprCXX.h"
+#include "clang/AST/Stmt.h"
+#include "clang/AST/StmtCXX.h"
+#include "clang/Basic/LLVM.h"
+#include "clang/Basic/SourceLocation.h"
+#include "clang/Sema/Ownership.h"
+#include "clang/Sema/Scope.h"
+#include "clang/Sema/SemaBase.h"
+
+namespace clang {
+class SemaCoroutine : public SemaBase {
+public:
+  SemaCoroutine(Sema &S);
+
+  /// The C++ "std::coroutine_traits" template, which is defined in
+  /// \<coroutine_traits>
+  ClassTemplateDecl *StdCoroutineTraitsCache;
+
+  bool ActOnCoroutineBodyStart(Scope *S, SourceLocation KwLoc,
+                               StringRef Keyword);
+  ExprResult ActOnCoawaitExpr(Scope *S, SourceLocation KwLoc, Expr *E);
+  ExprResult ActOnCoyieldExpr(Scope *S, SourceLocation KwLoc, Expr *E);
+  StmtResult ActOnCoreturnStmt(Scope *S, SourceLocation KwLoc, Expr *E);
+
+  ExprResult BuildOperatorCoawaitLookupExpr(Scope *S, SourceLocation Loc);
+  ExprResult BuildOperatorCoawaitCall(SourceLocation Loc, Expr *E,
+                                      UnresolvedLookupExpr *Lookup);
+  ExprResult BuildResolvedCoawaitExpr(SourceLocation KwLoc, Expr *Operand,
+                                      Expr *Awaiter, bool IsImplicit = false);
+  ExprResult BuildUnresolvedCoawaitExpr(SourceLocation KwLoc, Expr *Operand,
+                                        UnresolvedLookupExpr *Lookup);
+  ExprResult BuildCoyieldExpr(SourceLocation KwLoc, Expr *E);
+  StmtResult BuildCoreturnStmt(SourceLocation KwLoc, Expr *E,
+                               bool IsImplicit = false);
+  StmtResult BuildCoroutineBodyStmt(CoroutineBodyStmt::CtorArgs);
+  bool buildCoroutineParameterMoves(SourceLocation Loc);
+  VarDecl *buildCoroutinePromise(SourceLocation Loc);
+  void CheckCompletedCoroutineBody(FunctionDecl *FD, Stmt *&Body);
+
+  // As a clang extension, enforces that a non-coroutine function must be marked
+  // with [[clang::coro_wrapper]] if it returns a type marked with
+  // [[clang::coro_return_type]].
+  // Expects that FD is not a coroutine.
+  void CheckCoroutineWrapper(FunctionDecl *FD);
+  /// Lookup 'coroutine_traits' in std namespace and std::experimental
+  /// namespace. The namespace found is recorded in Namespace.
+  ClassTemplateDecl *lookupCoroutineTraits(SourceLocation KwLoc,
+                                           SourceLocation FuncLoc);
+  /// Check that the expression co_await promise.final_suspend() shall not be
+  /// potentially-throwing.
+  bool checkFinalSuspendNoThrow(const Stmt *FinalSuspend);
+};
+} // namespace clang
+
+#endif // LLVM_CLANG_SEMA_SEMACOROUTINE_H
diff --git a/clang/lib/Parse/ParseExpr.cpp b/clang/lib/Parse/ParseExpr.cpp
index eb7447fa038e4..98471ea0ea1a7 100644
--- a/clang/lib/Parse/ParseExpr.cpp
+++ b/clang/lib/Parse/ParseExpr.cpp
@@ -32,6 +32,7 @@
 #include "clang/Sema/Scope.h"
 #include "clang/Sema/SemaCUDA.h"
 #include "clang/Sema/SemaCodeCompletion.h"
+#include "clang/Sema/SemaCoroutine.h"
 #include "clang/Sema/SemaObjC.h"
 #include "clang/Sema/SemaOpenACC.h"
 #include "clang/Sema/SemaOpenMP.h"
@@ -1465,7 +1466,8 @@ ExprResult Parser::ParseCastExpression(CastParseKind ParseKind,
     SourceLocation CoawaitLoc = ConsumeToken();
     Res = ParseCastExpression(AnyCastExpr);
     if (!Res.isInvalid())
-      Res = Actions.ActOnCoawaitExpr(getCurScope(), CoawaitLoc, Res.get());
+      Res = Actions.Coroutine().ActOnCoawaitExpr(getCurScope(), CoawaitLoc,
+                                                 Res.get());
     return Res;
   }
 
diff --git a/clang/lib/Parse/ParseExprCXX.cpp b/clang/lib/Parse/ParseExprCXX.cpp
index e149b1a0fb5ef..78db3a7deeaac 100644
--- a/clang/lib/Parse/ParseExprCXX.cpp
+++ b/clang/lib/Parse/ParseExprCXX.cpp
@@ -24,6 +24,7 @@
 #include "clang/Sema/ParsedTemplate.h"
 #include "clang/Sema/Scope.h"
 #include "clang/Sema/SemaCodeCompletion.h"
+#include "clang/Sema/SemaCoroutine.h"
 #include "llvm/Support/Compiler.h"
 #include "llvm/Support/ErrorHandling.h"
 #include <numeric>
@@ -1977,7 +1978,7 @@ ExprResult Parser::ParseCoyieldExpression() {
   ExprResult Expr = Tok.is(tok::l_brace) ? ParseBraceInitializer()
                                          : ParseAssignmentExpression();
   if (!Expr.isInvalid())
-    Expr = Actions.ActOnCoyieldExpr(getCurScope(), Loc, Expr.get());
+    Expr = Actions.Coroutine().ActOnCoyieldExpr(getCurScope(), Loc, Expr.get());
   return Expr;
 }
 
diff --git a/clang/lib/Parse/ParseStmt.cpp b/clang/lib/Parse/ParseStmt.cpp
index b0af04451166c..f07bc2690fa35 100644
--- a/clang/lib/Parse/ParseStmt.cpp
+++ b/clang/lib/Parse/ParseStmt.cpp
@@ -23,6 +23,7 @@
 #include "clang/Sema/EnterExpressionEvaluationContext.h"
 #include "clang/Sema/Scope.h"
 #include "clang/Sema/SemaCodeCompletion.h"
+#include "clang/Sema/SemaCoroutine.h"
 #include "clang/Sema/SemaObjC.h"
 #include "clang/Sema/SemaOpenMP.h"
 #include "clang/Sema/TypoCorrection.h"
@@ -2456,7 +2457,8 @@ StmtResult Parser::ParseReturnStatement() {
     }
   }
   if (IsCoreturn)
-    return Actions.ActOnCoreturnStmt(getCurScope(), ReturnLoc, R.get());
+    return Actions.Coroutine().ActOnCoreturnStmt(getCurScope(), ReturnLoc,
+                                                 R.get());
   return Actions.ActOnReturnStmt(ReturnLoc, R.get(), getCurScope());
 }
 
diff --git a/clang/lib/Sema/Sema.cpp b/clang/lib/Sema/Sema.cpp
index f847c49920cf3..df71673548fb9 100644
--- a/clang/lib/Sema/Sema.cpp
+++ b/clang/lib/Sema/Sema.cpp
@@ -44,6 +44,7 @@
 #include "clang/Sema/SemaCUDA.h"
 #include "clang/Sema/SemaCodeCompletion.h"
 #include "clang/Sema/SemaConsumer.h"
+#include "clang/Sema/SemaCoroutine.h"
 #include "clang/Sema/SemaHLSL.h"
 #include "clang/Sema/SemaInternal.h"
 #include "clang/Sema/SemaObjC.h"
@@ -205,6 +206,7 @@ Sema::Sema(Preprocessor &pp, ASTContext &ctxt, ASTConsumer &consumer,
       CurScope(nullptr), Ident_super(nullptr),
       CodeCompletionPtr(
           std::make_unique<SemaCodeCompletion>(*this, CodeCompleter)),
+      CoroutinePtr(std::make_unique<SemaCoroutine>(*this)),
       CUDAPtr(std::make_unique<SemaCUDA>(*this)),
       HLSLPtr(std::make_unique<SemaHLSL>(*this)),
       ObjCPtr(std::make_unique<SemaObjC>(*this)),
@@ -219,8 +221,8 @@ Sema::Sema(Preprocessor &pp, ASTContext &ctxt, ASTConsumer &consumer,
       CodeSegStack(nullptr), StrictGuardStackCheckStack(false),
       FpPragmaStack(FPOptionsOverride()), CurInitSeg(nullptr),
       VisContext(nullptr), PragmaAttributeCurrentTargetDecl(nullptr),
-      StdCoroutineTraitsCache(nullptr), IdResolver(pp),
-      OriginalLexicalContext(nullptr), StdInitializerList(nullptr),
+      IdResolver(pp), OriginalLexicalContext(nullptr),
+      StdInitializerList(nullptr),
       FullyCheckedComparisonCategories(
           static_cast<unsigned>(ComparisonCategoryType::Last) + 1),
       StdSourceLocationImplDecl(nullptr), CXXTypeInfoDecl(nullptr),
diff --git a/clang/lib/Sema/SemaCoroutine.cpp b/clang/lib/Sema/SemaCoroutine.cpp
index 81334c817b2af..f59e6a7c2b6a8 100644
--- a/clang/lib/Sema/SemaCoroutine.cpp
+++ b/clang/lib/Sema/SemaCoroutine.cpp
@@ -13,6 +13,7 @@
 //
 //===----------------------------------------------------------------------===//
 
+#include "clang/Sema/SemaCoroutine.h"
 #include "CoroutineStmtBuilder.h"
 #include "clang/AST/ASTLambda.h"
 #include "clang/AST/Decl.h"
@@ -25,6 +26,7 @@
 #include "clang/Sema/Initialization.h"
 #include "clang/Sema/Overload.h"
 #include "clang/Sema/ScopeInfo.h"
+#include "clang/Sema/Sema.h"
 #include "clang/Sema/SemaInternal.h"
 #include "llvm/ADT/SmallSet.h"
 
@@ -57,7 +59,7 @@ static QualType lookupPromiseType(Sema &S, const FunctionDecl *FD,
   const SourceLocation FuncLoc = FD->getLocation();
 
   ClassTemplateDecl *CoroTraits =
-      S.lookupCoroutineTraits(KwLoc, FuncLoc);
+      S.Coroutine().lookupCoroutineTraits(KwLoc, FuncLoc);
   if (!CoroTraits)
     return QualType();
 
@@ -249,20 +251,21 @@ static bool isValidCoroutineContext(Sema &S, SourceLocation Loc,
 
 /// Build a call to 'operator co_await' if there is a suitable operator for
 /// the given expression.
-ExprResult Sema::BuildOperatorCoawaitCall(SourceLocation Loc, Expr *E,
-                                          UnresolvedLookupExpr *Lookup) {
+ExprResult
+SemaCoroutine::BuildOperatorCoawaitCall(SourceLocation Loc, Expr *E,
+                                        UnresolvedLookupExpr *Lookup) {
   UnresolvedSet<16> Functions;
   Functions.append(Lookup->decls_begin(), Lookup->decls_end());
-  return CreateOverloadedUnaryOp(Loc, UO_Coawait, Functions, E);
+  return SemaRef.CreateOverloadedUnaryOp(Loc, UO_Coawait, Functions, E);
 }
 
 static ExprResult buildOperatorCoawaitCall(Sema &SemaRef, Scope *S,
                                            SourceLocation Loc, Expr *E) {
-  ExprResult R = SemaRef.BuildOperatorCoawaitLookupExpr(S, Loc);
+  ExprResult R = SemaRef.Coroutine().BuildOperatorCoawaitLookupExpr(S, Loc);
   if (R.isInvalid())
     return ExprError();
-  return SemaRef.BuildOperatorCoawaitCall(Loc, E,
-                                          cast<UnresolvedLookupExpr>(R.get()));
+  return SemaRef.Coroutine().BuildOperatorCoawaitCall(
+      Loc, E, cast<UnresolvedLookupExpr>(R.get()));
 }
 
 static ExprResult buildCoroutineHandle(Sema &S, QualType PromiseType,
@@ -475,9 +478,10 @@ static ExprResult buildPromiseCall(Sema &S, VarDecl *Promise,
   return buildMemberCall(S, PromiseRef.get(), Loc, Name, Args);
 }
 
-VarDecl *Sema::buildCoroutinePromise(SourceLocation Loc) {
-  assert(isa<FunctionDecl>(CurContext) && "not in a function scope");
-  auto *FD = cast<FunctionDecl>(CurContext);
+VarDecl *SemaCoroutine::buildCoroutinePromise(SourceLocation Loc) {
+  assert(isa<FunctionDecl>(SemaRef.CurContext) && "not in a function scope");
+  ASTContext &Context = getASTContext();
+  auto *FD = cast<FunctionDecl>(SemaRef.CurContext);
   bool IsThisDependentType = [&] {
     if (const auto *MD = dyn_cast_if_present<CXXMethodDecl>(FD))
       return MD->isImplicitObjectMemberFunction() &&
@@ -487,19 +491,20 @@ VarDecl *Sema::buildCoroutinePromise(SourceLocation Loc) {
 
   QualType T = FD->getType()->isDependentType() || IsThisDependentType
                    ? Context.DependentTy
-                   : lookupPromiseType(*this, FD, Loc);
+                   : lookupPromiseType(SemaRef, FD, Loc);
   if (T.isNull())
     return nullptr;
 
-  auto *VD = VarDecl::Create(Context, FD, FD->getLocation(), FD->getLocation(),
-                             &PP.getIdentifierTable().get("__promise"), T,
-                             Context.getTrivialTypeSourceInfo(T, Loc), SC_None);
+  auto *VD =
+      VarDecl::Create(Context, FD, FD->getLocation(), FD->getLocation(),
+                      &SemaRef.PP.getIdentifierTable().get("__promise"), T,
+                      Context.getTrivialTypeSourceInfo(T, Loc), SC_None);
   VD->setImplicit();
-  CheckVariableDeclarationType(VD);
+  SemaRef.CheckVariableDeclarationType(VD);
   if (VD->isInvalidDecl())
     return nullptr;
 
-  auto *ScopeInfo = getCurFunction();
+  auto *ScopeInfo = SemaRef.getCurFunction();
 
   // Build a list of arguments, based on the coroutine function's arguments,
   // that if present will be passed to the promise type's constructor.
@@ -508,10 +513,10 @@ VarDecl *Sema::buildCoroutinePromise(SourceLocation Loc) {
   // Add implicit object parameter.
   if (auto *MD = dyn_cast<CXXMethodDecl>(FD)) {
     if (MD->isImplicitObjectMemberFunction() && !isLambdaCallOperator(MD)) {
-      ExprResult ThisExpr = ActOnCXXThis(Loc);
+      ExprResult ThisExpr = SemaRef.ActOnCXXThis(Loc);
       if (ThisExpr.isInvalid())
         return nullptr;
-      ThisExpr = CreateBuiltinUnaryOp(Loc, UO_Deref, ThisExpr.get());
+      ThisExpr = SemaRef.CreateBuiltinUnaryOp(Loc, UO_Deref, ThisExpr.get());
       if (ThisExpr.isInvalid())
         return nullptr;
       CtorArgExprs.push_back(ThisExpr.get());
@@ -532,9 +537,9 @@ VarDecl *Sema::buildCoroutinePromise(SourceLocation Loc) {
     // frame, use that reference.
     auto *MoveDecl =
         cast<VarDecl>(cast<DeclStmt>(Move->second)->getSingleDecl());
-    RefExpr =
-        BuildDeclRefExpr(MoveDecl, MoveDecl->getType().getNonReferenceType(),
-                         ExprValueKind::VK_LValue, FD->getLocation());
+    RefExpr = SemaRef.BuildDeclRefExpr(
+        MoveDecl, MoveDecl->getType().getNonReferenceType(),
+        ExprValueKind::VK_LValue, FD->getLocation());
     if (RefExpr.isInvalid())
       return nullptr;
     CtorArgExprs.push_back(RefExpr.get());
@@ -550,7 +555,7 @@ VarDecl *Sema::buildCoroutinePromise(SourceLocation Loc) {
     InitializedEntity Entity = InitializedEntity::InitializeVariable(VD);
     InitializationKind Kind = InitializationKind::CreateForInit(
         VD->getLocation(), /*DirectInit=*/true, PLE);
-    InitializationSequence InitSeq(*this, Entity, Kind, CtorArgExprs,
+    InitializationSequence InitSeq(SemaRef, Entity, Kind, CtorArgExprs,
                                    /*TopLevelOfInitList=*/false,
                                    /*TreatUnavailableAsInvalid=*/false);
 
@@ -561,18 +566,18 @@ VarDecl *Sema::buildCoroutinePromise(SourceLocation Loc) {
     // found ([over.match.viable]), then promise-constructor-arguments is ( q_1
     // , ...,  q_n ), otherwise promise-constructor-arguments is empty.
     if (InitSeq) {
-      ExprResult Result = InitSeq.Perform(*this, Entity, Kind, CtorArgExprs);
+      ExprResult Result = InitSeq.Perform(SemaRef, Entity, Kind, CtorArgExprs);
       if (Result.isInvalid()) {
         VD->setInvalidDecl();
       } else if (Result.get()) {
-        VD->setInit(MaybeCreateExprWithCleanups(Result.get()));
+        VD->setInit(SemaRef.MaybeCreateExprWithCleanups(Result.get()));
         VD->setInitStyle(VarDecl::CallInit);
-        CheckCompleteVariableDeclaration(VD);
+        SemaRef.CheckCompleteVariableDeclaration(VD);
       }
     } else
-      ActOnUninitializedDecl(VD);
+      SemaRef.ActOnUninitializedDecl(VD);
   } else
-    ActOnUninitializedDecl(VD);
+    SemaRef.ActOnUninitializedDecl(VD);
 
   FD->addDecl(VD);
   return VD;
@@ -596,10 +601,10 @@ static FunctionScopeInfo *checkCoroutineContext(Sema &S, SourceLocation Loc,
   if (ScopeInfo->CoroutinePromise)
     return ScopeInfo;
 
-  if (!S.buildCoroutineParameterMoves(Loc))
+  if (!S.Coroutine().buildCoroutineParameterMoves(Loc))
     return nullptr;
 
-  ScopeInfo->CoroutinePromise = S.buildCoroutinePromise(Loc);
+  ScopeInfo->CoroutinePromise = S.Coroutine().buildCoroutinePromise(Loc);
   if (!ScopeInfo->CoroutinePromise)
     return nullptr;
 
@@ -666,13 +671,13 @@ static void checkNoThrow(Sema &S, const Stmt *E,
     }
 }
 
-bool Sema::checkFinalSuspendNoThrow(const Stmt *FinalSuspend) {
+bool SemaCoroutine::checkFinalSuspendNoThrow(const Stmt *FinalSuspend) {
   llvm::SmallPtrSet<const Decl *, 4> ThrowingDecls;
   // We first collect all declarations that should not throw but not declared
   // with noexcept. We then sort them based on the location before printing.
   // This is to avoid emitting the same note multiple times on the same
   // declaration, and also provide a deterministic order for the messages.
-  checkNoThrow(*this, FinalSuspend, ThrowingDecls);
+  checkNoThrow(SemaRef, FinalSuspend, ThrowingDecls);
   auto SortedDecls = llvm::SmallVector<const Decl *, 4>{ThrowingDecls.begin(),
                                                         ThrowingDecls.end()};
   sort(SortedDecls, [](const Decl *A, const Decl *B) {
@@ -684,14 +689,14 @@ bool Sema::checkFinalSuspendNoThrow(const Stmt *FinalSuspend) {
   return ThrowingDecls.empty();
 }
 
-bool Sema::ActOnCoroutineBodyStart(Scope *SC, SourceLocation KWLoc,
-                                   StringRef Keyword) {
+bool SemaCoroutine::ActOnCoroutineBodyStart(Scope *SC, SourceLocation KWLoc,
+                                            StringRef Keyword) {
   // Ignore previous expr evaluation contexts.
   EnterExpressionEvaluationContext PotentiallyEvaluated(
-      *this, Sema::ExpressionEvaluationContext::PotentiallyEvaluated);
-  if (!checkCoroutin...
[truncated]

@cor3ntin
Copy link
Contributor

Thanks! However, I am fairly against moving C and C++ features out of Sema, especially when the outcome is 50 Loc kess, at the cost of not knowing where to put or find things in the long term (as we add coroutines adjacent features, or code related to both coroutines and something else).

The previous sema changes made sense (despite the inevitable churn) - I am not sure this one does

@Endilll
Copy link
Contributor Author

Endilll commented May 18, 2024

As @erichkeane described it, we're clearing the woods of "core" C++ functionality before dealing with the hard parts, like SemaDeclCXX.cpp or SemaChecking.cpp.
Actual LoC wins are secondary, as they have been all along.

@cor3ntin
Copy link
Contributor

What is the goal here? there is a lot of code reuse in C++ and C. and having a file per feature is going to make thing worse (maintenance, unboarding, compile times).
There are a lot more low-hamging fruits we should do before considering that sort of split.

Wasm, AMD, Builtins, pragmas, random objects which could be in headers ( NestedNameSpecInfo, ContextRAII, NameClassification, AccessCheckingSFINAE, InstantiatingTemplate, SynthesizedFunctionScope, etc - I would recommend using Aliases so they are still functionally in Sema)

@Endilll
Copy link
Contributor Author

Endilll commented May 18, 2024

What is the goal here? there is a lot of code reuse in C++ and C. and having a file per feature is going to make thing worse (maintenance, unboarding, compile times).

Maintenance: one big blob of code which is Sema makes it virtually impossible to approach long-standing const-correctness issue in a systematic way, because enormous flat list of member functions doesn't help you identify the leaves of dependency chains to start there.

Onboarding: this effort started with an impression I had that it's almost impossible to wrap your head around Sema because of its size. This was an onboarding problem for me.

Compile times: per measurements taken in #84184 (comment), indirection that splitting of Sema introduces doesn't have a noticeable impact on both our compile times and runtime performance.

There are a lot more low-hamging fruits we should do before considering that sort of split.

Wasm, AMD, Builtins, pragmas, random objects which could be in headers ( NestedNameSpecInfo, ContextRAII, NameClassification, AccessCheckingSFINAE, InstantiatingTemplate, SynthesizedFunctionScope, etc - I would recommend using Aliases so they are still functionally in Sema)

Time for that will come. Irrespective of the order we move stuff out of Sema, the hard parts (basically all the code for C++98 and, likely, C++11) will remain, because we haven't come with a plan for them yet. And those will be minor wins in terms of volume of code, because OpenMP miracle is unlikely to manifest again (when I was able to reduce Sema.h by almost 1k LoC).

C++ features not present in C++98 were identified as potential low-hanging fruits, because they are implemented on top of what was already there, and coroutines were indeed rather easy to factor out. C++20 modules are definitely not a low-hanging fruit, though.

Copy link
Contributor

@cor3ntin cor3ntin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As said in previous comments, I don't think this is the right direction

@mizvekov
Copy link
Contributor

Was there ever an RFC on this whole 'splitting up Sema' project?

I have my own reservations as well.

I think starting up from the easier bits is a risky proposition, as we might realize this whole thing will fail when we get to the harder parts.

I also think this can lead in the future to the same sort of problems overusing const got us in the first place: needing large amount of repetitive changes all around when a commonly used or deeply nested component starts depending on a larger chunk of Sema.

Also, this will further cement current design lines, without considering where we want to be in the future.

And lastly, it seems this will lead to large amounts of code churn without proportional benefit, impacting in-tree development as well as external projects.

@Endilll
Copy link
Contributor Author

Endilll commented May 19, 2024

Was there ever an RFC on this whole 'splitting up Sema' project?

It was quite extensively discussed in #84184, but it wasn't deemed necessary to go through RFC process for a non-functional change that doesn't affect stable interfaces.

I think starting up from the easier bits is a risky proposition, as we might realize this whole thing will fail when we get to the harder parts.

Sema has grown so big and interconnected that it's virtually impossible to wrap a head around it to solve hard problems first.

I also think this can lead in the future to the same sort of problems overusing const got us in the first place: needing large amount of repetitive changes all around when a commonly used or deeply nested component starts depending on a larger chunk of Sema.

I recently looked into our const situation, and I definitely agree that it's overused. I believe one of the reasons it's hard to address systematically at the Sema level is again its size and tight coupling of everything to everything, which makes it hard to figure out "leaves" of call chains to start to rectify the situation without viral changes.

Also, this will further cement current design lines, without considering where we want to be in the future.

I believe cemented design line between Sema and Parser served us well, even if it's limiting sometimes. That said, changes I'm making doesn't really cement anything yet. Everything is still available from everywhere (save for static functions). The difference is that dependencies between parts of Sema become more visible.

And lastly, it seems this will lead to large amounts of code churn without proportional benefit, impacting in-tree development as well as external projects.

I believe that we're significantly improving maintainability here, and the churn is worth it in the long run. For instance, #82217 took me whopping 20-25 hours of non-stop work to finish. Doing that in one step was requested by downstreams.

@ChuanqiXu9
Copy link
Member

While I am not against the idea about splitting Sema, the implementation detail makes me slightly concerning. What I had in mind is to split SemaC, SemaCXX (this can be derived class of SemaCXX), SemaObjC and other dialects (e.g., ACC or CUDA) out of Sema. Then big features of C++ can be member of SemaCXX as SemaCoroutine, SemaConcept...

@Endilll
Copy link
Contributor Author

Endilll commented May 20, 2024

While I am not against the idea about splitting Sema, the implementation detail makes me slightly concerning. What I had in mind is to split SemaC, SemaCXX (this can be derived class of SemaCXX), SemaObjC and other dialects (e.g., ACC or CUDA) out of Sema.

Inheritance is problematic, because it can't work off forward declarations of base classes. I believe there is still hope that we can reduce the blast radius of header changes in terms of TUs being recompiled.

Also note that language dialects you have in mind have already been moved outside of Sema.

Then big features of C++ can be member of SemaCXX as SemaCoroutine, SemaConcept...

This would mean even more hierarchy than just Sema containing a flat list of references to its parts. I'm not particularly opposed to it, but it's not on my radar at the moment. It would impact usage side, e.g. Actions.CXX().Coroutine().ActOnCoawaitExpr(), and I can see why people would be opposed to it.

@cor3ntin
Copy link
Contributor

It was quite extensively discussed in #84184, but it wasn't deemed necessary to go through RFC process for a non-functional change that doesn't affect stable interfaces.

We had buy-in to split languages out and I think that has been successful (with painful churns).
Splitting out individual C++ features or even C from C++ is a different proposition all together because these things are naturally interweaved, and we quickly get diminishing returns (and increased churns), at the cost of more clumsy APIs
(ie things like SemaAccess() everywhere and thing harder to find or organized)

I think there are a lots of opportunities to remove some utilities objects out of Sema.h, move pragma handling etc which should be a bit more self contained before considering doing more surgery on C++ (and at some point we should pause and consider the impact of already done work)
There are a lot of architectures specific functions that we might be able to move, if we want to?

@AaronBallman
Copy link
Collaborator

What is the goal here?

The eventual goal is for Sema to have appropriate layering instead of being a monolithic design. "Appropriate layering" can be handled in many different ways, but the goal when this was discussed extensively in the past was to split by "category" where some categories are language-based (splitting OpenACC from HLSL from Objective-C, etc) and some categories are feature-based (splitting attributes from coroutines from constant expression evaluation).

Clang is designed to be a library and our semantic layer is... not. So the goal is to eventually partition these into separable concerns so it is easier to maintain and so some parts can be reused (for example, constant expression evaluation could be helpful for clang-tidy while still not needing to expose the entirety of Sema to achieve it). This is obviously a very long-term goal, and also a bit of a research effort in terms of where to draw the line for such a monolithic part of the compiler.

We had buy-in to split languages out and I think that has been successful (with painful churns).

FWIW, I have not received any reports about churn being painful from downstreams. This was discussed at a previous Clang Language WG meeting and the only pain this has caused that I'm aware of has been the usual "it's hard to track against ToT" variety that we get with any significant maintenance of the project. If there are worse problems than that, it would be good to get details about them.

We need the ability to maintain and evolve Clang, so fear of churn without evidence of harm is not a blocking concern. That said, if there's some concrete reason the churn outweighs the benefits of the split, that is a different situation.

Splitting out individual C++ features or even C from C++ is a different proposition all together because these things are naturally interweaved, and we quickly get diminishing returns (and increased churns), at the cost of more clumsy APIs
(ie things like SemaAccess() everywhere and thing harder to find or organized)

I think this is part of the investigative work and I disagree with a blanket prohibition against splitting out features, at least at this early of a stage. IMO, we should evaluate the benefits for each component on their own merits and push back splitting up where it doesn't make sense to do so.

Specific to splitting out coroutines, I can see arguments in either direction. It's pretty trivial to split out because it's built "on top" of Sema rather than deeply integrated with it like C and C++ support are. However, it's not clear that it can ever be a reusable component and it's also not clear whether we want to go as far as what Chuanqi's comment suggests in terms of layering. Given the concerns raised here, it might make sense to put this change on hold, do a few more similar experiments for other parts of C++, and see what sort of principled design approach makes sense to take here. My current thinking (which could be off-base!) is that for language features strongly tied to language semantics (like coroutines), if we do split them then it probably makes sense to split those off in a language-based approach (getSema().CXX().Coroutines().whatever) whereas more language-agnostic concepts (like constant expression evaluation) might make sense to split off into their own component whose language-specific behavior can be controlled by a traits object of some kind (getSema().ConstExpr<SemaCXX>().whatever).

@Endilll
Copy link
Contributor Author

Endilll commented May 20, 2024

Given the concerns raised here, it might make sense to put this change on hold, do a few more similar experiments for other parts of C++, and see what sort of principled design approach makes sense to take here.

It's not clear to me what do we want to clarify or try out before going forward with this PR. I moved concepts out of Sema in #92672, and that also went well in terms of resistance I felt while trying to get declarations out of Sema. We need to start somewhere, and I believe concepts and coroutines are the right first steps. Alternative is to come up with some kind of a grand plan for core C and C++ parts (Expr, ExprCXX, Decl, DeclCXX, etc.), but prior discussions on the topic suggested that it's not feasible at this point.

My current thinking (which could be off-base!) is that for language features strongly tied to language semantics (like coroutines), if we do split them then it probably makes sense to split those off in a language-based approach (getSema().CXX().Coroutines().whatever) whereas more language-agnostic concepts (like constant expression evaluation) might make sense to split off into their own component whose language-specific behavior can be controlled by a traits object of some kind (getSema().ConstExpr().whatever).

Whatever shape Sema is going to take in the future, splitting Sema into a flat list of components moves us forward, because merging components back into bigger components is trivial compared to splitting them up. One of the primary goals of this effort is not as much transforming Sema according to some vision, as it is highlighting and learning how various parts of it are connected. That should make our future design decisions more informed.

@erichkeane
Copy link
Collaborator

FWIW, I've been working actively on Open ACC Sema, and I found that the split so far has made development much more organized/easier, and makes naming of things a lot less awkward. I realize splitting out actual language features is concerning to some, but I'm cautiously optimistic with how successful I think the split of some of the dialects have been.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category coroutines C++20 coroutines
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants