Skip to content

[clang-repl] Names declared in if conditions and for-init statements are local to the inner context (C++ 3.3.2p4) #84150

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

weliveindetail
Copy link
Member

A TopLevelStmtDecl must be a DeclContext itself to avoid variable definitions to be added to the outer TranslationUnitDecl context. Additionally, ActOnForStmt() requires a CompoundScope when processing a NullStmt body.

@weliveindetail weliveindetail requested a review from Endilll as a code owner March 6, 2024 11:02
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Mar 6, 2024
@llvmbot
Copy link
Member

llvmbot commented Mar 6, 2024

@llvm/pr-subscribers-clang

Author: Stefan Gränitz (weliveindetail)

Changes

A TopLevelStmtDecl must be a DeclContext itself to avoid variable definitions to be added to the outer TranslationUnitDecl context. Additionally, ActOnForStmt() requires a CompoundScope when processing a NullStmt body.


Full diff: https://github.com/llvm/llvm-project/pull/84150.diff

9 Files Affected:

  • (modified) clang/include/clang/AST/Decl.h (+10-6)
  • (modified) clang/include/clang/AST/DeclBase.h (+1)
  • (modified) clang/include/clang/Basic/DeclNodes.td (+1-1)
  • (modified) clang/include/clang/Sema/Sema.h (+2-1)
  • (modified) clang/lib/AST/Decl.cpp (+8-3)
  • (modified) clang/lib/AST/DeclBase.cpp (+1)
  • (modified) clang/lib/Parse/ParseDecl.cpp (+16-8)
  • (modified) clang/lib/Sema/SemaDecl.cpp (+13-3)
  • (modified) clang/test/Interpreter/execute-stmts.cpp (+13-1)
diff --git a/clang/include/clang/AST/Decl.h b/clang/include/clang/AST/Decl.h
index 61117cc5ce71f9..a5879591f4c659 100644
--- a/clang/include/clang/AST/Decl.h
+++ b/clang/include/clang/AST/Decl.h
@@ -4419,7 +4419,7 @@ class FileScopeAsmDecl : public Decl {
 ///
 /// \note This is used in libInterpreter, clang -cc1 -fincremental-extensions
 /// and in tools such as clang-repl.
-class TopLevelStmtDecl : public Decl {
+class TopLevelStmtDecl : public Decl, public DeclContext {
   friend class ASTDeclReader;
   friend class ASTDeclWriter;
 
@@ -4427,7 +4427,7 @@ class TopLevelStmtDecl : public Decl {
   bool IsSemiMissing = false;
 
   TopLevelStmtDecl(DeclContext *DC, SourceLocation L, Stmt *S)
-      : Decl(TopLevelStmt, DC, L), Statement(S) {}
+      : Decl(TopLevelStmt, DC, L), DeclContext(TopLevelStmt), Statement(S) {}
 
   virtual void anchor();
 
@@ -4438,15 +4438,19 @@ class TopLevelStmtDecl : public Decl {
   SourceRange getSourceRange() const override LLVM_READONLY;
   Stmt *getStmt() { return Statement; }
   const Stmt *getStmt() const { return Statement; }
-  void setStmt(Stmt *S) {
-    assert(IsSemiMissing && "Operation supported for printing values only!");
-    Statement = S;
-  }
+  void setStmt(Stmt *S);
   bool isSemiMissing() const { return IsSemiMissing; }
   void setSemiMissing(bool Missing = true) { IsSemiMissing = Missing; }
 
   static bool classof(const Decl *D) { return classofKind(D->getKind()); }
   static bool classofKind(Kind K) { return K == TopLevelStmt; }
+
+  static DeclContext *castToDeclContext(const TopLevelStmtDecl *D) {
+    return static_cast<DeclContext *>(const_cast<TopLevelStmtDecl *>(D));
+  }
+  static TopLevelStmtDecl *castFromDeclContext(const DeclContext *DC) {
+    return static_cast<TopLevelStmtDecl *>(const_cast<DeclContext *>(DC));
+  }
 };
 
 /// Represents a block literal declaration, which is like an
diff --git a/clang/include/clang/AST/DeclBase.h b/clang/include/clang/AST/DeclBase.h
index 9a4736019d1b1b..76810a86a78a46 100644
--- a/clang/include/clang/AST/DeclBase.h
+++ b/clang/include/clang/AST/DeclBase.h
@@ -2120,6 +2120,7 @@ class DeclContext {
     case Decl::Block:
     case Decl::Captured:
     case Decl::ObjCMethod:
+    case Decl::TopLevelStmt:
       return true;
     default:
       return getDeclKind() >= Decl::firstFunction &&
diff --git a/clang/include/clang/Basic/DeclNodes.td b/clang/include/clang/Basic/DeclNodes.td
index 8b1f415dd5fe2c..48396e85c5adac 100644
--- a/clang/include/clang/Basic/DeclNodes.td
+++ b/clang/include/clang/Basic/DeclNodes.td
@@ -95,7 +95,7 @@ def LinkageSpec : DeclNode<Decl>, DeclContext;
 def Export : DeclNode<Decl>, DeclContext;
 def ObjCPropertyImpl : DeclNode<Decl>;
 def FileScopeAsm : DeclNode<Decl>;
-def TopLevelStmt : DeclNode<Decl>;
+def TopLevelStmt : DeclNode<Decl>, DeclContext;
 def AccessSpec : DeclNode<Decl>;
 def Friend : DeclNode<Decl>;
 def FriendTemplate : DeclNode<Decl>;
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index ef4b93fac95ce5..db45de107163a2 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -3307,7 +3307,8 @@ class Sema final {
                               SourceLocation AsmLoc,
                               SourceLocation RParenLoc);
 
-  Decl *ActOnTopLevelStmtDecl(Stmt *Statement);
+  TopLevelStmtDecl *ActOnStartTopLevelStmtDecl(Scope *S);
+  void ActOnFinishTopLevelStmtDecl(TopLevelStmtDecl *D, Stmt *Statement);
 
   /// Handle a C++11 empty-declaration and attribute-declaration.
   Decl *ActOnEmptyDeclaration(Scope *S, const ParsedAttributesView &AttrList,
diff --git a/clang/lib/AST/Decl.cpp b/clang/lib/AST/Decl.cpp
index 5d6bb72a208a1a..5f2edee3e60a0d 100644
--- a/clang/lib/AST/Decl.cpp
+++ b/clang/lib/AST/Decl.cpp
@@ -5539,14 +5539,13 @@ FileScopeAsmDecl *FileScopeAsmDecl::CreateDeserialized(ASTContext &C,
 void TopLevelStmtDecl::anchor() {}
 
 TopLevelStmtDecl *TopLevelStmtDecl::Create(ASTContext &C, Stmt *Statement) {
-  assert(Statement);
   assert(C.getLangOpts().IncrementalExtensions &&
          "Must be used only in incremental mode");
 
-  SourceLocation BeginLoc = Statement->getBeginLoc();
+  SourceLocation Loc = Statement ? Statement->getBeginLoc() : SourceLocation();
   DeclContext *DC = C.getTranslationUnitDecl();
 
-  return new (C, DC) TopLevelStmtDecl(DC, BeginLoc, Statement);
+  return new (C, DC) TopLevelStmtDecl(DC, Loc, Statement);
 }
 
 TopLevelStmtDecl *TopLevelStmtDecl::CreateDeserialized(ASTContext &C,
@@ -5559,6 +5558,12 @@ SourceRange TopLevelStmtDecl::getSourceRange() const {
   return SourceRange(getLocation(), Statement->getEndLoc());
 }
 
+void TopLevelStmtDecl::setStmt(Stmt *S) {
+  assert(S);
+  Statement = S;
+  setLocation(Statement->getBeginLoc());
+}
+
 void EmptyDecl::anchor() {}
 
 EmptyDecl *EmptyDecl::Create(ASTContext &C, DeclContext *DC, SourceLocation L) {
diff --git a/clang/lib/AST/DeclBase.cpp b/clang/lib/AST/DeclBase.cpp
index 10fe8bb97ce660..fcedb3cfd176a0 100644
--- a/clang/lib/AST/DeclBase.cpp
+++ b/clang/lib/AST/DeclBase.cpp
@@ -1352,6 +1352,7 @@ DeclContext *DeclContext::getPrimaryContext() {
   case Decl::ExternCContext:
   case Decl::LinkageSpec:
   case Decl::Export:
+  case Decl::TopLevelStmt:
   case Decl::Block:
   case Decl::Captured:
   case Decl::OMPDeclareReduction:
diff --git a/clang/lib/Parse/ParseDecl.cpp b/clang/lib/Parse/ParseDecl.cpp
index edfab11c37cf0f..25634a5889271e 100644
--- a/clang/lib/Parse/ParseDecl.cpp
+++ b/clang/lib/Parse/ParseDecl.cpp
@@ -5676,24 +5676,32 @@ Parser::DeclGroupPtrTy Parser::ParseTopLevelStmtDecl() {
   // Parse a top-level-stmt.
   Parser::StmtVector Stmts;
   ParsedStmtContext SubStmtCtx = ParsedStmtContext();
-  Actions.PushFunctionScope();
+  ParseScope FnScope(this, Scope::FnScope | Scope::DeclScope |
+                               Scope::CompoundStmtScope);
+  TopLevelStmtDecl *TLSD = Actions.ActOnStartTopLevelStmtDecl(getCurScope());
   StmtResult R = ParseStatementOrDeclaration(Stmts, SubStmtCtx);
-  Actions.PopFunctionScopeInfo();
   if (!R.isUsable())
     return nullptr;
 
-  SmallVector<Decl *, 2> DeclsInGroup;
-  DeclsInGroup.push_back(Actions.ActOnTopLevelStmtDecl(R.get()));
+  Actions.ActOnFinishTopLevelStmtDecl(TLSD, R.get());
 
   if (Tok.is(tok::annot_repl_input_end) &&
       Tok.getAnnotationValue() != nullptr) {
     ConsumeAnnotationToken();
-    cast<TopLevelStmtDecl>(DeclsInGroup.back())->setSemiMissing();
+    TLSD->setSemiMissing();
   }
 
-  // Currently happens for things like  -fms-extensions and use `__if_exists`.
-  for (Stmt *S : Stmts)
-    DeclsInGroup.push_back(Actions.ActOnTopLevelStmtDecl(S));
+  SmallVector<Decl *, 2> DeclsInGroup;
+  DeclsInGroup.push_back(TLSD);
+
+  // Currently happens for things like -fms-extensions and use `__if_exists`.
+  for (Stmt *S : Stmts) {
+    // Here we should be safe as `__if_exists` and friends are not introducing
+    // new variables which need to live outside file scope.
+    TopLevelStmtDecl *D = Actions.ActOnStartTopLevelStmtDecl(getCurScope());
+    Actions.ActOnFinishTopLevelStmtDecl(D, S);
+    DeclsInGroup.push_back(D);
+  }
 
   return Actions.BuildDeclaratorGroup(DeclsInGroup);
 }
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index 9fdd8eb236d1ee..438a4b7fbe844a 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -20446,12 +20446,22 @@ Decl *Sema::ActOnFileScopeAsmDecl(Expr *expr,
   return New;
 }
 
-Decl *Sema::ActOnTopLevelStmtDecl(Stmt *Statement) {
-  auto *New = TopLevelStmtDecl::Create(Context, Statement);
-  Context.getTranslationUnitDecl()->addDecl(New);
+TopLevelStmtDecl *Sema::ActOnStartTopLevelStmtDecl(Scope *S) {
+  auto *New = TopLevelStmtDecl::Create(Context, /*Statement=*/nullptr);
+  CurContext->addDecl(New);
+  PushDeclContext(S, New);
+  PushFunctionScope();
+  PushCompoundScope(false);
   return New;
 }
 
+void Sema::ActOnFinishTopLevelStmtDecl(TopLevelStmtDecl *D, Stmt *Statement) {
+  D->setStmt(Statement);
+  PopCompoundScope();
+  PopFunctionScopeInfo();
+  PopDeclContext();
+}
+
 void Sema::ActOnPragmaRedefineExtname(IdentifierInfo* Name,
                                       IdentifierInfo* AliasName,
                                       SourceLocation PragmaLoc,
diff --git a/clang/test/Interpreter/execute-stmts.cpp b/clang/test/Interpreter/execute-stmts.cpp
index 2d4c17e0c91e66..e341b06d154f18 100644
--- a/clang/test/Interpreter/execute-stmts.cpp
+++ b/clang/test/Interpreter/execute-stmts.cpp
@@ -9,7 +9,6 @@
 //CODEGEN-CHECK-COUNT-2: define internal void @__stmts__
 //CODEGEN-CHECK-NOT: define internal void @__stmts__
 
-
 extern "C" int printf(const char*,...);
 
 template <typename T> T call() { printf("called\n"); return T(); }
@@ -41,3 +40,16 @@ for (; i > 4; --i) { printf("i = %d\n", i); };
 
 int j = i; printf("j = %d\n", j);
 // CHECK-NEXT: j = 4
+
+if (int i = j) printf("i = %d\n", i);
+// CHECK-NEXT: i = 4
+
+for (int i = j; i > 3; --i) printf("i = %d\n", i);
+// CHECK-NEXT: i = 4
+
+for(int i=0; i<2; i+=1) {};
+
+for(int i=0; i<2; i+=1) ;
+
+int *aa=nullptr;
+if (auto *b=aa) *b += 1;

@weliveindetail
Copy link
Member Author

This fixes #83028 which provides context on investigation.

@Endilll
Copy link
Contributor

Endilll commented Mar 6, 2024

Can you please update your branch with main? I'd like to make sure that your new functions in Sema doesn't end up in a wrong place after #82217

Copy link
Contributor

@vgvassilev vgvassilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR!

I think it is a good idea to make the TopLevelStmtDecl a DeclContext because if we have a statement that defines variables they should be attached to the TopLevelDeclContext. This PR makes the TopLevelStmtDecl looking more like a FunctionDecl and that's fine because the FunctionDecl is very close in terms of semantics.

Although fixes the problems we have in described in the issue, I wanted to ping @zygoloid and @AaronBallman if they think of a more elegant or standard-compliant solution.

@@ -5676,24 +5676,32 @@ Parser::DeclGroupPtrTy Parser::ParseTopLevelStmtDecl() {
// Parse a top-level-stmt.
Parser::StmtVector Stmts;
ParsedStmtContext SubStmtCtx = ParsedStmtContext();
Actions.PushFunctionScope();
ParseScope FnScope(this, Scope::FnScope | Scope::DeclScope |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if we could survive without this scope here or at least if we could drop the Scope::CompoundStmtScope part of it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested without it the ParseScope and got a segfault, so I assume we can not drop it (without looking into the details). I kept both, FnScope and CompoundStmtScope, because in ActOnStartTopLevelStmtDecl() we push them on the Sema side as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit awkward to push scoping state in two places. I am wondering if moving the FnScope down in ActOnStartTopLevelStmtDecl and ActOnFinishTopLevelStmtDecl would be a good idea. It would make the implementation look cleaner but if an error occurs the state of the scopes will be inconsistent (the R.isUsable branch). Maybe it is fine as it is...

weliveindetail and others added 7 commits March 6, 2024 12:44
… statements

Interpreter/execute-stmts.cpp fails with:
```
i = 2
i = 5
i = 5
i = 5
j = 4
error: Parsing failed.
error: Parsing failed.

error: 'expected-error' diagnostics seen but not expected:
  Line 1: redefinition of 'i'
  Line 1: redefinition of 'i'
error: 'expected-note' diagnostics seen but not expected:
  Line 1: previous definition is here
  Line 1: previous definition is here
```
Drop duplicate test-case
ActOnForStmt() requires a CompoundScope when processing a NullStmt body
Add TopLevelStmtDecl to outer DeclContext (and push new scopes afterwards)
@weliveindetail weliveindetail force-pushed the clang-repl-var-def-in-if-and-for-init branch from 4d2259e to 68ed18f Compare March 6, 2024 12:05
Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks generally reasonable to me; it makes sense that statements are tied to expecting a compound scope, so modeling that as though there's a function wrapping all the top-level statements seems like a good approach to accomplish that.

@weliveindetail
Copy link
Member Author

Thanks! Added while with empty body as well as switch. Refined test output a bit, so that it now prints:

i = 5 (global scope)
i = 1 (while condition)
i = 2 (if condition)
i = 3 (switch condition)
i = 4 (for-init)

Copy link
Contributor

@vgvassilev vgvassilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@weliveindetail weliveindetail merged commit 4b70d17 into llvm:main Mar 7, 2024
@weliveindetail weliveindetail deleted the clang-repl-var-def-in-if-and-for-init branch March 7, 2024 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants