Skip to content

[cling] The LookupHelper routines need the ROOT lock. #18522

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 29, 2025

Conversation

pcanal
Copy link
Member

@pcanal pcanal commented Apr 25, 2025

Those routines are at the very least looking up information into Clang and thus need to prevent concurrent updates. Some of the routines can also sometimes induces changes in Clang (eg. template instantiation).

This fixes #18520 and #18519

Those routines are at the very least looking up information into
Clang and thus need to prevent concurrent updates.  Some of the
routines can also sometimes induces changes in Clang (eg. template
instantiation).
Copy link

github-actions bot commented Apr 26, 2025

Test Results

    18 files      18 suites   3d 23h 41m 6s ⏱️
 2 731 tests  2 073 ✅  0 💤 658 ❌
47 752 runs  47 082 ✅ 12 💤 658 ❌

For more details on these failures, see this check.

Results for commit e162f99.

♻️ This comment has been updated with latest results.

Copy link
Member

@dpiparo dpiparo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks fine to me, thanks, provided that I understood correctly what was the role of the RAII (the header cannot be found)

@vgvassilev
Copy link
Member

Why not locking the underlying interface in the LookupHelper?

@pcanal
Copy link
Member Author

pcanal commented Apr 26, 2025

Why not locking the underlying interface in the LookupHelper?

Because one need to lock both the (potential) write access and the read accesses.

For example in GetPartiallyDesugaredNameWithScopeHandling we have:

   clang::QualType t = lh.findType(tname.c_str(), ToLHDS(WantDiags()));
   // Technically we ought to try:
   //   if (t.isNull()) t =  lh.findType(TClassEdit::InsertStd(tname), ToLHDS(WantDiags()));
   // at least until the 'normalized name' contains the std:: prefix.
   if (!t.isNull()) {

The lock needs to be held during both the 1st (because of potential modification by this thread) and the last line (because of potential modification by another thread)

We could improve the scaling further by leveraging the fact that the ROOT lock is a read-write lock but that level of granularity is not available from Cling yet.

@pcanal
Copy link
Member Author

pcanal commented Apr 26, 2025

the role of the RAII (the header cannot be found)

The header is interpreter/cling/lib/Interpreter/EnterUserCodeRAII.h and the RAII does the lock taking and releasing as needed. The name of the RAII is nowadays a misnomer as it is now used past its original usage intent.

@vgvassilev
Copy link
Member

Why not locking the underlying interface in the LookupHelper?

Because one need to lock both the (potential) write access and the read accesses.

Why the read access? Do you mean because read access sometimes mutates the AST?

@pcanal
Copy link
Member Author

pcanal commented Apr 26, 2025

Why the read access? Do you mean because read access sometimes mutates the AST?

No. Because another thread might mutate the area you are reading for example:

thread 1 time 1: Take lock
thread 1 time 2:   Load header file with class A
thread 1 time 3:   Find class A
thread 1 time 4: Release lock
thread 1 time 7: Iterate through class A's content

and

thread 2 time 4: Take lock
thread 2 time 5:    Unload header file class A (revert Transaction)
thread 2 time 6: Release lock

Then during time 7, thread 1 is using data/memory that is either deleted or least no longer what it was meant to be.

As long as there is 'destructive' or changing operation left in Cling (eg. unloading transaction) then we do need to have at least a Read Lock when reading data.

As a general rule we have been trying (and only incompletely succeeding) to always hold the ROOT lock when accessing Clang/Cling information.

@dpiparo dpiparo added the clean build Ask CI to do non-incremental build on PR label Apr 28, 2025
@dpiparo
Copy link
Member

dpiparo commented Apr 29, 2025

I think this can be merged, I do not believe fedora42 error is real.

@pcanal pcanal merged commit 5dbed33 into root-project:master Apr 29, 2025
36 of 41 checks passed
@pcanal pcanal added this to the 6.38.00 milestone Apr 29, 2025
@hahnjo
Copy link
Member

hahnjo commented Apr 29, 2025

Hm, so the first commit of this PR will not compile on its own? That's not great 😞

@pcanal pcanal deleted the master-18520 branch April 29, 2025 15:11
@hahnjo
Copy link
Member

hahnjo commented Apr 29, 2025

And cling::InterpreterAccessRAII seems to literally duplicate cling::LockCompilationDuringUserCodeExecutionRAII!?!

@pcanal
Copy link
Member Author

pcanal commented Apr 29, 2025

Hm, so the first commit of this PR will not compile on its own? That's not great

Nope :( I remembered right after click Merge that I was meant to click 'Squash' :( Sorry.

@pcanal
Copy link
Member Author

pcanal commented Apr 29, 2025

And cling::InterpreterAccessRAII seems to literally duplicate cling::LockCompilationDuringUserCodeExecutionRAII!?!

Sort of. The new one is necessary because the old one is private to Cling and thus not accessible to ClingUtils. The old one could indeed be replaced by the new one however it is name-wise pairing with a 2nd RAII about user code.

Related, the old name is used in many placed that are unrelated to 'UserCode'. I propose to:

  • Replace the old name with the new name in all those places (but not necessarily in the places related to UserCode)
  • Remove the old class and possibly alias it to the new name.

@hahnjo
Copy link
Member

hahnjo commented Apr 29, 2025

We must definitely get rid of duplicated code because it will be a nightmare to figure this out in some years from now, after the two copies diverged...

@pcanal
Copy link
Member Author

pcanal commented Apr 29, 2025

See #18551

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clean build Ask CI to do non-incremental build on PR in:Cling
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

Missing lock deep inside TClassEdit::ResolveTypedef Concurrency issue with TClassEdit::ResolveTypedef and TClass::GetListOfMethods
4 participants