cmd/vet: extend the loopclosure analysis to parallel subtests #55972

findleyr · 2022-09-30T14:42:39Z

I propose to update the loopclosure vet analyzer to check for references to loop variables from within parallel subtests. For example, with this change the analyzer would report a diagnostic for the use of tc after the call to t.Parallel() in the subtest body below:

https://go.dev/play/p/vekL_ZuhIpn

func TestLoopclosure(t *testing.T) {
	tests := []string{"a", "b", "c"}
	for _, tc := range tests {
		// Without this declaration, the test passes.
		// tc := tc
		t.Run(tc, func(t *testing.T) {
			t.Parallel()
			if tc != "c" {
				t.Errorf(`got %s, want "c"`, tc)
			}
		})
	}
}

Recently, @drchase has been experimenting with loop iterator capture semantics, and one of the outcomes of this experiment on large codebases is that parallel subtests frequently start failing, because they were not actually running each subtest due to the bug demonstrated above.

There is precedent for extending the loopclosure analyzer to recognize package APIs that are known to be concurrent: it already recognizes calls to x/sync/errgroup.Group.Go as invoking their argument concurrently. The case of parallel subtests is only slightly more complicated: look for calls to testing.T.Run that invoke T.Parallel. We added this in x/tools, guarded behind an internal variable, and the new check can be experimented with by building the command x/tools/internal/loopclosure, which enables the new check (caveat: the current logic could still be refined, for example by verifying that the receiver of t.Parallel matches the t passed in by t.Run).

A couple details about how this would be implemented:

The check would only consider statements that occur after the first call to t.Parallel within a function literal argument to t.Run. Users can and do rely on the fact that statements before the call to t.Parallel are executed synchronously.
optional: Whereas the loopclosure check for go, defer, and errgroup.Group.Go only consider invocations in the last statement of the loop, this check could consider all invocations of t.Run within the loop body. Unlike with go, defer, and errgroup.Group.Go, there is no common pattern for synchronizing parallel subtests. Attempting to do so would be problematic, as one would be fighting with the testing framework. (Note: it's also fine to not expand the check in this way, as I expect the overwhelming majority of parallel subtests are executed as the last statement in the loop body).

We found ~hundreds of bugs inside Google with this new check, and at least one user on Slack who noticed the in-progress change reported it as having found bugs in their codebase. We can do more research, but I suspect this new analysis would be accurate, actionable, and frequently prevent bugs.

CC @adonovan @lfolger @timothy-king

The text was updated successfully, but these errors were encountered:

timothy-king · 2022-09-30T19:07:28Z

Seems like a great idea to me overall. Minor comments about implementation details.

Whereas the loopclosure check for go, defer, and errgroup.Group.Go only consider invocations in the last statement of the loop, this check could consider all invocations of t.Run within the loop body. Unlike with go, defer, and errgroup.Group.Go, there is no common pattern for synchronizing parallel subtests.

There is one where an additional t.Run is used as parentheses to say when a group of parallel tests finishes: example https://go.dev/play/p/SZmQsSkFo8L and docs https://cs.opensource.google/go/go/+/refs/tags/go1.19.1:src/testing/testing.go;l=326-337;drc=690851ee3e48b85088698a9dbcc7dd112cef9eb5 . We probably should handle these ()'s. I don't think we need to worry about matching the pointer values of the *T.Testing objects, i.e. the parent of the receiver in T.Parallel matches the receiver of the immediately containing T.Run. This does not seem like an interesting detail to need to match.

The check would only consider statements that occur after the first call to t.Parallel within a function literal argument to t.Run.

To know if it is a race this should probably only be statements/expressions dominated by a t.Parallel() call. A very fast check with high precision is to just look at the statements in the range block (current implementation). I expect this to be somewhat acceptable as most reasonable usage is for it to be the first 1-2 statements. I think it is acceptable to not worry too hard about goto statements.

An alternative check is any usage reachable from a t.Parallel(). However this seems like a not helpful frequency improvement for a noticeable precision loss. (Imagine having conditional t.Parallel statements with conditional reads of variables.)

findleyr · 2022-09-30T19:43:20Z

There is one where an additional t.Run is used as parentheses to say when a group of parallel tests finishes:

Thanks for pointing that out. In the current implementation, we only consider t.Run statements that are at the at the top level of the loop body, and therefore wouldn't check the bodies of those sub-sub-tests. In other words, we won't produce false positives in that case (currently), but also wouldn't produce true positives. That seems OK to me, but we could certainly do more.

To know if it is a race this should probably only be statements/expressions dominated by a t.Parallel() call. A very fast check with high precision is to just look at the statements in the range block (current implementation).

Another great point. I suspect that just looking at statements in the loop body is probably good enough, and additional precision is not worth the complexity it would entail. I'll admit I hadn't sufficiently considered the case of goto (e.g. https://go.dev/play/p/8u87MIesRHC). It may be worthwhile to ignore any subtests that have labels.

timothy-king · 2022-09-30T20:14:31Z

In the current implementation, we only consider t.Run statements that are at the at the top level of the loop body, and therefore wouldn't check the bodies of those sub-sub-tests.

This seems reasonable to me. It may be good to document and/or add tests to ensure the very tempting switch from ranging over the body's statements to ast.Inspect to find T.Run/T.Parallel does not start introducing false positives.

rsc · 2022-10-12T17:53:03Z

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

rsc · 2022-10-20T18:04:39Z

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

rsc · 2022-10-26T18:04:50Z

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

gopherbot · 2022-11-01T21:50:02Z

Change https://go.dev/cl/446874 mentions this issue: go/analysis/passes/loopclosure: enable analysis of parallel subtests

gopherbot · 2022-11-02T14:39:01Z

Change https://go.dev/cl/447256 mentions this issue: go/analysis/passes/loopclosure: enable analysis of parallel subtests

…statements like if, switch, and for In golang/go#16520, Allan Donovan suggested the current loopclosure check could be extended to check more statements. The current loopclosure flags patterns like: for k, v := range seq { go/defer func() { ... k, v ... }() } The motivating example for this CL from golang/go#16520 was: var wg sync.WaitGroup for i := 0; i < 10; i++ { for j := 0; j < 1; j++ { wg.Add(1) go func() { fmt.Printf("%d ", i) wg.Done() }() } } wg.Wait() The current loopclosure check does not flag this because of the inner for loop, and the checker looks only at the last statement in the outer loop body. Allan suggested we redefine "last" recursively. For example, if the last statement is an if, then we examine the last statements in both of its branches. Or if the last statement is a nested loop, then we examine the last statement of that loop's body, and so on. A few years ago, Allan sent a sketch in CL 184537. This CL attempts to complete Allan's sketch, as well as integrates with the ensuing changes from golang/go#55972 to check errgroup.Group.Go, which with this CL can now be recursively "last". Updates golang/go#16520 Updates golang/go#55972

gopherbot · 2022-11-18T21:45:14Z

Change https://go.dev/cl/452155 mentions this issue: go/analysis/passes/loopclosure: recursively check last statements in statements like if, switch, and for

…statements like if, switch, and for In golang/go#16520, there was a suggestion to extend the current loopclosure check to check more statements. The current loopclosure flags patterns like: for k, v := range seq { go/defer func() { ... k, v ... }() } For this CL, the motivating example from golang/go#16520 is: var wg sync.WaitGroup for i := 0; i < 10; i++ { for j := 0; j < 1; j++ { wg.Add(1) go func() { fmt.Printf("%d ", i) wg.Done() }() } } wg.Wait() The current loopclosure check does not flag this because of the inner for loop, and the checker looks only at the last statement in the outer loop body. The suggestion is we redefine "last" recursively. For example, if the last statement is an if, then we examine the last statements in both of its branches. Or if the last statement is a nested loop, then we examine the last statement of that loop's body, and so on. A few years ago, Alan Donovan sent a sketch in CL 184537. This CL attempts to complete Alan's sketch, as well as integrates with the ensuing changes from golang/go#55972 to check errgroup.Group.Go, which with this CL can now be recursively "last". Updates golang/go#16520 Updates golang/go#55972 Fixes golang/go#30649 Fixes golang/go#32876 Change-Id: If66c6707025c20f32a2a781f6d11c4901f15742a GitHub-Last-Rev: 04980e0 GitHub-Pull-Request: #415 Reviewed-on: https://go-review.googlesource.com/c/tools/+/452155 Reviewed-by: Tim King <taking@google.com> Run-TryBot: Tim King <taking@google.com> Reviewed-by: Alan Donovan <adonovan@google.com> gopls-CI: kokoro <noreply+kokoro@google.com> Reviewed-by: Robert Findley <rfindley@google.com> Run-TryBot: Alan Donovan <adonovan@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>

findleyr added the Proposal label Sep 30, 2022

gopherbot added this to the Proposal milestone Sep 30, 2022

ianlancetaylor added this to Proposals Oct 5, 2022

ianlancetaylor moved this to Incoming in Proposals Oct 5, 2022

rsc moved this from Incoming to Active in Proposals Oct 12, 2022

rsc moved this from Active to Likely Accept in Proposals Oct 20, 2022

rsc added the Proposal-FinalCommentPeriod label Oct 20, 2022

rsc moved this from Likely Accept to Accepted in Proposals Oct 26, 2022

rsc changed the title ~~proposal: cmd/vet: extend the loopclosure analysis to parallel subtests~~ cmd/vet: extend the loopclosure analysis to parallel subtests Oct 26, 2022

rsc added the Proposal-Accepted label Oct 26, 2022

rsc modified the milestones: Proposal, Backlog Oct 26, 2022

findleyr self-assigned this Oct 27, 2022

gopherbot closed this as completed in golang/tools@d5e9e35 Nov 3, 2022

thepudds mentioned this issue Nov 4, 2022

staticcheck: flag testing.T.Run calls with closures that close over loop variables and run in parallel dominikh/go-tools#502

Open

This was referenced Nov 16, 2022

x/tools/go/analysis/passes/loopclosure: handle t.Parallel #28771

Closed

cmd/vet: check for improper use of t.Run and t.Parallel #18106

Closed

thepudds mentioned this issue Nov 18, 2022

go/analysis/passes/loopclosure: recursively check last statements in statements like if, switch, and for golang/tools#415

Closed

dmitshur removed this from the Backlog milestone Jun 4, 2023

dmitshur added this to the Go1.20 milestone Jun 4, 2023

rsc removed this from Proposals Nov 10, 2023

golang locked and limited conversation to collaborators Jun 3, 2024

gopherbot added the FrozenDueToAge label Jun 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmd/vet: extend the loopclosure analysis to parallel subtests #55972

cmd/vet: extend the loopclosure analysis to parallel subtests #55972

findleyr commented Sep 30, 2022 •

edited

Loading

timothy-king commented Sep 30, 2022

findleyr commented Sep 30, 2022

timothy-king commented Sep 30, 2022

rsc commented Oct 12, 2022

rsc commented Oct 20, 2022

rsc commented Oct 26, 2022

gopherbot commented Nov 1, 2022

gopherbot commented Nov 2, 2022

gopherbot commented Nov 18, 2022

cmd/vet: extend the loopclosure analysis to parallel subtests #55972

cmd/vet: extend the loopclosure analysis to parallel subtests #55972

Comments

findleyr commented Sep 30, 2022 • edited Loading

timothy-king commented Sep 30, 2022

findleyr commented Sep 30, 2022

timothy-king commented Sep 30, 2022

rsc commented Oct 12, 2022

rsc commented Oct 20, 2022

rsc commented Oct 26, 2022

gopherbot commented Nov 1, 2022

gopherbot commented Nov 2, 2022

gopherbot commented Nov 18, 2022

findleyr commented Sep 30, 2022 •

edited

Loading