-
Notifications
You must be signed in to change notification settings - Fork 18k
proposal: x/exp/xiter: new package with iterator adapters #61898
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The duplication of each function is the first thing that catches the eye. Are there thoughts on why this is acceptable? |
What about an adapter that converts an |
Some typos: EqualFunc2, Map2, Merge2, and MergeFunc2 lack the 2 suffixes on their actual names. They're all correct in the corresponding documentation. |
May I humbly suggest that the name "iterutils" is less susceptible to, uh, unfortunate mispronunciation. |
For |
I'd actually prefer Edit: I just realized that if |
This proposal has been added to the active column of the proposals project |
The more I think about it, the more that I think that API design for this should wait until after a decision is made on #49085. Multiple other languages have proven over and over that a left-to-right chained syntax is vastly superior ergonomically to simple top-level functions for iterators. For example, compare nonNegative := xiter.Filter(
xiter.Map(
bufio.Lines(r),
parseLine,
),
func(v int) bool { return v >= 0 },
) vs. nonNegative := bufio.Lines(r).
Map(parseLine).
Filter(func(v int) bool { return v >= 0 }) Go's a little weird because of the need to put the lines := bufio.Lines(r)
intlines := xiter.Map(lines, parseLine)
nonNegative := xiter.Filter(func(v int) bool { return v >= 0 }) That works, but it clutters up the local namespace and it's significantly harder to edit. For example, if you decide you need to add a new step in the chain, you have to make sure that all of the variables for each iterator match up in the previous and succeeding calls. |
What type does |
You would probably have to wrap the base iterator like:
|
Sorry. I should have stuck a comment in. I was just coming up with some hypothetical function that would give an Not necessarily. The transformative and sink functions on iterators could just be defined as methods on |
I was wrong, it’s not an interface. |
Why do some functions take the names := xiter.Map(func (p Person) string {
return p.Name
}, people) // "people" gets lost
// vs
names := xiter.Map(people, func (p Person) string {
return p.Name
}) |
@DeedleFake There won't be a "decision" on #49085 anytime soon. There are good reasons not to do it yet, but we also don't want to say it never happens. The issue exists to reflect that state. What it comes down to is, would you rather have no iterators (for the foreseeable future) or ones which can't be "chained"? |
No iterators, definitely. I've done fine without them for over a decade. I can wait a bit longer. If a bad implementation goes in, I'll never get a good version. Plus, I can just write my own implementation of whatever iterator functions I need as long as |
Neither chaining nor functional programming has ever been a decisive or recommended technique in Go. Instead, iteration—specifically, procedural 'for' loops—has always been a core technique since the language's inception. The iterator proposals aim to enhance this core approach. While I don't know what the overall plans are, if you're hoping for Go to follow the path of Java Streams or C# LINQ, you might be in for disappointment. |
I think "a bit" is misleading. We are talking years - if at all. And I don't believe the second part of that sentence is true either, we could always release a v2 of the relevant packages, if we ever manage to do #49085 in a decade or so. |
Is that not the intention of these proposals? To build a standardized iterator system that works similarly to those? Why else is there a proposal here for
Edit: The way this proposal is phrased does actually imply that they may be heavily reevaluated enough in That issue has only been open for 2 years. I think assuming that it'll take a decade to solve is a bit unfair. Yes, a One of my favorite things about Go is how slow and methodical it (usually) is in introducing new features. I think that the fact that it took over a decade to add generics is a good thing, and I really wanted generics. One of the purposes of that approach is to try avoid having to fix it later. Adding those functions in the proposed manner will almost definitely necessitate that later fix, and I very much would like to avoid that if at all possible. |
Java Streams and .NET LINQ build on a standardized iterator system, but they are more than that. Both languages had a generic iterator system before. Iterators are useful without chaining or functional programming.
That would be this very proposal, and it comes with a caveat: "... or perhaps not. There are concerns about how these would affect idiomatic Go code. " This means that not everyone who has read these proposals in advance believes that this part is a good idea. |
Maybe chaining leads to too much of a good thing. It becomes more tempting to write long, hard-to-read chains of functions. You're less likely to do that if you have to nest calls. As an analogy, Go has |
Re #49085, generic methods either require (A) dynamic code generation or (B) terrible speed or (C) hiding those methods from dynamic interface checks or (D) not doing them at all. We have chosen option (D). The issue remains open like so many suggestions people have made, but I don't see a realistic path forward where we choose A, B, or C, nor do I see a fifth option. So it makes sense to assume generic methods are not going to happen and do our work accordingly. |
@DeedleFake The issue is not lack of understanding what a lack of parameterized methods means. It's just that, as @rsc said, wanting them doesn't make them feasible. The issue only being 2 years old is deceptive. The underlying problem is actually as old as Go and one of the main reasons we didn't have generics for most of that. Which you should consider, when you say
We got generics by committing to keep implementation strategies open, thus avoiding the generics dilemma. Not having parametric methods is a pretty direct consequence of that decision. |
Well, I tried. If that's the decision then that's the decision. I'm disappointed, but I guess I'll just be satisfied with what I do like about the current proposal, even if it has, in my opinion, some fairly major problems. Sorry for dragging this a bit off-topic there. |
Hope that it's not noise: I wondered if naming it the |
Python does the equivalent of You could also do cooldown with for i, item := range enumerate(stream) {
if i != 0 && i%chunkSize == 0 {
time.Sleep(cooldown)
}
doSomething(item)
} That seems clearer to me. |
I don't want to derail the discussion, but I want to report some benchmark results for my Edit: Unfortunately, I had a bug in my reference "upfront_sort" implementation. I've fixed it and re-run my benchmarks. The new results indicate that the binary-heap implementation is only advantageous as the total number pairs grows and
(no differences in terms of allocations) I still believe that a binary heap is valuable. If x/exp/xiter ever sees the light of day and provides something like |
a small idea to make xiter adapters composable Added pipe helpers package xiter
type Operator[I any, O any] func(in I) O
// pipe helpers
// added more Pipe to support n operators
func Pipe[I, O any](
in I,
op1 Operator[I, O],
) O {
return op1(in)
}
func Pipe2[I, A, O any](
in I,
op1 Operator[I, A],
op2 Operator[A, O],
) O {
return op2(op1(in))
}
func Pipe3[I, A, B, O any](
in I,
op1 Operator[I, A],
op2 Operator[A, B],
op3 Operator[B, O],
) O {
return op3(op2(op1(in)))
}
func Pipe4[I, A, B, C, O any](
in I,
op1 Operator[I, A],
op2 Operator[A, B],
op3 Operator[B, C],
op4 Operator[C, O],
) O {
return op4(op3(op2(op1(in))))
}
// convert any function as operator
func Do[I any, Arg1 any, O any](
fn func(src I, arg1 Arg1) O,
arg1 Arg1,
) Operator[I, O] {
return func(src I) O {
return fn(src, arg1)
}
}
func Do2[I any, Arg1 any, Arg2 any, O any](
fn func(src I, arg1 Arg1, arg2 Arg2) O,
arg1 Arg1, arg2 Arg2,
) Operator[I, O] {
return func(src I) O {
return fn(src, arg1, arg2)
}
} Then we could package main
func main() {
// Must define the iter.Seq[int] to Pipe* infer the correct type
var seq iter.Seq[int] = func(yield func(i int) bool) {
for i := range 10 {
if !yield(i) {
return
}
}
}
// progress and collect
_ = xiter.Pipe3(
seq,
xiter.Do(xiter.Filter, func(x int) bool { return x%2 == 0 }),
xiter.Do(xiter.Map, func(x int) string { return fmt.Sprintf("%d", x*x) }),
slices.Collect,
)
// compose for progress each sub chunk
// then collect
_ := xiter.Pipe3(
seq,
xiter.Do(xiter.Chunk[int], 3),
xiter.Do(xiter.Map, func(values []int) int {
return xiter.Pipe(
slices.Values(values),
xiter.Do2(xiter.Reduce, 0, func(ret int, v int) int { return ret + v }),
)
}),
slices.Collect,
)
} |
@jimmyfrasche would it be possible to link the proposals you listed in that comment in the issue description as well (or at least add a link to your comment, buried behind "Load more")? This is an enormous thread so most people probably won't see your quite helpful summary. The remarks about potentially dropping/renaming Zip and adding Map12/Map21 also seem useful to mention. This issue is quite discoverable through Google searches, not as much as other proposals :) I'm not sure if there's a point in piling things up in this proposal by suggesting yet another function but it does support the case for addition of "classical" func Zip[V0, V1 any](iter.Seq[V0], iter.Seq[V1]) iter.Seq2[V0, V1] would be something that allows creating an iterator that repeats the given item (or maybe items like // count == -1 could mean an infinite iterator
func Repeat[T any](x T, count int) iter.Seq[T] This would allow something like: list1 = []string{"apple", "banana", "orange"}
list2 = []string{"grapefruit", "banana", "pineapple"}
// merge into a deduplicated set
set := map[string]struct{}{} // potentially could split this into xiter.Repeat and xiter.RepeatN like in `strings` package
valueIter := xiter.Repeat(struct{}{}, -1)
maps.Insert(set, xiter.Zip(slices.Values(list1), valueIter))
maps.Insert(set, xiter.Zip(slices.Values(list2), valueIter)) This specific case is also solved by a less universal func Lift[V any](seq Seq[V]) Seq2[V, struct{}] but it's a lot more restrictive compared to func Map12(f func(V1) (K2, V2), Seq[V1]) Seq2[K2, V2] in the following way: valueFunc := func(string) (string, struct{}) { return struct{}{} }
maps.Insert(set, xiter.Map12(valueFunc, slices.Values(list1)))
maps.Insert(set, xiter.Map12(valueFunc, slices.Values(list2))) though it starts to get tricky when trying to replicate |
By the way, we can easily chain iterators like this: type Stream[V any] iter.Seq[V]
func FromSeq[V any](seq iter.Seq[V]) Stream[V] {
return Stream[V](seq)
}
func (s Stream[V]) Iterator() iter.Seq[V] {
return iter.Seq[V](s)
}
func (s Stream[V]) Filter(predicate func(V) bool) Stream[V] {
return FromSeq(xiter.Filter[V](s.Iterator(), predicate))
}
func (s Stream[V]) Map(mapper func(V) V) Stream[V] {
return FromSeq(xiter.Map[V](s.Iterator(), mapper))
} st := stream.FromSeq(seq).
Filter(func(i int) bool { return i != 2 }).
Map(func(i int) int { return i * 2 })
for s := range st {
fmt.Println(s)
} I created a draft of a package to explore this more, if you are interested :) |
@dzherb Note that |
I think it is probably time to wind down this proposal. Well-past time actually. It was a good discussion, but I don't think we have consensus on really adding any of these to the standard library and encourage what in many cases is overabstraction (even though in some cases it is not). Most importantly, if we're not going to do xiter, we should stop talking about it and create more space for third-party packages, instead of standing in their way. We licked the cookie, but the cookie is now quite stale. Better to toss it and let other people bake some new ones. Or something like that. |
Without a heterogeneous collection xiter is going nowhere. I am not sure about third-party, because I does not seem worth it to import one for an iterator adapter.
Isn't these already solved by making iterator adapters verbose to write. Coroutines and iterator enable reusable algorithms, both are better options than repeating for loops. |
@rsc, the problem with this proposal is not the idea itself. Most people in here seem to like the idea of having iterator adapters in the standard library. The problem is that without generic methods and/or a pipe operator and shorter, easier-to-write anonymous functions, they are ergonomically unsound. I have an iterator adapter library of my own and while I have used it, I sometimes don't purely because the syntax is so awkward to write, not because it wouldn't simplify logic flow. These problems make evaluation of this proposal skewed because the biggest issues are with syntax and ergonomics, not utility or functionality. |
@DeedleFake We have to evaluate library proposals in the context of the language that we have. That's not skewed. If the language ever changes, it seems easy enough to revisit this. But given that those changes are not really on the horizon, it makes no sense to add things to the standard library that aren't useful without them. |
I am basically in favor of withdrawing this proposal because in practice while I have used iterators for real code, I haven't needed a swiss army set of iterator utilities. Instead I generally just write the ones I need at the time. That said, it does feel a little incomplete not to have map/filter/reduce in the standard library. |
That's been my experience too. Perhaps at some point we can scan the corpus for the most commonly implemented helpers and consider adding them to |
You're asking for type inference and the omission of the return keyword in closures, both syntactic sugar since these don't add new language features. I consider it bad practice to chain more than 2 iterator adapters, especially for those with functions as parameters.
A lot of iterator adapters is what LLVM based languages can afford but unfortunately not go without sacrificing performance.
I don't think this requires research, either iterator adapters are added or not. |
I think that even if we don't include the rest, FWIW, I'd be fine with a "this is less performant than an equivalent loop" warning on any iterator adapter. |
I think it's probably true that I've not been using In those cases, I have a function that normally returns a more interesting sequence but has some situations where it returns a fixed result of either an empty sequence or a sequence containing just one placeholder value, primarily just to satisfy the function signature in a way that avoids the caller needing to make any special accommodation, similar to returning nil slices or maps so that a caller will just naturally take no action on the result. The only example from this proposal that I've seen arise more than once is "filter", but even that is marginal since we've typically been able to do the filtering either at the source or the destination, rather than as an intermediate step. Our main focus has been to avoid materializing temporary slices just to conveniently iterate over them, and not to significantly change approach otherwise. Of course I can't and don't intend to speak for anyone else's experience here, but I'd generally agree that it's not clear yet just how extensive the standard library needs to be in this area, or whether future language changes would want these to be designed in a different way, so I might suggest that this larger proposal be closed and instead focus on smaller proposals like #68947 that each solve a specific set of well-understood problems. (While hopefully keeping in mind all of the general API design ideas already worked through above so that the individual proposals can all have consistent design elements.) |
We propose to add a new package golang.org/x/exp/xiter that defines adapters on iterators. Perhaps these would one day be moved to the iter package or perhaps not. There are concerns about how these would affect idiomatic Go code. It seems worth defining them in x/exp to help that discussion along, and then we can decide whether they move anywhere else when we have more experience with them.
The package is called xiter to avoid a collision with the standard library iter (see proposal #61897). An alternative would be to have xiter define wrappers and type aliases for all the functions and types in the standard iter package, but the type aliases would depend on #46477, which is not yet implemented.
This is one of a collection of proposals updating the standard library for the new 'range over function' feature (#61405). It would only be accepted if that proposal is accepted. See #61897 for a list of related proposals.
Edit, 2024-05-15: Added some missing 2s in function names, and also changed Reduce to take the function first, instead of between sum and seq.
Edit, 2024-07-17: Updated code to match the final Go 1.23 language change. Corrected various typos.
/*
Package xiter implements basic adapters for composing iterator sequences:
*/
The text was updated successfully, but these errors were encountered: