Add multithreading for singular files #468
Replies: 1 comment
-
Nice idea that is on my wish list of things to consider, but practically it is not as simple to implement in a generic grep tool. It would work fine for options like If we restrict to these use cases for Threaded search on a single file may be faster if we can generally assume that file IO is not the bottleneck. But often it is a bottleneck, since large files aren't cached in memory when searched for the first time or when the file is several GB large and won't fit in "spare" memory for caching. Furthermore, it's not going to speed up recursive searching that use worker thread pools that already saturate the CPU cores (with some limits, because saturating them all is not ideal for performance when other OS threads are busy). Therefore, it is nice to have, but there are caveats. IMHO a dedicated new utility to only count matches or find a matching file is more appropriate. |
Beta Was this translation helpful? Give feedback.
-
Currently ugrep (and I believe all other grep-inspired software) can only dedicate threads to individual files, which can be a major performance bottleneck when searching large singular files. A single logical core simply cannot keep up with today's 7 to 14 gigabytes-a-second read speeds on consumer SSDs, leading to unused bandwidth and wasted time.
The obvious workaround to this is to split the file up into smaller chunks so that ugrep can process them in parallel, which works, but it takes extra time, writing the chunks puts wear on the SSD, and it's just an overall unnecessary extra step.
My suggestion would be to add a kind of multithreading to ugrep that can search the same file from multiple worker threads at a time.
Beta Was this translation helpful? Give feedback.
All reactions