Skip to content

parquet reader: move pruning predicate creation from ParquetSource to ParquetOpener #15561

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Apr 6, 2025

Conversation

adriangb
Copy link
Contributor

@adriangb adriangb commented Apr 3, 2025

Needed for #15301 and #15057.

Closes #15534

Additionally I think this will make predicate evaluation slightly more performant for files with missing columns.
There are actually 3 file schemas going around:

  1. The table schema, as returned by TableProvider, etc.
  2. The file_schema passed into FileScanConfig which is not the physical file schema, rather it's the table schema - partition columns.
  3. The physical file schema.

Currently we build predicates against (2), which means that a predicate may reference columns not found in the actual file. I believe this would result in null stats being created on the fly (some minimal work) and pointless evaluation of predicates (some more work).
I'm not sure how this stacks up with the extra work of creating the predicates multiple times, that also has a cost. But that should be easier to cache and is O(number of files) instead of O(number of row pages), so I think it should be better.

At the very least this is more correct in my mind.

@adriangb
Copy link
Contributor Author

adriangb commented Apr 3, 2025

cc @alamb

@github-actions github-actions bot added core Core DataFusion crate datasource Changes to the datasource crate labels Apr 3, 2025
}
.build();
builder = builder.set_bloom_filter_enabled(true);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I'm now using the actual explain plan to check that bloom filters are used it's easiest to just enable them and check that they pruned. It's also a more realistic test in that it's what I as a user would do to check if bloom filters are working.

Comment on lines +210 to +223
let analyze_exec = Arc::new(AnalyzeExec::new(
false,
false,
// use a new ParquetSource to avoid sharing execution metrics
self.build_parquet_exec(
file_schema.clone(),
file_group.clone(),
self.build_file_source(file_schema.clone()),
),
Arc::new(Schema::new(vec![
Field::new("plan_type", DataType::Utf8, true),
Field::new("plan", DataType::Utf8, true),
])),
));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I'm refactoring RoundTrip this is that I need to create the file source twice: ParquetSource has internal Metrics so if we run the query against the same ParquetSource twice (once for the data and once for the explain analyze plan) then we end up with duplicate metrics. Cloning it doesn't help / work because the metrics themselves are Arced.

I think generally asserting against the explain plan and not a handle to the ParquetSource is more in line with how real world users use DataFusion to debug if the page index, stats, etc. are working / pruning.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this sounds good to me

Comment on lines -1476 to -1477
#[tokio::test]
async fn parquet_exec_display_deterministic() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I canned this whole test because it seems the the point is to check that required_guarantees= doesn't change run to run but that is no longer part of the output, so I don't see what this test would be testing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the guarantees were in a HashSet and printed to the explain plan so they could change from run to run

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right I guess the point is: now that they are no longer printed to the explain plan, is this test needed or can we bin it?

Comment on lines +1513 to +1514
// When both matched and pruned are 0, it means that the pruning predicate
// was not used at all.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not super intuitive (I'd prefer it was null or none or the whole row_groups_matched_statistics was just not included. But I've left a comment to clarify for future readers.

@@ -498,6 +473,7 @@ impl FileSource for ParquetSource {
reorder_filters: self.reorder_filters(),
enable_page_index: self.enable_page_index(),
enable_bloom_filter: self.bloom_filter_on_read(),
enable_row_group_stats_pruning: self.table_parquet_options.global.pruning,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just for consistency with the global option: before this I don't think this option was being referenced anywhere / it was broken. Maybe it's being decided in ListingTable? I feel like there's a couple options that you'd think apply to ParquetSource but only get handled via ListingTable.

Comment on lines 376 to 398
async fn load_page_index<T: AsyncFileReader>(
arrow_reader: ArrowReaderMetadata,
input: &mut T,
options: ArrowReaderOptions,
) -> Result<ArrowReaderMetadata> {
let parquet_metadata = arrow_reader.metadata();
let missing_column_index = parquet_metadata.column_index().is_none();
let missing_offset_index = parquet_metadata.offset_index().is_none();
if missing_column_index || missing_offset_index {
let m = Arc::try_unwrap(Arc::clone(&parquet_metadata))
.unwrap_or_else(|e| e.as_ref().clone());
let mut reader =
ParquetMetaDataReader::new_with_metadata(m).with_page_indexes(true);
reader.load_page_index(input).await?;
let new_parquet_metadata = reader.finish()?;
let new_arrow_reader =
ArrowReaderMetadata::try_new(Arc::new(new_parquet_metadata), options)?;
Ok(new_arrow_reader)
} else {
// No page index, return the original metadata
Ok(arrow_reader)
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this should be in arrow alongside load_async, but putting it here for now

Comment on lines 382 to 384
let missing_column_index = parquet_metadata.column_index().is_none();
let missing_offset_index = parquet_metadata.offset_index().is_none();
if missing_column_index || missing_offset_index {
Copy link
Contributor Author

@adriangb adriangb Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is different than the original logic: the original logic was if missing_column_index && missing_offset_index, which is probably fine in practice (I think they always get loaded at the same time) in theory you need to re-load if either is missing.

Comment on lines +130 to +133
// Note about schemas: we are actually dealing with **3 different schemas** here:
// - The table schema as defined by the TableProvider. This is what the user sees, what they get when they `SELECT * FROM table`, etc.
// - The "virtual" file schema: this is the table schema minus any hive partition columns and projections. This is what the file schema is coerced to.
// - The physical file schema: this is the schema as defined by the parquet file. This is what the parquet file actually contains.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not actually changing anything here, just documenting the current status quo because it's somewhat confusing.

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Apr 3, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @adriangb -- I think this PR makes a lot of sense to me

I do think it would be good to restore the explain plan display of pruning predicates

Also, I want to run some performance benchmarks on this branch to ensure there isn't a slowdown. Otherwise looks great to me

Comment on lines +210 to +223
let analyze_exec = Arc::new(AnalyzeExec::new(
false,
false,
// use a new ParquetSource to avoid sharing execution metrics
self.build_parquet_exec(
file_schema.clone(),
file_group.clone(),
self.build_file_source(file_schema.clone()),
),
Arc::new(Schema::new(vec![
Field::new("plan_type", DataType::Utf8, true),
Field::new("plan", DataType::Utf8, true),
])),
));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this sounds good to me

Comment on lines -1476 to -1477
#[tokio::test]
async fn parquet_exec_display_deterministic() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the guarantees were in a HashSet and printed to the explain plan so they could change from run to run

{
// (bloom filters use `pruning_predicate` too).
// Because filter pushdown may happen dynamically as long as there is a predicate
// if we have *any* predicate applied, we can't guarantee the statistics are exact.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the correct check and makes sense to me

@@ -625,7 +625,7 @@ physical_plan
01)CoalesceBatchesExec: target_batch_size=8192
02)--FilterExec: column1@0 LIKE f%
03)----RepartitionExec: partitioning=RoundRobinBatch(2), input_partitions=1
04)------DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/parquet/foo.parquet]]}, projection=[column1], file_type=parquet, predicate=column1@0 LIKE f%, pruning_predicate=column1_null_count@2 != row_count@3 AND column1_min@0 <= g AND f <= column1_max@1, required_guarantees=[]
04)------DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/parquet/foo.parquet]]}, projection=[column1], file_type=parquet, predicate=column1@0 LIKE f%
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is basically my only concern with the entire PR -- that the explain plans now do not show pruning predicates / literal guarantees anymore.

This will make it harder to understand when optimizations are happening or not (because a particular predicate can't be converted for example)

Would it be possible to add this part of the explain back in (by trying to create a pruning predicate for only the explain plan, when needed)?

I realize that approach has the downside that the pruning predicate used for each file may be different than what is shown in the explain plan

Copy link
Contributor Author

@adriangb adriangb Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think we can do that. I feared that it would be more confusing because the pruning predicate you see is not what you get in the end...

Is there any way we can inject this information at runtime? Metrics already kind of do that. It'd be nice to record the per-file pruning predicates, per file schema mappings and per-file filters once those exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reverted this part. Do you think maybe we can rename pruning_predicate= to table_schema_pruning_predicate=, estimated_pruning_predicate= something to give some indication that it may not match what actually happens/happened at runtime?

@github-actions github-actions bot removed the sqllogictest SQL Logic Tests (.slt) label Apr 3, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much @adriangb -- I am going to run the benchmarks on this PR to make sure we didn't mess anything up, but of that goes well i think this is ready to merge

// Didn't we explicitly *not* load it above?
// Well it's possible that a custom implementation of `AsyncFileReader` gives you
// the page index even if you didn't ask for it (e.g. because it's cached)
// so it's important to check that here to avoid extra work.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@alamb
Copy link
Contributor

alamb commented Apr 4, 2025

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.8.0-1016-gcp #18-Ubuntu SMP Fri Oct 4 22:16:29 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Comparing move-predicate to 28451b5 using Benchmarks: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@adriangb
Copy link
Contributor Author

adriangb commented Apr 4, 2025

Crossing my fingers that the overhead of building the predicates is not an issue. If it is... we could always create it once upfront and only re-create it if the predicate or schema changes. But I'm hoping that's not necessary.

@alamb
Copy link
Contributor

alamb commented Apr 4, 2025

🤖: Benchmark completed

Details

Comparing HEAD and move-predicate
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ move-predicate ┃       Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 0     │  2023.60ms │      2194.77ms │ 1.08x slower │
│ QQuery 1     │   743.75ms │      1055.10ms │ 1.42x slower │
│ QQuery 2     │  1516.95ms │      1998.04ms │ 1.32x slower │
│ QQuery 3     │   718.48ms │       704.61ms │    no change │
│ QQuery 4     │  1519.47ms │      1502.43ms │    no change │
│ QQuery 5     │ 17099.54ms │     16986.59ms │    no change │
│ QQuery 6     │  7014.18ms │      8055.53ms │ 1.15x slower │
└──────────────┴────────────┴────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary             ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)             │ 30635.96ms │
│ Total Time (move-predicate)   │ 32497.07ms │
│ Average Time (HEAD)           │  4376.57ms │
│ Average Time (move-predicate) │  4642.44ms │
│ Queries Faster                │          0 │
│ Queries Slower                │          4 │
│ Queries with No Change        │          3 │
└───────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ move-predicate ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     1.86ms │         1.96ms │  1.06x slower │
│ QQuery 1     │    35.59ms │        36.48ms │     no change │
│ QQuery 2     │    95.71ms │        91.77ms │     no change │
│ QQuery 3     │   101.49ms │       100.10ms │     no change │
│ QQuery 4     │   770.54ms │       770.11ms │     no change │
│ QQuery 5     │   880.81ms │      1082.81ms │  1.23x slower │
│ QQuery 6     │    32.08ms │        32.84ms │     no change │
│ QQuery 7     │    39.15ms │        40.89ms │     no change │
│ QQuery 8     │   962.18ms │       938.02ms │     no change │
│ QQuery 9     │  1235.92ms │      1241.56ms │     no change │
│ QQuery 10    │   281.92ms │       397.08ms │  1.41x slower │
│ QQuery 11    │   316.87ms │       425.90ms │  1.34x slower │
│ QQuery 12    │   951.88ms │      1176.17ms │  1.24x slower │
│ QQuery 13    │  1409.13ms │      1469.91ms │     no change │
│ QQuery 14    │   894.32ms │      1078.45ms │  1.21x slower │
│ QQuery 15    │  1074.48ms │      1084.12ms │     no change │
│ QQuery 16    │  1783.13ms │      2013.95ms │  1.13x slower │
│ QQuery 17    │  1659.72ms │      1861.72ms │  1.12x slower │
│ QQuery 18    │  3146.33ms │      3358.44ms │  1.07x slower │
│ QQuery 19    │    85.54ms │        88.95ms │     no change │
│ QQuery 20    │  1150.49ms │      1677.14ms │  1.46x slower │
│ QQuery 21    │  1393.94ms │      2094.43ms │  1.50x slower │
│ QQuery 22    │  2537.09ms │      4947.08ms │  1.95x slower │
│ QQuery 23    │  8843.92ms │     14846.84ms │  1.68x slower │
│ QQuery 24    │   488.32ms │       706.20ms │  1.45x slower │
│ QQuery 25    │   396.62ms │       609.53ms │  1.54x slower │
│ QQuery 26    │   550.11ms │       764.86ms │  1.39x slower │
│ QQuery 27    │  1715.82ms │      2180.49ms │  1.27x slower │
│ QQuery 28    │ 13288.62ms │     14036.38ms │  1.06x slower │
│ QQuery 29    │   545.67ms │       552.97ms │     no change │
│ QQuery 30    │   867.38ms │      1093.24ms │  1.26x slower │
│ QQuery 31    │   927.45ms │      1175.03ms │  1.27x slower │
│ QQuery 32    │  2759.69ms │      2823.48ms │     no change │
│ QQuery 33    │  3423.58ms │      4000.32ms │  1.17x slower │
│ QQuery 34    │  3448.65ms │      3971.63ms │  1.15x slower │
│ QQuery 35    │  1332.25ms │      1342.07ms │     no change │
│ QQuery 36    │   227.00ms │       257.92ms │  1.14x slower │
│ QQuery 37    │    89.59ms │       168.56ms │  1.88x slower │
│ QQuery 38    │   126.88ms │       161.53ms │  1.27x slower │
│ QQuery 39    │   428.24ms │       495.35ms │  1.16x slower │
│ QQuery 40    │    53.92ms │        50.93ms │ +1.06x faster │
│ QQuery 41    │    45.57ms │        47.94ms │  1.05x slower │
│ QQuery 42    │    56.85ms │        55.60ms │     no change │
└──────────────┴────────────┴────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary             ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)             │ 60456.30ms │
│ Total Time (move-predicate)   │ 75350.74ms │
│ Average Time (HEAD)           │  1405.96ms │
│ Average Time (move-predicate) │  1752.34ms │
│ Queries Faster                │          1 │
│ Queries Slower                │         27 │
│ Queries with No Change        │         15 │
└───────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     HEAD ┃ move-predicate ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 122.74ms │       124.82ms │     no change │
│ QQuery 2     │  24.28ms │        25.22ms │     no change │
│ QQuery 3     │  36.87ms │        38.19ms │     no change │
│ QQuery 4     │  21.18ms │        33.41ms │  1.58x slower │
│ QQuery 5     │  56.92ms │        58.97ms │     no change │
│ QQuery 6     │   8.09ms │         8.44ms │     no change │
│ QQuery 7     │ 108.40ms │       109.97ms │     no change │
│ QQuery 8     │  26.60ms │        27.36ms │     no change │
│ QQuery 9     │  62.58ms │        65.97ms │  1.05x slower │
│ QQuery 10    │  60.54ms │        61.64ms │     no change │
│ QQuery 11    │  12.95ms │        13.29ms │     no change │
│ QQuery 12    │  37.81ms │        34.19ms │ +1.11x faster │
│ QQuery 13    │  29.83ms │        30.74ms │     no change │
│ QQuery 14    │  10.25ms │        12.63ms │  1.23x slower │
│ QQuery 15    │  26.44ms │        25.89ms │     no change │
│ QQuery 16    │  24.77ms │        24.63ms │     no change │
│ QQuery 17    │  96.68ms │        99.84ms │     no change │
│ QQuery 18    │ 250.66ms │       250.73ms │     no change │
│ QQuery 19    │  29.46ms │        32.68ms │  1.11x slower │
│ QQuery 20    │  42.26ms │        40.72ms │     no change │
│ QQuery 21    │ 176.75ms │       179.41ms │     no change │
│ QQuery 22    │  17.38ms │        18.33ms │  1.05x slower │
└──────────────┴──────────┴────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary             ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)             │ 1283.43ms │
│ Total Time (move-predicate)   │ 1317.07ms │
│ Average Time (HEAD)           │   58.34ms │
│ Average Time (move-predicate) │   59.87ms │
│ Queries Faster                │         1 │
│ Queries Slower                │         5 │
│ Queries with No Change        │        16 │
└───────────────────────────────┴───────────┘

@alamb
Copy link
Contributor

alamb commented Apr 4, 2025

Crossing my fingers that the overhead of building the predicates is not an issue. If it is... we could always create it once upfront and only re-create it if the predicate or schema changes. But I'm hoping that's not necessary.

Hmm, the benchmark results suggest something else is going on -- I will try and reproduce them

@alamb
Copy link
Contributor

alamb commented Apr 4, 2025

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.8.0-1016-gcp #18-Ubuntu SMP Fri Oct 4 22:16:29 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Comparing move-predicate to 28451b5 using Benchmarks: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@adriangb
Copy link
Contributor Author

adriangb commented Apr 4, 2025

That's weird. I would not have been too surprised by a 1.01x slowdown or something. 1.4x seems really high unless (1) building pruning predicates was a significant portion of query time to begin with or (2) I did something wrong with the IO and it's now making 2 FS calls instead of 1 (even then was reading metadata that slow???)

@alamb
Copy link
Contributor

alamb commented Apr 4, 2025

I can reproduce the slowdown locally using this command

datafusion-cli  -c "SELECT * FROM hits WHERE \"URL\" LIKE '%google%' ORDER BY \"EventTime\" LIMIT 10;"
branch time
main Elapsed 3.533 seconds.
this pr Elapsed 5.076 seconds.

@adriangb
Copy link
Contributor Author

adriangb commented Apr 4, 2025

Huh okay maybe generating the pruning predicates is just that slow. Is that surprising to you? I'll investigate a bit later.

@alamb
Copy link
Contributor

alamb commented Apr 4, 2025

I bet it is something silly -- I am doing some profiling now

@adriangb
Copy link
Contributor Author

adriangb commented Apr 4, 2025

I am poking at this

Finally off my calls. Me as well. Feel free to ping me for a call if you think I can be of any help.

@alamb
Copy link
Contributor

alamb commented Apr 4, 2025

I think I see the issue -- I am just verifying now

@alamb
Copy link
Contributor

alamb commented Apr 4, 2025

Here is a fix that restores the performance on my local tests:

(thanks to @zhuqi-lucas for spotting the problem)

@alamb
Copy link
Contributor

alamb commented Apr 4, 2025

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.8.0-1016-gcp #18-Ubuntu SMP Fri Oct 4 22:16:29 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Comparing move-predicate (34993f2) to 28451b5 diff
Benchmarks: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@adriangb
Copy link
Contributor Author

adriangb commented Apr 4, 2025

Seems like the script died 😢

@alamb
Copy link
Contributor

alamb commented Apr 4, 2025

🤖: Benchmark completed

Details

Comparing HEAD and move-predicate
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ move-predicate ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │  2002.87ms │      2151.55ms │  1.07x slower │
│ QQuery 1     │   753.29ms │       718.90ms │     no change │
│ QQuery 2     │  1532.55ms │      1431.18ms │ +1.07x faster │
│ QQuery 3     │   726.98ms │       720.73ms │     no change │
│ QQuery 4     │  1500.91ms │      1489.97ms │     no change │
│ QQuery 5     │ 16788.03ms │     17125.12ms │     no change │
│ QQuery 6     │  7019.99ms │      6945.01ms │     no change │
└──────────────┴────────────┴────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary             ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)             │ 30324.62ms │
│ Total Time (move-predicate)   │ 30582.45ms │
│ Average Time (HEAD)           │  4332.09ms │
│ Average Time (move-predicate) │  4368.92ms │
│ Queries Faster                │          1 │
│ Queries Slower                │          1 │
│ Queries with No Change        │          5 │
└───────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ move-predicate ┃       Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 0     │     1.85ms │         1.94ms │ 1.05x slower │
│ QQuery 1     │    35.08ms │        35.44ms │    no change │
│ QQuery 2     │    92.50ms │        94.47ms │    no change │
│ QQuery 3     │   106.20ms │       103.92ms │    no change │
│ QQuery 4     │   776.32ms │       773.03ms │    no change │
│ QQuery 5     │   882.36ms │       884.71ms │    no change │
│ QQuery 6     │    32.35ms │        31.67ms │    no change │
│ QQuery 7     │    40.31ms │        41.66ms │    no change │
│ QQuery 8     │   953.96ms │       943.64ms │    no change │
│ QQuery 9     │  1220.34ms │      1255.50ms │    no change │
│ QQuery 10    │   278.23ms │       279.19ms │    no change │
│ QQuery 11    │   317.71ms │       317.78ms │    no change │
│ QQuery 12    │   947.20ms │       949.88ms │    no change │
│ QQuery 13    │  1412.46ms │      1386.91ms │    no change │
│ QQuery 14    │   884.79ms │       883.66ms │    no change │
│ QQuery 15    │  1072.87ms │      1070.09ms │    no change │
│ QQuery 16    │  1816.87ms │      1803.56ms │    no change │
│ QQuery 17    │  1677.57ms │      1651.50ms │    no change │
│ QQuery 18    │  3184.96ms │      3173.13ms │    no change │
│ QQuery 19    │    88.89ms │        90.80ms │    no change │
│ QQuery 20    │  1158.71ms │      1166.17ms │    no change │
│ QQuery 21    │  1395.41ms │      1373.28ms │    no change │
│ QQuery 22    │  2537.23ms │      2485.89ms │    no change │
│ QQuery 23    │  8875.63ms │      8846.75ms │    no change │
│ QQuery 24    │   483.51ms │       480.44ms │    no change │
│ QQuery 25    │   397.92ms │       410.63ms │    no change │
│ QQuery 26    │   546.75ms │       558.85ms │    no change │
│ QQuery 27    │  1712.24ms │      1711.04ms │    no change │
│ QQuery 28    │ 13209.97ms │     12888.64ms │    no change │
│ QQuery 29    │   556.10ms │       531.86ms │    no change │
│ QQuery 30    │   862.45ms │       872.27ms │    no change │
│ QQuery 31    │   897.25ms │       907.89ms │    no change │
│ QQuery 32    │  2778.13ms │      2832.75ms │    no change │
│ QQuery 33    │  3438.39ms │      3477.69ms │    no change │
│ QQuery 34    │  3484.82ms │      3505.94ms │    no change │
│ QQuery 35    │  1339.99ms │      1333.17ms │    no change │
│ QQuery 36    │   229.56ms │       226.12ms │    no change │
│ QQuery 37    │    88.03ms │        89.69ms │    no change │
│ QQuery 38    │   129.53ms │       135.48ms │    no change │
│ QQuery 39    │   413.47ms │       426.51ms │    no change │
│ QQuery 40    │    50.59ms │        53.67ms │ 1.06x slower │
│ QQuery 41    │    46.50ms │        47.05ms │    no change │
│ QQuery 42    │    56.25ms │        57.23ms │    no change │
└──────────────┴────────────┴────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary             ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)             │ 60511.22ms │
│ Total Time (move-predicate)   │ 60191.50ms │
│ Average Time (HEAD)           │  1407.24ms │
│ Average Time (move-predicate) │  1399.80ms │
│ Queries Faster                │          0 │
│ Queries Slower                │          2 │
│ Queries with No Change        │         41 │
└───────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     HEAD ┃ move-predicate ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 122.44ms │       123.01ms │     no change │
│ QQuery 2     │  24.71ms │        24.33ms │     no change │
│ QQuery 3     │  35.46ms │        37.17ms │     no change │
│ QQuery 4     │  21.07ms │        21.63ms │     no change │
│ QQuery 5     │  57.25ms │        57.13ms │     no change │
│ QQuery 6     │   8.18ms │         8.09ms │     no change │
│ QQuery 7     │ 105.18ms │       102.08ms │     no change │
│ QQuery 8     │  27.72ms │        27.33ms │     no change │
│ QQuery 9     │  65.07ms │        62.34ms │     no change │
│ QQuery 10    │  60.31ms │        59.39ms │     no change │
│ QQuery 11    │  13.22ms │        13.10ms │     no change │
│ QQuery 12    │  38.69ms │        38.97ms │     no change │
│ QQuery 13    │  29.18ms │        30.41ms │     no change │
│ QQuery 14    │  10.03ms │         9.85ms │     no change │
│ QQuery 15    │  26.66ms │        25.24ms │ +1.06x faster │
│ QQuery 16    │  24.30ms │        23.56ms │     no change │
│ QQuery 17    │  97.96ms │        99.51ms │     no change │
│ QQuery 18    │ 248.64ms │       274.74ms │  1.10x slower │
│ QQuery 19    │  29.09ms │        29.40ms │     no change │
│ QQuery 20    │  40.45ms │        41.23ms │     no change │
│ QQuery 21    │ 178.10ms │       176.90ms │     no change │
│ QQuery 22    │  18.10ms │        17.01ms │ +1.06x faster │
└──────────────┴──────────┴────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary             ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)             │ 1281.80ms │
│ Total Time (move-predicate)   │ 1302.44ms │
│ Average Time (HEAD)           │   58.26ms │
│ Average Time (move-predicate) │   59.20ms │
│ Queries Faster                │         2 │
│ Queries Slower                │         1 │
│ Queries with No Change        │        19 │
└───────────────────────────────┴───────────┘

@adriangb
Copy link
Contributor Author

adriangb commented Apr 4, 2025

Nice!

@alamb are you worried about the +1.06x changes or do you think that's noise?

I am happy to implement some basic caching here (compute a predicate in ParquetSource, only re-create it if the schemas or predicates don't match), but I want to make sure we want that added complexity before I do it.

@alamb
Copy link
Contributor

alamb commented Apr 4, 2025

@alamb are you worried about the +1.06x changes or do you think that's noise?

I think it is noise. I'll run clickbench_1 too just to be sure

I am happy to implement some basic caching here (compute a predicate in ParquetSource, only re-create it if the schemas or predicates don't match), but I want to make sure we want that added complexity before I do it.

This would be great -- maybe you can file a ticket to track it

@adriangb
Copy link
Contributor Author

adriangb commented Apr 4, 2025

Looking at the numbers I think it's noise, e.g. the biggest slowdown in the most recent bench is:

--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     HEAD ┃ move-predicate ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 18    │ 248.64ms │       274.74ms │  1.10x slower │
└──────────────┴──────────┴────────────────┴───────────────┘

But in the previous run, which included the bug causing other very large slowdowns:

--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     HEAD ┃ move-predicate ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 18    │ 252.59ms │       245.94ms │     no change │
└──────────────┴──────────┴────────────────┴───────────────┘

@adriangb
Copy link
Contributor Author

adriangb commented Apr 4, 2025

This would be great -- maybe you can file a ticket to track it

I mean my point is: if we think the impact on performance is minimal / not measurable in a real world scenario, is it worth having the added code / complexity to do this sort of caching?

@adriangb
Copy link
Contributor Author

adriangb commented Apr 4, 2025

A caching implementation also adds overhead, if we can't measure it how do we know the caching version is not slower, etc.

@alamb
Copy link
Contributor

alamb commented Apr 4, 2025

I mean my point is: if we think the impact on performance is minimal / not measurable in a real world scenario, is it worth having the added code / complexity to do this sort of caching?

If the performance impact is not measurable then no we shouldn't add the code in my opinion

@adriangb
Copy link
Contributor Author

adriangb commented Apr 4, 2025

Agreed then let's just proceed as is. Thanks for your help with this; sorry I was a bit MIA this morning!

@alamb
Copy link
Contributor

alamb commented Apr 4, 2025

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.8.0-1016-gcp #18-Ubuntu SMP Fri Oct 4 22:16:29 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Comparing move-predicate (34993f2) to 28451b5 diff
Benchmarks: clickbench_partitioned clickbench_1
Results will be posted here when complete

@alamb
Copy link
Contributor

alamb commented Apr 4, 2025

🤖: Benchmark completed

Details

Comparing HEAD and move-predicate
--------------------
Benchmark clickbench_1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ move-predicate ┃       Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 0     │     0.57ms │         0.57ms │    no change │
│ QQuery 1     │    69.30ms │        77.50ms │ 1.12x slower │
│ QQuery 2     │   123.68ms │       123.27ms │    no change │
│ QQuery 3     │   129.49ms │       132.03ms │    no change │
│ QQuery 4     │   789.11ms │       779.46ms │    no change │
│ QQuery 5     │  1003.83ms │       997.61ms │    no change │
│ QQuery 6     │    67.28ms │        67.83ms │    no change │
│ QQuery 7     │    83.28ms │        96.23ms │ 1.16x slower │
│ QQuery 8     │   959.99ms │       979.21ms │    no change │
│ QQuery 9     │  1282.53ms │      1227.88ms │    no change │
│ QQuery 10    │   319.12ms │       312.95ms │    no change │
│ QQuery 11    │   344.28ms │       355.88ms │    no change │
│ QQuery 12    │  1037.59ms │      1065.96ms │    no change │
│ QQuery 13    │  1503.82ms │      1485.05ms │    no change │
│ QQuery 14    │   999.28ms │      1010.39ms │    no change │
│ QQuery 15    │  1121.93ms │      1119.16ms │    no change │
│ QQuery 16    │  1910.48ms │      1946.54ms │    no change │
│ QQuery 17    │  1751.38ms │      1746.21ms │    no change │
│ QQuery 18    │  3422.39ms │      3411.35ms │    no change │
│ QQuery 19    │   122.19ms │       131.84ms │ 1.08x slower │
│ QQuery 20    │  1316.48ms │      1318.55ms │    no change │
│ QQuery 21    │  1646.11ms │      1674.41ms │    no change │
│ QQuery 22    │  4132.96ms │      4111.39ms │    no change │
│ QQuery 23    │ 10198.84ms │     10131.07ms │    no change │
│ QQuery 24    │   636.01ms │       655.67ms │    no change │
│ QQuery 25    │   559.46ms │       582.87ms │    no change │
│ QQuery 26    │   709.63ms │       717.70ms │    no change │
│ QQuery 27    │  1947.19ms │      1963.65ms │    no change │
│ QQuery 28    │ 13932.39ms │     13445.05ms │    no change │
│ QQuery 29    │   580.07ms │       571.48ms │    no change │
│ QQuery 30    │  1033.80ms │      1027.96ms │    no change │
│ QQuery 31    │  1052.67ms │      1085.77ms │    no change │
│ QQuery 32    │  2802.96ms │      2832.94ms │    no change │
│ QQuery 33    │  3616.14ms │      3617.88ms │    no change │
│ QQuery 34    │  3583.77ms │      3615.75ms │    no change │
│ QQuery 35    │  1377.25ms │      1345.07ms │    no change │
│ QQuery 36    │   284.71ms │       279.33ms │    no change │
│ QQuery 37    │   176.81ms │       186.18ms │ 1.05x slower │
│ QQuery 38    │   176.79ms │       180.30ms │    no change │
│ QQuery 39    │   479.56ms │       491.66ms │    no change │
│ QQuery 40    │    86.30ms │        91.40ms │ 1.06x slower │
│ QQuery 41    │    79.70ms │        88.57ms │ 1.11x slower │
│ QQuery 42    │    86.43ms │        97.27ms │ 1.13x slower │
└──────────────┴────────────┴────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary             ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)             │ 67537.52ms │
│ Total Time (move-predicate)   │ 67178.85ms │
│ Average Time (HEAD)           │  1570.64ms │
│ Average Time (move-predicate) │  1562.30ms │
│ Queries Faster                │          0 │
│ Queries Slower                │          7 │
│ Queries with No Change        │         36 │
└───────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ move-predicate ┃       Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 0     │     1.87ms │         1.98ms │ 1.06x slower │
│ QQuery 1     │    34.65ms │        35.88ms │    no change │
│ QQuery 2     │    94.32ms │        93.54ms │    no change │
│ QQuery 3     │   102.82ms │       101.02ms │    no change │
│ QQuery 4     │   726.35ms │       736.18ms │    no change │
│ QQuery 5     │   879.31ms │       851.96ms │    no change │
│ QQuery 6     │    32.99ms │        32.31ms │    no change │
│ QQuery 7     │    40.27ms │        41.75ms │    no change │
│ QQuery 8     │   946.62ms │       941.03ms │    no change │
│ QQuery 9     │  1242.07ms │      1220.36ms │    no change │
│ QQuery 10    │   280.58ms │       275.37ms │    no change │
│ QQuery 11    │   317.57ms │       318.06ms │    no change │
│ QQuery 12    │   971.41ms │       952.06ms │    no change │
│ QQuery 13    │  1391.78ms │      1432.93ms │    no change │
│ QQuery 14    │   856.07ms │       888.45ms │    no change │
│ QQuery 15    │  1053.52ms │      1071.47ms │    no change │
│ QQuery 16    │  1830.79ms │      1772.41ms │    no change │
│ QQuery 17    │  1672.69ms │      1629.35ms │    no change │
│ QQuery 18    │  3303.49ms │      3177.30ms │    no change │
│ QQuery 19    │    89.20ms │        96.51ms │ 1.08x slower │
│ QQuery 20    │  1170.70ms │      1156.21ms │    no change │
│ QQuery 21    │  1402.88ms │      1379.77ms │    no change │
│ QQuery 22    │  2492.16ms │      2544.69ms │    no change │
│ QQuery 23    │  8800.13ms │      8683.87ms │    no change │
│ QQuery 24    │   501.96ms │       484.56ms │    no change │
│ QQuery 25    │   402.33ms │       402.05ms │    no change │
│ QQuery 26    │   552.56ms │       548.03ms │    no change │
│ QQuery 27    │  1726.17ms │      1692.30ms │    no change │
│ QQuery 28    │ 13125.21ms │     12607.53ms │    no change │
│ QQuery 29    │   531.46ms │       539.16ms │    no change │
│ QQuery 30    │   865.41ms │       840.98ms │    no change │
│ QQuery 31    │   896.05ms │       901.11ms │    no change │
│ QQuery 32    │  2764.85ms │      2826.69ms │    no change │
│ QQuery 33    │  3459.25ms │      3449.30ms │    no change │
│ QQuery 34    │  3450.70ms │      3432.15ms │    no change │
│ QQuery 35    │  1307.10ms │      1303.63ms │    no change │
│ QQuery 36    │   228.34ms │       235.58ms │    no change │
│ QQuery 37    │    91.39ms │        93.59ms │    no change │
│ QQuery 38    │   129.47ms │       130.20ms │    no change │
│ QQuery 39    │   418.50ms │       420.71ms │    no change │
│ QQuery 40    │    49.36ms │        52.13ms │ 1.06x slower │
│ QQuery 41    │    44.31ms │        47.14ms │ 1.06x slower │
│ QQuery 42    │    59.01ms │        56.18ms │    no change │
└──────────────┴────────────┴────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary             ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)             │ 60337.65ms │
│ Total Time (move-predicate)   │ 59497.48ms │
│ Average Time (HEAD)           │  1403.20ms │
│ Average Time (move-predicate) │  1383.66ms │
│ Queries Faster                │          0 │
│ Queries Slower                │          4 │
│ Queries with No Change        │         39 │
└───────────────────────────────┴────────────┘

Copy link
Contributor

@zhuqi-lucas zhuqi-lucas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thanks!

@alamb
Copy link
Contributor

alamb commented Apr 5, 2025

I plan to merge this tomorrow unless other people would like a chance to commetn

@alamb alamb merged commit dccf377 into apache:main Apr 6, 2025
27 checks passed
@alamb
Copy link
Contributor

alamb commented Apr 6, 2025

Onwards towards topk pushdown

@alamb alamb added the api change Changes the API exposed to users of the crate label Apr 11, 2025
@alamb
Copy link
Contributor

alamb commented Apr 11, 2025

This definitely is an API change -- I hit it in the delta-rs upgrade:

I'll make a note to add it to the upgrade guide

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Changes the API exposed to users of the crate core Core DataFusion crate datasource Changes to the datasource crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove ParquetSource::pruning_predicate
3 participants