parquet reader: move pruning predicate creation from ParquetSource to ParquetOpener #15561

adriangb · 2025-04-03T14:36:17Z

Additionally I think this will make predicate evaluation slightly more performant for files with missing columns.
There are actually 3 file schemas going around:

The table schema, as returned by TableProvider, etc.
The file_schema passed into FileScanConfig which is not the physical file schema, rather it's the table schema - partition columns.
The physical file schema.

Currently we build predicates against (2), which means that a predicate may reference columns not found in the actual file. I believe this would result in null stats being created on the fly (some minimal work) and pointless evaluation of predicates (some more work).
I'm not sure how this stacks up with the extra work of creating the predicates multiple times, that also has a cost. But that should be easier to cache and is O(number of files) instead of O(number of row pages), so I think it should be better.

At the very least this is more correct in my mind.

… ParquetOpener

adriangb · 2025-04-03T14:36:23Z

cc @alamb

adriangb · 2025-04-03T17:08:27Z

datafusion/core/src/datasource/file_format/parquet.rs

                }
-                .build();
+                builder = builder.set_bloom_filter_enabled(true);


Since I'm now using the actual explain plan to check that bloom filters are used it's easiest to just enable them and check that they pruned. It's also a more realistic test in that it's what I as a user would do to check if bloom filters are working.

adriangb · 2025-04-03T17:10:45Z

datafusion/core/src/datasource/physical_plan/parquet.rs

+            let analyze_exec = Arc::new(AnalyzeExec::new(
+                false,
+                false,
+                // use a new ParquetSource to avoid sharing execution metrics
+                self.build_parquet_exec(
+                    file_schema.clone(),
+                    file_group.clone(),
+                    self.build_file_source(file_schema.clone()),
+                ),
+                Arc::new(Schema::new(vec![
+                    Field::new("plan_type", DataType::Utf8, true),
+                    Field::new("plan", DataType::Utf8, true),
+                ])),
+            ));


The reason I'm refactoring RoundTrip this is that I need to create the file source twice: ParquetSource has internal Metrics so if we run the query against the same ParquetSource twice (once for the data and once for the explain analyze plan) then we end up with duplicate metrics. Cloning it doesn't help / work because the metrics themselves are Arced.

I think generally asserting against the explain plan and not a handle to the ParquetSource is more in line with how real world users use DataFusion to debug if the page index, stats, etc. are working / pruning.

I agree this sounds good to me

adriangb · 2025-04-03T17:11:24Z

datafusion/core/src/datasource/physical_plan/parquet.rs

-    #[tokio::test]
-    async fn parquet_exec_display_deterministic() {


I canned this whole test because it seems the the point is to check that required_guarantees= doesn't change run to run but that is no longer part of the output, so I don't see what this test would be testing.

I think the guarantees were in a HashSet and printed to the explain plan so they could change from run to run

Right I guess the point is: now that they are no longer printed to the explain plan, is this test needed or can we bin it?

adriangb · 2025-04-03T17:12:58Z

datafusion/core/src/datasource/physical_plan/parquet.rs

+        // When both matched and pruned are 0, it means that the pruning predicate
+        // was not used at all.


Not super intuitive (I'd prefer it was null or none or the whole row_groups_matched_statistics was just not included. But I've left a comment to clarify for future readers.

adriangb · 2025-04-03T17:15:13Z

datafusion/datasource-parquet/src/source.rs

@@ -498,6 +473,7 @@ impl FileSource for ParquetSource {
            reorder_filters: self.reorder_filters(),
            enable_page_index: self.enable_page_index(),
            enable_bloom_filter: self.bloom_filter_on_read(),
+            enable_row_group_stats_pruning: self.table_parquet_options.global.pruning,


This is just for consistency with the global option: before this I don't think this option was being referenced anywhere / it was broken. Maybe it's being decided in ListingTable? I feel like there's a couple options that you'd think apply to ParquetSource but only get handled via ListingTable.

adriangb · 2025-04-03T17:15:47Z

datafusion/datasource-parquet/src/opener.rs

+async fn load_page_index<T: AsyncFileReader>(
+    arrow_reader: ArrowReaderMetadata,
+    input: &mut T,
+    options: ArrowReaderOptions,
+) -> Result<ArrowReaderMetadata> {
+    let parquet_metadata = arrow_reader.metadata();
+    let missing_column_index = parquet_metadata.column_index().is_none();
+    let missing_offset_index = parquet_metadata.offset_index().is_none();
+    if missing_column_index || missing_offset_index {
+        let m = Arc::try_unwrap(Arc::clone(&parquet_metadata))
+            .unwrap_or_else(|e| e.as_ref().clone());
+        let mut reader =
+            ParquetMetaDataReader::new_with_metadata(m).with_page_indexes(true);
+        reader.load_page_index(input).await?;
+        let new_parquet_metadata = reader.finish()?;
+        let new_arrow_reader =
+            ArrowReaderMetadata::try_new(Arc::new(new_parquet_metadata), options)?;
+        Ok(new_arrow_reader)
+    } else {
+        // No page index, return the original metadata
+        Ok(arrow_reader)
+    }
+}


I feel like this should be in arrow alongside load_async, but putting it here for now

adriangb · 2025-04-03T17:32:57Z

datafusion/datasource-parquet/src/opener.rs

+    let missing_column_index = parquet_metadata.column_index().is_none();
+    let missing_offset_index = parquet_metadata.offset_index().is_none();
+    if missing_column_index || missing_offset_index {


This is different than the original logic: the original logic was if missing_column_index && missing_offset_index, which is probably fine in practice (I think they always get loaded at the same time) in theory you need to re-load if either is missing.

adriangb · 2025-04-03T17:37:42Z

datafusion/datasource-parquet/src/opener.rs

+            // Note about schemas: we are actually dealing with **3 different schemas** here:
+            // - The table schema as defined by the TableProvider. This is what the user sees, what they get when they `SELECT * FROM table`, etc.
+            // - The "virtual" file schema: this is the table schema minus any hive partition columns and projections. This is what the file schema is coerced to.
+            // - The physical file schema: this is the schema as defined by the parquet file. This is what the parquet file actually contains.


I'm not actually changing anything here, just documenting the current status quo because it's somewhat confusing.

alamb

Thank you @adriangb -- I think this PR makes a lot of sense to me

I do think it would be good to restore the explain plan display of pruning predicates

Also, I want to run some performance benchmarks on this branch to ensure there isn't a slowdown. Otherwise looks great to me

alamb · 2025-04-03T20:03:06Z

datafusion/core/src/datasource/physical_plan/parquet.rs

+            let analyze_exec = Arc::new(AnalyzeExec::new(
+                false,
+                false,
+                // use a new ParquetSource to avoid sharing execution metrics
+                self.build_parquet_exec(
+                    file_schema.clone(),
+                    file_group.clone(),
+                    self.build_file_source(file_schema.clone()),
+                ),
+                Arc::new(Schema::new(vec![
+                    Field::new("plan_type", DataType::Utf8, true),
+                    Field::new("plan", DataType::Utf8, true),
+                ])),
+            ));


I agree this sounds good to me

datafusion/core/src/datasource/physical_plan/parquet.rs

alamb · 2025-04-03T20:04:31Z

datafusion/core/src/datasource/physical_plan/parquet.rs

-    #[tokio::test]
-    async fn parquet_exec_display_deterministic() {


I think the guarantees were in a HashSet and printed to the explain plan so they could change from run to run

datafusion/datasource-parquet/src/opener.rs

alamb · 2025-04-03T20:06:57Z

datafusion/datasource-parquet/src/source.rs

-        {
+        // (bloom filters use `pruning_predicate` too).
+        // Because filter pushdown may happen dynamically as long as there is a predicate
+        // if we have *any* predicate applied, we can't guarantee the statistics are exact.


I think this is the correct check and makes sense to me

alamb · 2025-04-03T20:08:41Z

datafusion/sqllogictest/test_files/parquet.slt

@@ -625,7 +625,7 @@ physical_plan
 01)CoalesceBatchesExec: target_batch_size=8192
 02)--FilterExec: column1@0 LIKE f%
 03)----RepartitionExec: partitioning=RoundRobinBatch(2), input_partitions=1
-04)------DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/parquet/foo.parquet]]}, projection=[column1], file_type=parquet, predicate=column1@0 LIKE f%, pruning_predicate=column1_null_count@2 != row_count@3 AND column1_min@0 <= g AND f <= column1_max@1, required_guarantees=[]
+04)------DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/parquet/foo.parquet]]}, projection=[column1], file_type=parquet, predicate=column1@0 LIKE f%


this is basically my only concern with the entire PR -- that the explain plans now do not show pruning predicates / literal guarantees anymore.

This will make it harder to understand when optimizations are happening or not (because a particular predicate can't be converted for example)

Would it be possible to add this part of the explain back in (by trying to create a pruning predicate for only the explain plan, when needed)?

I realize that approach has the downside that the pruning predicate used for each file may be different than what is shown in the explain plan

Yes I think we can do that. I feared that it would be more confusing because the pruning predicate you see is not what you get in the end...

Is there any way we can inject this information at runtime? Metrics already kind of do that. It'd be nice to record the per-file pruning predicates, per file schema mappings and per-file filters once those exist.

I've reverted this part. Do you think maybe we can rename pruning_predicate= to table_schema_pruning_predicate=, estimated_pruning_predicate= something to give some indication that it may not match what actually happens/happened at runtime?

alamb

Thank you so much @adriangb -- I am going to run the benchmarks on this PR to make sure we didn't mess anything up, but of that goes well i think this is ready to merge

alamb · 2025-04-04T12:41:58Z

datafusion/datasource-parquet/src/opener.rs

+    // Didn't we explicitly *not* load it above?
+    // Well it's possible that a custom implementation of `AsyncFileReader` gives you
+    // the page index even if you didn't ask for it (e.g. because it's cached)
+    // so it's important to check that here to avoid extra work.


alamb · 2025-04-04T13:00:38Z

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.8.0-1016-gcp #18-Ubuntu SMP Fri Oct 4 22:16:29 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Comparing move-predicate to 28451b5 using Benchmarks: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

adriangb · 2025-04-04T13:40:42Z

Crossing my fingers that the overhead of building the predicates is not an issue. If it is... we could always create it once upfront and only re-create it if the predicate or schema changes. But I'm hoping that's not necessary.

alamb · 2025-04-04T13:44:09Z

🤖: Benchmark completed

Details

Comparing HEAD and move-predicate
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ move-predicate ┃       Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 0     │  2023.60ms │      2194.77ms │ 1.08x slower │
│ QQuery 1     │   743.75ms │      1055.10ms │ 1.42x slower │
│ QQuery 2     │  1516.95ms │      1998.04ms │ 1.32x slower │
│ QQuery 3     │   718.48ms │       704.61ms │    no change │
│ QQuery 4     │  1519.47ms │      1502.43ms │    no change │
│ QQuery 5     │ 17099.54ms │     16986.59ms │    no change │
│ QQuery 6     │  7014.18ms │      8055.53ms │ 1.15x slower │
└──────────────┴────────────┴────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary             ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)             │ 30635.96ms │
│ Total Time (move-predicate)   │ 32497.07ms │
│ Average Time (HEAD)           │  4376.57ms │
│ Average Time (move-predicate) │  4642.44ms │
│ Queries Faster                │          0 │
│ Queries Slower                │          4 │
│ Queries with No Change        │          3 │
└───────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ move-predicate ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     1.86ms │         1.96ms │  1.06x slower │
│ QQuery 1     │    35.59ms │        36.48ms │     no change │
│ QQuery 2     │    95.71ms │        91.77ms │     no change │
│ QQuery 3     │   101.49ms │       100.10ms │     no change │
│ QQuery 4     │   770.54ms │       770.11ms │     no change │
│ QQuery 5     │   880.81ms │      1082.81ms │  1.23x slower │
│ QQuery 6     │    32.08ms │        32.84ms │     no change │
│ QQuery 7     │    39.15ms │        40.89ms │     no change │
│ QQuery 8     │   962.18ms │       938.02ms │     no change │
│ QQuery 9     │  1235.92ms │      1241.56ms │     no change │
│ QQuery 10    │   281.92ms │       397.08ms │  1.41x slower │
│ QQuery 11    │   316.87ms │       425.90ms │  1.34x slower │
│ QQuery 12    │   951.88ms │      1176.17ms │  1.24x slower │
│ QQuery 13    │  1409.13ms │      1469.91ms │     no change │
│ QQuery 14    │   894.32ms │      1078.45ms │  1.21x slower │
│ QQuery 15    │  1074.48ms │      1084.12ms │     no change │
│ QQuery 16    │  1783.13ms │      2013.95ms │  1.13x slower │
│ QQuery 17    │  1659.72ms │      1861.72ms │  1.12x slower │
│ QQuery 18    │  3146.33ms │      3358.44ms │  1.07x slower │
│ QQuery 19    │    85.54ms │        88.95ms │     no change │
│ QQuery 20    │  1150.49ms │      1677.14ms │  1.46x slower │
│ QQuery 21    │  1393.94ms │      2094.43ms │  1.50x slower │
│ QQuery 22    │  2537.09ms │      4947.08ms │  1.95x slower │
│ QQuery 23    │  8843.92ms │     14846.84ms │  1.68x slower │
│ QQuery 24    │   488.32ms │       706.20ms │  1.45x slower │
│ QQuery 25    │   396.62ms │       609.53ms │  1.54x slower │
│ QQuery 26    │   550.11ms │       764.86ms │  1.39x slower │
│ QQuery 27    │  1715.82ms │      2180.49ms │  1.27x slower │
│ QQuery 28    │ 13288.62ms │     14036.38ms │  1.06x slower │
│ QQuery 29    │   545.67ms │       552.97ms │     no change │
│ QQuery 30    │   867.38ms │      1093.24ms │  1.26x slower │
│ QQuery 31    │   927.45ms │      1175.03ms │  1.27x slower │
│ QQuery 32    │  2759.69ms │      2823.48ms │     no change │
│ QQuery 33    │  3423.58ms │      4000.32ms │  1.17x slower │
│ QQuery 34    │  3448.65ms │      3971.63ms │  1.15x slower │
│ QQuery 35    │  1332.25ms │      1342.07ms │     no change │
│ QQuery 36    │   227.00ms │       257.92ms │  1.14x slower │
│ QQuery 37    │    89.59ms │       168.56ms │  1.88x slower │
│ QQuery 38    │   126.88ms │       161.53ms │  1.27x slower │
│ QQuery 39    │   428.24ms │       495.35ms │  1.16x slower │
│ QQuery 40    │    53.92ms │        50.93ms │ +1.06x faster │
│ QQuery 41    │    45.57ms │        47.94ms │  1.05x slower │
│ QQuery 42    │    56.85ms │        55.60ms │     no change │
└──────────────┴────────────┴────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary             ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)             │ 60456.30ms │
│ Total Time (move-predicate)   │ 75350.74ms │
│ Average Time (HEAD)           │  1405.96ms │
│ Average Time (move-predicate) │  1752.34ms │
│ Queries Faster                │          1 │
│ Queries Slower                │         27 │
│ Queries with No Change        │         15 │
└───────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     HEAD ┃ move-predicate ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 122.74ms │       124.82ms │     no change │
│ QQuery 2     │  24.28ms │        25.22ms │     no change │
│ QQuery 3     │  36.87ms │        38.19ms │     no change │
│ QQuery 4     │  21.18ms │        33.41ms │  1.58x slower │
│ QQuery 5     │  56.92ms │        58.97ms │     no change │
│ QQuery 6     │   8.09ms │         8.44ms │     no change │
│ QQuery 7     │ 108.40ms │       109.97ms │     no change │
│ QQuery 8     │  26.60ms │        27.36ms │     no change │
│ QQuery 9     │  62.58ms │        65.97ms │  1.05x slower │
│ QQuery 10    │  60.54ms │        61.64ms │     no change │
│ QQuery 11    │  12.95ms │        13.29ms │     no change │
│ QQuery 12    │  37.81ms │        34.19ms │ +1.11x faster │
│ QQuery 13    │  29.83ms │        30.74ms │     no change │
│ QQuery 14    │  10.25ms │        12.63ms │  1.23x slower │
│ QQuery 15    │  26.44ms │        25.89ms │     no change │
│ QQuery 16    │  24.77ms │        24.63ms │     no change │
│ QQuery 17    │  96.68ms │        99.84ms │     no change │
│ QQuery 18    │ 250.66ms │       250.73ms │     no change │
│ QQuery 19    │  29.46ms │        32.68ms │  1.11x slower │
│ QQuery 20    │  42.26ms │        40.72ms │     no change │
│ QQuery 21    │ 176.75ms │       179.41ms │     no change │
│ QQuery 22    │  17.38ms │        18.33ms │  1.05x slower │
└──────────────┴──────────┴────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary             ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)             │ 1283.43ms │
│ Total Time (move-predicate)   │ 1317.07ms │
│ Average Time (HEAD)           │   58.34ms │
│ Average Time (move-predicate) │   59.87ms │
│ Queries Faster                │         1 │
│ Queries Slower                │         5 │
│ Queries with No Change        │        16 │
└───────────────────────────────┴───────────┘

alamb · 2025-04-04T13:56:24Z

Crossing my fingers that the overhead of building the predicates is not an issue. If it is... we could always create it once upfront and only re-create it if the predicate or schema changes. But I'm hoping that's not necessary.

Hmm, the benchmark results suggest something else is going on -- I will try and reproduce them

alamb · 2025-04-04T14:01:58Z

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.8.0-1016-gcp #18-Ubuntu SMP Fri Oct 4 22:16:29 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Comparing move-predicate to 28451b5 using Benchmarks: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

adriangb · 2025-04-04T14:05:53Z

That's weird. I would not have been too surprised by a 1.01x slowdown or something. 1.4x seems really high unless (1) building pruning predicates was a significant portion of query time to begin with or (2) I did something wrong with the IO and it's now making 2 FS calls instead of 1 (even then was reading metadata that slow???)

alamb · 2025-04-04T14:16:39Z

I can reproduce the slowdown locally using this command

datafusion-cli  -c "SELECT * FROM hits WHERE \"URL\" LIKE '%google%' ORDER BY \"EventTime\" LIMIT 10;"

branch	time
main	`Elapsed 3.533 seconds.`
this pr	`Elapsed 5.076 seconds.`

adriangb · 2025-04-04T14:19:15Z

Huh okay maybe generating the pruning predicates is just that slow. Is that surprising to you? I'll investigate a bit later.

alamb · 2025-04-04T14:20:34Z

I bet it is something silly -- I am doing some profiling now

adriangb · 2025-04-04T17:11:51Z

I am poking at this

Finally off my calls. Me as well. Feel free to ping me for a call if you think I can be of any help.

alamb · 2025-04-04T17:13:31Z

I think I see the issue -- I am just verifying now

alamb · 2025-04-04T17:29:05Z

Here is a fix that restores the performance on my local tests:

Always apply per-file schema during parquet read pydantic/datafusion#18

(thanks to @zhuqi-lucas for spotting the problem)

alamb · 2025-04-04T17:41:46Z

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.8.0-1016-gcp #18-Ubuntu SMP Fri Oct 4 22:16:29 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Comparing move-predicate (34993f2) to 28451b5 diff
Benchmarks: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

adriangb · 2025-04-04T18:35:00Z

Seems like the script died 😢

alamb · 2025-04-04T18:40:51Z

🤖: Benchmark completed

Details

Comparing HEAD and move-predicate
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ move-predicate ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │  2002.87ms │      2151.55ms │  1.07x slower │
│ QQuery 1     │   753.29ms │       718.90ms │     no change │
│ QQuery 2     │  1532.55ms │      1431.18ms │ +1.07x faster │
│ QQuery 3     │   726.98ms │       720.73ms │     no change │
│ QQuery 4     │  1500.91ms │      1489.97ms │     no change │
│ QQuery 5     │ 16788.03ms │     17125.12ms │     no change │
│ QQuery 6     │  7019.99ms │      6945.01ms │     no change │
└──────────────┴────────────┴────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary             ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)             │ 30324.62ms │
│ Total Time (move-predicate)   │ 30582.45ms │
│ Average Time (HEAD)           │  4332.09ms │
│ Average Time (move-predicate) │  4368.92ms │
│ Queries Faster                │          1 │
│ Queries Slower                │          1 │
│ Queries with No Change        │          5 │
└───────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ move-predicate ┃       Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 0     │     1.85ms │         1.94ms │ 1.05x slower │
│ QQuery 1     │    35.08ms │        35.44ms │    no change │
│ QQuery 2     │    92.50ms │        94.47ms │    no change │
│ QQuery 3     │   106.20ms │       103.92ms │    no change │
│ QQuery 4     │   776.32ms │       773.03ms │    no change │
│ QQuery 5     │   882.36ms │       884.71ms │    no change │
│ QQuery 6     │    32.35ms │        31.67ms │    no change │
│ QQuery 7     │    40.31ms │        41.66ms │    no change │
│ QQuery 8     │   953.96ms │       943.64ms │    no change │
│ QQuery 9     │  1220.34ms │      1255.50ms │    no change │
│ QQuery 10    │   278.23ms │       279.19ms │    no change │
│ QQuery 11    │   317.71ms │       317.78ms │    no change │
│ QQuery 12    │   947.20ms │       949.88ms │    no change │
│ QQuery 13    │  1412.46ms │      1386.91ms │    no change │
│ QQuery 14    │   884.79ms │       883.66ms │    no change │
│ QQuery 15    │  1072.87ms │      1070.09ms │    no change │
│ QQuery 16    │  1816.87ms │      1803.56ms │    no change │
│ QQuery 17    │  1677.57ms │      1651.50ms │    no change │
│ QQuery 18    │  3184.96ms │      3173.13ms │    no change │
│ QQuery 19    │    88.89ms │        90.80ms │    no change │
│ QQuery 20    │  1158.71ms │      1166.17ms │    no change │
│ QQuery 21    │  1395.41ms │      1373.28ms │    no change │
│ QQuery 22    │  2537.23ms │      2485.89ms │    no change │
│ QQuery 23    │  8875.63ms │      8846.75ms │    no change │
│ QQuery 24    │   483.51ms │       480.44ms │    no change │
│ QQuery 25    │   397.92ms │       410.63ms │    no change │
│ QQuery 26    │   546.75ms │       558.85ms │    no change │
│ QQuery 27    │  1712.24ms │      1711.04ms │    no change │
│ QQuery 28    │ 13209.97ms │     12888.64ms │    no change │
│ QQuery 29    │   556.10ms │       531.86ms │    no change │
│ QQuery 30    │   862.45ms │       872.27ms │    no change │
│ QQuery 31    │   897.25ms │       907.89ms │    no change │
│ QQuery 32    │  2778.13ms │      2832.75ms │    no change │
│ QQuery 33    │  3438.39ms │      3477.69ms │    no change │
│ QQuery 34    │  3484.82ms │      3505.94ms │    no change │
│ QQuery 35    │  1339.99ms │      1333.17ms │    no change │
│ QQuery 36    │   229.56ms │       226.12ms │    no change │
│ QQuery 37    │    88.03ms │        89.69ms │    no change │
│ QQuery 38    │   129.53ms │       135.48ms │    no change │
│ QQuery 39    │   413.47ms │       426.51ms │    no change │
│ QQuery 40    │    50.59ms │        53.67ms │ 1.06x slower │
│ QQuery 41    │    46.50ms │        47.05ms │    no change │
│ QQuery 42    │    56.25ms │        57.23ms │    no change │
└──────────────┴────────────┴────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary             ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)             │ 60511.22ms │
│ Total Time (move-predicate)   │ 60191.50ms │
│ Average Time (HEAD)           │  1407.24ms │
│ Average Time (move-predicate) │  1399.80ms │
│ Queries Faster                │          0 │
│ Queries Slower                │          2 │
│ Queries with No Change        │         41 │
└───────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     HEAD ┃ move-predicate ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 122.44ms │       123.01ms │     no change │
│ QQuery 2     │  24.71ms │        24.33ms │     no change │
│ QQuery 3     │  35.46ms │        37.17ms │     no change │
│ QQuery 4     │  21.07ms │        21.63ms │     no change │
│ QQuery 5     │  57.25ms │        57.13ms │     no change │
│ QQuery 6     │   8.18ms │         8.09ms │     no change │
│ QQuery 7     │ 105.18ms │       102.08ms │     no change │
│ QQuery 8     │  27.72ms │        27.33ms │     no change │
│ QQuery 9     │  65.07ms │        62.34ms │     no change │
│ QQuery 10    │  60.31ms │        59.39ms │     no change │
│ QQuery 11    │  13.22ms │        13.10ms │     no change │
│ QQuery 12    │  38.69ms │        38.97ms │     no change │
│ QQuery 13    │  29.18ms │        30.41ms │     no change │
│ QQuery 14    │  10.03ms │         9.85ms │     no change │
│ QQuery 15    │  26.66ms │        25.24ms │ +1.06x faster │
│ QQuery 16    │  24.30ms │        23.56ms │     no change │
│ QQuery 17    │  97.96ms │        99.51ms │     no change │
│ QQuery 18    │ 248.64ms │       274.74ms │  1.10x slower │
│ QQuery 19    │  29.09ms │        29.40ms │     no change │
│ QQuery 20    │  40.45ms │        41.23ms │     no change │
│ QQuery 21    │ 178.10ms │       176.90ms │     no change │
│ QQuery 22    │  18.10ms │        17.01ms │ +1.06x faster │
└──────────────┴──────────┴────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary             ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)             │ 1281.80ms │
│ Total Time (move-predicate)   │ 1302.44ms │
│ Average Time (HEAD)           │   58.26ms │
│ Average Time (move-predicate) │   59.20ms │
│ Queries Faster                │         2 │
│ Queries Slower                │         1 │
│ Queries with No Change        │        19 │
└───────────────────────────────┴───────────┘

adriangb · 2025-04-04T18:47:47Z

Nice!

@alamb are you worried about the +1.06x changes or do you think that's noise?

I am happy to implement some basic caching here (compute a predicate in ParquetSource, only re-create it if the schemas or predicates don't match), but I want to make sure we want that added complexity before I do it.

alamb · 2025-04-04T18:56:42Z

@alamb are you worried about the +1.06x changes or do you think that's noise?

I think it is noise. I'll run clickbench_1 too just to be sure

I am happy to implement some basic caching here (compute a predicate in ParquetSource, only re-create it if the schemas or predicates don't match), but I want to make sure we want that added complexity before I do it.

This would be great -- maybe you can file a ticket to track it

adriangb · 2025-04-04T19:05:20Z

Looking at the numbers I think it's noise, e.g. the biggest slowdown in the most recent bench is:

--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     HEAD ┃ move-predicate ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 18    │ 248.64ms │       274.74ms │  1.10x slower │
└──────────────┴──────────┴────────────────┴───────────────┘

But in the previous run, which included the bug causing other very large slowdowns:

--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     HEAD ┃ move-predicate ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 18    │ 252.59ms │       245.94ms │     no change │
└──────────────┴──────────┴────────────────┴───────────────┘

adriangb · 2025-04-04T19:06:11Z

This would be great -- maybe you can file a ticket to track it

I mean my point is: if we think the impact on performance is minimal / not measurable in a real world scenario, is it worth having the added code / complexity to do this sort of caching?

adriangb · 2025-04-04T19:07:04Z

A caching implementation also adds overhead, if we can't measure it how do we know the caching version is not slower, etc.

alamb · 2025-04-04T19:14:35Z

I mean my point is: if we think the impact on performance is minimal / not measurable in a real world scenario, is it worth having the added code / complexity to do this sort of caching?

If the performance impact is not measurable then no we shouldn't add the code in my opinion

adriangb · 2025-04-04T19:20:55Z

Agreed then let's just proceed as is. Thanks for your help with this; sorry I was a bit MIA this morning!

alamb · 2025-04-04T19:22:49Z

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.8.0-1016-gcp #18-Ubuntu SMP Fri Oct 4 22:16:29 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Comparing move-predicate (34993f2) to 28451b5 diff
Benchmarks: clickbench_partitioned clickbench_1
Results will be posted here when complete

alamb · 2025-04-04T19:54:25Z

🤖: Benchmark completed

Details

Comparing HEAD and move-predicate
--------------------
Benchmark clickbench_1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ move-predicate ┃       Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 0     │     0.57ms │         0.57ms │    no change │
│ QQuery 1     │    69.30ms │        77.50ms │ 1.12x slower │
│ QQuery 2     │   123.68ms │       123.27ms │    no change │
│ QQuery 3     │   129.49ms │       132.03ms │    no change │
│ QQuery 4     │   789.11ms │       779.46ms │    no change │
│ QQuery 5     │  1003.83ms │       997.61ms │    no change │
│ QQuery 6     │    67.28ms │        67.83ms │    no change │
│ QQuery 7     │    83.28ms │        96.23ms │ 1.16x slower │
│ QQuery 8     │   959.99ms │       979.21ms │    no change │
│ QQuery 9     │  1282.53ms │      1227.88ms │    no change │
│ QQuery 10    │   319.12ms │       312.95ms │    no change │
│ QQuery 11    │   344.28ms │       355.88ms │    no change │
│ QQuery 12    │  1037.59ms │      1065.96ms │    no change │
│ QQuery 13    │  1503.82ms │      1485.05ms │    no change │
│ QQuery 14    │   999.28ms │      1010.39ms │    no change │
│ QQuery 15    │  1121.93ms │      1119.16ms │    no change │
│ QQuery 16    │  1910.48ms │      1946.54ms │    no change │
│ QQuery 17    │  1751.38ms │      1746.21ms │    no change │
│ QQuery 18    │  3422.39ms │      3411.35ms │    no change │
│ QQuery 19    │   122.19ms │       131.84ms │ 1.08x slower │
│ QQuery 20    │  1316.48ms │      1318.55ms │    no change │
│ QQuery 21    │  1646.11ms │      1674.41ms │    no change │
│ QQuery 22    │  4132.96ms │      4111.39ms │    no change │
│ QQuery 23    │ 10198.84ms │     10131.07ms │    no change │
│ QQuery 24    │   636.01ms │       655.67ms │    no change │
│ QQuery 25    │   559.46ms │       582.87ms │    no change │
│ QQuery 26    │   709.63ms │       717.70ms │    no change │
│ QQuery 27    │  1947.19ms │      1963.65ms │    no change │
│ QQuery 28    │ 13932.39ms │     13445.05ms │    no change │
│ QQuery 29    │   580.07ms │       571.48ms │    no change │
│ QQuery 30    │  1033.80ms │      1027.96ms │    no change │
│ QQuery 31    │  1052.67ms │      1085.77ms │    no change │
│ QQuery 32    │  2802.96ms │      2832.94ms │    no change │
│ QQuery 33    │  3616.14ms │      3617.88ms │    no change │
│ QQuery 34    │  3583.77ms │      3615.75ms │    no change │
│ QQuery 35    │  1377.25ms │      1345.07ms │    no change │
│ QQuery 36    │   284.71ms │       279.33ms │    no change │
│ QQuery 37    │   176.81ms │       186.18ms │ 1.05x slower │
│ QQuery 38    │   176.79ms │       180.30ms │    no change │
│ QQuery 39    │   479.56ms │       491.66ms │    no change │
│ QQuery 40    │    86.30ms │        91.40ms │ 1.06x slower │
│ QQuery 41    │    79.70ms │        88.57ms │ 1.11x slower │
│ QQuery 42    │    86.43ms │        97.27ms │ 1.13x slower │
└──────────────┴────────────┴────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary             ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)             │ 67537.52ms │
│ Total Time (move-predicate)   │ 67178.85ms │
│ Average Time (HEAD)           │  1570.64ms │
│ Average Time (move-predicate) │  1562.30ms │
│ Queries Faster                │          0 │
│ Queries Slower                │          7 │
│ Queries with No Change        │         36 │
└───────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query        ┃       HEAD ┃ move-predicate ┃       Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 0     │     1.87ms │         1.98ms │ 1.06x slower │
│ QQuery 1     │    34.65ms │        35.88ms │    no change │
│ QQuery 2     │    94.32ms │        93.54ms │    no change │
│ QQuery 3     │   102.82ms │       101.02ms │    no change │
│ QQuery 4     │   726.35ms │       736.18ms │    no change │
│ QQuery 5     │   879.31ms │       851.96ms │    no change │
│ QQuery 6     │    32.99ms │        32.31ms │    no change │
│ QQuery 7     │    40.27ms │        41.75ms │    no change │
│ QQuery 8     │   946.62ms │       941.03ms │    no change │
│ QQuery 9     │  1242.07ms │      1220.36ms │    no change │
│ QQuery 10    │   280.58ms │       275.37ms │    no change │
│ QQuery 11    │   317.57ms │       318.06ms │    no change │
│ QQuery 12    │   971.41ms │       952.06ms │    no change │
│ QQuery 13    │  1391.78ms │      1432.93ms │    no change │
│ QQuery 14    │   856.07ms │       888.45ms │    no change │
│ QQuery 15    │  1053.52ms │      1071.47ms │    no change │
│ QQuery 16    │  1830.79ms │      1772.41ms │    no change │
│ QQuery 17    │  1672.69ms │      1629.35ms │    no change │
│ QQuery 18    │  3303.49ms │      3177.30ms │    no change │
│ QQuery 19    │    89.20ms │        96.51ms │ 1.08x slower │
│ QQuery 20    │  1170.70ms │      1156.21ms │    no change │
│ QQuery 21    │  1402.88ms │      1379.77ms │    no change │
│ QQuery 22    │  2492.16ms │      2544.69ms │    no change │
│ QQuery 23    │  8800.13ms │      8683.87ms │    no change │
│ QQuery 24    │   501.96ms │       484.56ms │    no change │
│ QQuery 25    │   402.33ms │       402.05ms │    no change │
│ QQuery 26    │   552.56ms │       548.03ms │    no change │
│ QQuery 27    │  1726.17ms │      1692.30ms │    no change │
│ QQuery 28    │ 13125.21ms │     12607.53ms │    no change │
│ QQuery 29    │   531.46ms │       539.16ms │    no change │
│ QQuery 30    │   865.41ms │       840.98ms │    no change │
│ QQuery 31    │   896.05ms │       901.11ms │    no change │
│ QQuery 32    │  2764.85ms │      2826.69ms │    no change │
│ QQuery 33    │  3459.25ms │      3449.30ms │    no change │
│ QQuery 34    │  3450.70ms │      3432.15ms │    no change │
│ QQuery 35    │  1307.10ms │      1303.63ms │    no change │
│ QQuery 36    │   228.34ms │       235.58ms │    no change │
│ QQuery 37    │    91.39ms │        93.59ms │    no change │
│ QQuery 38    │   129.47ms │       130.20ms │    no change │
│ QQuery 39    │   418.50ms │       420.71ms │    no change │
│ QQuery 40    │    49.36ms │        52.13ms │ 1.06x slower │
│ QQuery 41    │    44.31ms │        47.14ms │ 1.06x slower │
│ QQuery 42    │    59.01ms │        56.18ms │    no change │
└──────────────┴────────────┴────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary             ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)             │ 60337.65ms │
│ Total Time (move-predicate)   │ 59497.48ms │
│ Average Time (HEAD)           │  1403.20ms │
│ Average Time (move-predicate) │  1383.66ms │
│ Queries Faster                │          0 │
│ Queries Slower                │          4 │
│ Queries with No Change        │         39 │
└───────────────────────────────┴────────────┘

zhuqi-lucas

LGTM Thanks!

datafusion/datasource-parquet/src/opener.rs

alamb · 2025-04-05T11:09:42Z

I plan to merge this tomorrow unless other people would like a chance to commetn

alamb · 2025-04-06T13:33:06Z

Onwards towards topk pushdown

alamb · 2025-04-11T19:37:45Z

This definitely is an API change -- I hit it in the delta-rs upgrade:

feat: upgrade to DataFusion 47.0.0 delta-io/delta-rs#3378

I'll make a note to add it to the upgrade guide

parquet reader: move pruning predicate creation from ParquetSource to…

718bed3

… ParquetOpener

github-actions bot added core Core DataFusion crate datasource Changes to the datasource crate labels Apr 3, 2025

adriangb added 4 commits April 3, 2025 12:06

use file schema, avoid loading page index if unecessary

4655adc

Add comment

a30b631

add comment

398e7a4

Add comment

05bdcf8

adriangb force-pushed the move-predicate branch from 820cb49 to 05bdcf8 Compare April 3, 2025 17:36

remove check

9a38aec

adriangb commented Apr 3, 2025

View reviewed changes

adriangb added 2 commits April 3, 2025 12:52

fix clippy

f886645

update sqllogictest

2d486d5

github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Apr 3, 2025

adriangb mentioned this pull request Apr 3, 2025

[Epic] A collection of dynamic filtering related items #15512

Open

4 tasks

alamb reviewed Apr 3, 2025

View reviewed changes

restore to explain plans

bc9af12

github-actions bot removed the sqllogictest SQL Logic Tests (.slt) label Apr 3, 2025

adriangb added 2 commits April 3, 2025 16:42

reverted

3fd4b8a

modify access

c193d2e

alamb reviewed Apr 4, 2025

View reviewed changes

alamb mentioned this pull request Apr 4, 2025

Always apply per-file schema during parquet read pydantic/datafusion#18

Merged

Always apply per-file schema during parquet read (#18)

34993f2

alamb approved these changes Apr 4, 2025

View reviewed changes

adriangb mentioned this pull request Apr 4, 2025

Run benchmarks triggered by CI comment #15583

Open

zhuqi-lucas approved these changes Apr 5, 2025

View reviewed changes

zhuqi-lucas reviewed Apr 5, 2025

View reviewed changes

datafusion/datasource-parquet/src/opener.rs Outdated Show resolved Hide resolved

Update datafusion/datasource-parquet/src/opener.rs

c9aaa3d

alamb merged commit dccf377 into apache:main Apr 6, 2025
27 checks passed

alamb added the api change Changes the API exposed to users of the crate label Apr 11, 2025

This was referenced Apr 11, 2025

Release DataFusion 47.0.0 (April 2025) #15072

Open

Upgrade guide for DataFusion 47.0.0 #15707

Closed

adriangb mentioned this pull request Apr 20, 2025

Implement Parquet filter pushdown via new filter pushdown APIs #15769

Open

		#[tokio::test]
		async fn parquet_exec_display_deterministic() {

		// When both matched and pruned are 0, it means that the pruning predicate
		// was not used at all.

parquet reader: move pruning predicate creation from ParquetSource to ParquetOpener #15561

parquet reader: move pruning predicate creation from ParquetSource to ParquetOpener #15561

Conversation

adriangb commented Apr 3, 2025 • edited Loading

adriangb commented Apr 3, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adriangb Apr 3, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alamb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adriangb Apr 3, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alamb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alamb commented Apr 4, 2025

adriangb commented Apr 4, 2025

alamb commented Apr 4, 2025

alamb commented Apr 4, 2025

alamb commented Apr 4, 2025

adriangb commented Apr 4, 2025

alamb commented Apr 4, 2025

adriangb commented Apr 4, 2025

alamb commented Apr 4, 2025

adriangb commented Apr 4, 2025

alamb commented Apr 4, 2025

alamb commented Apr 4, 2025 • edited Loading

alamb commented Apr 4, 2025

adriangb commented Apr 4, 2025

alamb commented Apr 4, 2025

adriangb commented Apr 4, 2025

alamb commented Apr 4, 2025

adriangb commented Apr 4, 2025

adriangb commented Apr 4, 2025

adriangb commented Apr 4, 2025

alamb commented Apr 4, 2025 • edited Loading

adriangb commented Apr 4, 2025

alamb commented Apr 4, 2025

alamb commented Apr 4, 2025

zhuqi-lucas left a comment

Choose a reason for hiding this comment

alamb commented Apr 5, 2025

alamb commented Apr 6, 2025

alamb commented Apr 11, 2025

adriangb commented Apr 3, 2025 •

edited

Loading

adriangb Apr 3, 2025 •

edited

Loading

adriangb Apr 3, 2025 •

edited

Loading

alamb commented Apr 4, 2025 •

edited

Loading

alamb commented Apr 4, 2025 •

edited

Loading