PySR Hyperparameter setup for large dataset and Dimension Consistancy #950
Replies: 2 comments 4 replies
-
Hi @TrailblazerH, In SI units (the dimensional analysis system uses SI), note that If you would like the units to be cancelled out, one option is to switch to a different unit instead (like distance) Also be sure to read the tuning page: https://ai.damtp.cam.ac.uk/pysr/tuning/ Cheers, |
Beta Was this translation helpful? Give feedback.
-
Also, you can use regular |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi awesome PySR developers,
Thanks for building such a great tool! I'm using PySR on a relatively large dataset (~100,000 data points) and have tuned the hyperparameters following the suggested workflow. I'm using three input features: Feature 1 and Feature 2 each have 40 unique values and Feature 3 has 7 unique values.
The input data consists of two variables with units of radians and one dimensionless variable. Based on previous runs without dimensional constraints, symbolic regression typically generates expressions containing trigonometric functions like sine, cosine, or tangent, which is the expected behavior. However, I've observed cases where the radian-valued variables appear outside of trigonometric functions, resulting in dimensionally inconsistent expressions. To enforce dimensional correctness, I'm implementing dimensional constraints that should restrict radian-valued variables to only appear within trigonometric functions.
However, I am running into an issue with dimensional constraints. As you can see below, I add dimensional constraints, but the weird thing happens after I add the rad units here:
model.fit(X, y, X_units=["", "rad", "rad"], y_units="")
The expression it found how does not include cos/sin/tan function anymore? Do you have any idea what could be the cause?
If possible, could you please take a quick look and let me know if the whole setup seems reasonable for this large dataset symbolic regression?
Here’s the hyperparameter setup I’m using:
Any feedback or suggestions for improvement would be much appreciated!
Thanks again! 🙌
Beta Was this translation helpful? Give feedback.
All reactions