-
Notifications
You must be signed in to change notification settings - Fork 1
User Guide responses #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
|
General comments:
|
@jrising - Thank you for your comments! We are working on addressing the issues you've highlighted and preparing responses to some questions you've raised. Thanks for the time and effort you put into reviewing our work! |
One more point: I was trying to follow the vignette, but the second example download (to https://zenodo.org/records/13976521/files/gaia_example_climate.zip) has timed out on me 5 times. I suggest you offer an option for people to download the files outside of that function, or reduce the size. |
For #10 Update to clarify several instructions and languages in the user guides
… each function in the tutorial For #10
For #10 Better clarify the differences between two output tables and add explanations of the variables in the outputs
For #10 Added more instruction on adding crops to crop calendar
Dear @jrising, Thank you so much for your detailed review of
The followings are more detailed response to specific comments.
We appreciate the reviewer’s feedback regarding the WATCH dataset. We used the WATCH dataset because this is the same climate data used in Waldhoff et al., 2020 study, which developed the core approach embedded in
The historical cropland-weighted climate data used in both Example 1 and Example 2 is sourced from example-data-1. The key difference between Examples 1 and 2 lies in the projected climate data: Example 1 utilizes a small-sized, cropland-weighted future climate dataset if users want to do a quick test of gaia, whereas Example 2 relies on larger climate NetCDF files and would take longer time to run and it is for those who would like to reproduce similar experiment but with their own NetCDF data. To avoid redundant downloads of the historical data, we put all cropland-weighted climate data within example-data-1.
Thanks for your question. Currently, gaia supports outputs at both the country level and the region-basin level (specifically for the Global Change Analysis Model GCAM). The methodology is limited to the country scale because the model relies on global crop yield data from FAOSTAT, which is typically available at the country scale. Additionally, gaia was designed primarily to support global multisector dynamics models and integrated assessment models, which generally operate at the country level or even coarser scales.
Thank you for your question. gaia requires weather data with global coverage to get the best results. If the data does not have global coverage, the model will encounter missing weather information for certain crop types that may cause error and it will have less data for empirical fitting.
Thanks for your question. The annual yield varies by country, crop, and year. The data is formatted to show the monthly information on growing season, precipitation and temperature, so the annual yield is repeatedly showing at the monthly time step. Table 3 is for historical information and Table 4 is for the future weather. We have added more explanations in the section for clarity. Please see commit 6e0153a.
Thanks for your question. The min, max, and mean temperature and precipitation are all calculated based on cropland-weighted monthly temperature and precipitation for each year. We added explanations for the outputs. Please see commit 6e0153a.
Thank you. The yield shock is the fractional change in the yield of a given crop grown in a specific country in a future period compared to the baseline yield expected under stable climate conditions. We have added explanations of yield shock in this section. Please see commit b8b2ad5.
Thank you for your comment. The primary goal of this model is to equip researchers with tools to estimate country-level yield shocks under different future climate forcings using a peer-reviewed methodology. Gaia is designed as a crop model with built-in capabilities to processes, clean, and format climate data, global crop data (e.g., yield, planting, and harvesting) and outputs. A key advantage of gaia is that it simplifies complex processes for researchers who may not have extensive experience in data manipulation or model development. It provides seamless integration for visualizing results and processing outputs in a format compatible with different versions of GCAM, which has a broad user community. In addition, the provided functions are modular and adaptable for different projects that require country level yield shocks. Users can customize the inputs and workflow to suit their specific research needs, such as configuring the model, fitting the model with different climate and socioeconomic scenarios, CO2 emission trajectories, adding more crops, select time series of interest, and specifying the output options.
Thank you for pointing that out. To make the example data more suitable for download, we have reduced the size of the climate NetCDF file by shortening the time coverage from 2015-2100 to 2015-2030. This adjustment reduces the ZIP file size from 1.4 GB to 261 MB. Users who would like to try the full 21st-century time series can use the alternative dataset, example-data-1. We have also offered two options to download the data in the user guide, including through download URL link and through gaia's |
This is just a quick comment to thank you for your responses. I can't go through them now, but I will respond in a couple weeks. |
Thank you for your responses. Most of these are fine, but I still have the following concerns: Re: "CMIP-ISIMIP format": The link directs one to the protocol for simulation formatting, not input climate formatting. Also, it is a 17 page document. If the goal to for users to be able to provide their own data, the specific requirements needed for gaia should be specified. Also, the new text "Processes CMIP climate NetCDF data in accordance with the ISIMIP simulation protocols" suggests that you are processing data in accordance with the protocols, not that the data is in accordance with the protocols. Re: "Run gaia! Example 3": This is still confusing. You label it an example. Presumably, you intend for it to be an example of how to use gaia, so what is the goal of the example and how do I follow it? Re: adding new crop calendars: Table 3 shows 0's for every row under the oil_palm column, because only the first 10 rows are included. I don't see responses to the following questions:
Additional issues trying to run the vignette:
|
For #10 - Rename example 3 to be explore output to clarify the purpose of this section - Update the example code provided for users to modify crop calendar, so that people can see the immediate changes decribed in Table 3
For #10 Solve the error with writing processed example-data-2 into the example-data-1 folder.
Dear @jrising, Thank you for your comments. We have made several updates, as demonstrated in the linked commits above. Below are our responses to your comments:
Thank you for your comment. The ISIMIP climate forcing data format shares similarities with the impact model simulation output format. Unfortunately, there is no dedicated ISIMIP page detailing the input climate forcing format, which is why we referenced the simulation protocols. We agree that this document is quite long. To address this, we have summarized the key formatting requirements compatible with
Thank you for pointing this out. This section is intended to explain the intermediate and final outputs from running
Thank you for catching this! To keep the documentation relatively concise, we initially included only a limited number of rows for each table. We have now modified the example code to ensure that countries with newly added oil palm planting and harvesting months appear in the first 10 rows for demonstration purpose.
Thank you. In our last revision, we made several updates to improve this, which are summarized in our update log. Specifically:
We acknowledge that
We appreciate your thoroughness. However, we are unclear about what the "log-sum-exponent" problem you are referencing. In our search, we only find the LSE "trick" referenced as one used to deal with very large or small numbers, which does not seem applicable in this case. Could you please provide a reference or more detailed description of the potential issue?
Thank you for your suggestion. We have added a note in the User Guide clarifying this requirement here.
Thank you for your comment. Could you clarify which specific aspect of uncertainty you are referring to? The regression model follows standard practices by using the best estimates. We’d appreciate any further details on how uncertainty should be addressed in this context.
Thank you for catching this. We have updated the example to include the missing comma.
Thank you for catching this error. The issue is caused by saving the processed climate data from example-data-2 into the I have thoroughly updated the example codes to ensure separate output folders for Example 1 and Example 2. Please delete your current outputs and rerun the "Run gaia!" section from the beginning. |
Re: log-sum-exp problem: The problem comes from the fact that there is no pont-level data-generating process that is consistent with a log dependent variable regression. When you use different sized regions (countries), you are assuming that there is. If your regression is log(y) = f(T), but y = \sum_i w_i y_i, summing production over regions in a country, then you get log(y) = log(\sum_i w_i exp(f(T_i))). According to the log-sum-exp relationship (https://en.wikipedia.org/wiki/LogSumExp), that's approximately equal to f(T_j), where j is the largest yielding region, not f(\sum_i w_i T_i) like you use. Re: Standard errors are reported in the regression Standard practice when projecting econometric results under future climate is to report uncertainty around the final projection, as driven by uncertainty in the underlying coefficients. The normal way to do this is by taking Monte Carlo draws from the multivariate normal distribution of all parameters (using the variance-covariance matrix) and projecting each set of parameters, which then gives you an empirical distribution of final outcomes. Further problems with the examples: I was able to get the Run gaia! example 2 working with data from example 1. It may be confusing to users that the example 1 data doesn't map to the example 1 running. I deleted all my data, updated the package, and ran the code from scratch, but when I try to do Run gaia! example 1 I get the error:
|
#10 This will only filter the climate files for the right climate model, scenario, and time periods
Hi @jrising, Re: log-sum-exp problem: Re: Standard errors are reported in the regression Re: Further problems with the examples |
Re: log-sum-exp problem: Re: Standard errors are reported in the regression: Re: Further problems with the examples:
|
openjournals/joss-reviews#7538
These comments refer to the User Guide vignette (https://jgcri.github.io/gaia/articles/vignette.html).
The text was updated successfully, but these errors were encountered: