You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for the first open-sourced diffusion model on EHR. When we ran GAN baselines and EHRDiff on MIMIC or other datasets, we found the correlation between feature prevalence of synthetic data and feature prevalence of real data are both ~0.8, much lower than 0.99. Is there any tricks to run GAN baselines and EHRDiff?
The text was updated successfully, but these errors were encountered:
One reason could be that your paper reports Pearson corr, which is high for all methods. While we evaluate Spearman corr, some methods have pretty low Spearman corr.
I believe the issue lies with the metrics. Given that the MIMIC data predominantly features rare ICD codes, the distinction between these codes is quite subtle. This nuance can be amplified by the Spearman correlation. However, this might not be critically significant, as the Pearson correlation tends to be more relevant in this context, and there are many other metrics available for evaluating the results.
Thanks for the first open-sourced diffusion model on EHR. When we ran GAN baselines and EHRDiff on MIMIC or other datasets, we found the correlation between feature prevalence of synthetic data and feature prevalence of real data are both ~0.8, much lower than 0.99. Is there any tricks to run GAN baselines and EHRDiff?
The text was updated successfully, but these errors were encountered: