Getting user / student level predictions #152

GregSzopinski · 2023-12-08T09:25:24Z

Great work, I managed to run examples and prepare my own "mock" - it works just fine. I have some difficulties though when analyzing the results - I'd like to check individual users' predictions to analyze how it changes when sequences change, go through some examples, etc. Where should I look for stuff like that in the repository?

sonyawong · 2023-12-11T04:17:02Z

Great work, I managed to run examples and prepare my own "mock" - it works just fine. I have some difficulties though when analyzing the results - I'd like to check individual users' predictions to analyze how it changes when sequences change, go through some examples, etc. Where should I look for stuff like that in the repository?

Hi，when you run the wandb_predict.py successfully, you will get the output files such as qid_test_question_window_predictions.txt. The files contains the prediction results. Please note that we have split the original long student interactions into sub-sequences (each sub-sequence contains up to 200 interactions). Therefore, if you want to check each user's predication, you may further data preprocess for your own needs.

GregSzopinski · 2023-12-11T08:35:56Z

Thanks for the reply. Sub-sequences are fine, and I managed to find the files with predictions before, I'm just a bit confused what each row corresponds to. Hence, a few questions on how should I interpret the results:

If I get this correctly, each row in the output file is a sequence for which we're making the prediction, right?
Each "orirow" value in qid_test_question_window_predictions.txt corresponds to individual student - that is at least my impression since number of unique values in this column corresponds to number of students in test_quelevel.csv. How to go from "orirow" in predictions file to student and/or sequence id? In other words, I'd like to know for what exactly (e.g. which sequence) the prediction (in that particular row) is being made.
Last but not least,- If I get this right, since we're evaluating on question-level - how to check which question is predicted for given row?

Thanks for the help + once again, great work. :)

sonyawong · 2023-12-12T07:19:10Z

Thanks for the reply. Sub-sequences are fine, and I managed to find the files with predictions before, I'm just a bit confused what each row corresponds to. Hence, a few questions on how should I interpret the results:

If I get this correctly, each row in the output file is a sequence for which we're making the prediction, right?

Each "orirow" value in qid_test_question_window_predictions.txt corresponds to individual student - that is at least my impression since number of unique values in this column corresponds to number of students in test_quelevel.csv. How to go from "orirow" in predictions file to student and/or sequence id? In other words, I'd like to know for what exactly (e.g. which sequence) the prediction (in that particular row) is being made.

Last but not least,- If I get this right, since we're evaluating on question-level - how to check which question is predicted for given row?

Thanks for the help + once again, great work. :)

I am very grateful for your recognition of our work. Hope the following explanation would further solve your question:

For the output file i.e., qid_test_question_window_predictions.txt, each row denotes a prediction result in various fusion types (early fusion, late fusion mean , late fusion vote, late fusion all) for each question.
orirow denotes the row index of each question in the test_quelevel.csv. Hence, to get the prediction results of a student, u may integrate each prediction result with the same orirow value.
Since we provide 4 fusion types (early fusion, late fusion mean , late fusion vote, late fusion all) to get question-level prediction results, you can choose any of them as the final results. In our experiments, late fusion-mean results perform better, so we prefer the results on late fusion-mean type.

GregSzopinski · 2024-01-08T08:59:52Z

Thanks a lot. One more question regarding the columns in predictions file(s) - do I get it right?

concept_preds - mastery of given concept
late_trues - true values for late fusion
late_mean - predicted value for mean-based late fusion
late_vote - predicted value for majority voting late fusion
late_all - predicted value for late fusion merged methods
early_trues - true values for early fusion
early_preds - predicted values for early fusion

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting user / student level predictions #152

Getting user / student level predictions #152

GregSzopinski commented Dec 8, 2023

sonyawong commented Dec 11, 2023

GregSzopinski commented Dec 11, 2023

sonyawong commented Dec 12, 2023

GregSzopinski commented Jan 8, 2024

Getting user / student level predictions #152

Getting user / student level predictions #152

Comments

GregSzopinski commented Dec 8, 2023

sonyawong commented Dec 11, 2023

GregSzopinski commented Dec 11, 2023

sonyawong commented Dec 12, 2023

GregSzopinski commented Jan 8, 2024