Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the training loss function #188

Open
Xiao-congxi opened this issue Nov 21, 2024 · 3 comments
Open

About the training loss function #188

Xiao-congxi opened this issue Nov 21, 2024 · 3 comments

Comments

@Xiao-congxi
Copy link

Xiao-congxi commented Nov 21, 2024

During the pre-training, is the used loss function a combination of the MSE loss calculated on the first [mean] output head and the quantile loss calculated on other 9 [quantile] output heads?

@rajatsen91
Copy link
Collaborator

Yes that is what we have been using.

@Xiao-congxi
Copy link
Author

Xiao-congxi commented Nov 22, 2024

Thank you very much for your immediate reply. I have another question about the experiment results in the paper. Why are the MAE of the fine-tuned TimesFM (Table 2) is higher than the zero-shot results (Table 5) , on ETTm1 and ETTm2?

@ani0135
Copy link

ani0135 commented Jan 9, 2025

@Xiao-congxi Can you please provide the loss function used here which is the combination of both heads?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants