About the training loss function #188

Xiao-congxi · 2024-11-21T16:15:07Z

During the pre-training, is the used loss function a combination of the MSE loss calculated on the first [mean] output head and the quantile loss calculated on other 9 [quantile] output heads?

rajatsen91 · 2024-11-21T20:52:51Z

Yes that is what we have been using.

Xiao-congxi · 2024-11-22T05:48:15Z

Thank you very much for your immediate reply. I have another question about the experiment results in the paper. Why are the MAE of the fine-tuned TimesFM (Table 2) is higher than the zero-shot results (Table 5) , on ETTm1 and ETTm2?

ani0135 · 2025-01-09T05:15:25Z

@Xiao-congxi Can you please provide the loss function used here which is the combination of both heads?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About the training loss function #188

About the training loss function #188

Xiao-congxi commented Nov 21, 2024 •

edited

Loading

rajatsen91 commented Nov 21, 2024

Xiao-congxi commented Nov 22, 2024 •

edited

Loading

ani0135 commented Jan 9, 2025

About the training loss function #188

About the training loss function #188

Comments

Xiao-congxi commented Nov 21, 2024 • edited Loading

rajatsen91 commented Nov 21, 2024

Xiao-congxi commented Nov 22, 2024 • edited Loading

ani0135 commented Jan 9, 2025

Xiao-congxi commented Nov 21, 2024 •

edited

Loading

Xiao-congxi commented Nov 22, 2024 •

edited

Loading