You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using the same parameters in two different software packages results in drastically different model performances.
For example, in R: log likelihood = 37.83, aic = -69.66; while in Python: Log Likelihood = -99.484, AIC = 204.969.
Can you help me?
The text was updated successfully, but these errors were encountered:
I don't know what objective function is used by statsmodels. But even if the docs say it is maximum likelihood, there are many variations. R is using a state space representation with a diffuse prior as explained in the documentation for stats::arima(): https://rdrr.io/r/stats/arima.html. Other objective functions may yield different results. See https://robjhyndman.com/hyndsight/estimation/
The AIC/BIC/etc depends on the likelihood, so different likelihood functions lead to different information criteria. Even with the same likelihood function, some software implementations omit the constant in the calculation. See https://robjhyndman.com/hyndsight/lm_aic.html.
Thank you very much for your reply.
When I tried to use the StatsForecast to build an ARIMA model, the results still differed significantly from those obtained by running R.
Under the same parameters {order=(0, 1, 1), season_length=12, seasonal_order=(0,1,1)}, MAPE: is 4.922 in R and 14.463 in Python.
This may be attributed to different software algorithms?
Anyway, thank you very much for your help.
R 4.2.1; forecast 8.22.0 :
arima <- arima(train_data, order = c(0, 1, 1), seasonal = list(order = c(0, 1, 1), period = 12))
summary(arima)
Series: train_data
ARIMA(0,1,1)(0,1,1)[12]
Coefficients:
ma1 sma1
-0.193 -0.791
s.e. 0.091 0.084
sigma^2 = 181: log likelihood = 37.83
AIC=-69.66 AICc=-69.45 BIC=-61.32
Python 3.11; statsmodels 0.14.1 :
model = SARIMAX(train_data['incidence'], order=(0,1,1), seasonal_order=(0,1,1,12))
result = model.fit()
print(result.summary())
SARIMAX Results
Dep. Variable: incidence No. Observations: 132
Model: SARIMAX(0, 1, 1)x(0, 1, 1, 12) Log Likelihood -99.484
Date: Mon, 29 Apr 2024 AIC 204.969
Time: 23:46:06 BIC 213.306
Sample: 0 HQIC 208.354
- 132
Covariance Type:opg
coef std err z P>|z| [0.025 0.975]
ma.L1 -0.6900 0.048 -14.322 0.000 -0.784 -0.596
ma.S.L12 -0.8250 0.102 -8.081 0.000 -1.025 -0.625
sigma2 0.2766 0.019 14.838 0.000 0.240 0.313
Ljung-Box (L1) (Q): 0.73 Jarque-Bera (JB): 438.41
Prob(Q): 0.39 Prob(JB): 0.00
Heteroskedasticity (H): 1.21 Skew: -0.82
Prob(H) (two-sided): 0.56 Kurtosis: 12.26
Using the same parameters in two different software packages results in drastically different model performances.
For example, in R: log likelihood = 37.83, aic = -69.66; while in Python: Log Likelihood = -99.484, AIC = 204.969.
Can you help me?
The text was updated successfully, but these errors were encountered: