Composite Model Analysis in Forecasting the Malaysian Imports

For more than a century, forecasting models have been crucial in a variety of fields. Models can offer the most accurate forecasting outcomes if error terms are normally distributed. Finding out a good statistical model for time series predicting imports in Malaysia is the main target of this study. The decision made during this study mostly addresses Vector Error Correction Method (VECM), composite model (Combined regression-ARIMA), and ARIMA model. The imports of Malaysia from the first quarter of 1991 to the first quarter of 2023 are employed in this study's quarterly time series data. The forecasting outcomes of the current study demonstrated that the composite model offered more probabilistic data, which improved forecasting the volume of Malaysia's imports. The (ARIMA) mode, composite model, and VECM model in this study are linear models based on responses to Malaysia's imports. Future studies might compare the performance of nonlinear and linear models in forecasting.


Introduction
Prediction is a difficult art, especially when the future is involved.Forecasting is a process of making statements on events in which their actual outcomes (typically) have not occurred.The art of forecasting the future is a vital and important exercise to determine the economic performance of countries.Malaysian economists would like to determine the future imports to formulate their policy properly, and Malaysian analysts would like to determine the future performance of imports to guide their influencing factors.
Forecasting is beneficial as it provides useful data to various stakeholders and thus aiding in future decision makings.Forecasting for short periods predicts estimates in a distant future.The future cannot be forecasted accurately because forecasting coms with a margin of error.The margin of error increases, particularly when forecasting deep into the future that is, predicting the future.Variables and their expected influence may change (with social, economic and political changes) and other new variables may emerge.These errors arise because of the level of inaccuracy of the base information, the method used to forecast the future and the selection of an inappropriate methodological framework for time series data analysis, thereby providing biased an unreliable estimate.Given this condition, the selection of a forecasting method is pivotal in predicting the future.

Literature Review
Many investigations have been made to determine how Malaysian imports behave.Including Alias (1978), Semudram (1982), and (Awang, 1988).These estimated a traditional (classical) import demand function was computed using them, where the level of real income and relative prices serve as the explanatory variables, and the response variable is the number of imports.These analyses' fundamental presumption is that the data are stationary.The studies mentioned above were done prior to ‗co-integration analyses' and ‗error correction models' (ECM) were standard practice in time series analysis.To estimate the import demand function, they employed conventional (OLS) ordinary least squared regression models or partial adjustment techniques.These researches presume that the model's explanatory variables and import volume have an underlying equilibrium connection.Granger and Newbold (1974), if the stationary assumption is violated, this could result in spurious regression, therefore beware.As a result, the OLS method's standard statistical inference would be uncertain.In a late study, Tang and Alias (2000) used the Johansen (1988) multivariate co-integration method to determine the long-run elasticities of import demand.They revealed how present income and relative pricing have an impact on import growth in the near run Employing the error correction model (ECM).The assumed ECM's error correction term, however, was not relevant at the 10% level, demonstrating the absence of a long-term connection.Kremers et al. (1992), reveal that for statistics with little test measure, no co-integration connection can be made among factors that are coordinated of order one, I (1).(Mah,

Figure-1. Time Series of Malaysian Imports
A very common accuracy measurement functions are used to assess the performance of each model described below, these performance functions are: Akaike's Information Criterion (AIC), mean absolute percentage error (MAPE), coefficient of determination (R 2 ) and Ljung-Box test ( (Alzahrani et al., 2020;Campolieti and Ramos, 2022;Fisher, 2011;Milad M. A. H., 2020). (

Stationarity Test
A time series is a collection of observations on a variable that are regularly taken across time at predefined intervals.If a time series' mean and variance are constant and its covariance totally depend on the interval or lag between two periods rather than the actual time the covariance is calculated, the time series is said to be covariance stationary (weakly or simply stationary) (Brockwell and Davis, 2013;Chatfield, 2000;Enders, 2008).To model a time series with ARIMA and exponential smoothing methods, the time series must be stationary.It is common practice to estimate the model coefficients using OLS regression.The stochastic process must be stationary in order for OLS to be effective.The use of OLS can result in inaccurate estimations when the stochastic process is nonstationary.Such estimates are what Granger (1981) referred to as "spurious regression" results since they have high R2 values and t-ratios but no discernible economic significance.The ADF and PP unit root tests of stationarity are run in this study to exclude structural effects (autocorrelation) in the time series.Additionally, this study utilizes the autocorrelation function (ACF) and partial autocorrelation function (PACF) to assess the data's stationarity.A nonstationary series' autocorrelation function (ACF) also displays a pattern with a gradual decline in autocorrelation size.In Figure 2, six instances of such series are shown.
where is the time series, is a white noise process with mean and variance, is the backshift operator, is difference parameter and ( and ( are the polynomials of orders and a, respectively.

Composite Model
The composite (combined regression-ARIMA) model has been proven useful in many areas, such as in economic business forecasting.This method is based on excellent documentation (Co and Boosarawongse, 2007) and has been proven to be computationally efficient.This model is expressed as ( ( , (6) where is the dependent variable, are the independent variables, are the regression parameters, and are the AR and MA parameters, respectively, and is the error random variable.
This composite model can be used to process a high degree of autocorrelation in residuals.Therefore, this study integrates CO-VECM into this model to improve its performance.

Vector Error Correction Method (VECM):
A Vector Error Correction method (VECM model) is a restricted Vector Autoregression (VAR) designed for use with non-stationary series that are known to be cointegrated and is known as the error correction method (ECM MODEL) since the deviation from long-term equilibrium is corrected gradually through a series of partial short-term adjustments.In this study, the VECM equation econometric model for Malaysia's imports can be specified as follows.

Stationarity Tests
The following unit root tests were used: the ADF and PP tests (for which the null hypothesis are nonstationary).1 to 2 show that the null hypothesis of (y_t,x_1t,x_2t ) has a unit root and cannot be rejected at the 5% level of significance in both the ADF and PP tests.Therefore, all variables are non-stationary in their level form and both the mean and variance are not constant.However, all variables are stabilised at the first level.

Lag Order Selection
Selecting the number of the lags is crucial in the conception of a VAR model.Lag length is often selected by using a fixed statistical criterion, such as LR, FPE, AIC, SC and HQ.The results of LR, FPE, AIC, and HQ as shown in the above table clearly indicate that the number of optimal delays in our model is equal to 4. Meanwhile, the results of SC indicate that the number of optimal delays is equal to 2. After comparing these delays based on the accuracy of the model results, we find that the number of optimal delays in our model is equal to 4.

(VECM) Vector Error Correction Approach
Panel A of Table 4 shows that the (VECM) model is statistically significant at the 1% level and bears a negative coefficient, which is desirable.Therefore, the model is reliable.Meanwhile, the value of -0.64 suggests that the longrun equilibrium relationship eventually returns to the steady state when the system faces some shocks.However, the coefficient has a moderate value, which indicates that restoring such relationship to its steady state will not take long when the system faces some disturbance.This finding is consistent with those of Pindyck and Rubinfeld (1998), who considered the same restrictions for Malaysia's imports in his work.

Diagnostic Tests
Panel B of Table 4 shows that The LM test can be used to detect the autocorrelation problem, which conclude that no serial correlation exists.The results of the Jarque-Berra (JB) test confirm that the residual is normally distributed.Nevertheless, we confirm that heteroscedasticity no existing in our model because the results of white test confirm that the series is not suffers from the effect of heteroscedasticity on error variances.

Box-Jenkins Approach for Univariate Models (ARIMA)
The ARIMA model is typically applied to time series analysis, forecasting and control.The Box-Jenkins (ARIMA) modelling approach has three major stages: model identification, model estimation and validation and model application.

Model Identification
Firstly, a series of stationary conditions should be imported.To achieve this, the stationarity of the import series is analysed via ADF and PP tests.The results are presented in Table 1 and Table 2.The series is stationary in the first level.

Model Estimation and Validation
This step is initiated by estimating the 8 specifications of ARIMA models as shown in Table 5.Then, the optimal model amongst the studied models can be selected in accordance with the specifications.The initial estimates are presented in Table 5.As indicated in Table 5, all the parameters in the first, second and third models are significant, whereas the rest of the other models are insignificant.The (1,1,1) model, the (1,1,0) model and the (0,1,1) model random walk model are optimal and appropriate to help achieve a part of the first objective of the present study, i.e., to forecast Malaysia's imports.The selected model also approximately fulfils the basic criteria for model selection with minimum values of Bayesian information criterion (BIC), root-mean-square error (RMSE) and mean absolute error (MAE) with a high correlation of coefficients and an insignificant Ljung-Box value.Amongst the models assessed in the present study, the identified optimal model is the (1,1,1) model, where of RMSE, MAE, and BIC are slightly smaller than those of the other models.Thereafter, the mean and the variance of the series become stationary.This condition should be present in the appropriate model, i.e., the (1,1,1) model.Table 7 presents the p-values for the Ljung-Box test.A good forecasting model should have residuals that are simply white noise after fitting the model; furthermore, insignificant values are expected when evaluating the residuals.7 shows that the Ljung-Box test provides an insignificant p-value, thereby indicating that the residuals appear to be uncorrelated and the model is suitable for prediction.

Composite Model
We develop composite model that use VECM to obtain short-term forecasts.
( ( ( ) ( ( ) (5) We construct an ARIMA model for the random error variable in VECM by performing a time series analysis.The residuals in this model, such as , are analysed as follows by using the ARIMA model.
The ARIMA (1,1,2) model of the residual series is combined with VECM to develop the MARMA composite model for forecasting Malaysia's imports.The results are presented in Table 8.
We substitute the ARIMA (1,1,2) model for the implicit error in the original regression model equation.As shown in Table 8, the MARMA model is a combination of the regression model and the time series model.The dependent variable,( and the independent variables are related whilst the error term that is partially -explained‖ by a time series model is estimated.Table 8 shows that the explanatory variables and the AR and MA parameters explain nearly 88% of the error term.

Diagnostic Tests:
We Evaluate the Serial Correlation, Normality, Heteroscedasticity and Predictive Ability of the Composite Model by Performing Diagnostic Tests.9), show the composite model passes all diagnostic tests, no autocorrelation is observed at 5% confidence level and the average and its standard deviation are 0.000419 and 0.036607.The error term is normally distributed based on the values of torsion, spacing in Jarque-Bera test.
We test the effect of heteroscedasticity by calculating the coefficients of the residual ACF and PACF for a certain number of time differences.Table 10 shows that all ACF and PACF indicating the absence of correlation in the time series and heteroscedasticity in the error variances.

Assessing Predictive Ability:
The difference between the adjusted-R 2 and predicted-R 2 must always be between 0 and 0.200 to ensure that the model has an adequate predictive ability.In our calculations, the difference between these values is 0.009, thereby indicating that both values are in good agreement and that CM-VECM has a high predictive ability.

Analysis of the forecasting abilities of various models
The three models, the VECM model the ARIMA model, and composite model, are contrasted as seen in Table 11.These models were compared based on a range of error metrics.Table 11 and Figure 3 below provide summaries of the outcomes of the forecasting performance of these two models (5).The results shown in Table 11 and Figure 3 were evaluated and analysed by the author in light of the pertinent problems.
The selected model demonstrates excellent performance as reflected in its explained variability and predictive power.

Discussion
The results presented in Table 11 revealed that the MAPE and AIC of composite model are 0.001, and 0.031, respectively, for the time series of the Malaysia's imports.Such results clearly indicate that all results are lower than those of the other method and R 2 in the model is higher than that in the other model.Based on that, Since the composite model had the best match out of all the models, it performed the best.Figure 4 displays the ACF and PACF of the residuals.To create a satisfactory forecasting model, the residuals should only contain white noise after the model has been fitted.Insignificant values are anticipated for these statistics when looking at the residuals.The selected model demonstrates excellent performance as reflected in its explained variability and predictive power.Therefore, the results of CO-VECM show that the dependent variable (Malaysia's imports) and independent variables (GDP and exports) are related, the error term that is partially -explained‖ by a time series model is estimated and the explanatory variables as well as the AR and MA parameters explain nearly 0.88% of the error term.These findings are in line with those of Shamsudin and Arshad (1990), Shamsudin and Arshad (2000), Islam (2007), Khin (2008), Aye et al. (2011), Khin (2013).The composite model provides better forecasts than the regression equation or time series model alone because this model provides structural and time series explanations for those parts of the variance that can and cannot be explained structurally, respectively.This result supports the findings in Milad M. et al. (2017) Milad M. andRoss (2016).

Conclusion
The methods for predicting imports in Malaysia were suggested and assessed in this study.The proposed models, that are, VECM model, ARIMA model and composite model were assessed by comparing them with one another using Malaysia's import time series.This study has made a valuable contribution to the literature as it was the first empirical study in this field to compare VECM model, ARIMA model and composite models.The achieved findings have proven the significance and worth of such composite model as a potent forecasting technique that improves the precision of import value prediction and strengthens forecasting techniques in the Malaysian context.As observed from the results that the composite model is suitable for use it in forecasting Malaysian imports, the author recommends the proposed composite model is a linear model that relies on the reactions to Malaysia's imports.However, future research should better describe the use of non-linear models, such as neural network models.The same procedures described in this study can be also applied to these models.Afterwards, the forecasting performance of non-linear and linear models may be compared.

Figure- 3 .
Figure-3.The outcomes of comparing the forecasting abilities of the various models

Figure- 4 .
Figure-4.PACF and ACF of the residuals of Malaysia's imports from the composite model

Table - 1
. Results of the ADF test for the linear variables

Table - 4
. VECM model results in the short run

Table - 5
. Initial estimates of the parameters of different ARIMA models

Table - 6
. Comparative results from various ARIMA models for Malaysia's imports

Table - 8
. Results of the MARMA composite model

Table - 11
. Statistical measures of forecast error for Malaysia's imports.