Sasmita Manjari Nayak and Minakhi Rout*
School of Computer Engineering, Kalinga Institute of Industrial Technology (Deemed to be) University, Bhubaneswar, India
Abstract
Prediction of bankruptcy is an active area for research which is associated with the state of insolvency where a company or a person is unable to repay the creditors the debt amount. Up to now so many statistical and machine learning based models have been introduced for bankruptcy prediction. The pre-processing phase is an important step to enhance the performance of the model. Thus, one needs to choose effective pre-processing techniques which can be more suitable for the data set considered. So, this chapter focused on both the model, specifically, ensemble models for classification to address how new improved models are developed by combining two or more simple developed techniques and pre-processing techniques in order to address the imbalanced nature of the data and outlier if any present in the data. In most of the chapters the authors make some comparisons to show up the performance of their models with some other previously developed models. Here, we observed from the survey that the pre-processed datasets give better prediction outcome and it also proved that the ensemble models are more powerful for bankruptcy prediction as compared to the single models.
Keywords: Bankruptcy prediction, ensemble models, imbalance dataset, outlier, pre-processing, machine learning, oversampling, undersampling
A company, having bond non-payment, overdrawn bank accounts, non-payment of bonus, or the announcement of bankruptcy, is taken as a failed company. The bankruptcy of a company hampering both the company’s goodwill, as well as shareholders benefits and therefore affects the economic growth of the company. To minimize these effects, the bankruptcy prediction (BP), would be necessary [1]. If a corporation is having higher stage of debts and such situation is going on for many years then that company becomes distressed. That means the company is unable or faces difficulty to pay off the debt, which leads to a bankruptcy situation. So, in some cases due to financial suffering or financial discomfort bankruptcy occurs. In some other cases firms suffer from bankruptcy just after facing a trouble, like a major fraud [16].
Different techniques are generally used to find out, the interrelation between bankruptcy of a company’s and its financial ratios like cash ratio, profitability, solvency, etc., known as a bankruptcy prediction model (BPM). By taking the facts at hand or by taking present information of a company the BPM is constructed, assuming that such relationship occurs in the future means that model can be used in future for prediction of bankruptcy. The performance of BPM is not only dependent on which types of financial ratios are selected and which techniques are used, but it is also depended on type of data that is taken for that model [2]. For choosing the dataset, one of the simplest methods is to use all data present at hand but when considered about a dataset having huge number of data, it becomes non-successful only because of high computing time and space. Another type of simple sampling technique is random sampling. When datasets have less number of data, there cannot be such types of problems like computing time and space. The fact remains that the number of bankrupt companies is very few as compared to non-bankrupt ones, for which an imbalanced problem arises. Without doing any procedure for balancing the imbalanced dataset, one cannot get accurate performance in classification, whatever may be the sampling strategy.
At the time of forecasting of bankruptcy the imbalance property of the datasets is ignored by some authors. But actually, the degree of imbalance or inequality which is the ratio of number of bankrupt to non-bankrupt may be nearby equal to 1 to 100 or even 1 to 1,000. So the imbalance problem should be discussed in bankruptcy prediction [2]. There are mainly two identified issues in imbalance datasets. The first is when a BPM uses a dataset which has very less number of bankrupt firms. In such situations, for bankrupt firms, the performance of BPM is diminished. The second is handling imbalanced datasets and enhancing the performance of the BPM [13].
An outlier can be defined as an outside data in a dataset or data having completely dissimilar character from others in a dataset, with basis on some measurement. An outlier data always contains useful information related to that dataset [14]. So handling of outliers should be done very carefully. Outliers can be handled either by omitting the outlier data or by winsorization. Omission i.e. deletion of outlier data from dataset is not preferable always as outliers are also part of reality [7]. Before handling of outliers, detection of outlier should be done. Outliers may be detected by setting a threshold value. After handling the outliers in a dataset the accuracy of BPM increases as the bad data or the outliers are removed from the dataset [9].
By using ensemble learning the predictive capability of single classifiers can be improved and simultaneously error is also reduced. Through ensemble learning we can get a strong classifier i.e. a highly accurate classifier by combining number of weak classifiers [15]. By combining multiple classifiers ensembled classifier is constructed [16]. Three things should be considered before construction of an ensembled classifier. At first one should consider which classifiers have to be taken from the available classifiers. Secondly how many classifiers are taken and the most important point is that which technique should be taken for constructing the ensembled model [17].
Preprocessing of data can be said as a technique through which the data of a dataset can be given a useable form. For this purpose there are at present several techniques through which the classification accuracy can be improved [8]. Balancing the imbalance dataset, handling the outlier mainly comes under data preprocessing.
A dataset is called imbalanced when it is having unequal distributions or in other word it can be said, a dataset is imbalanced when one class is having absolutely different numbers than another [13]. Imbalance problem arises when all most all data are associated with the majority class. Mainly due to the imbalance problem the predicting capability of BPM deteriorates [10]. Practically as compared to non-bankrupt cases, the bankrupt cases are extremely less. So, it can be said that the numeral value of bankrupt cases as compared to non-bankrupt cases is nearly equal to zero. That’s why severely imbalanced problem arises. So for a dataset first balancing is necessary before classification.
Through the review we get that the BP rate is high for balance dataset as compared to imbalance dataset for the same classifier. To balance the datasets the authors go through the following balancing techniques. Presently there are mainly two principal types of balancing techniques: one is oversampling and another is undersampling. In oversampling sampling is done in the minority class over and over that means again and again new data are added in minority class, whereas in an undersampling a part of the majority class is selected for getting same numbered samples in both classes. Both oversampling and undersampling are called resampling. The work, which were already reported on the imbalance factor of data and the techniques used to handle it are discussed in Table 18.1.
There are presently a number of oversampling techniques. Through review we get the following oversampling methods.
As oversampling, there are also presently a number of undersampling techniques. Through review we get the following undersampling methods.
Table 18.1 Findings from different chapters working on imbalance data through review.
Chapter | Dataset | Balancing strategies | Classifiers | Outcome | Future work | |
1 | Zhou, L. (2013). |
| ROWR, SMOTE, RU, UBOCFNN, UBOCFGMD. | LDA, LOGR, C4.5 (DT), NN and SVM | Sampling method depends on number of bankrupt cases. For few numbers of bankrupt cases, oversampling method SMOTE is used, but if there are presently huge number of bankruptcy cases, the combination of SMOTE and undersampling method may be used. | Identification of difficult and easy observations for testing, as training sample and bankruptcy prediction models [2]. |
2 | Kim, T., & Ahn, H. (2015). | H Bank’s bankruptcy data for non-externally audited companies in Korea. | By combining k-RNN and OCSVM, new hybrid under-sampling method is introduced. | LR, DA, CART and SVM | Whatever may be the sampling method, here SVM shows better prediction result. Performance of LR, DA, and CART is improved by applying this hybrid approach, as compared to simple random under-sampling. |
|
3 | Kim, H. J., Jo, N. O., & Shin, K. S. (2016). | Dataset having 22,500 small- and medium-sized Korean manufacturing firms | CBEUS method is used | Hybrid model that combines the Genetic Algorithms (GA) and the Artificial Neural Networks model (GA-ANN) are introduced. | The performance of GA-ANN with cluster-based evolutionary under sampling was superior than ANN with random sampling, GA-ANN with random sampling, and GA-ANN with evolutionary sampling. | Clustering algorithms, such as self-organizing maps, and hierarchical agglomerative clustering should be tested in future [4]. |
4 | Sisodia, D. S., & Verma, U. (2018). | The Spanish bankruptcy dataset which is collected from GitHub is used here | Oversampling methods like SMOTE, BSMOTE, SLS, and ROS and undersampling methods like RUS and CNN are used here. | Three individual classifiers i.e. C4.5, LR, and SVM and three ensembled classifiers i.e. AdaboostM, DTBagging, and RF are used here. | Here no one is the best for all but for oversampling LR and DTBagging are better where as for undersampling C4.5 and RF are better. Here oversampling method with DTBagging gives best prediction result as compare to others [5]. | |
5 | Le, T., Le Son, H., Vo, M. T., Lee, M. Y., & Baik, S. W. (2018). | Korean bankruptcy dataset (KBD). | Here the undersampling method is used, in which IHT (Instance Hardness Threshold) concept is used to remove noise. | The Cluster-based boosting (CBoost) classifier is used. | The Robust Framework using the CBoost algorithm and IHT(RFCI) performs better than the GMBoost algorithm, the oversampling-based methods, and the clustering-based undersampling method. | To find out a cost-sensitive method which can deals with the class imbalance problem in order to get an optimized model to predict bankruptcy [6]. |
6 | Kim, M.J., Kang, D.K., & Kim, H.B. (2015). | Dataset of Korean commercial bank is used here. | Here SMOTE technique is used to balance the data. | Here following techniques are used for classification:
| For both imbalanced data and balanced data, GMBoost has the highest prediction power than the other. | For multiclass classification problem, where severe imbalance problem are found, GMBoost algorithm can be applied there [10]. |
7 | Vieira, A.S., Ribeiro, B., Mukkamala, S., Neves, J.C., & Sung, A.H. (2004). | A financial data having 780,000 financial statements of French companies | Two types of companies are considered:
| Three classifiers like linear genetic programming (LGP), support vector machines and artificial neural networks (ANN) are considered. | LGP shows best prediction for the balanced dataset but for the imbalanced dataset it’s prediction power is not so good. HLVQ and SVMs are perform best for the imbalanced datasets | Some financial ratios can have large annual variation, so as a future work the records of these ratios from a longer period should be taken [20]. |
8 | Wang, M., Zheng, X., Zhu, M., & Hu, Z. (2016, November) | The data is collected from the website Wangdaizhijia (http://www.wangdaizhijia.com/), which handles a Chinese P2P online lending portal. | Here SMOTE algorithm is used to solve the imbalance problem | Four classifiers are used
A new model FSVM-RI is designed, which uses fuzzy SVM as classifier with region information for BP. | The designed model used fuzzy membership function for which, it decreases the effect of outliers. It is having higher prediction rate, even if there is present outliers and missing values in the database. | The statements of financial expertise should be included for prediction and some more datasets should be included to measure the accuracy of the prediction [41]. |
9 | Smiti, S., & Soui, M. (2020). | Datasets are obtained from the University of California, Irvine (UCI) Machine Learning Repository (https//archive.ics.uci.edu/ml/datasets.html). | Apply Borderline SMOTE as a data preprocessing method to balance the original data set. | Softmax classifier | BSM-SAES approach, which combines Borderline Synthetic Minority oversampling technique (BSM) and Stacked AutoEncoder (SAE) based on the Soft-max classifier outperforms the other’s applied methods. | To enhance the AUC of the bankruptcy prediction model by using a comprehensible evaluation model based on IF-THEN rules [54]. |
10 | Sun, J., Li, H., Fujita, H., Fu, B., & Ai, W. (2020). | Based on the sample data of Chinese public companies listed in Shanghai Stock Exchange and Shenzhen Stock Exchange. | Uses SMOTE | Two class-imbalanced DFDP models based on Adaboost-SVM ensemble combined with SMOTE and time weighting 1. S-SMOTE-ADASVM-TW 2. E-SMOTE-ADASVM-TW | The E-SMOTE-ADASVM-TW model significantly outperforms the S-SMOTE-ADASVM-TW model and is more preferred for class-imbalanced DFDP. | These two models can be further applied to solving the other problems such as default diagnosis, customer classification, spam filtering, etc. [55]. |
11 | Shrivastava, S., Jeyanthi, P.M., & Singh, S. (2020). | Data collected for failed and survived public and private sector (2000–2017) | 1. SMOTE is used to convert imbalanced data in a balanced form. 2. Lasso regression is used to reduce the redundant features from the failure predictive model. | To avoid the bias and over-fitting in the models, random forest and AdaBoost techniques are used and compared with the logistic regression. | AdaBoost gives the maximum accuracy in comparison to all other methods [56]. |
The outlier is a data of a database, which lies far away from the rest of the values, or in other words it can be said that outlier is nothing but an extreme value. Before handling the outliers they should be identified or detected first. Several approaches are found for identifying and handling outlier.
Through review we get the following outlier detection methods.
Handling of outlier should be done after its detection. Deletion or winsorization are mainly the two things which can be done for handling outliers. They may be the extreme values but deletion of data cannot be taken as an appropriate approach for handling outliers, as they are also part of the dataset. Outliers are more important when there are present limited numbers of observations, for which they should not be omitted from the dataset. But they must go through proper preprocessing before utilization.
For preprocessing or handling outlier, winsorization is an appropriate method. In this technique after detection of an outlier, without omitting that should be replaced with a very nearest value of it. It may be possible that after removing an outlier by its “nearest neighbor” that may still be detected as an outlier. To overcome such type of problem “dynamic” winsorization is applied. In dynamic winsorization, after substituting nearest neighbor as an outlier, again it is detected whether that new value is an outlier or not. If it is found that the new substituted value is an outlier then, that value is again substituted by its nearest neighbor. This procedure should be repeated up to the outliers comes under the threshold value set previously. The work based on above problem of outlier and the techniques to handled it are discussed in Table 18.2.
Literally classification means categorizing of a collection of items into different groups. Assignment of different data items of a dataset into different classes is called classification. Actually through classification one can find the exact categories of the data items of a dataset. The functions which are used for classification of dataset in data mining are known as classifiers.
Through review we get the following classifiers.
Ensemble models are made by combining several simple single models. By this approach one can get better bankruptcy prediction result as compared to single models. Through review we got the following ensemble techniques.
The process of construction of a bankruptcy prediction model can be represented pictorially as follows:
The steps specified in Figure 18.1 is the general framework to be followed whenever we for bankruptcy prediction.
To prove that the ensemble model is performing better than a single classifier which is also proved in many of the published work by different authors in the reviews. We have also applied some of the ensemble models on the bankruptcy data set collected from www.kaggel.com. The outcome of the simulation results is listed in the form bar chart, RoC curve from Figure 18.2 to 18.7. From the plots and figures, we observed that the claim that many authors has made on ensembles model is true.
Through this review we surveyed on different preprocessing techniques as well as ensembling techniques. Here it is found that for the proper prediction of bankruptcy one should first give emphasis on preprocessing of dataset i.e., balancing of imbalance dataset and detecting and handling of outliers from the dataset. Type of sampling technique based on numeral value of bankrupt cases (BCs). When there are a small number of BCs present, the oversampling technique SMOTE may be considered, but when there are enormous numbers of BCs present, both SMOTE and some undersampling technique give better result [2]. A particular sampling technique can’t give the best result for the entire set of classifiers. When oversampling technique is considered LR and DTBagging classifiers may be taken for better result likewise for undersampling technique C4.5 and RF classifiers may be taken [5]. Whether data may be balanced or imbalanced, GMBoost has the highest BP capability than others [10], but the RFCI achieves better prediction result than GMBoost, oversampling techniques, and undersampling techniques based on clustering [6]. One of the hybrid under-sampling techniques which is constructed by combination of the k-RNN and OCSVM increases the rate of correctness of classifiers, like LR, LDA, DT, SVM, etc. [3]. GA-ANN with CBEUS gives better prediction result than ANN with ROWR or RU, GA-ANN with ROWR or RU, and GA-ANN with CBEUS [4]. LGP shows the best BP capability when used for the balanced data, whereas HLVQ and SVM are best for the imbalanced data [20].
When we consider about the outliers the BP capability of FSVM-RI model is very high, even though the database is having outliers and missing values [41]. When outliers are there in the dataset decision trees can give highest prediction result as compare to others [7]. By removing 50% of the outliers, prediction capability can be optimized for SVM, ANN, and LRDT. Here SVM gives the highest BP rate than others [9]. Through LOF technique BP power can be improved. By applying LOF the predictive capability of LR is more improved than LDA and CT [11]. Anomaly outlier detection method has isolation forest (unsupervised learning method) that performs best BP [42].
After preprocessing of dataset classifiers are made for making bankruptcy prediction models. In place of choosing a single best classifier, one should always choose a set of classifiers which are not dependent and optimal also, for constructing a classification system from the possible classifiers for getting best result [27]. Here it is found that ensembling classifiers which are made by combining number of classifiers with the help of different ensembling methods are always superior to the single classifiers on basis of prediction rate. The predictive power of an ensemble model is basically controlled by types of data available in a dataset [22]. From the serve we get the following ensembled models like Fuzzy NN predictor, Hybrid financial distress model having SOM clustering technique, XGBS, gXGBS and gXGBS_hist, Grabit model, SOFM-assisted NN model and MDA-assisted NN models, etc., which are mentioned above in Table 18.3. From the above survey it is found that the prediction power of XGBS, gXGBS and gXGBS_hist is extremely good for all types of data i.e. data may be balanced or imbalanced [18]. When the sample sizes are smaller, the predictive capability of the Grabit model is larger, whereas for the sample having largest size, the predictive power is almost same as boosted Logit model [19]. The ensemble model constructed by the combination of MDA, LR, CRT, ANN and AdaBoost methodology is having 88.8% of accurate [24]. Even if class is having imbalanced data and in overlapping conditions HACT model performs and is superior to others [36]. Result obtained through different classifiers are pictorially represented in Figures 18.1 to 18.8.
So for getting a perfect or best model for prediction of bankruptcy one has to first preprocess the data for getting proper dataset. Then by taking a proper ensembling technique an ensembling model should be developed.
1. Chou, C.-H., Hsieh, S.-C., Qiu, C.-J., Hybrid genetic algorithm and fuzzy clustering for bankruptcy prediction. Appl. Soft Comput., 56, 298–316, 2017.
2. Zhou, L., Performance of corporate bankruptcy prediction models on imbalanced dataset: The effect of sampling methods. Knowledge-Based Syst., 41, 16–25, 2013.
3. Kim, T. and Ahn, H., A hybrid under-sampling approach for better bankruptcy prediction. (Intelligence Research), 21, 2, 173–190, 2015.
4. Kim, H.-J., Jo, N.-O., Shin, K.-S., Optimization of cluster-based evolutionary undersampling for the artificial neural networks in corporate bankruptcy prediction. Expert Syst. Appl., 59, 226–234, 2016.
5. Sisodia, D.S. and Verma, U., The Impact of Data Re-Sampling on Learning Performance of Class Imbalanced Bankruptcy Prediction Models. Int. J. Electr. Eng. Inform., 10, 3, 433–446, 2018.
6. Le, T. et al., A cluster-based boosting algorithm for bankruptcy prediction in a highly imbalanced dataset. Symmetry, 10, 7, 250, 2018.
7. Nyitrai, T. and Virág, M., The effects of handling outliers on the performance of bankruptcy prediction models. Socio-Econ. Plann. Sci., 67, 34–42, 2019.
8. Son, H. et al., Data analytic approach for bankruptcy prediction. Expert Syst. Appl., 138, 112816, 2019.
9. Tsai, C.-F. and Cheng, K.-C., Simple instance selection for bankruptcy prediction. Knowledge-Based Syst., 27, 333–342, 2012.
10. Kim, M.-J., Kang, D.-K., Kim, H.B., Geometric mean based boosting algorithm with over-sampling to resolve data imbalance problem for bankruptcy prediction. Expert Syst. Appl., 42, 3, 1074–1082, 2015.
11. Figini, S., Bonelli, F., Giovannini, E., Solvency prediction for small and medium enterprises in banking. Decis. Support Syst., 102, 91–97, 2017.
12. Zhou, L. and Lai, K.K., AdaBoost models for corporate bankruptcy prediction with missing data. Comput. Econ., 50, 1, 69–94, 2017.
13. Veganzones, D. and Séverin, E., An investigation of bankruptcy prediction in imbalanced datasets. Decis. Support Syst., 112, 111–124, 2018.
14. Aggarwal, C.C. and Yu, P.S., Outlier detection for high dimensional data. Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, 2001.
15. Kim, M.-J. and Kang, D.-K., Classifiers selection in ensembles using genetic algorithms for bankruptcy prediction. Expert Syst. Appl., 39, 10, 9308–9314, 2012.
16. Tsai, C.-F., Combining cluster analysis with classifier ensembles to predict financial distress. Inform. Fusion, 16, 46–58, 2014.
17. Tsai, C.-F., Hsu, Y.-F., Yen, D.C., A comparative study of classifier ensembles for bankruptcy prediction. Appl. Soft Comput., 24, 977–984, 2014.
18. Le, T. et al., A fast and accurate approach for bankruptcy forecasting using squared logistics loss with GPU-based extreme gradient boosting. Inf. Sci., 494, 294–310, 2019.
19. Sigrist, F. and Hirnschall, C., Grabit: Gradient tree-boosted Tobit models for default prediction. J. Bank. Financ., 102, 177–192, 2019.
20. Vieira, A.S. et al., On the performance of learning machines for bankruptcy detection. Second IEEE International Conference on Computational Cybernetics, 2004. ICCC 2004, IEEE, 2004.
21. Cheng, C.-H., Chan, C.-P., Sheu, Y.-J., A novel purity-based k nearest neighbors imputation method and its application in financial distress prediction. Eng. Appl. Artif. Intell., 81, 283–299, 2019.
22. García, V., Marqués, A.I., Salvador Sánchez, J., Exploring the synergetic effects of sample types on the performance of ensembles for credit risk and corporate bankruptcy prediction. Inform. Fusion, 47, 88–101, 2019.
23. Tobback, E. et al., Bankruptcy prediction for SMEs using relational data. Decis. Support Syst., 102, 69–81, 2017.
24. Fedorova, E., Gilenko, E., Dovzhenko, S., Bankruptcy prediction for Russian companies: Application of combined classifiers. Expert Syst. Appl., 40, 18, 7285–7293, 2013.
25. Sánchez-Lasheras, F. et al., A hybrid device for the solution of sampling bias problems in the forecasting of firms’ bankruptcy. Expert Syst. Appl., 39, 8, 7512–7523, 2012.
26. Kim, M.-J. and Kang, D.-K., Classifiers selection in ensembles using genetic algorithms for bankruptcy prediction. Expert Syst. Appl., 39, 10, 9308–9314, 2012.
27. Kotsiantis, S. et al., Selective costing voting for bankruptcy prediction. Int. J. Knowledge-Based Intell. Eng. Syst., 11, 2, 115–127, 2007.
28. Cho, S., Kim, J., Bae, J.K., An integrative model with subject weight based on neural network learning for bankruptcy prediction. Expert Syst. Appl., 36, 1, 403–410, 2009.
29. Lee, K.C., Han, I., Kwon, Y., Hybrid neural network models for bankruptcy predictions. Decis. Support Syst., 18, 1, 63–72, 1996.
30. Wu, C.-H. et al., A real-valued genetic algorithm to optimize the parameters of support vector machine for predicting bankruptcy. Expert Syst. Appl., 32, 2, 397–408, 2007.
31. Tsai, C.-F., Combining cluster analysis with classifier ensembles to predict financial distress. Inform. Fusion, 16, 46–58, 2014.
32. West, D., Dellana, S., Qian, J., Neural network ensemble strategies for financial decision applications. Comput. Oper. Res., 32, 10, 2543–2559, 2005.
33. Pal, R. et al., Business health characterization: A hybrid regression and support vector machine analysis. Expert Syst. Appl., 49, 48–59, 2016.
34. Fallahpour, S., Norouzian Lakvan, E., Zadeh, M.H., Using an ensemble classifier based on sequential floating forward selection for financial distress prediction problem. J. Retail. Consum. Serv., 34, 159–167, 2017.
35. Tsai, C.-F., Hsu, Y.-F., Yen, D.C., A comparative study of classifier ensembles for bankruptcy prediction. Appl. Soft Comput., 24, 977–984, 2014.
36. Cleofas-Sánchez, L. et al., Financial distress prediction using the hybrid associative memory with translation. Appl. Soft Comput., 44, 144–152, 2016.
37. Azayite, F.Z. and Achchab, S., The impact of payment delays on bankruptcy prediction: A comparative analysis of variables selection models and neural networks. 2017 3rd International Conference of Cloud Computing Technologies and Applications (CloudTech), IEEE, 2017.
38. Mousavi, M.M., Ouenniche, J., Tone, K., A comparative analysis of two-stage distress prediction models. Expert Syst. Appl., 119, 322–341, 2019.
39. Chung, C.-C. et al., Bankruptcy prediction using cerebellar model neural networks. Int. J. Fuzzy Syst., 18, 2, 160–167, 2016.
40. Liang, D. et al., Financial ratios and corporate governance indicators in bankruptcy prediction: A comprehensive study. Eur. J. Oper. Res., 252, 2, 561–572, 2016.
41. Wang, M. et al., P2P lending platforms bankruptcy prediction using fuzzy SVM with region information. 2016 IEEE 13th International Conference on e-Business Engineering (ICEBE), IEEE, 2016.
42. Fan, S., Liu, G., Chen, Z., Anomaly detection methods for bankruptcy prediction. 2017 4th International Conference on Systems and Informatics (ICSAI), IEEE, 2017.
43. Uthayakumar, J., Vengattaraman, T., Dhavachelvan, P., Swarm intelligence based classification rule induction (CRI) framework for qualitative and quantitative approach: An application of bankruptcy prediction and credit risk analysis. J. King Saud Univ.-Comput. Inf. Sci., 36, 2020, 647–657, 2017.
44. Zoričák, M., Gnip, P., Drotár, P., Gazda, V., Bankruptcy prediction for small-and medium-sized companies using severely imbalanced datasets. Econ. Model., 84, 165–176, 2020.
45. Soui, M., Smiti, S., Mkaouer, M.W., Ejbali, R., Bankruptcy Prediction Using Stacked Auto-Encoders. Appl. Artif. Intell., 34, 1, 80–100, 2020.
46. Becerra-Vicario, R., Alaminos, D., Aranda, E., Fernández-Gámez, M.A., Deep Recurrent Convolutional Neural Network for Bankruptcy Prediction: A Case of the Restaurant Industry. Sustainability, 12, 12, 5180, 2020.
47. Nti, I.K., Adekoya, A.F., Weyori, B.A., A comprehensive evaluation of ensemble learning for stock-market prediction. J. Big Data, 7, 1, 1–40, 2020.
48. Pisula, T., An Ensemble Classifier-Based Scoring Model for Predicting Bankruptcy of Polish Companies in the Podkarpackie Voivodeship. J. Risk Financ. Manage., 13, 2, 37, 2020.
49. Lahmiri, S., Bekiros, S., Giakoumelou, A., Bezzina, F., Performance assessment of ensemble learning systems in financial data classification. Intell. Syst. Account. Finance Manage., 27, 1, 3–9, 2020.
50. Lee, S., Bikash, K.C., Choeh, J.Y., Comparing performance of ensemble methods in predicting movie box office revenue. Heliyon, 6, 6, e04260, 2020.
51. du Jardin, P., Forecasting corporate failure using ensemble of self-organizing neural networks. Eur. J. Oper. Res., 288, 2021, 869–885, 2020.
52. Chen, Z., Chen, W., Shi, Y., Ensemble learning with label proportions for bankruptcy prediction. Expert Syst. Appl., 146, 113155, 2020.
53. Aliaj, T., Anagnostopoulos, A., Piersanti, S., Firms Default Prediction with Machine Learning, in: Workshop on Mining Data for Financial Applications, pp. 47–59, Springer, Cham, 2019, September.
54. Smiti, S. and Soui, M., Bankruptcy prediction using deep learning approach based on borderline SMOTE. Inform. Syst. Front., 22, 1–17, 2020.
55. Sun, J., Li, H., Fujita, H., Fu, B., Ai, W., Class-imbalanced dynamic financial distress prediction based on Adaboost-SVM ensemble combined with SMOTE and time weighting. Inform. Fusion, 54, 128–144, 2020.
56. Shrivastava, S., Jeyanthi, P.M., Singh, S., Failure prediction of Indian Banks using SMOTE, Lasso regression, bagging and boosting. Cogent Econ. Finance, 8, 1, 1729569, 2020.
*Corresponding author: [email protected]
3.134.110.16