A Machine-Learning-Based Approach for Tourist-Arrival Trend Prediction
DOI:
https://doi.org/10.22334/jbhost.v9i2.480Keywords:
Tourist-Arrival Trend, Machine Learning, News Headlines, Logistic Regression, Support Vector MachineAbstract
This study proposes a machine-learning-based technique to predict trend in tourist arrivals based on online news headlines and the number of previous tourist arrivals. Tourist arrivals prediction is important to give information to destinations’ local governments and businesses to prepare their services. We use Logistic Regression and Support Vector Machine to create a model to predict the increase in tourist arrivals monthly. News headlines from three online Indonesian news portals are used. A total of 47,298 online news headlines were collected. The results show that Logistic Regression can achieve up to 67.4% of F-score while Support Vector Machine can achieve up to 62.9% of F-score. These results show that adding online news headlines and machine-learning algorithms can give significantly better results in predicting tourist arrivals.
References
A. Field. (2009). Discovering Statistics Using SPSS. London: SAGE Publications.
A. J. Sa´nchez-Medina and E. C-Sa´nchez. (2020). “Using machine learning and big data for efficient forecasting of hotel booking cancellations,” International Journal of Hospitality Management, vol. 89, no. September 2019, p. 102546. [Online]. Available: https://doi.org/10.1016/j.ijhm.2020.102546
Abang Abdurahman, A. Z., Wan Yaacob, W. F., Md Nasir, S. A., Jaya, S., & Mokhtar, S. (2022). Using Machine Learning to Predict Visitors to Totally Protected Areas in Sarawak, Malaysia. Sustainability (Switzerland), 14(5), 1–16. https://doi.org/10.3390/su14052735
Atallah, L., Lo, B., King, R., & Yang, G. Z. (2011). Sensor positioning for activity recognition using wearable accelerometers. IEEE Transactions on Biomedical Circuits and Systems, 5(4), 320–329. https://doi.org/10.1109/TBCAS.2011.2160540
B. E. Boser, I. M. Guyon, and V. N. Vapnik. (1992). “A Training Algorithm for Optimal Margin Classifiers,” Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory, pp. 144–152. [Online]. Available: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.21.3818
D. Dor. (2003). “On newspaper headlines as relevance optimizers,” Journal of Pragmatics, vol. 35, no. 5, pp. 695–721.
D. Wang, H. L. R. Chan, and S. Pan.(2015). “The Impacts of Mass Media on Organic Destination Image: A Case Study of Singapore,” Asia Pacific Journal of Tourism Research, vol. 20, no. 8, pp. 860–874.
E. Park, J. Park, and M. Hu, “Tourism demand forecasting with online news data mining. (2021).” Annals of Tourism Research, vol. 90, p. 103273, 2021. [Online]. Available: https://doi.org/10.1016/j.annals.2021.103273
E. Purnaningrum and M. Athoillah. (2021). “SVM Approach for Forecasting International Tourism Arrival in East Java,” Journal of Physics: Conference Series, vol. 1863, no. 1, 2021.
F. C. Yuan. (2020). “Intelligent forecasting of inbound tourist arrivals by social networking analysis,” Physica A: Statistical Mechanics and its Applications, vol. 558, p. 124944. [Online]. Available: https://doi.org/10.1016/j.physa.2020.124944
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. (2011). “Scikit-learn: Machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825– 2830, 2011.
G. Xie, Y. Qian, and S. Wang. (2021). “Forecasting Chinese cruise tourism demand with big data: An optimized machine learning approach,” Tourism Management, vol. 82, no. August 2020, p. 104208. [Online]. Available: https://doi.org/10.1016/j.tourman.2020.104208
H. Song and G. Li. (2008). “Tourism demand modelling and forecasting-A review of recent research,” Tourism Management, vol. 29, no. 2, pp. 203–220.
Hausler, J., Ruscheinsky, J., & Lang, M. (2018). News-based sentiment analysis in real estate: a machine learning approach. Journal of Property Research, 35(4), 344–371. https://doi.org/10.1080/09599916.2018.1551923
I. N. Subadra and H. Hughes. (2022). “Pandemic in paradise: Tourism pauses in Bali,” Tourism and Hospitality Research, vol. 22, no. 1, pp. 122–128, 2022.
Im, T. L., San, P. W., On, C. K., Alfred, R., & Anthony, P. (2014). Impact of Financial News Headline and Content to Market Sentiment. International Journal of Machine Learning and Computing, 4(3), 237–242. https://doi.org/10.7763/ijmlc.2014.v4.418
L. Yu and H. Liu. (2004). “Efficient Feature Selection via Analysis of Relevance and Redundancy,” Journal of Machine Learning Research, vol. 5, pp. 1205–1224.
L.-J. Kau and C.-S. Chen. (2015). “A smart phone-based pocket fall accident detection, positioning, and rescue system,” IEEE Journal of Biomedical and Health Informatics, vol. 19, no. 1, pp. 44–56,.
Le Thi, H. A., Nguyen, V. V., & Ouchani, S. (2008). Gene selection for cancer classification using DCA. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 5139 LNAI, 62–72. https://doi.org/10.1007/978-3-540-88192-6_8
Le, H. T., & Van Tran, L. (2013). Automatic feature selection for named entity recognition using genetic algorithm. ACM International Conference Proceeding Series, 81–87. https://doi.org/10.1145/2542050.2542056
Liu, Y., Zeng, Q., Yang, H., & Carrio, A. (2018). Stock price movement prediction from financial news with deep learning and knowledge graph embedding. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11016 LNAI). Springer International Publishing. https://doi.org/10.1007/978-3-319-97289-3_8
M. Antara and M. S. Sumarniasih. (2017). “Role of Tourism in Economy of Bali and Indonesia,” Journal of Tourism and Hospitality Management, vol. 5, no. 2, pp. 34–44.
N. Yu and J. Chen. (2022). “Design of Machine Learning Algorithm for Tourism Demand Prediction,” Computa- tional and Mathematical Methods in Medicine, vol. 2022.
P. A. Flach and M. Kull. (2015). “Precision-Recall-Gain curves: PR analysis done right,” Advances in Neural Information Processing Systems, vol. 2015-January, pp. 838–846.
P. Oncharoen and P. Vateekul. (2018). “Deep Learning for Stock Market Prediction Using Event Embedding and Technical Indicators,” ICAICTA 2018 - 5th International Conference on Advanced Informatics: Concepts Theory and Applications, pp. 19–24.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., … Duchesnay, É. (2011). Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12(85), 2825–2830. Retrieved from http://jmlr.org/papers/v12/pedregosa11a.html
R. E. Fan, K. W. Chang, C. J. Hsieh, X. R. Wang, and C. J. Lin. (2008), “LIBLINEAR: A library for large linear classification,” Journal of Machine Learning Research, vol. 9, no. 2008, pp. 1871–1874, 2008.
R. Law, G. Li, D. K. C. Fong, and X. Han.(2019). “Tourism demand forecasting: A deep learning approach,” Annals of Tourism Research, vol. 75, no. January, pp. 410–423, [Online]. Available: https://doi.org/10.1016/j.annals.2019.01.014
S. Bird, E. Klein, and E. Loper. (2009). Natural Language Processing with Python, 1st ed. O’Reilly Media, Inc.
Shimada, K., Inoue, S., Maeda, H., & Endo, T. (2011). Analyzing tourism information on twitter for a local city. Proceedings - 1st ACIS International Symposium on Software and Network Engineering, SSNE 2011, 61–66. https://doi.org/10.1109/SSNE.2011.27
W. B. Cavnar, J. M. Trenkle, and A. A. Mi (1994)., “N-Gram-Based Text Categorization,” In Proceedings of SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval, pp. 161–175, 1994. [Online]. Available: http://www.let.rug.nl/$sim$vannoord/TextCat/textcat.pdf
W. Ho¨pken, T. Eberle, M. Fuchs, and M. Lexhagen (2021), “Improving Tourist Arrival Prediction: A Big Data and Artificial Neural Network Approach,” Journal of Travel Research, vol. 60, no. 5, pp. 998–1017, 2021.
W. Li, “Prediction of Tourism Demand in Liuzhou Region Based on Machine Learning,” Mobile Infor- mation Systems, vol. 2022, 2022.
Wang, Z., Wu, D., Chen, J., Ghoneim, A., & Hossain, M. A. (2016). A Triaxial Accelerometer-Based Human Activity Recognition via EEMD-Based Features and Game-Theory-Based Feature Selection. IEEE Sensors Journal, 16(9), 3198–3207. https://doi.org/10.1109/JSEN.2016.2519679
Wu, L., & Ow, S. H. (2021). The Impact of News Sentiment on the Stock Market Fluctuation: The Case of Selected Energy Sector. Jurnal Ekonomi Malaysia, 55(3), 1–21. https://doi.org/10.17576/JEM-2021-5503-01
Xie, G., Qian, Y., & Wang, S. (2021). Forecasting Chinese cruise tourism demand with big data: An optimized machine learning approach. Tourism Management, 82(August 2020), 104208. https://doi.org/10.1016/j.tourman.2020.104208
Y. Liu, Q. Zeng, H. Yang, and A. Carrio, Stock price movement prediction from financial news with deep learning and knowledge graph embedding. Springer International Publishing, 2018, vol. 11016 LNAI, no. August. [Online]. Available: http://dx.doi.org/10.1007/978-3-319-97289-3 8
Yadav, R., Kumar, A. V., & Kumar, A. (2019). News-based supervised sentiment analysis for prediction of futures buying behaviour. IIMB Management Review, 31(2), 157–166. https://doi.org/10.1016/j.iimb.2019.03.006
Yu, N., & Chen, J. (2022). Design of Machine Learning Algorithm for Tourism Demand Prediction. Computational and Mathematical Methods in Medicine, 2022. https://doi.org/10.1155/2022/6352381
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish with this journal agree to the following terms:Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).