A Machine-Learning-Based Approach for Tourist-Arrival Trend Prediction

Authors

  • I Putu Edy Suardiyana Putra Department of Digital Business, Institut Pariwisata dan Bisnis Internasional, Bali
  • Denok Lestari Department of Digital Business, Institut Pariwisata dan Bisnis Internasional, Bali http://orcid.org/0000-0002-1073-4900
  • Komang Ratih Tunjungsari University of Otago, Dunedin

DOI:

https://doi.org/10.22334/jbhost.v9i2.480

Keywords:

Tourist-Arrival Trend, Machine Learning, News Headlines, Logistic Regression, Support Vector Machine

Abstract

This study proposes a machine-learning-based technique to predict trend in tourist arrivals based on online news headlines and the number of previous tourist arrivals. Tourist arrivals prediction is important to give information to destinations’ local governments and businesses to prepare their services. We use Logistic Regression and Support Vector Machine to create a model to predict the increase in tourist arrivals monthly. News headlines from three online Indonesian news portals are used. A total of 47,298 online news headlines were collected. The results show that Logistic Regression can achieve up to 67.4% of F-score while Support Vector Machine can achieve up to 62.9% of F-score. These results show that adding online news headlines and machine-learning algorithms can give significantly better results in predicting tourist arrivals.

Author Biographies

I Putu Edy Suardiyana Putra, Department of Digital Business, Institut Pariwisata dan Bisnis Internasional, Bali

WA: 081356420689

Denok Lestari, Department of Digital Business, Institut Pariwisata dan Bisnis Internasional, Bali

WA: 08124696639

References

A. Field. (2009). Discovering Statistics Using SPSS. London: SAGE Publications.

A. J. Sa´nchez-Medina and E. C-Sa´nchez. (2020). “Using machine learning and big data for efficient forecasting of hotel booking cancellations,” International Journal of Hospitality Management, vol. 89, no. September 2019, p. 102546. [Online]. Available: https://doi.org/10.1016/j.ijhm.2020.102546

Abang Abdurahman, A. Z., Wan Yaacob, W. F., Md Nasir, S. A., Jaya, S., & Mokhtar, S. (2022). Using Machine Learning to Predict Visitors to Totally Protected Areas in Sarawak, Malaysia. Sustainability (Switzerland), 14(5), 1–16. https://doi.org/10.3390/su14052735

Atallah, L., Lo, B., King, R., & Yang, G. Z. (2011). Sensor positioning for activity recognition using wearable accelerometers. IEEE Transactions on Biomedical Circuits and Systems, 5(4), 320–329. https://doi.org/10.1109/TBCAS.2011.2160540

B. E. Boser, I. M. Guyon, and V. N. Vapnik. (1992). “A Training Algorithm for Optimal Margin Classifiers,” Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory, pp. 144–152. [Online]. Available: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.21.3818

D. Dor. (2003). “On newspaper headlines as relevance optimizers,” Journal of Pragmatics, vol. 35, no. 5, pp. 695–721.

D. Wang, H. L. R. Chan, and S. Pan.(2015). “The Impacts of Mass Media on Organic Destination Image: A Case Study of Singapore,” Asia Pacific Journal of Tourism Research, vol. 20, no. 8, pp. 860–874.

E. Park, J. Park, and M. Hu, “Tourism demand forecasting with online news data mining. (2021).” Annals of Tourism Research, vol. 90, p. 103273, 2021. [Online]. Available: https://doi.org/10.1016/j.annals.2021.103273

E. Purnaningrum and M. Athoillah. (2021). “SVM Approach for Forecasting International Tourism Arrival in East Java,” Journal of Physics: Conference Series, vol. 1863, no. 1, 2021.

F. C. Yuan. (2020). “Intelligent forecasting of inbound tourist arrivals by social networking analysis,” Physica A: Statistical Mechanics and its Applications, vol. 558, p. 124944. [Online]. Available: https://doi.org/10.1016/j.physa.2020.124944

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. (2011). “Scikit-learn: Machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825– 2830, 2011.

G. Xie, Y. Qian, and S. Wang. (2021). “Forecasting Chinese cruise tourism demand with big data: An optimized machine learning approach,” Tourism Management, vol. 82, no. August 2020, p. 104208. [Online]. Available: https://doi.org/10.1016/j.tourman.2020.104208

H. Song and G. Li. (2008). “Tourism demand modelling and forecasting-A review of recent research,” Tourism Management, vol. 29, no. 2, pp. 203–220.

Hausler, J., Ruscheinsky, J., & Lang, M. (2018). News-based sentiment analysis in real estate: a machine learning approach. Journal of Property Research, 35(4), 344–371. https://doi.org/10.1080/09599916.2018.1551923

I. N. Subadra and H. Hughes. (2022). “Pandemic in paradise: Tourism pauses in Bali,” Tourism and Hospitality Research, vol. 22, no. 1, pp. 122–128, 2022.

Im, T. L., San, P. W., On, C. K., Alfred, R., & Anthony, P. (2014). Impact of Financial News Headline and Content to Market Sentiment. International Journal of Machine Learning and Computing, 4(3), 237–242. https://doi.org/10.7763/ijmlc.2014.v4.418

L. Yu and H. Liu. (2004). “Efficient Feature Selection via Analysis of Relevance and Redundancy,” Journal of Machine Learning Research, vol. 5, pp. 1205–1224.

L.-J. Kau and C.-S. Chen. (2015). “A smart phone-based pocket fall accident detection, positioning, and rescue system,” IEEE Journal of Biomedical and Health Informatics, vol. 19, no. 1, pp. 44–56,.

Le Thi, H. A., Nguyen, V. V., & Ouchani, S. (2008). Gene selection for cancer classification using DCA. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 5139 LNAI, 62–72. https://doi.org/10.1007/978-3-540-88192-6_8

Le, H. T., & Van Tran, L. (2013). Automatic feature selection for named entity recognition using genetic algorithm. ACM International Conference Proceeding Series, 81–87. https://doi.org/10.1145/2542050.2542056

Liu, Y., Zeng, Q., Yang, H., & Carrio, A. (2018). Stock price movement prediction from financial news with deep learning and knowledge graph embedding. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11016 LNAI). Springer International Publishing. https://doi.org/10.1007/978-3-319-97289-3_8

M. Antara and M. S. Sumarniasih. (2017). “Role of Tourism in Economy of Bali and Indonesia,” Journal of Tourism and Hospitality Management, vol. 5, no. 2, pp. 34–44.

N. Yu and J. Chen. (2022). “Design of Machine Learning Algorithm for Tourism Demand Prediction,” Computa- tional and Mathematical Methods in Medicine, vol. 2022.

P. A. Flach and M. Kull. (2015). “Precision-Recall-Gain curves: PR analysis done right,” Advances in Neural Information Processing Systems, vol. 2015-January, pp. 838–846.

P. Oncharoen and P. Vateekul. (2018). “Deep Learning for Stock Market Prediction Using Event Embedding and Technical Indicators,” ICAICTA 2018 - 5th International Conference on Advanced Informatics: Concepts Theory and Applications, pp. 19–24.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., … Duchesnay, É. (2011). Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12(85), 2825–2830. Retrieved from http://jmlr.org/papers/v12/pedregosa11a.html

R. E. Fan, K. W. Chang, C. J. Hsieh, X. R. Wang, and C. J. Lin. (2008), “LIBLINEAR: A library for large linear classification,” Journal of Machine Learning Research, vol. 9, no. 2008, pp. 1871–1874, 2008.

R. Law, G. Li, D. K. C. Fong, and X. Han.(2019). “Tourism demand forecasting: A deep learning approach,” Annals of Tourism Research, vol. 75, no. January, pp. 410–423, [Online]. Available: https://doi.org/10.1016/j.annals.2019.01.014

S. Bird, E. Klein, and E. Loper. (2009). Natural Language Processing with Python, 1st ed. O’Reilly Media, Inc.

Shimada, K., Inoue, S., Maeda, H., & Endo, T. (2011). Analyzing tourism information on twitter for a local city. Proceedings - 1st ACIS International Symposium on Software and Network Engineering, SSNE 2011, 61–66. https://doi.org/10.1109/SSNE.2011.27

W. B. Cavnar, J. M. Trenkle, and A. A. Mi (1994)., “N-Gram-Based Text Categorization,” In Proceedings of SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval, pp. 161–175, 1994. [Online]. Available: http://www.let.rug.nl/$sim$vannoord/TextCat/textcat.pdf

W. Ho¨pken, T. Eberle, M. Fuchs, and M. Lexhagen (2021), “Improving Tourist Arrival Prediction: A Big Data and Artificial Neural Network Approach,” Journal of Travel Research, vol. 60, no. 5, pp. 998–1017, 2021.

W. Li, “Prediction of Tourism Demand in Liuzhou Region Based on Machine Learning,” Mobile Infor- mation Systems, vol. 2022, 2022.

Wang, Z., Wu, D., Chen, J., Ghoneim, A., & Hossain, M. A. (2016). A Triaxial Accelerometer-Based Human Activity Recognition via EEMD-Based Features and Game-Theory-Based Feature Selection. IEEE Sensors Journal, 16(9), 3198–3207. https://doi.org/10.1109/JSEN.2016.2519679

Wu, L., & Ow, S. H. (2021). The Impact of News Sentiment on the Stock Market Fluctuation: The Case of Selected Energy Sector. Jurnal Ekonomi Malaysia, 55(3), 1–21. https://doi.org/10.17576/JEM-2021-5503-01

Xie, G., Qian, Y., & Wang, S. (2021). Forecasting Chinese cruise tourism demand with big data: An optimized machine learning approach. Tourism Management, 82(August 2020), 104208. https://doi.org/10.1016/j.tourman.2020.104208

Y. Liu, Q. Zeng, H. Yang, and A. Carrio, Stock price movement prediction from financial news with deep learning and knowledge graph embedding. Springer International Publishing, 2018, vol. 11016 LNAI, no. August. [Online]. Available: http://dx.doi.org/10.1007/978-3-319-97289-3 8

Yadav, R., Kumar, A. V., & Kumar, A. (2019). News-based supervised sentiment analysis for prediction of futures buying behaviour. IIMB Management Review, 31(2), 157–166. https://doi.org/10.1016/j.iimb.2019.03.006

Yu, N., & Chen, J. (2022). Design of Machine Learning Algorithm for Tourism Demand Prediction. Computational and Mathematical Methods in Medicine, 2022. https://doi.org/10.1155/2022/6352381

Downloads

Published

2023-12-30

How to Cite

Putra, I. P. E. S., Lestari, D., & Tunjungsari, K. R. (2023). A Machine-Learning-Based Approach for Tourist-Arrival Trend Prediction. Journal of Business on Hospitality and Tourism, 9(2), 212–233. https://doi.org/10.22334/jbhost.v9i2.480