Combining the power of artificial intelligence and mathematical modelling: A hybrid technique for enhanced forecast of tourism receipts

Despite being one of the most visited countries in the world, Türkiye's share of tourism revenue does not rank among the top ten. Therefore, it would be worth researching tourist expenditures and analysing this data could provide valuable insights. This research develops a novel approach to estimating and modelling tourism receipts by analysing expenditure types. Artificial intelligence-based methods, such as machine learning, have been increasingly used in the tourism literature to improve various aspects of the industry. However, little research has been conducted using a hybrid method to model and estimate tourist expenditure. This paper is the first to combine conventional mathematical analysis, specifically first-order two-variable polynomial equations, with artificial intelligence-based machine learning algorithms in a tourism setting. The research results indicate that expenditure types such as accommodation and food & beverage significantly impact Türkiye's tourism revenue and Türkiye's total tourism revenue will not exceed 45 billion dollars by 2027. This study provides a valuable and practical contribution to improving the accuracy and efficiency of methods for managing tourism economics, particularly in European countries where the economy heavily relies on income generated by tourism. Additionally, it fills a gap in studies focused on tourists' expenditure types by combining artificial intelligence and traditional analysis, making it a unique piece of research.


Introduction
When tourists visit a country, they typically spend money on things like food and beverage, accommodation, health, transportation, sports, education, and culture.This spending can provide a significant boost to the country's economy by creating jobs, supporting small businesses, and generating tax revenue (Fleischer and Felsenstein, 2000).Before Covid-19, travel and tourism had become one of the most important sectors in the world economy, accounting for 10% of global GDP, 7.4% of total exports, $1.466 billion in income and more than 320 million jobs worldwide (International Monetary Fund, 2020).In addition to economic benefits, tourist expenditures can also have a positive impact on other areas of development.For instance, tourism can help to preserve cultural heritage and promote environmental conservation (Wager, 1995).It can also contribute to the development of infrastructure such as transportation, hotels, medical services and other facilities (Zaei and Zaei, 2013).Overall, international tourism receipts, also known as tourism revenue, can be a crucial source of income for developing countries (Sadler and Archer, 1975) such as Türkiye.
Researching tourist expenditures is essential for various reasons (Sun et al., 2022).To begin, it is beneficial to comprehend the economic impact of tourism on a specific destination or region (Pai et al., 2014).Researchers can estimate the amount of revenue generated by tourism and its contribution to the local economy by analysing how much money tourists spend on things like accommodation, food and beverage, transportation, entertainment, or shopping.Second, examining tourist expenditures can provide insights into tourist behaviour and preferences (Wu et al., 2012).Researchers, for instance, can determine which types of activities or attractions are most popular among tourists by analysing spending patterns, which can help destination marketers and tourism businesses tailor their offerings to better meet the needs and interests of their target market.Third, researching tourist expenditures can assist in identifying potential areas for tourism industry growth and development (Chen, 2011).Researchers can identify areas where investment is needed to improve infrastructure, services, or attractions in order to attract more tourists and increase revenue by analysing spending trends over time.
Prior to the Covid-19 pandemic, Türkiye was the sixth most visited destination in the world with 51.2 million visitors, according to the United Nations World Tourism Organization (UNWTO) in 2019.However, it did not rank among the top ten in terms of tourism receipts (UNWTO, 2020).This stands out as a significant issue as it may indicate that the country is not effectively monetizing its large number of tourists.This could lead to missed opportunities for economic growth and development.Therefore, it would be worth researching the tourist expenditures and analysing this data could provide valuable insights for policymakers, industry professionals and other stakeholders in Türkiye to improve the country's tourism industry and increase tourism revenue.Besides, accurately predicting tourist expenditures for the next years can be a vital marketing insight in developing and implementing successful strategies such as planning and budgeting, improving decision-making, identifying trends, and forecasting economic impact for the country.Hence, the research will provide a practical contribution to improving accurate and efficient approaches for tourism management in European countries such as Türkiye whose economy is dependent on tourism.
Recently, artificial intelligence-based methods such as machine learning have been increasingly used in the tourism literature to improve various aspects of the industry, such as customer service, marketing and pricing, and forecasting demand.Rice, Park, Pan, and Newman (2019) adopted machine learning to forecast campground demand in US national parks.Li et al. (2021) employed machine learning-based feature selection methods to forecast tourist arrivals in Beijing, China, along with hotel occupancy in the city of Charleston, South Carolina, USA.Bi, Han and Li (2022) examined international tourist arrivals in 10 European countries to forecast tourism demand with machine learning models.Although it is frequently used for demand forecasting, the application area of machine learning is quite wide.For example, Sangkaew and Zhu (2022) tried to understand tourists' experiences at local markets in Phuket by analyzing TripAdvisor reviews using a machine learning-based algorithm.Similarly, Oh and Kim (2022) investigated emotions in online reviews of fine dining restaurants in Hong Kong and adopted semantic network analysis and a machine learning algorithm.In the field of customer service, Chen et al. (2021) proposed a customer purchase forecasting model for online tourism to obtain an accurate prediction result and the influence of variables as predictors.Even the machine learning method is used not only for quantitative data but also for the analysis of photos on social media.Yu and Egger (2021) used machine learning approaches to investigate the effectiveness of engagement rate regarding touristic Instagram pictures.However, most ongoing studies based on the machine learning technique do not focus on tourists' expenditure types.Considering the gap in the literature, this paper employs machine learning algorithms to gain a deeper understanding and predict inbound tourist expenditures.
The objective of this research is to develop a novel approach for estimating and modelling tourism receipts by analysing expenditure types using a hybrid method of conventional mathematical analysis and artificial intelligence-based machine learning algorithms.The study aims to gain a deeper understanding of tourist spending patterns and predict inbound tourist expenditures more accurately.By achieving this objective, the research intends to provide valuable insights for policymakers, industry professionals, and other stakeholders in countries heavily reliant on tourism income, such as Türkiye.The findings of this study can inform more targeted and effective policies and strategies for managing tourism, enhance decision-making processes, and contribute to the overall development and growth of the tourism industry.The paper contributes to the broader body of knowledge on research methods in tourism, particularly in data analysis and modelling.

Tourist Expenditure
Tourist expenditure refers to the money spent by travellers on various products and services while on vacation.This includes costs such as lodging, food and beverage, transportation, souvenirs, and entertainment.Tourist expenditure is analysed under the utility maximization theory.Based on this theory, tourism companies develop products that will maximize the utility (satisfaction) derived from tourists' time and money.They provide a package of benefits and experiences that tourists would not be able to assemble themselves at the same price.Maximizing the overall value for money impacts tourist expenditures.Understanding tourist expenditure is crucial for assessing the economic impact of tourism, analysing tourist behaviour and preferences, and developing effective strategies for destination management and marketing.Tourist spending can have a significant impact on a destination's economy, making it of interest to governments, businesses, and researchers (Alfarhan, Olya and Nusair, 2022).According to studies, tourists spend more money on trips to foreign destinations than on domestic trips, primarily due to the increased costs of transportation, accommodation, and other expenses in foreign destinations (Jiménez et al., 2023;Su et al., 2023;Ramos-Dominguez et al., 2023;Mariani et al., 2023).In addition, tourists tend to spend more in destinations that offer a high level of comfort, safety, and convenience (Brida and Scuderi, 2013).Research has also found that the length of stay is a significant determinant of tourist expenditure.Longer stays generally result in higher expenditure, as tourists have more time to explore the destination and engage in a wider range of activities.On the other hand, shorter stays may result in higher expenditure per day, as tourists try to maximize their experiences within a limited timeframe (Ramos and Murta, 2022).
Tourist expenditure patterns are also influenced by their motivations for travel.Tourists travelling for leisure, for example, tend to spend more on entertainment and recreation, whereas those travelling for business tend to spend less overall because they have less time for leisure activities (Erdem and Jiang, 2016).Finally, studies have also found that tourists' demographic characteristics, such as age, income, and education, can influence their spending habits.For example, younger tourists tend to have a lower average expenditure than older tourists, while tourists with higher incomes tend to spend more on luxury goods and services (Jurdana and Frleta, 2017).In conclusion, the literature on tourist expenditure highlights the importance of various factors in determining the amount of money that tourists spend on their trips.However, the current research differs from the existing literature by focusing on tourists' expenditure types using artificial intelligence.

Artificial Intelligence in Tourism
Tourism is one of the world's fastest-growing industries, and with technological advancements, Artificial Intelligence (AI) has become an essential aspect of the industry (Wong, Chau and Chan, 2023).Technology has the potential to transform the way in which tourism services are delivered, and it is being used to improve the customer experience, optimize operations, and personalize travel experiences (Ivanov and Webster, 2017).One study has shown that AI can be used to improve the customer experience by providing personal recommendations and real-time information to tourists (Zhu et al., 2023).For example, AI-powered virtual assistants can help tourists find the best deals on flights, accommodations, and activities.In addition, AI can be used to analyse customer behaviour and preferences, allowing tourism companies to personalize the customer experience and improve customer satisfaction (Li et al., 2023).
Another application of AI in the tourism industry is in operational optimization.AI can streamline processes, reduce costs, and improve efficiency in various areas, such as hotel operations, tour planning, and event management (Samara et al., 2020).For example, AI can be used to optimize hotel room pricing, tour schedules, and event attendance, leading to increased revenue and improved customer experiences.Besides, AI is being used to personalize travel experiences for tourists.AI-powered systems can analyse customer preferences, travel history, and other factors to recommend personalized itineraries, activities, and destinations.This can help tourists have a more unique and enjoyable travel experience, and it can also help tourism companies increase customer loyalty and repeat business (Oday et al., 2021).This study aims to contribute to the literature by modelling and estimating international tourism receipts in Türkiye according to expenditure types, using conventional mathematical analysis, namely first-order two-variable polynomial equation and artificial intelligence-based machine learning algorithms with a hybrid approach.

Types of Expenditures Dataset Description
The expenditures of tourists have been quarterly declared by the Turkish Statistical Institute (Türkiye İstatistik Kurumu, 2022) commonly known as TÜİK, with various types of expenditures used as parameters for these declarations.Specifically, Food-Beverage, Accommodation, Health, Transportation, Sports-Education-Culture, Tour, Clothes-Shoes, Souvenirs, and Other Expenditures are taken into consideration when determining the total revenue obtained by tourists.TÜİK is the official statistical institution in Türkiye responsible for collecting, analysing, and disseminating statistical data related to various aspects of the country's economy, society, and population.TÜİK plays a crucial role in providing reliable and comprehensive statistical information for research, policy-making, and decision-making processes.It conducts surveys, censuses, and other data collection activities to gather data from individuals, households, businesses, and other entities across Türkiye.Feature selection methods and machine learning (ML) algorithms are used to analyse and estimate tourism revenue.The proposed dataset consists of years, quarters, food and beverage, accommodation, health, transportation, sports, education, culture, tours, clothing and shoes, souvenirs, other expenditures, and the total incoming value of tourists.The detailed features with respect to the years and quarters are given in Figure 1, resulting in a dataset with 12 attributes (one of which is the class: total incoming value) and 75 instances.

Figure 1. The Detailed Analysis of the Expenditure Dataset
The statistical analysis of each attribute is presented in Table 1.

Machine Learning Algorithms
There are three main types of Machine Learning (ML) algorithms: unsupervised learning, supervised learning, and reinforcement learning.Unsupervised learning is generally suitable for clustering and dimensionality reduction problems, while supervised learning is related to classification and regression analysis.Reinforcement learning, although less relevant for tourism cases, is another type of ML algorithm.Additionally, natural language processing (NLP) is a specialized ML case, including algorithms for text classification, topic modelling, and sentiment analysis, among others.

Table 1. Statistics for Each Feature
There are several pros to using supervised learning algorithms like the random forest or kNN over unsupervised or reinforcement learning for estimating tourism expenditure.First of all, supervised learning uses labelled data for training.To estimate tourism expenditure, the proposed dataset should contain the actual spending amounts for tourists.As a result of labelling, supervised algorithms can learn directly from these examples to predict spending for new tourists.Unsupervised learning has no labels so it cannot make direct predictions.Reinforcement learning also requires a reward signal which may be difficult to define for this task.Secondly, supervised techniques are designed to generate predictive models.Unsupervised learning focuses on pattern detection.Reinforcement learning learns policies or actions, not predictive models.Thirdly, with supervised learning, the predictions of the model can be compared to actual labelled expenditure amounts to evaluate accuracy.ML approaches can be classified based on the type of data and the availability of labels for the dependent variable.Supervised algorithms are used when labels are available for either continuous or discrete dependent variables, while unsupervised methods are applied when no labels are given.Hence, in this study, the most suitable algorithms take part and related to the supervised learning methodology are utilized.Because, the dataset, whose labels or classes are assisted as expenditure types obtained by the TÜİK website is suitable for supervised analysis.

Random Forest Algorithm
The Random Forest Algorithm was initially introduced by Breiman (2001) as a supervised method for solving regression and classification problems (Shalev-Shwartz and Ben-David, 2014).Its working principle is similar to the Decision Tree Algorithm, and the algorithm itself resembles a tree-like structure.The tree consists of leaves and test nodes, with a leaf node terminating an end node that includes classes of objects based on their characteristics.On the other hand, a test node is an internal node that compares features belonging to the desired node using the other branch.(Quinlan, 1986).

k-Nearest Neighbors (kNN) Algorithm
The kNN algorithm classifies the instances through the distance between the clusters (Sahoo and Kumar, 2012).In the kNN algorithm, distance is a hyperparameter to be determined, such as Euclidean distance, Manhattan distance, or Chebyshev distance, among others.In this paper, Euclidean distance is selected as the distance method as given in Equation (1): (1) Based on the above equation results, each instance attained a class which is nearest to the distance.'k' parameter given in Equation ( 1) is depicted for the number of instances.Even algorithm is simple, it is a popular method in machine learning (ML) and widely used classification algorithms (Gençoğlan et al., 2020).During kNN algorithm analysis, k is selected as 1 to assign the instance effectively to the class attribute.

Locally Weighted Learning (LWL)
The LWL algorithm is mainly supported the approximated model among instances for analysing the features (Atkeson et al., 1997).This algorithm will be used for classification with the Naive mathematician approaches that are the supervised machine learning approaches and multivariate analysis is performed.Moreover, LWL is additionally categorised as a non-parametric supervised learning algorithm through density function probability.

LWL Algorithm Principle: Prediction of the inbound tourist expenditures
Dataset: INPUTs including years, quarters, food-beverage, accommodation, health, transportation, sports-education-culture, tour, clothes and shoes, souvenirs, and other expenditures are attained as query points.OUTPUT is total inbound tourist expenditures.
Given:   is a notation for each feature of the dataset where q is taken from 1 to 13. 'n' is the total number of the instances (the number of training sample) where n=75.Additionally; , , , ,   and  ̌ stand for build matrix, build vector, diagonal weight matrix, distance metric, regression coefficient and predicted output, respectively: There are a number of terms namely correlation coefficient (CC), mean absolute error (MAE), and root mean square error (RMSE) to define the performance of the classification algorithms.Correlation coefficients are used to a degree how robust a relationship is among attributes.The correlation coefficient takes the value between -1 and 1.The obtained result, which is 1, means a strong relationship among the features of the dataset.The mean absolute error (MAE) is an average of overall absolute errors between actual (  ) and predicted values (  )( |  −   |).The root mean square error is the average square root of the difference between forecasting and actual values.The formulas of the correlation coefficient, mean absolute error, and root mean square error are given in Equations of 2-4, respectively: (2) Then, the kNN algorithm was applied to the reduced dimension dataset.The analyzed results showed that the CbFS algorithm outperformed the others given in Begum et al. (2015).
In this paper, Correlation-based Feature Selection (CbFS) is utilized to eliminate irrelevant and redundant attributes of expenditure data for estimating total tourist expenditures.Twelve parameters used for estimating the total income obtained by inbound tourists are examined and analyzed to determine their effect on the accuracy of the estimation performance.CbFS is a filter-based feature selection technique that considers the consistency levels of class values to investigate the effectiveness of features (Liu and Setiono, 1996).

First-Order Two-Variable Polynomial Modelling
The fundamentals of any object handled in calculus are generally described as "functions."The representation of functions can vary and includes an equation, a graph, words, or a table.Functions occur because one variable depends on the other.Dependent variables can be expressed through various independent variables.For example, the human population depends on time 't.'In other words, the population is a function of a time-dependent variable.Functions can be defined as input and output relationships, in terms of the unknown relationship or function (black box), as illustrated in Figure 2.  5.: Where Y stands for the dependent variable,  1 ,  2 ,  3 , … ,   represent the independent variables of the systems. 1 ,  2 ,  3 , …,   outline the coefficients and  depicts the constant term of the proposed approach.In this study, the tourism revenue in Türkiye with respect to the various expenditure types is modelled with regard to the first-order two-variable polynomial equation to forecast via mathematical model and machine learning (ML) algorithms.In addition to machine learning, the study incorporates conventional mathematical analysis to enhance the accuracy and reliability of the predictive model.Initially, the year from 2012 to 2021 is represented in Figure 3 to visualize the first-order equation line since the historical data gives a hint for forecasting future data.

Figure 3. The Total Tourism Revenue with respect to Years from 2012 to 2021
Then, the total tourism revenue is detailed in Table 2 as well to illustrate the total tourism revenue in terms of a table as mentioned before.

Findings
The ML algorithms discussed in the previous section have been applied to the proposed dataset for estimating the total revenue obtained by inbound tourists.A 10-fold cross-validation procedure has been utilized for testing and training the data.In other words, out of 75 instances, 68 instances (90% of the dataset) have been used for training the model, and 7 instances (10% of the dataset) have been used to test the model via iterations.The results of the algorithms are listed in Table 3.According to the performance analysis of the algorithms, Random Forest outperforms the kNN and LWL algorithms in terms of a higher correlation coefficient.The CC for these algorithms is determined as 0.98, 0.96, and 0.84, respectively.These results reveal that the total revenue obtained by tourists can be estimated using the 12 attributes used in this study with a correlation coefficient higher than 85%.When CbFS is applied, the most relevant and crucial features are determined as the year, food-beverage, and accommodation.The reduced dataset through CbFS is tested again with machine learning algorithms for the estimation of the total revenue obtained from incoming tourists.Therefore, in this case, the number of ML models has three inputs rather than twelve.However, the number of the model output remains the same.The results with the reduced dataset, including the most relevant features that occurred by feature selection methods, are given in After feature selection, all three algorithms experienced some degree of decrease in performance, as evidenced by decreases in the correlation coefficient and increases in both MAE and RMSE.However, it's important to note that the changes were relatively small, and the overall ranking of the algorithms remained the same, with Random Forest performing the best, followed by kNN, and then LWL.When the real and predicted lines are compared in Figure 4, it is clear that the random forest produces the most overlapping lines.The tourist expenditure types are modelled through the first-order twovariable polynomial equation.The closed-form mathematical expression is expressed by Equation 6: Where x and y stand for the year and quarter variables.Each type has different coefficients of eleven tourist expenditure types calculated by past data from 2005 to 2022.According to the past data, Table 5 outlines the polynomial coefficients of each expenditure type.8. Souvenirs p00 = -6.086e+06(-2.045e+07, 8.278e+06) p10 = 3125 (-4007, 1.026e+04) p01 = 3.211e+04 (-1538, 6.575e+04) 9. Others p00 = -7.825e+07(-1.016e+08, -5.486e+07) p10 = 3.899e+04 (2.738e+04, 5.061e+04) p01 =6.854e+04 (1.375e+04, 1.233e+05) 10.Total Tourism Income p00 = -7.536e+08(-1.04e+09, -4.674e+08) p10 = 3.759e+05 (2.338e+05, 5.18e+05) p01 =1.143e+06 (4.728e+05, 1.814e+06) Based on the coefficients obtained and given in Table 5, tourist expenditure for each expenditure type is calculated with respect to year (from 2023 to 2027) and quarters (from 1 to 4).The Random Forest algorithm, which outperforms the other two algorithms in this study, is again evaluated to emphasize the mathematical modeling approach.By evaluating the algorithm, the tourist expenditure types for the next five years are forecasted.The calculated and forecasted values of each expenditure type are outlined in Table 6.It is clear from Table 6 that the total tourism income in the next five years is expected to reach up to 44,827,200 (thousand dollars) calculated or 43,543,949 (thousand dollars) forecasted.In addition, the most relevant and crucial types of tourist expenditure are deduced to be accommodation and food & beverage due to their higher income compared to the other types.

Discussion and Recommendations
This paper aims to model and estimate international tourism receipts in Türkiye based on expenditure types using conventional mathematical analysis namely first-order two-variable polynomial equation and artificial intelligence-based machine learning algorithms.A hybrid approach was preferred in the study, which employed both conventional and cutting-edge methods.Tourism is undoubtedly an essential sector of the Turkish economy.Türkiye is a natural tourism centre due to its location, geographical features, and historical heritage on the lands it occupies.Nevertheless, this is not the reason for choosing Türkiye as the case in this paper which examines tourist expenditure types through artificial intelligence-based machine learning algorithms.Despite being one of the most visited countries in the world (ranked sixth with 51 million tourists in 2019), Türkiye's share of tourism revenue (under 40 billion dollars in 2019 and not in the top ten) is not significant.Therefore, the reason for Türkiye being considered a "cheap" country where a tourist can return without spending even a thousand dollars has been thoroughly examined, and artificial intelligence has been used to predict whether this situation will change in the future.The reason for taking into account 2019 is that the World Tourism Organization declared 2020 as the worst year in tourism history, claiming that it would take four years to return to 2019 levels and fully recover from the destructive effects of the Covid19 pandemic.
Firstly, using the feature selection method, the expenditure types that have the greatest impact on Türkiye's tourism revenue were identified, and it was discovered that accommodation and food & beverage were prominent.The results of the research indicate that expenditure types such as accommodation and food & beverage have a significant impact on Türkiye's tourism revenue.This finding is consistent with previous studies that have shown that tourists tend to spend more on basic needs such as lodging and food.For example, a study by Fleissig (2021) found that accommodation and food & beverage expenditures were inelastic and necessary needs.Gómez-Déniz et al. (2020) found that the most significant expense was expenditure on accommodation.Accommodation and food & beverage were also highlighted the top two expenditure categories in Northern Italy's tourism sector by Disegna and Osti (2016).Similarly, a study by Alegre et al. (2011) on the analysing the influence of tourist motivations on tourist expenditure found that accommodation was a significant variable in explaining total expenditure levels and expenditure in the country of origin.Jingwen and Mingzhu (2018), on the other hand, found accommodation and food & beverage to have a limited pulling function on visitor expenditure when compared to other commodities.Since these types of expenditures are the basic needs of every tourist, it is obvious why Türkiye's tourism income does not increase in proportion to the number of tourists.This finding of the study shows that most tourists visiting Türkiye hardly ever leave their hotels.They keep their extra expenses low, and this may be due to the all-inclusive system applied in most holiday destinations in the country.The all-inclusive system is a popular pricing strategy in the tourism industry, especially in resort destinations.Under this system, tourists pay a fixed price for their accommodation, food, and some additional services such as drinks, snacks, and entertainment.The popularity of this pricing model has increased in recent years, with more and more hotels and resorts adopting it (Bladh & Holm, 2013).
Many all-inclusive resorts offer a limited number of excursions or activities.Besides, all-inclusive resorts can limit tourists' interactions with local people and businesses, which can negatively impact local economies and the cultural exchange experience.Alegre and Pou (2008) have suggested that the allinclusive system can have both positive and negative impacts on the local economy and the tourism industry.On the one hand, it can attract more tourists by providing them with a hassle-free and predictable holiday experience, and thus increase the revenue for the local businesses.On the other hand, it can discourage tourists from exploring the local attractions and spending money on local businesses outside the resort, resulting in a leakage of tourism revenue from the local economy.
Additionally, the all-inclusive system can lead to environmental problems, such as excessive waste generation and water usage, if not managed properly.Therefore, the necessity to call for a more sustainable and responsible approach to the all-inclusive system in tourism, such as promoting local food and culture, reducing waste, and involving local communities in the decision-making process has emerged.Originally, the all-inclusive system was designed to prevent tourists from going out in countries having security problems.However, it has turned into a model focused on the hotels' profit margins and has led to very low per capita expenditures, so it is recommended that Türkiye should abandon this system in the course of time.
The study then forecasted Türkiye's tourism revenue for the next few years using a machine learning (ML) algorithm with a 97% accuracy rate.According to the calculations, Türkiye's total tourism revenue will not exceed 45 billion dollars by 2027.Based on this finding, it is possible to say that Türkiye's tourism income will also remain low in the future.Türkiye's rival countries in Europe such as Spain, Italy and France, which are all Mediterranean countries like itself, had already exceeded 45 billion dollars even in 2019.In contrast, some studies offer a more optimistic outlook for Türkiye's tourism industry.For instance, a study by Mengu (2018) predicts that Türkiye's tourism revenue will increase in the future due to the country's strategic location, diverse natural and cultural resources, and improved tourism infrastructure.Another study highlights the potential of health tourism in Türkiye and its contribution to the growth of the country's tourism industry (Uygun & Ekiz, 2016).Türkiye should embark on a transformation process without wasting too much time and focus on tourism which has high expenditure levels but relatively low environmental damage and costs.To achieve this, encouraging sustainable tourism and promoting the country's unique culture, history, and natural beauty can be recommended because high-income level tourists often seek out destinations with rich cultural heritage, such as art, music, cuisine, and history.
It is also crucial to diversify the tourism service offered in this process.The diversified types of tourism implied cover a wide spectrum.For example, congress or fair tourism can take place in big cities.For business tourism such as conferences, trade fairs, and meetings, it is necessary to develop and promote the country.This can attract high-income level tourists who are also business executives and entrepreneurs.High-income level tourists may seek out destinations that offer sustainable and environmentally friendly tourism options.Developing and promoting eco-tourism activities such as wildlife safaris, bird watching, and nature walks can attract this type of tourist, so rural tourism should take place in rural areas.It is important to develop and promote the country as a destination for medical tourism, such as cosmetic surgery, dental treatments, and other health-related services.This can attract high-income level tourists who are looking for these types of services.Therefore, health tourism, medical tourism or thermal tourism should be developed in regions where health centres are concentrated.These types of tourism, which are generally tailored to individuals or small groups, can be attended by high-income people that will both generate more revenue for the country and have low societal costs.Last but not least, high-income level tourists are often looking for unique and exclusive experiences such as escapism in casinos.In fact, a controlled reopening of casinos may be on the agenda by taking the United States of America, which has the highest tourism income with 214 billion dollars, as an example.These recommendations are not only applicable to Türkiye but also to other European countries whose economies rely on tourism yet cannot receive a sufficient share of tourism income worldwide.
The paper's theoretical framework draws on optimization theory to compare and select the best machine learning algorithm for estimating tourist expenditures.Optimization theory is a subfield of mathematics that deals with finding the best solution for a given problem by minimizing or maximizing an objective function subject to constraints.In the context of machine learning, optimization theory is used to find the best parameters for a model that minimize a cost function.This paper contributes to the international body of knowledge by presenting a novel approach to estimating and modelling tourism receipts.The study provides a more accurate and efficient method for estimating tourism revenues by combining traditional mathematical analysis and artificial intelligence-based machine learning algorithms.The study contributes to tourism theory by identifying the most significant types of expenditure affecting tourism revenue, which can then be used to inform tourism strategies and policies.Furthermore, the use of machine learning algorithms in estimating future tourism revenue in the study provides a valuable methodological approach that can be replicated in other studies.As a whole, this paper offers a useful and innovative contribution to the field of tourism research.
Besides, the study's forecasting of Türkiye's tourism revenue using machine learning algorithms can provide valuable insights for tourism policymakers and industry practitioners in planning and budgeting for the future.Knowing that the country's total tourism revenue may not exceed 45 billion dollars by 2027 can help them make informed decisions about where to allocate resources and how to develop and promote tourism services.As a result, the findings of this research can contribute to more effective and efficient tourism management in Türkiye and other European countries that heavily rely on tourism income, ultimately leading to economic growth and development.While this study provides useful awareness of the impact of tourist expenditure types on a country's tourism revenue and offers a practical approach to estimating future revenue, there is still much to be explored.Researchers should be encouraged to pursue further investigation in this area, particularly in terms of expanding the scope of the study and testing the validity of the assumptions made in the analysis.Using machine learning is a nish area in the tourism literature.In the future, different datasets linked to tourism can be integrated, and alternative machine learning algorithms can be employed.Lastly, Türkiye has been chosen as the case in this study, other European countries can be selected in the future and tourism receipts can be handled with artificial intelligence-based machine learning techniques.

Figure 2 .
Figure 2. Input-Output Relationship with respect to the Unknown Equation Graph representation of any function includes permissible information for the behaviour of the structure or any historical-based data.Based on this information, the mathematical modelling of any objects or situations might be represented as a first-order polynomial equation as given in Equation5.:

Figure 5 .
Figure 5. Calculated and Forecasted Values for Each Tourist Expenditure Type Based on the each test node is defined, and the final attribute is attained from the class label of the reached leaf node The feature selection technique determines the most relevant features in the dataset and speeds up the learning process.Additionally, suitable feature extraction or selection procedures can improve the accuracy of the classification method.In other words, some of the attributes in the proposed dataset, which are expenditure types, might be redundant and irrelevant.Therefore, the irrelevant attributes may lead to lower accuracies.
A myriad of feature selection techniques has been suggested for determining crucial features in the literature.For instance, Wang et al. (2019) used different feature selection algorithms, namely Sequential Floating Forward Selection (SFFS), Fisher Score, Sequential Forward Selection (SFS), and Fast Correlation-Based Filter Solution (FCBF), to achieve higher accuracy and efficiency during sleep stage through only one channel electroencephalograms (EEGs).Al-Batah et al. (2019) applied the Correlation-based Feature Selection (CbFS) algorithm to reduce dataset dimensionality and determine relevant genes for eleven microarray data.The results demonstrated that CbFS can efficiently eliminate redundant and irrelevant attributes and noise.In 2015, Begum et al. (2015) focused on the classification technique based on the Leukemia dataset.Initially, Consistency-based Feature Selection (CbFS), Kernelized Fuzzy Rough Set (KFRS), and Fuzzy Preference Based Rough Set (FPRS) were utilized.

Table 2 .
Total Tourism Revenue in Türkiye from the Year 2012 to 2021(Thousand $)

Table 4 .
The reduced dataset results are clearly given in Table4, where the CbFS method is applied to three algorithms, namely Random Forest, kNN, and LWL.average errors in the predictions.The RMSE also increased to 895.159, indicating slightly larger overall errors in the predictions.Feature selection led to a minor decrease in performance for the Random Forest algorithm, as indicated by a slightly lower correlation coefficient and slightly larger errors in both MAE and RMSE.However, the algorithm still performs well overall.Before feature selection, the kNN algorithm had a high CC of 0.96, indicating a strong positive correlation between the predicted and actual values.The MAE was 788.611, suggesting larger average errors in the predictions compared to Random Forest.The RMSE was 1.067.718,indicating larger overall errors in the predictions compared to Random Forest.After feature selection, the CC decreased to 0.95, indicating a slightly weaker but still strong correlation between the predicted and actual values.The MAE increased to 843.735, suggesting slightly larger average errors in the predictions.The RMSE increased to 1.221.101,indicating slightly larger overall errors in the predictions.Similar to Random Forest, feature selection resulted in a minor decrease in performance for kNN.The correlation coefficient decreased slightly, and both MAE and RMSE increased, indicating slightly larger errors in the predictions.The LWL algorithm had a CC of 0.84, indicating a weaker correlation between the predicted and actual values compared to Random Forest and kNN.The MAE was 1.685.289,suggesting larger average errors in the predictions.The RMSE was 2.064.462,indicating larger overall errors in the predictions.After feature selection, the CC remained the same at 0.84, indicating no change in the correlation between the predicted and actual values.The MAE increased to 1.713.489,suggesting slightly larger average errors in the predictions.The RMSE increased to 2.084.604,indicating slightly larger overall errors in the predictions.Feature selection did not significantly impact the performance of the LWL algorithm.The correlation coefficient remained the same, and there were slight increases in both MAE and RMSE, indicating slightly larger errors in the predictions.
errors in the predictions.The root mean square error (RMSE) was 822.216, indicating overall small errors in the predictions.After feature selection: The CC decreased slightly to 0.97, indicating a slightly weaker but still strong correlation between the predicted and actual values.The MAE increased to 579.722, suggesting slightly larger

Table 4 .
Performance Analysis of Reduced Dataset

Table 6 .
Each Pattern Analysis in terms of the future years from 2023 to 2027(Quarterly) Figure 5 visually presents the calculated and forecasted values for each tourist expenditure type.