An extended model of destination image formation: The inclusion of sensory images

The purpose of this study is to examine the development of destination image from the sensory form to the cognitive and affective forms, and the effects that these three types of destination image can have on tourist intention. Japan’s Tohoku district was selected as the destination, while Malaysia’s respondents were chosen as the potential tourists. This study proved that the model of destination image formation must begin from the sensory images and continue with the cognitive images and the affective image. The linear correlations between the sensory, cognitive and affective elements further facilitate the intention to visit the destination of potential tourists. Among the three, affective image has the largest effect on intention. However, the formation of the affective image is under the influence of the sensory and cognitive images, with that of the former to be larger. Implications for destination image promotion, with an emphasis on the sensory elements, are discussed.


Introduction
Nowadays, tourists have a great deal of choices of destinations to visit. As a result, destinations have to compete with one another to win over tourists' awareness or loyalty. In this competition, certain destinations have become more attractive than others because they are attached with beautiful attributes or images. Destination images are created by the direct and indirect activities of destination managers and marketers, as well as formed through non-commercial sources and personal experiences (Gartner, 1993). In all cases, the information about a destination must be received by the five senses (Hultén, Broweus, and van Dijk, 2009;Krishna, 2009). Consequently, the initial perceptions that an individual has of a destination must be in the sensory form.
Sensory images are the attributes that an individual holds of a destination through seeing (visual), hearing (auditory), smelling (olfactory), tasting (gustatory) and touching or being touched (tactile). Compared to the cognitive (the physical and non-physical attributes) and affective (the feelings) elements, research on the sensory component of destination image has largely been neglected. The sensory images have only been investigated in the past ten years (Agapito, Pinto, and Mendes, 2017;Govers, Go, and Kumar, 2007;Huang and Gross, 2010;Son and Pearce, 2005;Xiong, Hashim, and Murphy, 2015). However, the theoretical frameworks of many studies were insufficiently developed. Researchers did not make any differentiation between sensory images and cognitive and affective images. As a consequence, the interpretation and presentation of the sensory images were vague in terms of what really was perceived. Moreover, the sensory attributes of different places seem to be similar. In other words, previous studies could not reveal or report the unique aspect of destinations (Echtner & Ritchie, 2003), which is an important issue in destination image communication.
It should be noted that each sensory image is an individual attribute. The combination of two or more sensory attributes or cues is a cognitive process (Shams and Seitz, 2008), which results in a cognitive or integrated image (Compeau, Grewal, and Monroe, 1998). The cognitive component can have further effects on the formation of the affective component (Baloglu and McCleary, 1999). All of the sensory, cognitive and affective elements have important influences on tourists' intention to visit a destination (Kim and Kerstetter, 2016;Kim and Perdue, 2013;Ramkissoon and Uysal, 2011). However, little research has been attempted to figure out the whole process of image formation (sensory  cognition  affect) and the correlation between all the three types of image with tourist intention (sensory  cognition  affect  intention).
Nowadays, social media, such as Facebook, have firmly established their significant role in the promotion of destination images in the visual and auditory forms. Conceptually, a visual image is regarded as a visible physical attribute of or an object found at a destination. In addition, an auditory image is a sound that can be heard at the destination; this sound should be attached to a given object or source, such as the sound of sea waves. Regarding their importance, Ghosh and Sarkar (2016) found that the use of the visual cues, together with the olfactory and tactile cues, could influence the formation of destination affective image as well as tourist intentions to visit and to recommend the destination. However, how the visual and auditory images presented on the Facebook page of a destination actually affect the formation of the cognitive and affective images and tourist intention still remains largely unknown.
The purpose of this study, therefore, is to examine the development of destination image from the sensory form to the cognitive and affective forms, and the effects that these three types of destination image can have on tourist intention. The focus of the study is on the pre-trip formation of destination image and tourist intention. Thus, this study significantly differs from previous research on sensory image which mainly investigated the on-site or post-trip images (Agapito, Pinto, and Mendes, 2017;Xiong, Hashim, and Murphy, 2015). Its outcomes are meaningful to the projection of destination images to influence tourists' pre-trip intention (i.e., to visit).

Study setting and population
This study chose Tohoku, the North Eastern region of Japan, as the setting due to the mixed nature of its images. On the one hand, Tohoku possesses many valuable attractions, including natural heritages (mountains, lakes and hot springs), traditional festivals (Akita's Namahage and Aomori's Nebuta), food and drinks (fruits, rice and sake) and crafts (lacquer). On the other hand, Tohoku has been attached to the images of a disaster area after the 2011 tsunami and nuclear power plant incidents. The promotion of the images of such a destination has both the encouraging and restraining aspects which is worth an investigation.
In recent years, Tohoku is trying its best to recover and redevelop its tourism. For example, an English Facebook page (https://www.facebook.com/TohokuTourism/) has been created to promote the beautiful images of the region. As of September 2019, the Facebook page had approximately 194,000 likers and followers each. This makes the Tohoku Tourism Facebook page a promising carrier of images and a busy forum of potential visitors. Considering this reason, this study opted to focus on this particular platform.
In addition, Malaysian tourists were chosen as the survey population. In 2018, approximately 468,000 tourists from Malaysia visited Japan, making Malaysia one of the top ten inbound markets of Japanese tourism. Within that population, the younger ones (aged 18-24) were targeted since they are heavy Facebook users (Statista, 2018). Students at a university in Sabah state were specifically approached considering the exploratory nature of the research (Arnett, 2008). A general or Facebook-follower population was also overlooked given the same reason.

Instrument
A structured instrument was developed to collect the necessary data (Appendix). Initially, seventeen visual attributes and four auditory attributes were gathered from the Tohoku Tourism Facebook page over a one-year period (April 2016 -March 2017). The content analysis of the page was undertaken twice in order (1) to capture the most diverse pool of attributes and to ensure the content validity of the measures (Kassarjian, 1977), and (2) to eliminate the miscalculation and misinterpretation of the attributes and to guarantee the reliability of the generation process (Given, 2008).
In addition, the three elements of cognitive destination image (functional, psychological and mixed) were also examined (Echtner & Ritchie, 2003). Specifically, the nineteen measures of the functional cognitive image were generated from the Tohoku Tourism Website, with a strict reference to the categorization of attractions displayed on the site. The measures of the psychological cognitive image, the mixed cognitive image and the affective image (four measures each) were borrowed from the existing literature (Alcañiz, García, and Blas, 2009;Echtner and Ritchie, 2003;Russell and Pratt, 1980). Moreover, one measure to capture tourist intention to visit Tohoku was also included in the questionnaire (Ramkissoon and Uysal, 2011).
The visual and auditory attributes were measured on a five-point scale, from very unimpressive to very impressive. The cognitive attributes were evaluated on a five-point scale, from very unfavourable to very favourable. A "don't know" option was also included. The scale of the four bipolar measures of affective image and the measure of intention was also five-point, with the latter ranging from strongly disagree to strongly agree. In addition to these measures, the questionnaire also gathered some profile information of the respondents, including age, sex and previous experiences in Japan.
After being developed, a group of students at a university in Sabah, Malaysia helped answer the questionnaire, which was written in English, to verify its usability. As a result, the questionnaire was approved since no issues were detected during the pre-test. An online version of the questionnaire was then prepared based on the Google Forms application. This version was employed in the main survey.

Data collection and analysis
In the main survey, the participants, who were voluntarily recruited through the lecturers' and students' channels at the intended university in Sabah, were gathered in small groups. Initially, the participants were asked to visit the Facebook page of Tohoku Tourism using their mobile devices. After that, they were given the link to the online instrument to provide responses to the questions. A series of such group surveys were implemented between May and July of 2018 to collect a total of 119 valid answers. Among the 119 respondents, only 4 were below the age of 20 (the remaining were 20 or above) and only 7 have visited Japan before. Seventy three of them are female (61.3%) and 44 are male (37.0%). Two respondents did not reveal their sexes.
After being generated, the data were analysed in SPSS. The outcomes revealed that the respondents' perceptions of the attributes of Tohoku were positive. The mean values of the image components ranged from 3.68 to 4.36 out of 5 points (Appendix). Among them, the auditory image, psychological image, mixed image and affective image components were singly constructed (Kaiser-Meyer-Olkin or KMO > 0.78, p < 0.000). Otherwise, the visual image and the functional image components could be structured by two factors. However, the two-factor solution requires the deletion of five (visual image) to thirteen (functional image) items due to the cross-loading issue (Matsunaga, 2010). Therefore, a one-factor solution was sought for for both the visual image and the functional image components. The KMO values of 0.92 and 0.94 (p < 0.000) showed that this solution is valid for both components (Leech, Barrett, and Morgan, 2005).
In all cases, the corrected item-total correlation values of the measures exceeded 0.52 and the Cronbach's alphas of the components were larger than 0.82. This pattern suggests that the items of each component are strongly correlated to one another. Thus, the average value of each component was used in the verification of the theoretical model. As a consequence, only six measured variables are employed in the model (visual image, auditory image, functional image, psychological image, affective image and visit intention). In addition, all the hypothesized correlations among these variables are linear. Therefore, a hierarchical regression analysis was adopted to verify the model. Moreover, a series of multiple regression analyses were computed to discover the correlations between pairs of variables. With a sample of 119 respondents, the subject-to-variable ratio of 19.8 could guarantee a reliable analysis (Knofczynski and Mundfrom, 2008).

Findings
It was found that the visual image component did not have any significant effects on the three cognitive image components. However, this sensory component was the significant antecedent of both the affective image (β = 0.36, p = 0.004) and the visit intention (β = 0.32, p = 0.019). Otherwise, the auditory image component had significant influences on both the cognitive (β = 0.36 -0.42, p < 0.05) and the affective images (β = 0.29, p = 0.021). Yet, it did not affect the intention to visit the destination. Among the remaining independent variables, only affective image could generate some significant effect on tourists' intention (β = 0.46, p = 0.000). The whole model was satisfactorily validated: R 2 = 0.31, p = 0.000 (Figure 1).
Note. Insignificant correlations are presented by dash-lines.

Discussion
According to the existing literature, visual is the most popular sensory cue (Krishna, 2012;Nghiêm-Phú, 2017). This component is the focus of almost all the advertising and promotion practices. Yet, visual image probably is the least influential sensory cue (Balaji, Raghavan, and Jha, 2011). The findings of this study partly support this prior observation. Specifically, visual image did not play any significant role in the formation of cognitive image (functional, psychological and mixed). Nevertheless, this component is a significant influencer of affective image and visit intention. This outcome is consistent with the finding of another study in the context of tourism (Ghosh and Sarkar, 2016).
After visual image, auditory image is another popular component of marketing practices (Krishna, 2012;Nghiêm-Phú, 2017). The auditory impressions are more important than the visual ones in the formation of the other image components (cognitive and affective). Yet, when taking tourist behaviour into account, this sensory component is less significant than the visual image. This outcome, unfortunately, does not advocate the findings of previous studies (Stafford, 1996;Yalch and Spangenberg, 2000). Nevertheless, while the prior attempts employed a real setting and focused on daily goods, this study involves geographical distance and an expected trip. The differences in designs and purposes probably affect the outcomes of this and the other studies. Alternatively, when combining both the visual and the auditory cues, the outcomes could be improved. Together these two sensory components could significantly explain 13-16% of the variances of the cognitive image components and 25-38% of those of the visit intention and the affective image component. This observation is consistent with findings of previous studies (Balaji, Raghavan, & Jha, 2011;Stephenson & Carter, 2011). In addition, it was found that affective image had the largest effect on visit intention. This outcome is similar to those found in prior research (Nghiêm-Phú, 2015;Zhang, Fu, Cai, and Lu, 2014).

Theoretical implications
Destination image is indeed a multidimensional construct (Echtner and Ritchie, 2003). Yet, destination image does not only have the cognitive and affective dimensions but also the five sensory elements (Baloglu and McCleary, 1999;Echtner and Ritchie, 2003;Son and Pearce, 2005). While previous studies have often treated these dimensions and elements separately, this study empirically proved that the model of destination image formation must begin from the sensory images and continue with the cognitive and affective images. The linear correlations between the sensory, cognitive and affective elements further facilitate the intention to visit the destination of potential tourists. Among the three, affective image has the largest effect on intention. However, the formation of the affective image is under the influence of the sensory and cognitive images, with that of the former to be larger. (It should be noted that the combined effect of sensory and cognitive images is larger than that of the individual effect of each image component.) Thus, as the most fundamental forms of information input, sensory images should be thoroughly understood and employed in the management and marketing of tourist destinations.
Fortunately, the development of social media platforms such as Facebook has sufficiently supported the promotional efforts of destination managers and marketers. In addition to being an interactive platform (direct connections between or among providers and users), social media also are a multisensory environment (visual and auditory). The employment of social media in marketing helps sufficiently promote the sensory attributes of the destination, create the more complexed cognitive and affective impressions, and strengthen the intention to visit of potential tourists. The multisensory approach of sensory marketing (Hultén, Broweus, and van Dijk, 2009), thus, has successfully been applied.

Managerial implications
Findings of this study advocate the application of the sensory marketing approach in promoting the images of tourism destinations. Specifically, both the visual and auditory forms of destination attributes should be employed in promotional practices, especially those implemented on social media platforms. The more positive that the sensory images are perceived, the better the cognitive and affective images are, and the stronger the visit intention is.
With Tohoku tourism, more auditory cues should be promoted on the Facebook page in addition to the current festival-related ones, such as Tsugaru shamisen music, festival attendants' laughter and shouts, and festival music. Fortunately, auditory cues can be simulated or created to carry other sensory impressions (visual, olfactory, gustatory and tactile). This advantage should be further exploited. In addition, certain visual attributes could be promoted in a more frequent manner (those with higher factor loadings). Among them, some may be communicated seasonally (snow kamakura and pink cherry blossoms), while some could be utilized the whole year-round (blue sky and water, lantern and straw sandals). Unfortunately, such attributes could not exclusively represent the Tohoku region. In other words, they could be found anywhere in Japan. Tohoku's unique attributes (Namahage and Neputa characters), however, are not considered as important as the other common ones (lower factor loadings). Therefore, more efforts should be attempted to promote such uniqueness of Tohoku tourism, especially on social media platforms and other Internet-based portals (e.g., website).

Conclusion
Destination image is no doubt a complicated concept and construct. With the inclusion of the sensory elements, the model of destination image formation is now extended as follows: sensory  cognition  affect. Among the three, affective image is the most important antecedent of visit intention. Yet, affective image receives more influences from sensory images than from cognitive images. Consequently, more emphasis should be put on the presentation and promotion of the sensory elements of destination image, especially on the multisensory environment of social media platforms.
However, several limitations should be taken into account, and certain directions for future studies could be considered. First, the population of the study is not representative. The outcomes, thus, cannot be generalized to the whole inbound tourist market of Tohoku tourism. To attain this goal, future studies could enlarge the scale of the survey to include tourists from other markets and other age or occupation segments. Second, the sample of this study is rather small and the aim of the analysis is exploratory. The study, therefore, could not assess the relative importance of each image item. To solve this issue, future studies could generate a bigger sample and use the structural equation modelling technique to reveal the contribution of each attribute and the mutual constraints from other image components. Third, the perceptions of the Facebook-follower community were overlooked. In other words, the medium-to long-term exposure to the visual images and sounds displayed on the Facebook page was not taken into account. In the future, an examination or comparison among the short-, medium-and long-term exposures to Facebook information is also recommended. Fourth, the negative images of Tohoku and those circulated by social media users were not under examination. Thus, the impacts of the negative images on the visit intention and the relationship between the positive images and the visit intention remain unknown. These issues may also be further explored in the future.
In addition to the abovementioned directions, certain exploratory findings of this study could also be re-examined in future research. For example, are auditory cues really more powerful than visual images? If it is the case, which ways should be done to auditorise the other sensory cues? Similarly, are sensory images really more significant than cognitive images in the formation of the affective image? If the answer is yes, which components of cognitive image could be retained in and which could be excluded from future promotional efforts? Answers to these questions will help expand the understanding of destination image, one of the most important concepts in destination management and marketing.