Understanding tourists’ urban images with geotagged photos using convolutional neural networks (pdf)

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007%2Fs41324-019-00285-x.pdf

Understanding tourists’ urban images with geotagged photos using convolutional neural networks

Spat. Inf. Res. https://doi.org/10.1007/s41324-019-00285-x Understanding tourists’ urban images with geotagged photos using convolutional neural networks Dongeun Kim1 • Youngok Kang1 • Yerim Park1 • Nayeon Kim1 • Juyoon Lee1 Received: 4 March 2019 / Revised: 21 July 2019 / Accepted: 25 July 2019 Ó The Author(s) 2019 Abstract This study aims to track down representative images and elements of sightseeing attractions by analyzing the photos uploaded on Flickr by Seoul tourists with the image mining technique. For this purpose, we crawled the photos uploaded on Flickr and classified users into residents and tourists; drew 11 region of attractions (RoA) in Seoul by analyzing the spatial density of the photos; classified the photos into 1000 categories and then 14 categories by grouping 1000 categories by utilizing Inception V3 model; analyzed the characteristics of the photo image by RoA. Key findings of this study are that tourists are interested in old palaces, historical monuments, stores, food, etc. and those key elements are distinguished from the major sightseeing attractions in Seoul. More specifically, tourists are more interested in palaces and cultural assets in Jongno and Namsan, food and restaurants in Shinchon, Hongdae, Itaewon, Yeouido, Garosu-gil, and Apgujeong, war monuments or specific artifacts in War Memorial and the National Museum of Korea, facilities, temples, and pictures of cultural properties in Samsung Station, and toyshops in Jamsil. This study is meaningful in three folds: first, it tries to analyze urban image through the photos posted on SNS by tourists. Second, it uses deep learning technique to analyze the photos. Third, it classifies and analyzes the whole photos posted by Seoul tourists while most of other researches focus on only specific objects. However, this study has a limitation because the Inception v3 model which has been used in this research is a pre-trained model created by training the ImageNet data. & Youngok Kang 1 Department of Social Studies, Ewha Womans University, Seoul, South Korea In future research, it is necessary to classify photo categories according to the purpose of tourism and retrain the model by creating new training data set focusing on elements of Korea. Keywords Image data mining Geotagged photos Flickr Convolutional neural network Inception v3 model 1 Introduction Today people prefer to share the posts such as texts, images, and videos via Social Network Services (SNS) with others without regard to time and location. Moreover, the geo-tagged photos uploaded on the site by tourists display the perception and the action of tourists as well as the images that tourists feel about the sightseeing attractions [1]. As the images of touristic sites are closely associated with the tourists’ attraction and intention, they serve as a reference for other tourists who seek to travel to those sites [2]. In addition, as the touristic images on SNS can be continually produced and reproduced, we are able to ascertain the perceptions and the trends of representative sightseeing elements and locations by analyzing the images uploaded on SNS. Furthermore, this process contributes to the basic research on tourism for discovering, developing, and improving sightseeing attractions [3]. We think that it is possible to conduct in-depth analysis with the extracted information in tandem with pre-existing methodologies of spatial data analysis because geo-tagged photos contain locational information. Especially we can make better use of Flickr data because they contain the information on location and time and are automatically affiliated with photo metadata. Previous studies which have utilized geotagged data on SNS have mostly explored the 123 D. Kim et al. location that users occupied [4–6], the patterns of movement [7, 8] and the texts of uploaded photos [9–16]. However, as the image analysis using deep learning technology becomes available, the studies using the photos posted on SNS keep increasing recently. Examples of researches on analyzing the photos posted on the SNS include classification of food [11], analysis of bird observations between experts and ordinary people [17], estimation of weather preference by visiting specific places [18]. Most of the studies are focused on analyzing the photos which contain specific objects. There have been no studies to analyze the image of tourists in the area by classifying the whole photos posted by the tourists who visit the specific area. The purpose of this study is to analyze representative images and elements of sightseeing attractions by analyzing the photos uploaded on Flickr by Seoul tourists. For this purpose, first, we crawled the photos uploaded on Flickr, which is one of Social Network Service (SNS) platforms that people can share geotagged photos, and classified users into residents and tourists. Second, we drew 11 region of attractions (RoA) in Seoul by analyzing the spatial density of the photos uploaded by tourists. Third, we classified the photos into 1000 categories and then 14 categories by grouping 1000 categories by utilizing Inception V3 model, which is one of the convolutional neural networks (CNN) with deep learning capability. Finally, we analyzed the characteristics of photo image by RoA. 2 Research on image data mining via convolutional neural networks Image data mining is the process of extracting information or knowledge from image data [19]. Recently, with the increase in the volume of image data as well as the improvement of training algorithm, techniques of image data mining using artificial neural networks have been applied to various fields such as medicine, environmental studies, information science, and computer graphics [20]. Convolutional neural network (CNN) which is one of artificial neural networks has been developed based on neurological knowledge surrounding the visual cortex of humans and animals [21]. As CNN has been shown to be effective in distinguishing and categorizing the photo images, it has become a trend to make use of it in most image data mining research. CNN is basically composed of three layers such as a convolutional layer, a pooling layer, and a fully connected layer. One can not only produce a variety of models by changing the CNN configurations, but also train the CNN through the scan of the image characteristics. 123 Researches on classification of images by category using CNN method have been actively conducted in the field of medicine. Krishnan et al. [22] categorized liver diseases surfaced on the images of ultrasonic inspection. Sawant et al. [23] detected brain cancer through MRI, and Motlagh et al. [24] distinguished breast cancer from the images of histopathological samples. Further, CNN method has been applied in other fields of image mining. Park and Shim [25] established a model of discerning the genre from the images of movie posters, taking inspiration from the thought that elements such as (...truncated)