Discovering Geo-dependent Stories by Combining Density-based Clustering and Thread-based Aggregation techniques
Cerezo-Costas, Héctor, Vilas, Ana Fernández, Martín-Vicente, Manuela, Díaz-Redondo, Rebeca P.
–arXiv.org Artificial Intelligence
Citizens are actively interacting with their surroundings, especially through social media. Not only do shared posts give important information about what is happening (from the users' perspective), but also the metadata linked to these posts offer relevant data, such as the GPS-location in Location-based Social Networks (LBSNs). In this paper we introduce a global analysis of the geo-tagged posts in social media which supports (i) the detection of unexpected behavior in the city and (ii) the analysis of the posts to infer what is happening. The former is obtained by applying density-based clustering techniques, whereas the latter is consequence of applying natural language processing. We have applied our methodology to a dataset obtained from Instagram activity in New York City for seven months obtaining promising results. The developed algorithms require very low resources, being able to analyze millions of data-points in commodity hardware in less than one hour without applying complex parallelization techniques. Furthermore, the solution can be easily adapted to other geo-tagged data sources without extra effort. Nowadays, users are the main source of alternative sensor information in a city, although this huge source of information is often overlooked. Being ubiquitously connected to the Internet with their mobile phones, they intensively use services which promote user generated content such us Online Social Networks (OSNs), one of the most massively alternatives employed. Content in OSNs is a combination of text/images (e.g. a user post, a reply to other users posts, etc.) and meta-data information (number of likes, stars of user posts, number of posts made by the user, GPS-location, etc.).
arXiv.org Artificial Intelligence
Dec-18-2023
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe
- North America > United States
- Massachusetts > Middlesex County
- Reading (0.04)
- New York (0.25)
- Massachusetts > Middlesex County
- Asia > Middle East
- Genre:
- Research Report (0.40)
- Industry:
- Information Technology > Services (1.00)
- Technology: