From PIace2Vec to Multi-Scale Built-Environment Representation: A General-Purpose Distributional Embedding for Urban Data Analysis
METADATA ONLY
Loading...
Author / Producer
Date
2020-11
Publication Type
Conference Paper
ETH Bibliography
yes
Citations
Altmetric
METADATA ONLY
Data
Rights / License
Abstract
Built environments like cities, roads, communities are rich sources of urban data. Many downstream applications require comprehensive analysis like geographic information retrieval, recommender systems, geographic knowledge graphs, and in general, understanding urban spaces [28]. Points of Interests (POI), as one of the most researched aspects of urban data, has been successfully modeled using concepts borrowed from Machine Learning (ML) and Natural Language Processing (NLP). In the work of Place2Vec [28], a Word2Vec-like statistical model is proposed to represent spatial adjacency with a continuous embedding space. This method successfully models the functional semantics of POIs with regard to several human-assessment based evaluations. However, though the Place2Vec model addresses the distributional heterogeneity within a given spatial context with ITDL augmentation, it does not address the spatial heterogeneity among different regions. To solve this problem, we propose to introduce a hierarchical, density-based, self-adjusting clustering mechanism. The boundary of relatedness and unrelatedness is learned from the given context, where denser areas have tighter bounds while sparser areas have looser ones. We train our model on both the baseline Yelp hierarchical dataset [28] and our OpenStreetMap dataset. We demonstrate that 1) our model significantly improves the performance on 2 of the 3 baseline tasks and the stability of training, and 2) our model generalizes excellently across 112 cities of radically different scales (minimum 1725 POIs, maximum 2694070 POIs), regions (North America, Europe, Asia, Africa) and types (commercial, touristy, industrial, etc.) without the need of adjusting or tuning any hyperparameters. We also demonstrate that our model can be used to discover interesting facts about cities like inter-city semantic analogy and intra-city connectivity, which can be very useful in urban planning, social computing and public policy making. © 2020 Association for Computing Machinery.
Permanent link
Publication status
published
External links
Editor
Book title
LocalRec'20: Proceedings of the 4th ACM SIGSPATIAL Workshop on Location-Based Recommendations, Geosocial Networks, and Geoadvertising
Journal / series
Volume
Pages / Article No.
1
Publisher
Association for Computing Machinery
Event
28th International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL 2020) (virtual)
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
Points of Interest; Similarity; Geo-Semantics; Machine Learning
Organisational unit
Notes
Due to the Coronavirus (COVID-19) the conference was conducted virtually.