Location uncertainty in user-generated geographic content

GIMA
M-GEO
M-SE
STAMP
Topic description

As a result of hardware miniaturization and ubiquitous internet access, there is a wealth of user- and sensor-generated geographic content available, from social media activity to private weather stations to embedded sensor technology for managing public transport or optimizing energy consumption.
However, citizens and governments are still far away from fully exploiting the potentials of these new data sources for improved decision-making, participation, and service provision. One obstacle is the often uncertain location of origin or uncertain extent of geographic reference of the user-generated geographic content.
Several geosocial media platforms offered precise geolocation via GNSS coordinate meta-data. However, Twitter has dropped this support [1], leaving only less precise location in the form of place-names at various granularities, either in the meta-data or in the unstructured text field. However, even on platforms such as OSM, many ‘points’ of interest are, in fact, areal features.
This provides additional challenges for any attempts to make inferences about phenomena from this data, because during analysis, the data often is aggregated into spatial units of regular (e.g. raster grids) or irregular (e.g. administrative units) shape [2].
This MSc project has two main components strands: First, to investigate the precision and granularity of spatial information in user-generated geographic content such as Twitter or Flickr. Second, to develop approaches to render this information usable for spatial analysis and inference.
Depending on the interests and skills of the student, as well as the scope of the MSc thesis research in the program (e.g. 45 ECTS vs. 30 ECTS), the student can address both components, or focus on one of the two (although the second component would benefit from insights of the first).
Useful skills include the willingness to look beyond GIScience disciplinary boundaries (e.g. basic geoparsing and natural language processing for finding and looking up place-names), the ability to script processes (preferably in Python), social data handling and transformation (e.g. using Twitter API and data munging), and basic (geo)statistics.

Topic objectives and methodology

Quantify and reduce uncertainty of locations derived from place-names in user-generated geographic content

References for further reading
  • [1] Hu, Yingjie, and Ruo-Qian Wang. 2020. “Understanding the Removal of Precise Geotagging in Tweets.” Nature Human Behaviour, September. https://doi.org/10.1038/s41562-020-00949-x.

  • [2] Robertson, Colin, and Rob Feick. 2018. “Inference and Analysis across Spatial Supports in the Big Data Era: Uncertain Point Observations and Geographic Contexts.” Transactions in GIS 22 (2): 455–76. https://doi.org/10.1111/tgis.12321.