Contextualization of Geographical Scraped Data to Support Human Judgment and Classification
When dealing with information extraction or data mining for security, one of the prerequisite is the data cleaning process, that influence deeply the final result. This is particularly true in case of data scraped automatically from online sources (web pages) that contains geographical or georeferenced information, such as movements report. In this paper we present a model, and a first partial implementation, for location resolution of string description. The domain is the monitoring and analysis of maritime container traffic, relying over the status messages generated automatically by container carriers. The model is based on the usage of different data dimensions, such as string similarity, trajectories similarity and most frequent patterns. It relies on historical data collected in order to compute segments of the containers trajectories. The realized interface, through a map-based view, provide an integration of the three spaces. This functionality allows to support human expert in associating a location to the string description provided in the raw record, in
order to improve the numbers of messaqes usable for the analysis of trajectories. One of the final objective of the ConTraffic project, in which this activity is carried out, is to provide additional informative sources to custom authorities for containers’ risk evaluation.
MAZZOLA Luca;
TSOIS Aris;
DIMITROVA Tatyana;
CAMOSSI Elena;
2014-08-18
The Institute of Electrical and Electronics Engineers (IEEE)
JRC81823
978-0-7695-5062-6,
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6657143,
https://publications.jrc.ec.europa.eu/repository/handle/JRC81823,
10.1109/EISIC.2013.33,
Additional supporting files
File name | Description | File type | |