Spatial Reasoning
Parallel Computation of PDFs on Big Spatial Data Using Spark
Liu, Ji, Lemus, Noel Moreno, Pacitti, Esther, Porto, Fabio, Valduriez, Patrick
We consider big spatial data, which is typically produced in scientific areas such as geological or seismic interpretation. The spatial data can be produced by observation (e.g. using sensors or soil instrument) or numerical simulation programs and correspond to points that represent a 3D soil cube area. However, errors in signal processing and modeling create some uncertainty, and thus a lack of accuracy in identifying geological or seismic phenomenons. Such uncertainty must be carefully analyzed. To analyze uncertainty, the main solution is to compute a Probability Density Function (PDF) of each point in the spatial cube area. However, computing PDFs on big spatial data can be very time consuming (from several hours to even months on a parallel computer). In this paper, we propose a new solution to efficiently compute such PDFs in parallel using Spark, with three methods: data grouping, machine learning prediction and sampling. We evaluate our solution by extensive experiments on different computer clusters using big data ranging from hundreds of GB to several TB. The experimental results show that our solution scales up very well and can reduce the execution time by a factor of 33 (in the order of seconds or minutes) compared with a baseline method.
Cluster-based trajectory segmentation with local noise
Damiani, Maria Luisa, Hachem, Fatima, Hamza, Issa, Ranc, Nathan, Moorcroft, Paul, Cagnacci, Francesca
We present a framework for the partitioning of a spatial trajectory in a sequence of segments based on spatial density and temporal criteria. The result is a set of temporally separated clusters interleaved by sub-sequences of unclustered points. A major novelty is the proposal of an outlier or noise model based on the distinction between intra-cluster (local noise) and inter-cluster noise (transition): the local noise models the temporary absence from a residence while the transition the definitive departure towards a next residence. We analyze in detail the properties of the model and present a comprehensive solution for the extraction of temporally ordered clusters. The effectiveness of the solution is evaluated first qualitatively and next quantitatively by contrasting the segmentation with ground truth. The ground truth consists of a set of trajectories of labeled points simulating animal movement. Moreover, we show that the approach can streamline the discovery of additional derived patterns, by presenting a novel technique for the analysis of periodic movement. From a methodological perspective, a valuable aspect of this research is that it combines the theoretical investigation with the application and external validation of the segmentation framework. This paves the way to an effective deployment of the solution in broad and challenging fields such as e-science.
A Trajectory Calculus for Qualitative Spatial Reasoning Using Answer Set Programming
Baryannis, George, Tachmazidis, Ilias, Batsakis, Sotiris, Antoniou, Grigoris, Alviano, Mario, Sellis, Timos, Tsai, Pei-Wei
Spatial information is often expressed using qualitative terms such as natural language expressions instead of coordinates; reasoning over such terms has several practical applications, such as bus routes planning. Representing and reasoning on trajectories is a specific case of qualitative spatial reasoning that focuses on moving objects and their paths. In this work, we propose two versions of a trajectory calculus based on the allowed properties over trajectories, where trajectories are defined as a sequence of non-overlapping regions of a partitioned map. More specifically, if a given trajectory is allowed to start and finish at the same region, 6 base relations are defined (TC-6). If a given trajectory should have different start and finish regions but cycles are allowed within, 10 base relations are defined (TC-10). Both versions of the calculus are implemented as ASP programs; we propose several different encodings, including a generalised program capable of encoding any qualitative calculus in ASP. All proposed encodings are experimentally evaluated using a real-world dataset. Experiment results show that the best performing implementation can scale up to an input of 250 trajectories for TC-6 and 150 trajectories for TC-10 for the problem of discovering a consistent configuration, a significant improvement compared to previous ASP implementations for similar qualitative spatial and temporal calculi. This manuscript is under consideration for acceptance in TPLP.
Spatial Data Science and Applications Coursera
About this course: Spatial (map) is considered as a core infrastructure of modern IT world, which is substantiated by business transactions of major IT companies such as Apple, Google, Microsoft, Amazon, Intel, and Uber, and even motor companies such as Audi, BMW, and Mercedes. Consequently, they are bound to hire more and more spatial data scientists. Based on such business trend, this course is designed to present a firm understanding of spatial data science to the learners, who would have a basic knowledge of data science and data analysis, and eventually to make their expertise differentiated from other nominal data scientists and data analysts. Additionally, this course could make learners realize the value of spatial big data and the power of open source software's to deal with spatial data science problems. This course will start with defining spatial data science and answering why spatial is special from three different perspectives - business, technology, and data in the first week.
Towards integrating spatial localization in convolutional neural networks for brain image segmentation
Ganaye, Pierre-Antoine, Sdika, Michaël, Benoit-Cattin, Hugues
Semantic segmentation is an established while rapidly evolving field in medical imaging. In this paper we focus on the segmentation of brain Magnetic Resonance Images (MRI) into cerebral structures using convolutional neural networks (CNN). CNNs achieve good performance by finding effective high dimensional image features describing the patch content only. In this work, we propose different ways to introduce spatial constraints into the network to further reduce prediction inconsistencies. A patch based CNN architecture was trained, making use of multiple scales to gather contextual information. Spatial constraints were introduced within the CNN through a distance to landmarks feature or through the integration of a probability atlas. We demonstrate experimentally that using spatial information helps to reduce segmentation inconsistencies.
Learning from and improving upon ggplotly conversions
For a quick demonstration of geom_sf(), I'm using albersusa to access the laea projected boundaries of the United States as a simple features (sf) data structure, but sf also makes it easy to read various file formats and even convert various spatial objects to sf. There are also a bunch of other R packages that, like albersusa, make it easy to query geo-spatial data as an sf data. The "Reverse dependencies" section of sf's CRAN page is a good place to discover them, but just to name a few: tidycensus, rnaturalearth, and mapsapi. One awesome consequence of using sf is that, since the data structure contains all the geo-spatial information, both plot() and geom_sf() just workTM. The most brilliant thing about sf is that it stores geo-spatial structures in a special list-column of a data frame.
Visualizing geo-spatial data with sf and plotly
Work with me or attend my 2 day workshop! Here's a quick example of reading a shape file into R as simple features via st_read(), then plotting those features (in this case, North Carolina counties) using each one of the four mapping approaches plotly provides. You might be wondering, "What can plotly offer over other interactive mapping packages such as leaflet, mapview, mapedit, etc?". One big feature is the linked brushing framework, which works best when linking plotly together with other plotly graphs (i.e., only a subset of brushing features are supported when linking to other crosstalk-compatible htmlwidgets). Another is the ability to leverage the plotly.js
The Future is in IoT, AI, Robotics – Rajesh Alla, IIC Technologies
Our lives today, and in the future, will necessarily pivot around the digitization of objects in the universe, through the efficient land, sea, and aerial surveys. The data collected will embed locational intelligence that will help us create maps with enhanced and meaningful spatial properties. These maps will form the substrate upon which the DNA of physical objects and their thematic properties will be seamlessly interwoven. The resulting rich datasets will become amenable to real-time analysis through Cloud computing that can be shared anytime, anywhere! Temporal resolution of the data is going to be crucial for real-time and near-real-time applications and thus controlled crowdsourcing with automated validation tools is bound to lead to more opportunities.
Clustering to Reduce Spatial Data Set Size
Traditionally it had been a problem that researchers did not have access to enough spatial data to answer pressing research questions or build compelling visualizations. Today, however, the problem is often that we have too much data. Spatially redundant or approximately redundant points may refer to a single feature (plus noise) rather than many distinct spatial features. We can use density-based clustering to compress such spatial data into a set of representative features. This paper demonstrates how to reduce the size of a spatial data set of GPS latitude-longitude coordinates using the Python programming language and its scikitlearn implementation of the DBSCAN density-based clustering algorithm. DBSCAN works very well in low-dimension space, such as the two-dimensional feature space in this geospatial example.