12. Cool technology..!!
Can I use it in my
application?
Oh..!! But, it is not
made for me. Can’t
make use of it as is
My pleasure. Here
it is..
13.
14. Kindly let me get
the technology
you have
Kindly let me
understand your needs
15.
16. HELP..!! I have too
much data. Your
technology is not
helping me
mmm…Let me
check with my good
friends there.
My pleasure. Here
it is..
Cool DBMS
technology..!!
Can I use it in my
application?
Oh..!! But, it is not
made for me. Can’t
make use of it as is
17.
18. Kindly let me
understand your needs
Kindly let me
get the
technology you
have
19.
20.
21.
22.
23. Let me check with
my other good
friends there.
HELP..!! Again, I have
too much data. Your
technology is not
helping me
Cool MapReduce technology..!!
Can I use it in my application?
Sorry, seems like the
DBMS technology
cannot scale more
My pleasure. Here
Oh..!! But, it is not
made for me. Can’t
make use of it as is
it is..
24.
25. Kindly let me
understand your needs
Kindly let me
get the
technology you
have
26. Kindly let me
understand your needs
Kindly let me
get the
technology you
have
aka
GeoJinni
27. VGI Sensor networks
27
Tons of Spatial data out there…
Smart phones Satellite Images
Medical data
Traffic data
Geotagged Microblogs
Geotagged pictures
28. GeoJinni
Website: http://spatialhadoop.cs.umn.edu/
Download source code, binary distribution, and instructions
Email us at: shadoop@cs.umn.edu
■ Released in March 2013; 75,000 downloads since then
Spatial language Built-in spatial data types
28
Spatial Indexes Spatial Operations
29. User Programs
Pig
Latin
Hadoop
Java APIS
Job Monitoring and
29
The Built-in Approach of GeoJinni
Spatial Modules
User Programs
Pig
Latin
Hadoop
Java APIS
Job Monitoring and
Scheduling
MapReduce
Runtime
Storage (HDFS)
(Spatial)
User Program
+
MapReduce
APIs
+
Job Monitoring
and Scheduling
+
MapReduce
Runtime
+
Storage
+
…
Scheduling
MapReduce
Runtime
Storage (HDFS)
Spatial
Language
Spatial
Operators
Early
Pruning
Spatial
Indexing
The On-top
Approach
From Scratch
Approach
The Built-in Approach
(GeoJinni)
30. 30
Spatial Data & Hadoop
Spatial Data Hadoop
points = LOAD ’points’ AS
(id:int, x:int, y:int);
result = FILTER points BY
x < xmax AND x >= xmin AND
y < ymax AND y >= ymin;
Takes 193 seconds
GeoJinni
GeoJinni
points = LOAD ’points’ AS
(id:int, location:point);
result = FILTER points BY
IsOverlap(location, rectangle
(xmin, ymin, xmax, ymax));
Finishes in 2 seconds
36. 36
Range Query
SpatialFileSplitter
prunes blocks
outside the query
range
SpatialRecordReader
passes local indexes
to the map function
Map function selects
records in range
37. 37
CG_Hadoop
■ Make use of GeoJinni to speedup
computational geometry algorithms
Polygon union, Skyline, Convex Hull,
Farthest/Closest Pair
■ Single machine implementation
E.g., Skyline of 4 billion points takes three hours
■ Straight forward implementation in Hadoop
Hadoop parallel execution
■ More efficient implementation
in GeoJinni
Spatial indexing
Early pruning
■ Free open source as part of GeoJinni
Single
Machine Hadoop
GeoJinni
29x
260x
1x
38. 38
Convex Hull
Find the minimal convex polygon that contains all points
Input Output
42. 42
Multi-level Image
■ Many images at
different zoom
levels
Pan
Zoom in/out
Fly to
■ More details as
the zoom level
increases
43. 43
MNTG - World-wide traffic generator
for road networks
http://mntg.cs.umn.edu/
M. F. Mokbel, L. Alarabi, J. Bao, A. Eldawy, A. Magdy, M. Sarwat, E. Waytas, and S.
Yackel. MNTG: An Extensible Web-based Traffic Generator. In SSTD, 2013
44. 44
SHAHED – A tool for querying and
visualizing spatio-temporal satellite data
http://shahed.cs.umn.edu/
"SHAHED: A MapReduce-based System for Querying and Visualizing Spatio-temporal
Satellite Data“, Ahmed Eldawy et al, ICDE 2015
48. 48
TAREEG – Web-based extractor for
OpenStreetMap data using MapReduce
http://tareeg.net/
L. Alarabi, A. Eldawy, R. Alghamdi, and M. F. Mokbel. TAREEG: A MapReduce-Based
Web Service for Extracting Spatial Data from OpenStreetMap. In SIGMOD, 2014
50. GeoJinni
Analyze your spatial data efficiently
50
Built-in spatial data types
Spatial high level language
Efficient Spatial Operations
Language
Data types
Spatial Indexes
Indexes Operations
Analyze Datasets your are organized data on large efficiently clusters using with spatial built-in indexes
spatial
operations that runs efficiently using spatial indexes
Interact Have with all your the system spatial and datasets express ready your to queries load in
in a
simple SpatialHadoop (Grid Website: high or level R-tree) http://language with that spatialhadoop.the are with built-adapted built-in spatial cs.to in umn.MapReduce
spatial data edu/
support
types
Download source code, binary distribution, and instructions
Email us at: shadoop@cs.umn.edu