We need to transition from analysis to synthesis when it comes to large scale image based studies of satellite or street level images.
Large scale, image based studies have the ability to unlock the human potential and really address some of the most important societal problems. The question really is, are we going to do that through analysis or are we going to step up to the game and actually start doing synthesis? Are we only go to study and observations or are we going to go and actually make an impact in the society?
Can global image repositories help UN's sustainable development goals (SDGs)? help us understand the social determinants of health? Satellite imagery, Google street view and user contributed photos from a global image repository are being used for large scale image-based studies, visual census and sentiment analysis [Ermon][http://StreetScore.media.mit.edu]. But we need to go beyond simply relying on big data for investigating social questions via remote analysis. We need to transition from analysis to synthesis. For deployable social solutions, we need to consider the full stack of physical devices, organizational interests and sector-specific resources.
Image-based large studies allow us to predict poverty from daytime and nighttime satellite imagery which can influence critical decisions for aid and development planning. In project ‘StreetScore’, our group has shown that semantic analysis of street level imagery such as Google Streetview, can provide varied insights rich in urban perception; our recent project ‘StreetChange’ shows the benefits of time-series data in driving these insights (http://streetchange.media.mit.edu).
We have seen some amazing work and you'll hear from Stephano about poverty mapping my glove previous collaborators to a population density crop maps, Betaine. So we had been, that's been fantastic progress in, in using a global industry, uh, in, in these areas that are taken from satellites or drones and then a street level imagery is also very widely available, either very structured like Google street view, but also from a user contributor photos and to that Nikki like and others in my group have been working on can we do a sentiment analysis of, of this imagery in this case, sentiment analysis of the perceived safety just for Google Street and main street and then create kind of citywide maps of a perceived safety that can be used by city planners and urban planners. So, which is great. But coming back to analysis versus synthesis opportunities, I'm going to give you a flavor of one of the projects we worked on a which is street addresses.
7. Street Address: Assign | Adopt
• 75% of the world population without street addresses
• Timely ambulance delivery
• Stimulate digital economy via eCommerce
• Protect property rights of the marginalized
• Crisis response (Haiti, 48 hrs to coord aid)
8. Beyond Analysis Towards Synthesis
Actionable Insights
Traffic Nudge
Crisis Response
Street Addr
Govt Policies
Large Scale Visual Study
Visual Census
Socio-eco observations
Sentiment maps
13. Analyze Present Future
Low
Level
Measure
•Visual Census
•Counting/Density
Streetscore,
Streetchange,
Visual Census
(Fei-Fei Li)
Generative
modeling of
cities
Mid
Level
Understand
•Geolocalization
•Crowds
•3D Modeling
DeepRoadMap
per (Urtasun et
al.)
Design
Suggestions
High
Level
Aggregate and
Predict
• Economic activity
Jean and
Ermon
Jayachandran
Deforestation
Study
Real-time data
analysis.
Actionable
information
14. Socio-Economic Inference from Digital Patterns
Satellite ImageryStreet-level Imagery Aerial Imagery
Phone Mobility Data
Social Networks, Photos
Device
Activity
Built
Environment
Human
Activity
84 countries, no data
Ack: Nikhil Naik
25. Example: Congestion Pricing
(Nikhil Naik et al.)
• Jakarta: Response of vehicles by types
• CV + traffic cameras across the city to detect types + number of vehicles
• Understand traffic flows before/after congestion pricing is introduced
• Adjust rates for different types of vehicles/ in different areas
26. Street Address: Assign | Adopt
• 75% of the world population
without street addresses
• Timely ambulance delivery
• Stimulate digital economy via eCommerce
• Protect property rights of the marginalized
• Crisis response (Haiti, 48 hrs to coord aid)
27. Street Addresses
from Satellite Imagery
İlke Demir, Forest Hughes,
Aman Raj, Kaunil Dhruv,
Suryanarayana M. Muddala,
Sanyam Garg, Barrett Doo,
Ramesh Raskar
IJCG 2018
28. Thanks
Nikhil Naik Tristan Swedish Praneeth Vepakomma Ilke Demir
Forest Hughes Jatin Malhotra SuryaNarayana Murthy
Kavnil Dhruv Aman Raj Barrett Doo Praveen Gedam Anna Roy
Cesar A. Hidalgo Guan Pang Jing Huang Daniel Aliaga
Manohar Paluri Pierre Roux Yael Maguire Leo Tsourides Divyaa
Ravichandran Sanyam Garg Sai Sri Sathya Grace Kermani
Tobias Tiecke Andreas Gros Santanu Bhattacharya
Kabir Rustogi Will Marshall
29. Pilot 1:
Will people use street names?
In use After 6 monthsCommunity assigned names + Signs
30. Pilot 2
Will SMEs use the labeling scheme?
Measured relative efficiency for Pizza delivery
33. What3words:
A: parrot.casino.failed
B: issuer.lollipop.ripe
- Irrelevant words
based on lat/lon.
Robocodes:
75D.NE27.Dhule.MhIn
76C.NE27.Dhule.MhIn
- Hierarchical and
linear addresses.
Google Maps:
Near Green Park
Near Green Park
- No street names or numbers
Point vs Edges for Geometric Queries
34. Addressing Schemes Around the World
London postal code system:
Radial regions based on orientation and distance
South Korea streets:
Meter markers
Japan block system:
Hard to decipher
Dubai addressing:
Uses districts
Berlin numbering:
Zigzag house pattern
35. Robocode Scheme
• 5 alphanumeric fields
• Hierarchical and linear descriptors
• To close the gap between physical
addresses and automated geocoding
Road naming scheme:
- distance from the center
- orientation in odd parity
i.e. WB14
Region naming scheme:
- orientation wrt downtown
- distance from downtown
i.e. WB
House numbering scheme:
- meter markers on the road
- block letters from the road
i.e. 38K WB14
“I7 Hacker Way, Menlo Park, CA, US”
36. Design Choices
Linear: similar addresses stored in a linear fashion
Hierarchical: top-down structure for spatial encapsulation
Compressible: 5x4 max (chars x words)
Universal: independent of local language
Inquirable: useful for geometric, proximity-based, and type-ahead queries
Extendible: dynamically modifiable for new places
Robust: flexible for overestimation and noise
StructuralDesignParameters
forefficientcomputerimplementation
Linear: closer addresses are given related names
Hierarchical: top-down subdivision of the world
Memorable: short and alphanumeric, easily convertible
Intuitive: with a sense of direction and distance
Topological: consistent with road topology
Inclusive: with local names (city, state)
Physical: consistent with natural boundaries
SemanticDesignParameters
foruserfriendliness
Machine
Needs
Human
Needs
41. NF
NH
NE
Region Creation
• Road graph: Node=intersection,
edge=road, weight=length
• Partition for max inter, min intra
connectivity, using normalized min-cut.
• 𝑛 𝑚𝑎𝑥 = 𝑐𝑒𝑖𝑙
𝑟𝑜𝑎𝑑𝑠
88
42. Region and Road Naming
• Cmax
𝑟𝑜𝑎𝑑𝑠
𝐴
→ 𝐶𝐴 (downtown)
• Orientation bucketing into N, S, W, E
• Trace regions based on distance to CA
• Orientation bucketing into major axes
• Trace roads based on order
43. Offsetting and Meter Marking
• 5 meter marker along the road
• Odd/even based on RHR
• Distance field of roads: block offset
45. Street Address with Robocodes
• From Satellite Imagery to Deployed Street Addresses
• Generative address : linear, hierarchical, and intuitive
• Human friendly rather than machine friendly
49. Act Alert Assist Change
Low/Mi
d/High
Level
Alert about the
state of
people/economy/b
uilt environment
(e.g., predict crop
yield from satellite
imagery, predict
insurance price
from street view)
Assist in acting on
information by
providing
suggestions based
on data
(e.g., design
optimal congestion
pricing based on
detected cars,
design crisis
response in
hurricanes)
?
50. Inaccessible Areas
• To extend our format to cover areas that are not accessible by
streets, we explored different implementations to cover such
areas, which are 26*5 m away from any street.
• Geocoding as a function (excluding the version field):
f (info, lat, lon) = x.y.z.t
• For places with roads, info={road network, city, country}
f (R, C) = x.y.city.country
• Extreme case: only reliable information is latitude/longitude!
52
51. f(C,lat,lon) = hash(round(lat,3)) + dir(lat) .
hash(round(lon,3)) +dir(lon) . C
L-A-T-dir.L-O-N-dir.name.area
Inaccessible Areas: Blackholes!
• Linear hashing:
• 26 letters + 10 digits
• 100m x 100 m granularity
• Last letter is the hemisphere
• Range: 359.999, longitude: 7PRZ W
• Hierarchical hashing:
• Enlarge the grid from to 1 km x 1 km
• Using two floating points = three letters
• Within each cell, re-hash it to a 36 x 36 grid = one letter
• New resolution: 30m, represented by five letters
53
f(C,lat,lon) = hash(round(lat,2)) + hash(lat - round(lat,2)) + dir(lat) .
hash(round(lon,2)) + hash(lon - round(lon,2)) + dir(lon) . C
LlatLlatHlatDlat .LlonLlonHlonDlon . name . Ocean /Continent /etc
52. Thanks!
What next?
Today
Tomorrow
Friday
A month
• Robocode.info
• Join our presentation in CVPR WiCV. Friday 10am
• Join us with your new ideas at SIGGRAPH 2018 Maps & Urban Data session.
Code: https://github.com/facebookresearch/street-addresses
Paper: https://research.fb.com/publications/generative-street-
addresses-from-satellite-imagery/
53. Bonus: Blackholes!
Main aim: f(<place>)=robocode
Base case:
<place> = <house, street, city, country>
“12C.NA14.PALO.CAUS”
No street:
<place> = <lat, lon, city, country>
“F12.HN3.PALO.CAUS”
No city/country:
<place> = <lat, lon, other info (ocean, dessert, etc.)>
“JK3.3DF.PAC.OCEA”
54. Region Experiments
• Experimented with (a) normalized min-cut, (b) Newman-
Girvan, (c) modularity based partitioning.
• Experimented with image based methods (superpixels,
region growing) and the dual of the road graph.
• Evaluated with urban rules (geography, population,
road distribution)
55. Output Maps and Tools
• .osm maps with roads (meter marking and offsetting on the fly)
• ID-tool of MapBox for on-demand inverse/forward geocoding
• rtree extension for efficient spatial querying
• Experimental mobile app for self navigation
• 21.7% decrease in arrival time using Robocodes
56. Analyze: Three Types of Outputs
1. Semantic
2. Objective
Population Density
76000/sq. mile
3. Qualitative
Assign a semantic
label to each pixel
Label road quality
as
“Bad” or “Good”
Hinweis der Redaktion
Various Street View datasets are now available online. And perhaps the most popular one is Google Street View, which has covered more than a hundred countries to date. And interestingly Street View provides researchers with a new way to observe neighborhood.
Check Steve Seitz and U of Washington Phototourism Page
So why street addresses are important, why we need adequate mapping. Let me ask you, how many of you had a unique address up to your house or flat number, back home? According to geocoding companies, 75% of the world is unmapped and UN says 4 billion people are invisible because of that. This lack of addressing is even worse in disaster zones, for example in Haiti Earthquake, Humanatarian Openstreetmap community started remotely mapping the disaster area in 48 hours, and mapped adequately in 6 months for NGOs and aid agencies to use. But are curves on a plane enough to be a map? How do we refer to places, how do we define locations? Are those maps complete without any labels?
Capture, analyze, act
Capture, analyze, act
Capture, analyze, act
Capture, analyze, act
Socio Economic and Physical environment,
Cities are physical too: Using computer vision to measure the quality and impact of urban appearance
N Naik, R Raskar, CA Hidalgo - American Economic Review, 2016 - aeaweb.org
Streetscore-predicting the perceived safety of one million streetscapes
N Naik, J Philipoom, R Raskar… - Proceedings of the …, 2014 - openaccess.thecvf.com
Capture, analyze, act
So why street addresses are important, why we need adequate mapping. Let me ask you, how many of you had a unique address up to your house or flat number, back home? According to geocoding companies, 75% of the world is unmapped and UN says 4 billion people are invisible because of that. This lack of addressing is even worse in disaster zones, for example in Haiti Earthquake, Humanatarian Openstreetmap community started remotely mapping the disaster area in 48 hours, and mapped adequately in 6 months for NGOs and aid agencies to use. But are curves on a plane enough to be a map? How do we refer to places, how do we define locations? Are those maps complete without any labels?
Welcome to our presentation of Robocodes! Hopefully we’ll jump-start the conference with an impactful presentation to keep you pumped up for the rest of the week. Today I will introduce you our approach for generative street addressing from satellite imagery. This project is developed at Facebook and MIT Media Lab.
Will people use street Signs $40 per sign
City counsel used approved
Naming streets is politically unsavvy
20% of first time deliveries undelivered
$300M/year loss because of inadequate addressing
Need an addressing system for emergency, businesses, and residents
On the bright side, we don’t need to re-invent addresses, as there are many addressing schemes already organically being developed and experimented throughout the history. Within these addressing schemes, London postal code system caught our attention with its linear and hierarchical features. Also other addressing schemes guided us for some design choices that I will discuss next.
Our addresses consist of four alphanumeric fields, hierarchically designating larger areas in order. It resembles real world addresses, containing house number, street name, city, state and country.
The region naming scheme is inspired from the London scheme in the previous slide. The first letter is dedicated for orientation from the downtown (north, south, west, east) and the second letter cues the distance from the downtown. So the yellow house on the left is in region WB. Then, within a region, the roads are named according to their orientation and order. Finally, the house numbering is based on the distances along and from the road.
We followed some design principles for making our system both efficient, and user friendly. I won’t go through all, but for example, the storage should be supporting forward and inverse geoqueries, should be extendible for address changes, should be compressible, etc. On the other hand, the linear and hierarchical properties of the scheme supports user integration, intuitiveness and memorability.
Back to our pipeline, we start by extracting road masks from satellite images by deep learning, then we find the individual road segments from those predictions. We create regions from the road graph, and label the parcels by distance fields from the roads.
Focusing on developing countries, the urban structure in those satellite images are not easy to detect and organize. Also there are different weather and illumination conditions per country.
Our annotaters created binary road masks from satellite images, on zoom level 19 satellite images. Then we train a SegNet model on those images, and we are able to learn road predictions as shown here, with 72.6% precision and 57.2% recall.
After we have the predictions, we threshold and thin the roads to find the skeleton. Then we use orientation bucketing to find individual road segments, which creates the base for streets.
After we have the road segments we create a road graph where nodes represent intersections and edges represents road weighted by their length. We partition the graph for maximum inter minimum intra connectivity of clusters, using normalized min cut. We also use some urban rules to limit the clusters.
After the regions are created, we mark the densest are as the downtown, and bucket other regions based on their orientation with respect to the downtown, in south, north, east, and west directions. After the regions are named, we find the two major axes of the roads within each region, and name the roads in each region based on their orientation and order.
Finally, we put meter markers on the roads by pixel distance, and we compute distance fields to create the block offsets for the house numbering. The gradient in the left image shows those offsets
As defined in our motivation, we want to locate and connect the invisible 4 billion, right? So we tested our system on unmapped developing countries. The map coverage improved up to 80% in some areas,
In conclusion, we presented a robust addressing scheme, a deep learning and graph partitioning approach for street extraction, and ready to deploy maps and tools for easy geoqueries.
Capture, analyze, act
Capture, analyze, act
Capture, analyze, act
To partition the road graph we experimented with different algorithms as newman girvan and modularity based partitioning. Since there is no clear definition of regions, our domain experts evaluated the success of region creation by using urban rules as geography, population, and road distribution. We also experimented with partitioning the dual of the road graph and traditional image based approaches as mean shift and super pixels
As an output, our system generates .osm files with roads. However since it does not make sense to output addresses for each five by five area, we compute the meter marking and offsetting on the fly. We modified the id-tool of MapBox, which is also the tool that openstreetmap uses, to integrate that last mile computation. We also developed an rtree extension for efficient querying. Lastly, as you can imagine, our addresses needs to be evaluated by real users: so we developed a mobile app to enable self navigation with our robocodes. In our user studies, we observed that using our smart addresses decreased the arrival time of the agents by 21.7 percent.