SlideShare ist ein Scribd-Unternehmen logo
1 von 127
Downloaden Sie, um offline zu lesen
Emerging 3D Scanning Technologies for PropTech
Falling costs with rising quality via hardware innovations and deep learning
Outlineofthepresentation
StructurefromMotion(SfM) Low-cost passive sensing
360°imaging Omnidirectional immersiveimagesandvideos
Rangesensing Structuredlight, Matterport,Kinectforexample
Laserscanning LiDARs fromVelodyne for example
Data-drivenprocessing DeepLearning
3DDatasets Withwhat totrain yourdeeplearningpipelines
FutureProspects Short overview of future applications
Thepresentationismeant asatechnical introductionfor typical hardware andsoftware
processingtechniquesusedinreal estateandconstruction site scanning.
Computerscientistsnew to proptechorganizations andreal estate fieldin generalmight
especiallyfindthispresentation useful.One assumesthat thereaderisfamiliarwiththe basics
ofdeeplearning.
Datastructuresfor realestatescans
RGB+D Pixel grid presenting colorand depth
Example
from Prof. Li
Mesh(Polygon) from voxel data(“3Dpixels”)
Voxel grid meshing using marching cubes (StackExchange)
PointCloud unordered datatypically (i.e. not on agrid but sparse
PropTechResources for domaininsights
https://www.inman.com/
Inman Hacker Connect is created by and for the real
estate technology community. Debate, discuss and
define the future of real estate’s most pressing tech
issues at Hacker Connect. Join more than 400
engineers, developers, designers, product managers,
database architects, webmasters, and technology
executives from across the real estate space. Build
partnerships, connect with peers, tackle thorny tech
issues, learn best practices discover innovative
breakthroughs and collaborate during special
hands-on keyboard sessions at this day-long, tech-
first event.
WHY YOU SHOULD ATTEND Hear from industry
leaders on APIs, bots, data security, ownership, user
experience, blockchain and more. Take part in
collaborative hands-on-keyboard sessions and
come out with a new tool to apply to your job. Learn
how to better integrate data, workflows and be
competitive in your recruitment efforts
https://www.inman.com/event/hacker-17-sf/ http://www.moderneventures.com/accelerator/
https://gust.com/accelerators/moderne-accelerator
(Pi Labs) is Europe’s first venture capital platform
investing exclusively in early stage ventures in the
property tech vertical. London, United Kingdom.
http://pilabs.co.uk/
http://www.jamesdearsley.co.uk/
“The only PropTech site for the latest Property
Technology news and views”
#PropTech community across Europe. Join us for our next event in #Berlin
http://futureproptech.de/
StructurefromMotion(SfM)
Low-costpassivesensing
StructurefromMotionBasics
Structure-from-Motion (SfM). Instead of a
single stereo pair, the SfM technique requires
multiple, overlapping photographs as input to
feature extraction and 3-D reconstruction
algorithms. - Westoby et al
praehistorische-archaeologie.de - Florian Tubbesing
Structure from Motion can achieve good
accuracy compared to laser scanners.
James and Robson (2012)
Cited by 281 Articles, and see Related articles
This volcanic bomb (~10 cm across) from Soufrière Hills
volcano was scanned by an Arius3d laser scanner (
Stuart Robson, University College London) and also
reconstructed using the SfM-MVS technique, with the
results scaled by sfm_georef. Differences between cross
sections through the two models have RMS values of
~0.3 mm. Point cloud: low res (6 Mb)
http://www.lancaster.ac.uk/staff/jamesm/software/sfm_georef.htm
SfM method basically computes the relative camera
positions between all related photos. After every
relative camera position is found, the scheme uses
these matrices to reconstruct all feature points using
triangulation. Thus there are two main problems:
1) Image registration (e.g. SIFT, SURF, ORB, etc)
2) Pose Estimation (e.g. Perspective-n-Point with RANSAC)
By Dr Calle Olsson
https://www.youtube.com/watch?v=i7ierVkXYa8
StructurefromMotionLiteratureReferences
https://doi.org/10.1016/j.geomorph.2012.08.021
Cited by 631 articles, and see Related articles
https://arxiv.org/abs/1701.08493
Structure-from-Motion’ (SfM) operates under
the same basic tenets as stereoscopic
photogrammetry, namely that 3-D structure
can be resolved from a series of overlapping,
offset images. However, it differs fundamentally
from conventional photogrammetry, in that the
geometry of the scene, camera positions and
orientation is solved automatically without the
need to specify a priori, a network of targets
which have known 3-D positions. Instead, these
are solved simultaneously using a highly
redundant, iterative bundle adjustment
procedure, based on a database of features
automatically extracted from a set of multiple
overlapping images (Snavely et al 2008).
Finally, even though there exist various theoretical works in the literature
that study fundamental problems in SfM and/or provide rigorous analysis of
stability and robustness of specific methods, we believe that the SfM
community would still highly benefit from rigorous results on fundamental
problems (e.g., what is the theoretically maximal amount of mismatched
features or level of noise in the images that can be tolerated for a stable
structure recovery, and can this be achieved efficiently?) and theoretical
analysis of stability, robustness and computational efficiency of existing
or new methods
SLAM Simultaneouslocalizationandmapping
SLAM, Visual Odometry, Structure from Motion, Multiple View Stereo
Yu Huang, Senior Architect, Autonomous Driving@Baidu USA
https://www.slideshare.net/yuhuang/visual-slam-structure-from-motion-multiple-view-stereo
Samsung R&D Institute
Necessary Skills / Attributes:
● 5+ years’ experience delivering computer vision based products using C++ or Python
(Masters or PhD study will be considered).
● Theoretical and practical understanding of multi-view geometry and 3D
reconstruction.
● Experience with machine learning techniques within a computer vision context.
● PhD/MS in Computer Vision, Artificial Intelligence or Machine Learning.
● Expertise with Deep Neural Networks using TensorFlow or Keras.
SLAM stands for Simultaneous Localization and Mapping and one way to understand
it is to imagine yourself entering an unfamiliar building for the first time. As you move about
the building, you don't completely forget where you have already been. Indeed, at any
moment you have a pretty good idea where you are within the current map that you have
so far constructed in your head, and unless you have a really bad sense of direction, you
could probably turn around and get back out of the building without too much trouble.
Finding your way around the building is a good example of simultaneously
constructing a map and localizing yourself within that map.
http://www.pirobot.org/blog/0015/
SLAM Traditionalalgorithm comparison
http://dx.doi.org/10.1186/s41074-017-0027-2
The framework is mainly composed of three modules as follows.
1) Initialization
2) Tracking
3) Mapping
Additional modules for stable and accurate vSLAM
+ Relocalization
+Global map optimization
“ From the technical point of views, there is no definitive difference between SLAM and real-time SfM.”
Even though visual SLAM algorithms have been developed since 2003, vSLAM is
still an active research field. Each algorithm has different characteristics. We need
to choose an appropriate algorithm by considering a purpose of an application.
VisualOdometry
Taketomi et al. (2017):
http://dx.doi.org/10.1186/s41074-017-0027-2
“Odometry is to estimate the sequential changes of
sensor positions over time using sensors such as
wheel encoder to acquire relative sensor movement.
Camera-based odometry called visual odometry
(VO) is also one of the active research fields in the
literature [16, 17].
From the technical point of views, vSLAM and VO
are highly relevant techniques because both
techniques basically estimate sensor positions.
According to the survey papers in robotics [18, 19],
the relationship between vSLAM and VO can be
represented as follows.
vSLAM = VO + global map optimization
The relationship between vSLAM and VO can also
be found from the papers [20, 21] and the papers [22,
23]. In the paper [20, 22], a technique on VO was first
proposed. Then, a technique on vSLAM was
proposed by adding the global optimization in VO [21,
23].”
Towards stable visual odometry & SLAM solutions
for autonomous vehicles
https://www.youtube.com/watch?v=T5Y6OPG-d08
NavStik Hackerspace | Projects at Hackerspace
Visual Odometry using Optic Flow
SoftwareOpen-sourceVisualSFM
VisualSFM:AVisualStructurefromMotion
System Changchang Wu
Cited by 326 articles, and see Related articles
VisualSFM is a GUI application for 3D reconstruction using structure
from motion (SFM). The reconstruction system integrates several of my
previous projects: SIFT on GPU(SiftGPU), Multicore Bundle Adjustment,
and Towards Linear-time Incremental Structure from Motion
. VisualSFM runs fast by exploiting multicore parallelism for feature
detection, feature matching, and bundle adjustment.
Using VisualSFM and Meshlab as an offline alternative
to Autodesk's excellent 123D catch. I walk you through my
workflow for converting multiple images into a 3D model
suitable for use in Blender.
Tutorial for amateur photographers by Jamie Fuller.
https://www.youtube.com/watch?v=V4iBb_j6k_g
OpenSourcePhotogrammetrywithVisualSFM:
Ditching123DCatchJuly12,2013 by Jesse
Indoor Navigation from Multiple Images
By Jaan Tollander de Balsch, 2016, Aalto
https://jaantollander.github.io/SCI-C1000/pr
ototype.html
What is the best method for 3D object
modelling and reconstruction from photos
or videos taken by flying robots or drones?
What is the accuracy of such reconstruction
methods with regards to the vibrations of the
flying drones, quality of camera and resolution?
Is it possible to improve the results by organizing
multiple flights and overlaying/accumulating the
data in the point cloud? Is there any free
software available?
SoftwarePythonPhotogrammetryToolbox(PPT)GUI
Real photo x SfM with texture color x SfM with simple shader. Made
with Python Photogrammetry Toolbox GUI and rendered in Blender
with Cycles.
http://184.106.205.13/arcteam/ppt.php
https://github.com/archeos/ppt-gui/
Converting pictures into a 3D mesh with PPT, MeshLab and Blender
http://arc-team-open-research.blogspot.co.uk/2012/09/converting-pi
ctures-into-3d-mesh-with.html
Blender camera tracking + Python Photogrammetry Toolbox
http://arc-team-open-research.blogspot.co.uk/2012/11/blender-camer
a-tracking-python.html
The video show the skull reconstructed in 3D with Python Photogrammetry Toolkit GUI.
Smilodon, the 3D reconstruction of the saber-toothed cat
http://arc-team-open-research.blogspot.co.uk/2013/03/
Open-sourcelibraries forSfM
OpenSfM is a Structure from Motion
library written in Python on top of
OpenCV. The library serves as a
processing pipeline for reconstructing
camera poses and 3D scenes from
multiple images.
https://github.com/mapillary/OpenSfM
656 stars
OpenSfM
OpenMVG (Multiple View Geometry)
"open Multiple View Geometry" is a
library for computer-vision scientists and
especially targeted to the Multiple View
Geometry community.
https://github.com/openMVG/openMVG
1,1856 stars
OpenMVG
https://doi.org/10.1007/978-3-319-56414-2_5
http://imagine.enpc.fr/~marletr/publi/RRPR-2016
-Moulon-et-al.pdf
Sung and Lin (2017): “VisualSFM uses the pre-
emptive feature matching, the incremental
structure from motion and the re-triangulation
techniques. The incremental feature matching
can greatly speed up the process because
this kind of matching will first sort all feature
points and match only first h feature points for
each photo.”
Sung and Lin (2017): “OpenMVG also
contains incremental structure from
motion technique. Besides that, they
proposed a new iterative sampling
method called a contrario Random
Sample Consensus (AC-RANSAC) as a
substitution to the original RANSAC in
order to acquire higher precision and
better performance. The AC-RANSAC
using the “a contrario” methodology in
order to find a model that best fits the
data with a threshold T that adapts
automatically to the noise. Hence, it is
able to find a model and its associated
noise without a fixed threshold.”
Open-sourcelibraries forSfM+SLAM
OpenChisel
https://github.com/personalrobotics/OpenChisel
An open-source version of the Chisel chunked TSDF
library. It contains two packages:
open_chisel
open_chisel is an implementation of a generic
truncated signed distance field (TSDF) 3D mapping
library; based on the Chisel mapping framework
developed originally for Google's Project Tango. It is
a complete re-write of the original mapping system
(which is proprietary). open_chisel is chunked and
spatially hashed inspired by this work from
Neissner et. al, making it more memory-efficient than
fixed-grid mapping approaches, and more performant
than octree-based approaches. A technical
description of how it works can be found in our
RSS 2015 paper.
http://ri.cmu.edu/pub_files/2015/7/ChiselPaper.pdf
Research-gradeSfM old-school monovideo
http://dx.doi.org/10.1186/s13640-017-0168-3
Inspired by the structure from motion systems, we
propose a system that reconstructs sparse feature
points to a 3D point cloud using a mono video
sequence so as to achieve higher computation
efficiency. The system keeps tracking all detected
feature points and calculates both the amount of these
feature points and their moving distances. We only use
the key frames to estimate the current position of the
camera in order to reduce the computation load and
the noise interference on the system. Furthermore, for
the sake of avoiding duplicate 3D points, the system
reconstructs the 2D point only when the point shifts
out of the boundary of a camera. In our experiments,
we show that our system is able to be implemented on
tablets and can achieve state-of-the-art accuracy with
a denser point cloud with high speed.
Research-gradeSfM DeepLearning -based#1
Research-gradeSfM DeepLearning -based#2
https://arxiv.org/abs/1702.01381, 2 May 2017
We evaluated the performance of our proposal on the DTU dataset comparing it
with two traditional feature based methods, namely SURF (Cited by 8683
articles) and ORB ( Cited by 2739 articles).
The system is trained in an end-to-end manner utilising transfer
learning from a large scale classification dataset. In addition, a
variant of the proposed architecture containing a spatial pyramid
pooling (SPP) layer is evaluated and shown to further improve the
performance.
RegNet is able to correct even large decalibrations such as
depicted in the top image. The inputs for the deep neural
network are an RGB image and a projected depth map. RegNet
is able to establish correspondences between the two
modalities which enables it to estimate a 6 DOF extrinsic
calibration.
Additionally, with an iterative execution of multiple CNNs, that
are trained on different magnitudes of decalibration, our
approach compares favorably to state-of-the-art methods in
terms of a mean calibration error of 0.28º for the rotational and
6 cm for thetranslation components even for large
decalibrations up to 1.5 m and 20º
.
https://arxiv.org/abs/1702.02295
Research-gradePose/Structure DeepLearning -based#1
Essentially the same technology for stereo matching and depth map generation as for SfM
https://arxiv.org/abs/1703.04309 https://arxiv.org/abs/1704.07813
Empirical evaluation on the KITTI dataset
demonstrates the effectiveness of our
approach: 1) monocular depth performs
comparably with supervised methods that
use either ground-truth pose or depth for
training, and 2) pose estimation performs
favorably compared to established SLAM
systems under comparable input settings.
Research-gradePose/Structure DeepLearning -based#2
GANs on everything, so here as well :) The usefulness of VisualSFM/ openSFM/ openMVG for defensible startup products?
Inversion is often ambiguous, e.g., many compositions of 3D shape and camera pose give rise to the same 2D projection. To
address this ambiguity, we impose priors on the predicted latent factors, through an adversarial discriminator network
trained to discriminate between predicted factors and ground-truth ones. Training adversarial inversion does not require
input-output paired annotations, but merely a collection of ground-truth factors, unrelated (unpaired) to the current input.
Our model can thus be self-supervised by unlabelled image data, by minimizing a joint reconstruction and adversarial
loss, complementing any direct supervision provided by paired annotations.
Applying adversarial inversion to super-resolution and inpainting results in automated “visual plastic surgery”
Structure-from-motion(SfM) results with and without adversarial priors. The results of the baseline (columns 5th and 8th)
are obtained from a model with depth smooothness prior, trained with early stopping at 40K iterations (before divergence).
SfMonMobileDevices
https://arxiv.org/abs/1611.09498
https://doi.org/10.1109/ICCV.2013.15 | Cited by 141 articles, see Related articles
https://doi.org/10.1016/j.cviu.2016.09.007
After introducing the reconstruction algorithms at the base of our approach, we show how to build
applications able to generate 3D floor plans scaled to their real-world metric dimensions and
capable to manage scene not necessary limited by Manhattan World assumptions. Then, exploiting
the resulting structural and visual model, we propose a client-server interactive exploration system
implementing a low-DOF navigation interface, specifically developed for touch interaction on
smartphones and tablets.
https://doi.org/10.1145/2999508.2999526
SfMonMobileDevices CaseDacuda
Magic Leap, the augmented reality
startup that has raised $1.4 billion in
funding but has yet to release a product,
has made an acquisition to expand its
work in computer vision and deep
learning, and to build out its operations
into Europe.
The company has acquired the 3D division
of Dacuda, a computer vision startup
based out of Zurich. One of
Dacuda’s focuses had been
developing algorithms for consumer-
grade cameras (and not just cameras, but
any device with a camera function) to
capture 2D and 3D imaging in real time,
“making 3D content as easy as taking a
video.”
https://techcrunch.com/2017/02/18/confir
med-magic-leap-acquires-3d-division-of-d
As you can see, no detail about what the two might be working on. The acquisition was first rumored
last week — after Dacuda posted a note on its blog about selling its 3D division, and then
some Dacuda employees updated their LinkedIn profiles as Magic Leap employees (one example
here). Tom’s Hardware then speculated it could signal Magic Leap using technology developed by
Dacuda to enable room-scale, six degrees of freedom tracking (essentially to improve its image
capturing sensors in 3D environments).
The ecosystem there is attracting other big-name M&A. Faceshift, a motion capture startup
acquired by Apple in 2015, was also founded in Zurich. Facebook’s Oculus VR in August 2016
also quietly acquired a startup called Zurich Eye, incubated at the University of Zurich and ETH,
the federal institute of technology. Zurich Eye became the basis of Oculus and Facebook’s office in
the city. Zurich Eye, ironically, was co-founded by a three former software engineers from Dacuda
(they all now work for Oculus VR).
For example, in October the company had linked up with MindMaze, another virtual/augmented
reality startup out of Switzerland, to build a platform they were calling “MMI, the world’s first
multisensory computing platform for mobile-based, immersive and social virtual reality
applications,” MindMaze noted.
MindMaze said it planned to “deploy the technology for users globally to address a void left by
Google’s DayDream View for positional tracking and multiplayer interactions.” We have contacted
Magic Leap for comment and will update this post if and when we learn more.
AppleARKit Technology
https://developer.apple.com/arkit/
Since the iPhone 6, iPhones have used what Apple calls “Focus Pixels”, which is its term for phase
detection AF. Fast Company reports that system will be replaced with laser autofocus possibly as soon
as the next iPhone, which is set to debut this fall. It is likely that Apple would use both AF technologies,
as Google does in its Pixel line of phones. The technology would serve a dual purpose, also allowing for
better depth perception with the inbuilt camera for augmented reality apps. ARKit rolls out with iOS 11
this fall, so it would make sense to also include the VSCEL laser system in the phone launching at the
same time.
https://petapixel.com/2017/07/20/apple-bring-3d-laser-autofocus-iphone-cameras-report-says/
https://www.theverge.com/2017/6/26/15872332/apple-arkit-ios-11-augmented-reality-developer-excitement
AppleARKit ExampleApplications
https://twitter.com/madewithARKit
Measuring kitchen dimensions
http://bit.ly/2tJ5KV8 app by→ @SmartPicture3D
Measure distances with your
iPhone. Clever little #ARKit app by
@BalestraPatrick http://bit.ly/2sFl8RB
Inter-dimensional iPhone
AR portals are closer than they
appear http://bit.ly/2sufO0d ARkit
demo by @nedd
Demo Shows How Augmented Reality Will
Make Advertising More Immersive. Mixed
reality producer Bilawal Singh Sidhu show peek of
what the world of advertising could be with the
ARKit. #adtech
https://mobile-ar.reality.news/news/apple-ar-demo-shows-
augmented-reality-will-make-advertising-more-immersive-0
178905/
Google’s responsetoARKit ARCore
DAVID JAGNEUX, UPLOADVR@UPLOADVR SEPTEMBER 2, 2017 6:00 AM “Earlier this week, Google
announced ARCore, a software-based solution for making more Android devices AR-capable without the need for depth
sensors and extra cameras. It will even work on the Google Pixel, Galaxy S8, and several other devices very soon and
supports Java, Unity, and Unreal from day one. In short, it’s kind of like Google’s answer to Apple’s ARKit.”
- https://venturebeat.com/2017/09/02/googles-first-arcore-goal-100-million-ar-capable-android-phones/
“Another example, which is especially relevant for
developers that build traditional smartphone apps in
Java, is that we want to make it easier than ever for
people to get into 3D modeling that haven’t done it
before,” Bavor says. “We know there are a lot of people
that want to get into 3D development and AR but
aren’t experts in Maya, or Unity, or anything. So Blocks
is an app we built with the intention of enabling
people that have never done a 3D model in their
life to feel comfortable building 3D assets. We even
made it easy to export right from Blocks and pull into
ARCore apps you’re developing.”
ARCore tooearlytotellhowitwilldoagainst“AppleCult”
Verge Adi Robertson
https://youtu.be/NhJydpMkpug
FusedVR https://youtu.be/dNXBvDKRg1M
https://venturebeat.com/2017/08/29/google-launches-arcore
-sdk-in-preview-ar-on-android-phones-no-extra-hardware-re
quired/
https://youtu.be/ttdPqly4OF8
Super Ventures Blog Matt Miesnieks
CEO 6D.ai, Partner @Super_Ventures, AR technology & cycling
https://medium.com/super-ventures-blog/how-is-arcore-better-than-arkit-5223e6b3e79d
● Isn’t ARCore just Tango-lite?
● The iPhone-8-keynote sized elephant in the room
● So should I build on ARCore now?
● Is ARCore better than ARKit?
Scottie Gardonio Aug 30
AR / VR enthusiast. Creative Manager. Passionate graphic designer.
https://medium.com/iotforall/arcore-vs-arkit-google-counters-apple-33483c08d3da
ARCore vs. ARKit: Google Counters Apple
Let the Dueling Begin
Google announcing inside-out 6-DOF tracking support for Daydream back at Google IO earlier this year.
DeepLearningonMobileDevices
https://techcrunch.com/2017/05/17/googles-tensorflow-lite-brings-machine-learning-to-android-devices/
http://blog.stratospark.com/creating-a-deep-learning-ios-app-with-keras-and-tensorflow.html
● 3D Face Capture
● 3D Scene Reconstruction
● 2.5D Scene Reconstruction and Computational Photography
● SLAM and Object Tracking
● Augmented Reality
● Google Cardboard SDK for iOS
https://doi.org/10.1109/IPSN.2016.7460664 | Cited by 28 articles, see Related articles
Thursday 20 July 2017, Movidius USB stick
https://techcrunch.com/2017/07/20/movidius-launches-a-79-deep-learning-usb-stick/
Snapchat secretly acquires Seene, a computer vision
startup that lets ...
https://techcrunch.com/.../snapchat-secretly-acquires-seene-a-
computer-vision-startup-... 3 Jun 2016
https://doi.org/10.1109/PDP.2017.98
https://arxiv.org/abs/1705.06224
360°imaging
360°(omnidirectionalimaging) Introduction
The Panoptic Camera platform developed
jointly by Microelectronic Systems
Laboratory (LSM) and Signal Processing
Laboratory (LTS2) of EPFL.*
http://lsm.epfl.ch/page-52820-en.html
Wikipedia: “360-degree videos, also known as immersive videos[1] or spherical videos ,[2] are video recordings where a view in every direction is recorded
at the same time, shot using an omnidirectional camera or a collection of cameras. During playback the viewer has control of the viewing direction like a
panorama.”
Consumer-level camera review
http://thewirecutter.com/reviews/best-360-degree-camera/
By DANIEL CULPANWednesday 12 August 2015
http://www.wired.co.uk/article/9-mind-blowing-360-degree-videos
Scuba Diving Short Film in 360° Green Island, Taiwan
https://youtu.be/2OzlksZBTiA
360°aspartof “10BreakthroughTechnologiesof2017”
https://www.technologyreview.com/s/603496/10-breakthrough-technologies-2017-the-360-degree-selfie/
Seasonal changes to vegetation fascinate Koen Hufkens. So last fall Hufkens, an
ecological researcher at Harvard, devised a system to continuously broadcast
images from a Massachusetts forest to a website called VirtualForest.io. And
because he used a camera that creates 360°pictures, visitors can do more than
just watch the feed; they can use their mouse cursor (on a computer) or finger (on a
smartphone or tablet) to pan around the image in a circle or scroll up to view the
forest canopy and down to see the ground.
Journalists from the New York Times and Reuters are using $350
Samsung Gear 360 cameras to produce spherical photos and videos that
document anything from hurricane damage in Haiti to a refugee camp in Gaza.
One New York Times video that depicts people in Niger fleeing the militant group
Boko Haram puts you in the center of a crowd receiving food from aid groups.
Or consider the spherical videos of medical procedures that the Los Angeles
startup Giblib makes to teach students about surgery. The company films the
operations by attaching a $500 360fly 4K camera, which is the size of a baseball,
to surgical lights above the patient. The 360° view enables students to see not just
the surgeon and surgical site, but also the way the operating room is organized and
how the operating room staff interacts.
These applications are feasible because of the smartphone boom and
innovations in several technologies that combine images from multiple lenses and
sensors. For instance, 360° cameras require more horsepower than regular
cameras and generate more heat, but that is handled by the energy-efficient chips
that power smartphones. Both the 360fly and the $499 ALLie camera use
Qualcomm Snapdragon processors similar to those that run Samsung’s high-
end handsets.
Once people discover spherical videos, research suggests, they shift their
viewing behavior quickly. The company Humaneyes, which is developing an
$800 camera that can produce 3-D spherical images, says people need to watch
only about 10 hours of 360° content before they instinctively start trying to interact
with all videos. When you see 360°imagery that truly transports you somewhere
else, you want it more and more.
Low-costendSamsung Gear andGalaxy
Samsung Gear360, ~£250
Samsung GearVR, ~£100
Samsung Galaxy S6-8, smartphone, ~£200-£700
http://www.samsung.com/uk/wearables/gear-360-c200/
If you’re clamoring to shoot in 360 degrees, the Gear 360 balances
simple design with workable image quality — but you really need a
Samsung phone (and a Gear VR, and a good hunk of money) to get
the most out of it. And, for now, that's fine.
This version of the Gear 360 is more likely to be looked back on as a
relic anyway, a recognizable but eventually dismissible attempt at a
new idea, and the foundation for whatever Samsung does next.
Low-costend#2Ricoh Theta
Ricoh’s Theta V 4K camera sports 360-
degree video and wireless playback
RYAN WINTERHALTER, UPLOADVR@@UPLOADVR
SEPTEMBER 02, 2017 07:03 PM
https://venturebeat.com/2017/09/02/ricohs-theta-v-4k-camera-sport
s-360-degree-video-and-wireless-playback/
Ricoh is unveiling its latest 360-degree camera this morning. Dubbed the Ricoh Theta V, the $430 4K camera
is the latest in the line which launched in 2013 with the Ricoh Theta.
Available for pre-order now, and shipping in mid-September, the Theta V features 3,820-by-1,920 resolution
video capture. That’s a massive improvement on the earlier Theta S, which offered a sub-1,080p 1,920-by-960,
and the Theta SC, which allowed for 1,920-by-1,080 recording.
Perhaps the biggest usability improvement to the Theta V is the inclusion of remote playback. Users can now
wirelessly stream their video to an external display directly from the camera. Previous devices in the Theta line
(except the developer-only Theta R) required users to export their raw footage into a computer to stitch the
image and create a useable video. That’s now all done on the device. Videographers can watch their footage
on any display, and move the POV by moving the camera itself.
The Theta V boosts sound quality as well. Four microphones capture data from their respective dimensions,
creating spatial audio that allows users to hear where the sound is coming from within the recording.
Ricoh Theta V hands-on
Published Aug 31, 2017 | Jeff Keller
Based on some quick tests of a non-final Theta V,
both stills and videos are noticeably better than
those from its predecessor. We're looking forward
to getting our hands on a production model in a few
weeks and putting it through its paces.
For higher quality audio
capture, Ricoh is offering
the TA-1 3D Microphone
($269). Developed by
Audio Technica, the mic
attaches via the tripod
mount and uses a
standard 3.5mm audio
jack.
HigherEndGoPro, Nokia Ozo, FacebookSurround, etc.
GoPro (NASDAQ:GPRO) recently unveiled the Omni, a six-camera rig
for filming interactive spherical videos that can be explored through a
smartphone's movements, a user's finger swipes, or a virtual reality
headset. The device is the smaller sibling of the 16-camera Odyssey
rig ($15,000), which hasn't been launched despite being announced
nearly a year ago. Let's take a look at four key things investors should
know about the Omni ($3,500), and how they might impact GoPro's
future.
https://www.fool.com/investing/general/2016/04/14/4-things-inves
tors-need-to-know-about-gopro-incs-o.aspx
What's next for GoPro? GoPro investors don't have many catalysts
to look forward to this year. The Omni is too pricey relative to its
peers to gain any mainstream traction. The Karma drone, which is
due to arrive within the next two months, faces tough competition
from market leader DJI Innovations. By the time the Hero 5 cameras
arrive near the end of the year, the mainstream market could be
saturated with cheap VR and flying cameras.
Introducing Facebook Surround
360: An open, high-quality 3D-360
video capture system
Brian K Cabral, April 12, 2016
● Facebook has designed and built a durable, high-
quality 3D-360 video capture system.
● The system includes a design for camera hardware
and the accompanying stitching code, and we will
make both available on GitHub this summer. We're
open-sourcing the camera and the software to
accelerate the growth of the 3D-360 ecosystem —
developers can leverage the designs and code, and
content creators can use the camera in their
productions.
● The system exports 4K, 6K, and 8K video for each
eye. The 8K videos double industry standard output
and can be played on Gear VR with Facebook's
custom Dynamic Streaming technology.
https://code.facebook.com/posts/1755691291326688/introduc
ing-facebook-surround-360-an-open-high-quality-3d-360-vid
eo-capture-system/
https://www.theverge.com/2016/4/25/11421992/disney-nokia-oz
o-camera-virtual-reality-star-wars-marvel
Ever since Nokia announced its
360-degree Ozo virtual reality camera it has positioned the
system as a high-end option for Hollywood filmmakers, and
today the company is announcing a partnership with Disney
that should help deliver on that promise. As part of the deal,
Ozo cameras will be put into the hands of Disney filmmakers
and its marketing teams to create 360-degree, virtual reality
content across all of the studio’s various brands.
LytroImmerge The world'sfirst professional Light Field solution forcinematicVR
roadtovr.com/lytros-immerge-360
https://www.lytro.com/immerge
Consequently, to create a virtual reality that even the human eye cannot distinguish from the real
world, we must achieve the perfect immersive viewing experience, such that human viewers feel
they can walk into the scene. This is known as the virtual walk-in effect, and it requires light-field
technology—3D imaging technology that emerged from the field of computational
imaging/photography to capture the light rays that people perceive from different locations and
directions. When combined with computer vision and deep learning, light- field technology
provides a viable path for producing low-cost, high-quality VR content, positioning this technology
to be the most profitable segment of the VR industry.
“DepthLytro”‘Depth sensing with light fieldtechniques
Refocusing in spite of foreground occlusions: (a) Scene containing a
monkey toy being partially occluded by a plant in the foreground, (b)
traditional synthetic aperture refocusing on light field is partially effective in
removing the effect of foreground plants, (c) synthetic aperture refocusing
of depth displays corruption due to occlusion, (d) histogram of depth
clearly shows two clusters corresponding to plant and monkey, (e) virtual
aperture refocusing after removal of plant pixels shows sharp depth image
of monkey, (f) Quantitative comparison of indicated scan line of the
monkey’s head for (c) and (e)
We use coding techniques from Tadano et al. (2015) to image beyond
backscattering nets. Notice how the corrupted depth maps are improved
using the codes. We show how digital refocusing can be performed on the
images without the scattering occluders by combining depth fields with
coded TOF.
https://arxiv.org/abs/1509.00816
Post-processingfor360° imaging
https://doi.org/10.1007/s00371-017-1368-7
Overall process. a Input image. b Lines detected and classified: red for
vertical lines and yellow for horizontal lines. c Great circles from the
classified lines. Green dots are vanishing points computed from
horizontal (yellow) lines. d Upright adjustment result
We implemented our method using C++ and the OpenCV library on a 64-bit Windows
PC with an Intel i7- 6700K 4.00GHz CPU and 32GB RAM. For an input image of size
5376 × 2688 px, it takes a few hundred milliseconds (less than one second) to
obtain the final rotation matrix R for upright adjustment.
https://arxiv.org/abs/1703.10798
http://vllab1.ucmerced.edu/~wlai24/360hyperlapse
Pipeline of the proposed algorithm. Given a 360 video, we first stabilize the sequence to smooth the relative rotation◦
between adjacent frames. We estimate the focus of expansion (i.e., the direction of forward motion) as a prior information for
our camera path planning. To extract the regions of interest, we compute the spatial-temporal saliency and semantic
segmentation. The detected regions of interest are used to guide the camera path planning. Finally, we use an adaptive 2D
video stabilization to render a smooth hyperlapse.
360°DeepLearning #1
http://dx.doi.org/10.3390/s17061341
https://arxiv.org/abs/1705.01759
Watching a 360º sports video
requires a viewer to
continuously select a viewing
angle, either through a
sequence of mouse clicks or
head movements. To relieve
the viewer from this “360
piloting” task, we propose
“deep 360 pilot” – a deep
learning-based agent for
piloting through 360º sports
videos automatically
Panel (a) overlaps three panoramic frames
sampled from a 360 skateboarding video◦
with two skateboarders. One skateboarder
is more active than the other in this
example. For each frame, the proposed
“deep 360 pilot” selects a view – a
viewing angle, where a Natural Field of View
(NFoV) (cyan box) is centered at. It first
extracts candidate objects (yellow boxes),
and then selects a main object (green dash
boxes) in order to determine a view (just like
a human agent). Panel (b) shows the NFoV
from a viewer’s perspective.
360°DeepLearning #2
Flat2Sphere: Learning Spherical Convolution for Fast Features from 360° Imagery
Yu-Chuan Su, Kristen Grauman (Submitted on 2 Aug 2017) https://arxiv.org/abs/1708.00919
We propose to learn a spherical
convolutional network that translates a
planar CNN to process 360° imagery
directly in its equirectangular projection.
Our approach learns to reproduce the flat
filter outputs on 360° data, sensitive to
the varying distortion effects across the
viewing sphere. The key benefits are
1) Efficient feature extraction for
360°images and video, and
2) The ability to leverage powerful pre-
trained networks researchers have
carefully honed (together with massive
labeled image training sets) for
perspective images.
We validate our approach compared to
several alternative methods in terms of
both raw CNN output accuracy as well as
applying a state-of-the-art "flat" object
detector to 360° data. Our method yields
the most accurate results while saving
orders of magnitude in computation
versus the existing exact reprojection
solution.
360°Therolein PropTech? #1a
Usefor real estate agents, still a novelty/gimmicky? (from 2014 until 2017)
MAY 26, 2014 By James Dearsley
http://www.jamesdearsley.co.uk/is-the-property-industry-intereste
d-in-360-degree-hd-filming/
USES OF 360 DEGREE HD FILMING IN REAL ESTATE:
1. Sales and Marketing. Firstly, from a realtor or estate agent perspective there are several uses
here of 360 degree cameras, the first being obvious, that of sales and marketing. It will be simple
and efficient to take a quick film of each room, or just walk through the property with these devices
to record what you need
2. Property Management issues. We have also seen interest from companies looking to use these
bits of equipment for inventory taking. Seeing as they are of HD quality it means you can quickly
take photographs of properties which can later be looked at in more detail should problems arise in
letting disputes.
3. Virtual Reality. With Facebook recently buying Oculus Rift for $2 Billion, it is getting less far
fetched. Considering the price of an Oculus is relatively cheap (reckoned to be less than
$500/£360 when released next year) it would not be surprising if Facebook are hoping for a lot of
people to be purchasing these (Candy Crush Saga in Virtual Reality anyone?!). It isn’t just Facebook
though; Sony have a VR headset in production as does Samsung (it was recently announced) and so
this space is going to move quickly. By using these cameras you can put your clients into these
homes very quickly and easily – either in the office, if you get a set of these yourself, or, in time, in
their own home if Facebook get their way.
https://www.forbes.com/sites/forbesagencycouncil/2017/06/28/want-to-use-360
-degree-photo-and-video-11-things-to-consider/#22fffa955002
1. I would recommend that marketers stay on the sidelines until the industry
matures. - Kristopher Jones, LSEO.com
4. Use A Strategic Approach The capabilities of 360-degree photo/video have
powerful applications in many industries, including real estate, retail and tourism. A
360-degree view has a better chance of selling a house than a static image. -
Brock Murray, seoplus+
7. Prepare For Tomorrow's Consumer Expectations Today, 360-degree photos
and videos are very helpful in industries such as the auto industry or real estate where
visualizing the product is essential. As VR continues to grow, 360-degree photos and
videos will likely become a standard. The consumers' expectations will likely adjust to
needing to learn more about the overall "360-degree" experience of the restaurant for
example, not just a picture of the dish. - Ahmad Kareh, Twistlab Marketing
11. Create An Emotional Connection 360-degree multimedia is a brilliant tool for
meaningful storytelling, as it allows the consumer to be transported to the experience
you want them to have, bringing the story to life. Companies should take advantage of
these tools to transform products into experiences, cultivating an immersive and
emotional connection with the brand. - Joey Hodges, Demonstrate PR
JUN 28, 2017 by Forbes Agency Council
360°Therolein PropTech? #1b
Usefor real estate agents
A four-wheeled tripod outfitted with a computer, 360-
degree camera and sensors can roam properties,
producing highly choreographed, immersive videos that
would be difficult — if not impossible — to replicate with
a normal video camera.
VirtualAPT (Brooklyn, NYC) offers residential tour service at now $1/ft² (~10.8$/m²), and for commercial uses,
for a monthly fee per building or $0.50/ft² (~5.4$/m²) for separate units.
Generated by technology from companies such as Matterport, 3-D home tours allow users to jump between
360-degree photos — sometimes situated within a 3-D model.
● A rover can shoot 360-degree footage of
a home while moving along a pre-plotted
route.
● Made by VirtualAPT, the videos can
include on-camera presentations from
real estate agents.
● They're an alternative to 3-D homes tours
from companies such as Matterport.
https://www.youtube.com/watch?v=JhfQK-tDvGU
360°Therolein PropTech? #2a
Use forconstruction andasatoolforconstructing4D/5D/6DBIM (BuildingInformationModel)
Construction site manager
manually taking photos of the
progress.
- Time-consuming to walk through
and take photos
- No full coverage of site
- Might forget some spots
- Nice initial 3D BIM not properly
maintained during construction site.
+ Ideally have a drone inspecting the
whole construction site with an on-
board 360 degree video and a
LIDAR / laser scanner.
+ One can go back in time and see
who of the subcontractors for
example are responsible for possible
problems
https://doi.org/10.1186/s40327-014-0016-9
360°Therolein PropTech? #2b
360 videos registered or not to 3D BIM model allows inspection of the progress (“4D BIM”) in the
construction site also retrospectively, and can possibly reduce legal battles when it is clearer who is
the one to be held responsible in case of discrepancies between as-built and as-planned data.
VISUAL ASSET MANAGEMENT Visual Asset Management (VAM) service digitizes industrial
and infrastructure assets using 360 degree images, 3D Models, and relative asset information.
3D MODELING We thrive on enabling 3D realistic visualization to projects while preserving the
minute details necessary to portray our world.
360 VIDEO 360 video enables viewers to be at the center of any medium, allowing for a unique
visual experience and situational awareness from any device.
VIRTUAL REALITY OcuTech’s virtual reality solutions stimulate creative thinking and enhanced
information sharing allowing for one of kind virtual experience.
Ocutech from Houston, Texas, USA is
already providing these type of
services
https://ocutech360.com/3d-architectural-visualization-solution/#3dvrvideo
360°imaging+SfM
360°intosmartphones howbigwillitbe?
https://www.engadget.com/2017/07/10/future-of-smartphone-camera/
1) Augmented reality
2) Dual-lens cameras
3)Better lenses
4)4K recording
5)Thermal imaging
6)Optical zoom
7)360 video
“Several smartphone makers, including Samsung and Huawei, have already released add-on 360-
degree cameras for their handsets, but this is something that could eventually be integrated into the
phones themselves. Immersive 360-degree videos are gradually making their mark, with Facebook
among the big firms pushing the technology, while virtual reality companies are gradually introducing
more 360-VR content that be viewed from mobile phones.”
https://techcrunch.com/2016/08/30/the-future-of-mobile-video-is-virtual-reality/
Are 360 cameras the future?
https://youtu.be/i8EUerX90-0 TechAltar
So whether teens in big
numbers will ever apply
Snapchat bunny ears to
immersive 360 degree
videos?
360°intosmartphones plentyofoptionscoming#1
Acer’s new Holo 360 degree camera
is essentially a smartphone
Acer has announced its entry into the VR
video market with a device that’s half
360-degree camera, half smartphone.
http://www.trustedreviews.com/news/acer-s-new-ho
lo-360-degree-camera-is-essentially-a-smartphone
-2953609
Paul Monckton CONTRIBUTOR
I write about photography and related subjects
https://www.forbes.com/sites/paulmonckton/2016/05/31/worlds-first-live-smartphone-vr-camera/#9
fea6921a8b0
Yesterday at this year’s Computex trade show in Taipei,
Quanta Computer and ImmerVision jointly announced what
is claimed to be the world’s first 360-degree live VR
streaming camera for smartphones, with demos starting from
today. The, as yet unnamed, camera fits in the palm of the
hand and is designed to attach magnetically to any
smartphone. It comes with a 360-degree by 187-degree lens
and uses a Sony Exmor-HDR imaging sensor to produce 16
megapixel panoramic images.
ImmerVision's Panamorph lens makes more efficient use of an image sensor
(Image credit: ImmerVision)
THIS ADD-ON CAMERA WILL TURN YOUR
SMARTPHONE INTO A 360 CAMERAJULY 26, 2017
ION360 U 4K 360-Degree Smartphone Camera
is comprised of a 360 camera that goes on top of
Essential's 360 Camera Is the World's Smallest
360-Degree Personal Camera for a Smartphone
30 May 2017
http://gadgets.ndtv.com/mobiles/news/essentials-360-camera-is-the-worlds-sm
allest-360-degree-personal-camera-for-a-smartphone-1705826
After months of teasing, Android creator Andy Rubin has
finally unveiled the Essential Phone that features a near
bezel-less display that tries to outdo Samsung's Galaxy
S8. Essential's 360 camera, which weighs around 35
grams and is being called the world's smallest 360-
degree personal camera by the company, includes a dual
12-megapixel fisheye sensors that can capture 4K 360
video at 30fps. The camera also features 4 microphones
to capture sound in 3D. The 360 camera can be bought
along with the Essential Phone for an additional $50, or
can be bought separately which will cost you $199.
@essential, Palo Alto, CA, essential.com
360°intosmartphones plentyofoptionscoming#2
ProTruly’s Darling
https://www.theverge.com/2017/3/5/14809
182/protruly-darling-360-degree-camera-
smartphone
A company called HT Optical
that makes the cameras
found on ProTruly’s devices.
The company said that it is
working on a much smaller
360 camera module that will
actually fit into a 7.6 mm thick
smartphone and will be
capable of capturing 16 MP
photos and shoot 4K videos.
What’s even more interesting
is that the module will only
add an extra 1 mm to the
overall thickness of a device.
https://www.theverge.com/ci
rcuitbreaker/2017/2/22/1469
8026/huawei-360-degree-came
ra-honor-vr-smartphones
http://360rumors.com/
https://www.vrfocus.com/2017/07/360-degree-video-edi
ting-app-for-smartphones/
V360 -360 video editor Avincel GroupInc 
360-DegreeVideo Editing App ForSmartphonesV360editingsuite alreadyout for Android, withiOS versioncomingsoon.
360°intosmartphones convergencewith AI players of course
https://www.embedded-vision.com/news/movidius-low-po
wer-vpu-technology-delivers-4k-vr-pixel-processing-p
erformance-motorola%E2%80%99s-newest
Movidius’ Myriad 2 Vision Processing Unit (VPU) technology,
known for its image signal processing and computer vision
capabilities with high energy efficiency, was selected by
Motorola Mobility to power their newest Moto Mod: the 360
Camera. Moto Mods are unique modular accessories for
Motorola smartphones that bring advanced functionality
beyond traditional smartphone features. Motorola’s newest
Moto Mod brings users the ability to live stream 360 videos⁰
while preserving battery life.
Say Hello to the moto z² Force Edition with moto mods
https://www.youtube.com/watch?v=0moMnChM6Ds
https://www.wsj.com/articles/intel-to-buy-semiconduct
or-startup-movidius-1473170441
https://www.altera.com/solutions/industry/automotive/applicat
ions/drive-assistance/surround-view-camera.html
http://www.nvidia.co.uk/object/drive-px-uk.html
360°VideoSfM Obviousextensiontocombineboth
Instead of manuallyrotatingyour camera,image all angles simultaneously while going through the
rooms in an apartment
https://uploadvr.com/adobe-algorithm-6dof-360-cam/
http://variety.com/2017/digital/news/adobe-6dof-vr-v
ideo-algorithms-1202394491/
Adobe Motion Parallax demo
https://youtu.be/37Z4f6p1HOY
https://www.roadtovr.com/adobes-new-research-aims-give-depth-monoscopic-360-video/: Other techniques to achieve 6-DoF VR video
usually require light-field cameras like HypeVR’s crazy 6k/60 FPS, LiDAR rig or Lytro’s giant Immerge camera. While these undoubtedly will
produce a higher quality 3D effect, they’re also custom-built and ungodly expensive.
6-DOF VR videos with a single 360-camera
Jingwei Huang ; Zhili Chen ; Duygu Ceylan ; Hailin Jin, Virtual Reality (VR), 2017 IEEE
http://dx.doi.org/10.1109/VR.2017.7892229, 18-22 March 2017
Given a 360-video captured by a single spherical panorama camera, in an offline pre-processing stage, we recover
the camera motion and the scene geometry first by performing structure-from-motion (SfM) followed by dense
reconstruction. Then, in real-time we playback the video in a VR headset where we track the 6-DOF motion of the
headset and synthesize new views by a novel warping algorithm.
360°VideoSfM KoreaAdvanced Institute ofScience andTechnology(KAIST)
Spherical panoramic cameras (Ricoh Theta S, Samsung
Gear 360 and LG 360)
Our sphere sweeping algorithm
enables to compute all-around
dense depth maps, minimizing the
loss of spatial resolution. With the
estimated all-around image and
depth map, we have shown
practical utilities by introducing
360 stereoscopic and anaglyph◦
images as VR contents.
European Conference on Computer Vision ECCV
2016: Computer Vision – ECCV 2016 pp 156-172
https://doi.org/10.1007/978-3-319-46487-9_10
All-Around Depth from Small Motion with a Spherical Panoramic Camera. Sunghoon ImEmail author Hyowon Ha François Rameau Hae-Gon Jeon Gyeongmin Choe In So Kweon
RangeSensing
Structured-LightandTime-of-Flight
MicrosoftKinect Democratizing structuredlightscanning
https://arxiv.org/abs/1505.05459
Structured light A sequence of known patterns is
sequentially projected onto an object, which gets
deformed by geometric shape of the object. The
object is then observed from a camera from a
different direction. By analyzing the distortion of
the observed pattern, i.e. the disparity from the
original projected pattern, depth information can
be extracted
The Time-of-Flight (ToF) technology is based on
measuring the time that light emitted by an illumination
unit requires to travel to an object and back to the sensor
array. The Kinec tToF camera applies this CW intensity
modulation approach. . Due to the distance between the
camera and the object (sensor and illumination are
assumed to be at the same location), and the finite speed
of light c, a time shift [s]φ is caused in the optical signal
which is equivalent to a phase shift in the periodic signal.
This shift is detected in each sensor pixel by a so-called
mixing process. The time shift can be easily transformed
into the sensor-object distance as the light has to travel
the distance twice,
Cited by 65 articles - see Related articles
KinectFusion Scanning with Kinect
https://doi.org/10.1145/2047196.2047270 Cited by 1356 articles, see Related articles
https://arxiv.org/abs/1704.01047
https://arxiv.org/abs/1612.02859
The semantic cue from floorplan
(i.e., door detection) resolves
ambiguities. The figure shows the
best placement based on the unary
potential with or without the
semantic cue
We show qualitative results on ModelNet using the TSDF encoding (Curless and Levoy, 1996) and 4 views. The
same TSDF truncation threshold has been used for traditional fusion, our OctNetFusion approach and the ground
truth generation process. While the baseline approach is not able to resolve conflicting TSDF information from
different viewpoints, our approach learns produce a smooth and accurate 3D model from highly noisy input.
By learning the structure of real world 3D objects and scenes, our approach is further able to
reconstruct occluded regions and to fill gaps in the reconstruction. We evaluate our approach
extensively on both synthetic and real-world datasets for volumetric fusion. Further, we apply
our approach to the problem of 3D shape completion from a single view where our approach
achieves state-of-the-art results.
Kinecttweaks depthresolution improvementswithpolarization measurement?
http://news.mit.edu/2015/object-recognition-robots-0724
https://youtu.be/m6sStUk3UVk
http://news.mit.edu/2015/algorithms-boost-3-d-imaging-resolution-1000-times-1201
https://doi.org/10.1007/s11263-017-1025-7
https://doi.org/10.1364/OE.25.001173
RangeSensing PlentyofOptions
http://3dscanexpert.com/photogrammetry-benchmarks-r
emake-vs-photoscan-vs-realitycapture-vs-zephyr/
This post is just an example based on a single photoset from a single
object. That makes it zero percent scientific. Also, RealityCapture
might have won this Drag Race in terms of both speed with the Fast
preset and quality with the Normal preset, but an organic object like
this is very favorable to its algorithms. Read my Full RC Review to see
that it can’t always handle non-organic objects well.
COMMERCIAL SOFTWARE
http://3dscanexpert.com/
By Nick Lievendag Entrepreneur at the intersection of Creativity × Technology. Writes, Speaks and Consults about 3D
Capture (3D Scanning & Photogrammetry). Founder of 3D Scan Expert.
Matterportdominating RealEstatescanning
This $4,500 camera turns the real world into the virtual one. Today, Matterport
’s hardware is a hit with real estate agents. But fueled by the $30 million Series C
it just raised, Matterport’s software and partnership with Google’s Project Tango
could let you wave your phone around to create VR tours of anywhere you want.
https://techcrunch.com/2015/06/25/matterport/
https://www.crunchbase.com/organization/matterport#/entity
Matterport spawned out of the Xbox Kinect hacker scene in 2010. Founder
Matt Bell had been working for a gesture recognition company that relied on a
$50,000 camera and expert operators to produce a huge CAD file that could
only be accessed through a specialized application. Bell was flabbergasted by
the power of the $150 Kinect. He realized the potential for a relatively cheap
device with similar technology that could let anyone map out rooms to create
3D models accessible straight from the web.
https://youtu.be/HZX8RupfQls
MatterportResearch onsemanticindoor segmentation
We collected the data using the Matterport Camera, which combines 3
structured-light sensors to capture 18 RGB and depth images during a
360 rotation at each scan location◦ . The output is the reconstructed 3D
textured meshes of the scanned area, the raw RGB-D images, and camera
metadata. We used this data as a basis to generate additional RGB-D data
and make point clouds by sampling the meshes. We semantically annotated
the data directly on the 3D point cloud, rather than images, and then
projected the per point labels on the 3D mesh and the image domains.
https://arxiv.org/abs/1702.01105 | Cited by 3 - Related articles
https://arxiv.org/abs/1702.07600
https://www.fastcompany.com/3059281/
introducing-hover-an-ai-powered-indo
or-safe-camera-drone
+
Indoor scanning with tripod-based Matterport
still requires a lot of manual work, and at some
point will be updated to autonomous AI-
powered indoor drone for better user
experience.
MatterportTechnologypatents
Capturing and aligning multiple 3-dimensional sceneswww.google.com/patents/US8879828Grant -
Filed Jun 29, 2012 - Issued Nov 4, 2014 - Matthew Bell - Matterport, Inc.
Multi-modal method for interacting with 3d models
www.google.com/patents/US20130342533App. - Filed Jun 24, 2013 - Published Dec 26, 2013 - Matthew Bell -
Matterport, Inc.
Identifying and filling holes across multiple aligned three-dimensional scenes
www.google.com/patents/US8861840Grant - Filed Oct 14, 2013 - Issued Oct 14, 2014 - Matthew Bell - Matterport, Inc.
Building a three-dimensional composite scene
www.google.com/patents/US8861841Grant - Filed Oct 14, 2013 - Issued Oct 14, 2014 - Matthew Bell - Matterport, Inc.
Processing and/or transmitting 3D data
www.google.com/patents/US9396586Grant - Filed Mar 14, 2014 - Issued Jul 19, 2016 - Matthew Tschudy Bell -
Matterport, Inc.
Semantic understanding of 3d data
www.google.com/patents/US20160055268App. - Filed Jun 6, 2014 - Published Feb 25, 2016 - Matthew Tschudy Bell -
Matterport, Inc.
Selecting two-dimensional imagery data for display within a three-dimensional model
www.google.com/patents/EP3120329A1?cl=enApp. - Filed Mar 13, 2015 - Published Jan 25, 2017 - Matthew Tschudy BELL -
Matterport,
Classifying, separating and displaying individual stories of a three-dimensional model
of a multi-story structure based on captured image data of the multi-story structure
www.google.com/patents/US20160217225App. - Filed Jan 28, 2016 - Published Jul 28, 2016 - Matthew Tschudy Bell -
Matterport, Inc.
Semantic understanding of 3d data
US 20160055268 A1
ABSTRACT Systems and techniques for processing three-
dimensional (3D) data are presented. Captured three-
dimensional (3D) data associated with a 3D model of an
architectural environment is received and at least a portion of
the captured 3D data associated with a flat surface is
identified. Furthermore, missing data associated with the
portion of the captured 3D data is identified and additional 3D
data for the missing data is generated based on other data
associated with the portion of the captured 3D data.
REFERENCED BY
US9576184 Textura Planswift Corporation
Detection of a perimeter of a region of interest in a floor plan document
US20130328872 Tekla Corporation
Computer aided modeling
US20150227644 Pictometry International Corp.
Method and system for displaying room interiors on a floor plan
US20160063722 Textura Planswift Corporation
Detection of a perimeter of a region of interest in a floor plan document
US20160379405 Jim S Baca
Technologies for generating computer models, devices, systems, and
methods utilizing the same
GoogleTangoTechnology
http://www.deccanchronicle.com/technology/gadgets/210717/i
s-google-tango-relevant-in-2017.html
https://arstechnica.co.uk/gadgets/2016/12/google-
tango-phab-2-pro-review/
A Project Tango device ‘sees’ the environment around it
through a combination of three core functions.
First up is motion tracking, which allows the device to
understand its position and orientation using a range of
sensors (including accelerometer and gyroscope).
Then there’s depth perception, which examines the
shape of the world around you. Intel provides a vital cog in
this respect with its RealSense 3D camera. With this
component on board, a device can gain accurate gesture
control and snappy 3D object rendering among other
things.
Finally, Project Tango incorporates area learning, which
means that it maps out and remembers the area around it.
Point Cloud Framework for Rendering 3D
Models Using Google Tango
Maxen Chung, Santa Clara University
Julian Callin, Santa Clara University
http://scholarcommons.scu.edu/cseng_senior/84
https://doi.org/10.1007/s11227-016-1891-8
Project Tango Tablet Development Kit, recently introduced by
Google, Inc. Equipped with the most powerful processor available
to date on a consumer-level mobile platform (i.e., NVIDIA Tegra K1
whose 192 programmable CUDA-enabled GPU cores use the
same efficient Kepler architecture found in the world’s most
powerful supercomputers and workstations) along with several
sensors (motion tracking camera, 3D depth sensor,
accelerometer, ambient light sensor, barometer, compass, GPS,
gyroscope), this mobile device can readily utilize GPU computing
making it an ideal platform for developing real-time contextual
awareness applications for the visually impaired (VI). Moreover,
being compact, lightweight, potentially wearable, relatively
discreet and affordable render it aesthetically appealing, socially
acceptable and accessible for VI users
GoogleTangoExampleApplications#1
We broke the news yesterday that Google
was producing a prototype 3D sensing
smartphone called Project Tango. We also
broke down the capabilities of the vision
processor inside the device and talked
about what it means for the future of
phones.
Now, we’ve got an exclusive look in the
video below at a real 3D indoor map of a
room captured with one of the prototype
devices by Matterport.
https://techcrunch.com/2014/02/21/heres-an-actual-3d-indoor-map-of-a-room-captured-with-googles-project-tango-phone/
https://matterport.com/mobile-3d-capture/
https://developers.google.com/tango/apis/overview
Daydream is Google’s platform for virtual
reality. It consists of Daydream-ready phones,
Daydream-ready headsets and controllers, and
Daydream apps. Daydream View is the first
Daydream-ready headset and controller
designed and developed by Google. It also
comes with a touch-and-motion enabled
controller so you can easily interact with VR
apps.
With the Daydream View, you will be able to
explore new worlds through Google Street View
and Fantastic Beasts. Kick back in your
personal cinema with YouTube, Netflix, Hulu,
and HBO. Get in the game with Gunjack 2,
LEGO® BrickHeadz, and Need for Speed.
That’s just the beginning of the VR possibilities
with Daydream.
http://www.techphlie.com/
2017/07/what-is-google-ta
ngo-and-daydream.html
Google has notably been pushing AR/VR
technologies with its latest Android OS. The
most prominent introduction however, has
been the ASUS ZenFone AR launch that took
place at CES, 2017, earlier this year.
GoogleTangoExampleApplications#2
Google Tango SDK
examples: how to
make a floor plan in
50 seconds
Alexander Grau
Google Tango and
Revit
Leonardo Manzione
https://www.youtube.com/watch?v=A-4cuJ1kOQ4
“GoogleTango”withoutdepth sensors
I have always believed that bringing 3D to consumers could only work without the need for
dedicated depth sensors. This pure-software approach is already being embraced for
Augmented Reality with Apple’s upcoming ARKit and Google’s ARCore which was announced
last week. Both can give modern smartphones AR-capabilities by just using the regular camera(s),
instead of using dedicated sensors like Tango.
https://3dscanexpert.com/sony-3d-creator-brings-sensor-less-3d-scanning-consumers/
But yesterday, at IFA Berlin, Sony announced its
latest smartphone, the XZ1. Which has all the
bells and whistles you expect from a flagship
Android phone but also an app called 3D Creator
. It basically does exactly what Microsoft showed
last year, but is actually available — albeit
exclusive for the XZ1.
https://www.sonymobile.com/global-en/products/phones/xperia
-xz1/3d-creator/
AppleDepthSensing
TheiPhoneX’s
notch isbasically
aKinect
365by Paul Miller@futurepaul  Sep 17,2017, 10:00am
EDT
https://www.theverge.com/circuitbreaker/2017/9/17/16315510/iphone-x-notch-kinect-apple-primesense-microsoft
And now, in late 2017, Apple is going to sell a phone witha front-facing depthcamera. Unlike the original Kinect,
which was built to track motion in a whole living room, the sensor is primarily designed for scanning faces and
powers Apple’s Face ID feature. Apple’s “TrueDepth” camera blasts “more than 30,000 invisible dots” and can
create incredibly detailed scans of a human face. In fact, while Apple’s Animoji feature is impressive, 
the developerAPIbehind it is even wilder: Apple generates, in real time, a full animated 3D mesh of your face,
while also approximating your face’s lighting conditions to improve the realismofAR applications.
How Apple’siPhone X
TrueDepth CameraWorks
By David Cardinal onSeptember 14, 2017
Beyond the Camera: Facial Motions and
Changing Features Getting a depth estimate for
portions of a scene is only the beginning of what’s
required for Apple’s implementation of secure facial
recognition and Animojis. For example, a mask could
be used to hack a facial recognition system that relied
solely on the shape of the face. So Apple is using
processing power to learn and recognize 50 different
facial motionsthat are muchharder toforge.Theyalso
provide the basis for making Animoji figures seem to
mimicthephone’sowner.
How Secure is Face ID? Given how willing Apple is
to commit to using Face ID for financial transactions,
I’m sure they have pushed the limits beyond either
simple 3D models or 2D motion. It is likely they are
relying on the phone’s abilitytorecognize minute facial
movements and feed them into a machine learning
system on the A11Bionicchip that will add another
layer of security to the system. That piece will also be
key in helping the phone decide whether you’re the
same person when you put on a pair of glasses, a hat,
or grow a beard — all of which Apple claims Face ID
willhandle.
Laserscanning
LIDARtechnology
LaserScanning LiDAR(LightDetection AndRanging)
http://dx.doi.org/10.1038/nphoton.2010.148
http://dx.doi.org/10.1080/19479832.2013.811124
3D building modeling
(BIM) using images and
LiDAR: a review
https://techcrunch.com/2017/07/12/nyu-releases-the-largest-lidar-
dataset-ever-to-help-urban-development/
http://ia.cr/2017/613
https://www.theregister.co.uk/2017/06/27/lidar_spoofed_bad_news_for_self_driving_cars/
VelodyneThemoston newsduetoautonomousdriving
http://velodynelidar.com/
https://www.youtube.com/watch?v=8nTFjVm9sTQ https://www.youtube.com/watch?v=nXlqv_k4P8Q
http://spectrum.ieee.org/cars-that-think/transportation/se
nsors/velodyne-announces-a-solidstate-lidar
http://spectrum.ieee.org/cars-that-think/transportati
on/sensors/israeli-stealth-startup-innoviz-promises-1
00-solidstate-automotive-lidar-by-2018
http://spectrum.ieee.org/transportation/advanced-cars/cheap-lidar-the-k
ey-to-making-selfdriving-cars-affordable
RieglA rangeof differentlaserscanners
http://www.riegl.com/products/unmanned-scanning/
RIEGL VZ-400 Indoor Scanned Data
by Jamis Choi, Published on Apr 1, 2010
https://www.youtube.com/watch?v=hOf0hpCn92I
Scanning made simple with RiSOLVE - RIEGL's new 3D Scene Capture Software
Published on Oct 4, 2012 (feat. horrible lounge music)
https://www.youtube.com/watch?v=lbxvzMlTWyg
Rieglsystemin practice
https://doi.org/10.1109/IROS.2016.7759501
Namely, we propose a method for the automatic selection of feature coordinate
locations, and introduce the concept of localized automatic relevance
determination (LARD) to the Hilbert Maps framework, in which different
dimensions in the projected Hilbert space operate within independent length scale
values. The proposed technique was tested against other state-of-the-art 3D
scene reconstruction tools in three different datasets: a simulated indoors
environment, RIEGL laser scans and dense LSD-SLAM pointclouds. The results
testify to the proposed framework’s ability to model complex structures and
correctly interpolate over unobserved areas of the input space while achieving
real-time training and querying performances.
HandheldScanning GeoSLAMZEB-REVO
Handheld Laser Scanning -
ZEB-REVO
The ZEB-REVO is the latest, lightweight
revolving laser scanner from GeoSLAM.
Handheld, pole-mounted or attached to a
mobile platform, the ZEB-REVO can
record more than 40,000 measurement
points per second from the survey
environment.
NEW ZEB-CAM
The new ZEB-CAM is an optional upgrade
for standard ZEB-REVO systems. Simply
attach ZEB-CAM to the underside of a
standard REVO and begin scanning
immediately.
The ZEB-CAM captures live video footage
of the survey environment and adds
contextual video and imagery to scan data
to aid feature identification.
Optical flow technology is utilised to
accurately synchronise the video and scan
together in GeoSLAM's Desktop software.
http://www.3dlasermapping.com/zeb-revo-
handheld-laser-scanning/
https://youtu.be/k8q5xr_eLgk
GeoSlamvs.Leica Portablescanningquality
http://dx.doi.org/10.1117/12.2270761
The paper investigates the performances of two portable
mobile mapping systems (MMSs), the handheld GeoSLAM
ZEB-REVO and Leica Pegasus:Backpack, in two typical
user-case scenarios: an indoor two-floors building and an
outdoor open city square.
Note! This paper would have
been even nicer with a
‘gold standard’ giving the
“correct measurements”
instead of just comparing
two “good enough” scanners.
ResearchScanners SensorFusion
The Indoor Multi-sensor Acquisition System
(IMAS) presented in this paper consists of a wheeled
platform equipped with two 2D laser heads, RGB
cameras, thermographic camera, thermohygrometer,
and luxmeter. One of the laser scanning sensors is
foreseen to obtain the building map and the navigation
information, and the other one to the 3D environment
reconstruction. The thermographic and optical
images, and the geometric and comfort data are
synchronized and automatically linked to trajectory
positions, so that they are georeferenced in the
building in terms of a relativepositioning system Software interface for virtual immersive navigation and ex situ data analysis.
http://dx.doi.org/10.3390/s16060785
AppliedPointCloud Scans Accessibility
Point Clouds to Indoor/Outdoor Accessibility
Diagnosis
J. Balado, L. Díaz-Vilariño, P. Arias, I. Garrido
https://www.isprs-ann-photogramm-remote-sens-spatial-inf-sci.net/IV-2-W4/287/2017/isprs-annals-IV-2-
W4-287-2017.pdf
This work presents an approach to automatically detect structural floor elements such as steps or ramps in
the immediate environment of buildings, elements that may affect the accessibility to buildings. The
methodology is based on Mobile Laser Scanner (MLS) point cloud and trajectory information. The
methodology is tested in a real case study, consisting of 100 m of an urban street. Ground elements are
correctly classified in an acceptable computation time. Steps and ramps also are exported to GIS software to
enrich building models from Open Street Map with information about accessible/inaccessible entrances and
their locations.
http://www.wired.co.uk/article/wayfindr-app
A project initiated by the Royal London Society for the
Blind's (RLSB) Youth Forum has led to the prototyping of
a new app called Wayfindr, which has been built especially
to help blind and partially sighted people use London's
transport network independently. The app relies on
smartphones and iBeacons and has been developed in
collaboration with global digital product design studio
ustwo
Our Open Standard gives you
the tools to create inclusive
and consistent experiences for
your vision impaired
customers. From transport
networks and shopping
centres, to hospitals and any
other indoor space - we can
help. Through our on-site trials
and consultancy we will work
together with you to
understand how digital
wayfinding can make your
estate accessible.
https://www.wayfindr.net/
Post-processing
Rawpointcloudsaremassiveandpossiblycontain alotof
redundantdatapoints
DataQuality compromisebetweenfilesize,computationaltimeandquality
3D model reconstruction from point cloud processed either with OpenSFM,
VisualSFM or Pix4D (top row) to mesh model (middle row) to final textured 3D
model (bottom row) across a series of downsampled Sky Ranger UAV including full
resolution (first column) half resolution (second column) and quarter resolution (last
column).
Bolick and Harguess (2016), http://dx.doi.org/10.1117/12.2224677
Garbage in – garbage out true like as always. The
more high-quality images / points you have as input, the
higher the reconstruction quality will obviously be.
Top-left: points sampled on a sphere and corrupted
with a lot of noise. Top-right: reconstructed surface
mesh. Bottom-left: smoothed point set. Bottom-
right: reconstructed surface mesh.
Reconstruction error (mm) against number of points
for the Bimba con Nastrino point set with 1.6M points
as well as for simplified versions.
CGAL 4.10 - Poisson Surface Reconstruction
The sensitivity of biological finite element models to the
resolution of surface geometry: a case study of
crocodilian crania: “Example of the simplified models. C.
moreletti models composed of 20k, 30k, 90k and 300k
surface (mesh) elements.”
https://doi.org/10.7717/peerj.988
point cloud & mesh processing
MAY 27 2017, posted by Taylor Wang
The final goal is to get a fully editable NURBS CAD
model so that it can be modified by any CAD
software to improve the design or reproduce the
product.
PointCloudLibray(PCL) The mostpopular open-sourcelibrary
http://unanancyowen.com/en/pcl-with-velodyne/
https://www.youtube.com/watch?v=7BUFxkyH1r0
https://doi.org/10.1109/MRA.2012.2206675
Cited by 186 articles - see Related articles
Otherlibraries CGALandresearchcode
Driftcorrection forproperimageregistration
https://doi.org/10.1109/ROBOT.2010.5509312
Correcting for drift (distortion) between different
scans or overlapping point clouds with added
velocity information for ICP (Iterative Closest Point)
algorithm.
(a) is a given environment. Blue points in (b) shows distortion of
the scan, and red points in (b) show compensated scan.
Transformation estimated using distorted data includes inevitable
errors(c). Transformation estimated from the rectified scan gives
us more accurate results(d).
Kaarta - Common point cloud registration issues
http://www.kaarta.com/cloud-registration-issues/
Published: 8 March 2017
http://dx.doi.org/10.3390/s17030539
Keywords: LiDAR; inertial measurement unit; iterative closest
point; iterated sigma point Kalman filter; time delay calibration
DataReduction andsimplificationfor storage
Imran Ashraf ; Soojung Hur ; Yongwan Park
https://doi.org/10.1109/ACCESS.2017.2699686
LIDAR produces large point cloud, but, while generating
images for limited field of view, data sparsity results in poor
quality images. Moreover, 3D to 2D data transformation also
involves data reduction, which further deteriorates the
quality of images.
http://dx.doi.org/10.1117/12.2270833
31 October 2016
https://doi.org/10.1109/TIP.2016.2623488
https://www.google.com/patents/US9582939
https://arxiv.org/abs/1609.00893
Keywords: Tensor networks, Function-related tensors, CP decomposition,
Tucker models, tensor train (TT) decompositions, matrix product states (MPS),
matrix product operators (MPO), basic tensor operations, multiway component
analysis, multilinear blind source separation, tensor completion,
linear/multilinear dimensionality reduction, large-scale optimization problems,
symmetric eigenvalue decomposition (EVD), PCA/SVD, huge systems of linear
equations, pseudo-inverse of very large matrices, Lasso and Canonical
Correlation Analysis (CCA)
https://doi.org/10.1016/j.isprsjprs.2016.06.012
In-base point cloud management pipeline in the point cloud server (PCS).
DataReduction CompressiongPointClouds
Dynamic polygon cloud compression
Eduardo Pavez ; Philip A. Chou (2017)
https://doi.org/10.1109/ICASSP.2017.7952694
We introduce a compressible representation of 3D
geometry (including its attributes, such as color texture)
intermediate between polygonal meshes and point clouds
called a polygon cloud. Polygon clouds, compared to
polygonal meshes, are more robust to live capture noise
and artifacts. Furthermore, dynamic polygon clouds,
compared to dynamic point clouds, are easier to
compress, if certain challenges are addressed. In this
paper, we propose methods for compressing dynamic
polygon clouds using transform coding of color and
motion residuals.
Real-time compression of point cloud
streams
Julius Kammerl ; Nico Blodow ; Radu Bogdan Rusu ;
Suat Gedikli ; Michael Beetz ; Eckehard Steinbach
(2012)
https://doi.org/10.1109/ICRA.2012.6224647
We present a novel lossy compression approach for point
cloud streams which exploits spatial and temporal
redundancy within the point data. Our proposed compression
framework can handle general point cloud streams of
arbitrary and varying size, point order and point density.
Furthermore, it allows for controlling coding complexity and
coding precision. To compress the point clouds, we perform
a spatial decomposition based on octree data structures.
3D Reconstruction Framework for
Multiple Remote Robots on Cloud
System
Phuong Minh Chu, Seoungjae Cho, Simon Fong, Yong Woon
Park and Kyungeun Cho (2017)
http://dx.doi.org/10.3390/sym9040055
This paper proposes a cloud-based framework that
optimizes the three-dimensional (3D) reconstruction of multiple
types of sensor data captured from multiple remote robots. A
working environment using multiple remote robots requires
massive amounts of data processing in real-time, which cannot
be achieved using a single computer. In the proposed
framework, reconstruction is carried out in cloud-based servers
via distributed data processing.
Data-drivenprocessing
Likein allthefieldsofcomputervision,real-timescanning,post-
processingandsemanticunderstandingareimprovedwith
recent deeplearningandartificial intelligencetechniques
DeepLearningbeyondnon-euclidean problems
Michael M. Bronstein, Joan Bruna, Yann LeCun, Arthur Szlam, andPierre Vandergheynst
https://doi.org/10.1109/MSP.2017.2693418
https://arxiv.org/abs/1705.10819
DeepLearningPointclouds
https://arxiv.org/abs/1704.03847
https://arxiv.org/abs/1705.03428
DeepLearningPointNet++
PointNet++: Deep Hierarchical Feature Learning on
Point Sets in a Metric Space
Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas
Stanford University, (Submitted on 7 Jun 2017)
https://arxiv.org/abs/1706.02413
Illustration of our hierarchical feature learning architecture and its application for set segmentation and classification using points in 2D
Euclidean space as an example. Single scale point grouping is visualized here.
Left: Point cloud with random point
dropout.
Right: Curve showing advantage of
our density adaptive strategy in
dealing with non-uniform density.
DP means random input dropout
during training; otherwise training is
on uniformly dense points
Scannet labeling results. PointNet captures the
overall layout of the room correctly but fails to
discover the furniture. Our approach, in contrast,
is much better at segmenting objects besides
the room layout.
DeepLearning2DFeatureDescriptors
Instead of using the old-school SIFT, SURF, ORB, etc., the
feature descriptor / matching can be done with data-driven
deep learning network as well
Note This model was trained with SfM data, which does not have strong
rotation changes. Newer models work better in this case, which will be
released soon. In the meantime, you can also use the models in the
learn-orientation, benchmark-orientation.
https://github.com/cvlab-epfl/LIFT
https://arxiv.org/abs/1603.09114 | Cited by 23 Related articles
DeepLearning3DFeatureDescriptors
https://arxiv.org/abs/1706.04496
We present a view-based convolutional network that produces local, point-based shape descriptors.
The network is trained such that geometrically and semantically similar points across different 3D
shapes are embedded close to each other in descriptor space (left). Our produced descriptors are
quite generic — they can be used in a variety of shape analysis applications, including dense
matching, prediction of human affordance regions, partial scan-to-shape matching, and shape
segmentation (right).
In contrast to findings in the image analysis community where learned 2D
descriptors are ubiquitous and general (e.g. LIFT), learned 3D descriptors have
not been as powerful as 2D counterparts because they (1) rely on limited training
data originating from small-scale shape databases, (2) are computed at low spatial
resolutions resulting in loss of detail sensitivity, and (3) are designed to operate on
specific shape classes, such as deformable shapes.
We generate training correspondences
automatically by leveraging highly structured
databases of consistently segmented shapes
with labeled parts. The largest such database
is the segmented ShapeNetCore dataset [
Yi et al. 2016, https://www.shapenet.org/] that
includes 17K man-made shapes distributed in
16 categories
Meshgenerativeshapeswith GAN
https://arxiv.org/abs/1705.02090
Our key insight is that 3D shapes are effectively
characterized by their hierarchical organization of parts,
which reflects fundamental intra-shape relationships such as
adjacency and symmetry. We develop a recursive neural net
(RvNN) based autoencoder to map a flat, unlabeled, arbitrary
part layout to a compact code. The code effectively captures
hierarchical structures of man-made 3D objects of varying
structural complexities despite being fixed-dimensional: an
associated decoder maps a code back to a full hierarchy. The
learned bidirectional mapping is further tuned using an
adversarial setup to yield a generative model of plausible
structures, from which novel structures can be sampled.
It would be interesting to thoroughly investigate the effect
of code length on structure encoding. Finally, it is worth
exploring recent developments in GANs, e.g. Wasserstein
GAN [Arjovsky et al. 2017], in our problem setting. It would
also be interesting to compare with plain VAE and other
generative adaptations.
PointCloud generativeGANsforpointclouds #1a
https://arxiv.org/abs/1707.02392
We build an end-to-end pipeline for 3D point clouds that uses an autoencoder (AE) to
create a latent representation, and a Generative Adversarial Networks (GAN) to generate
new samples in that latent space. Our AE is designed with a structural loss tailored to
unordered point clouds. Our learned latent space, while compact, has excellent class-
discriminative ability: per our classification results, it outperforms recent GAN-based
representations by 4.3%. In addition, the latent space allows for vector arithmetic, which
we apply in a number of shape editing scenarios, such as interpolation and structural
manipulation.
We argue that jointly learning the representation and training the GAN is unnecessary for
our modality. We propose a workflow that first learns a representation by training an AE
with a compact bottleneck layer, then trains a plain GAN in that fixed latent
representation. One benefit of this approach is that AEs are a mature technology: training
them is much easier and they are compatible with more architectures than GANs. We
point to theory that supports this idea, and verify it empirically: we show that GANs
trained in our learned AE-based latent space generate visibly improved results,
even with a generator and discriminator as shallow as a single hidden layer. Within a
handful of epochs, we generate geometries that are recognized in their right object class at
a rate close to that of ground truth data. Importantly, we report significantly better diversity
measures (10x divergence reduction) over the state of the art, establishing that we cover
more of the original data distribution. In summary, we contribute.
● An effective cross-category AE-based latent representation on point clouds.
● The first (monolithic) GAN architecture operating on 3D point clouds.
● A surprisingly simpler, state-of-the-art GAN working in the AE’s latent space.
1) Autoencoder
For fixed latent representation
Vector arithmetic
2) Generative Adversarial Network
Using the fixed latent representation
In our latent-space GAN, instead of operating on the raw point cloud input, we pass the data through
our pre-trained autoencoder, trained separately for each object class with the Earth Mover’s distance
(EMD) loss function. Both the generator and the discriminator of the GAN then operate on the 512-
dimensional bottleneck variable of the AE. Finally, once the GAN training is over, the output of the
generator is decoded to a point cloud via the AE decoder. We found that very shallow designs for both
the generator and discriminator (in our case, 1 hidden layer for the generator and 2 for the
discriminator) are sufficient to produce realistic results
PointCloud generativeGANsforpointclouds #1b
Interpolating between different point clouds, using our latent
space representation. Note the interpolation between
structurally and topologically different shapes.
Generative results using our latent-space GAN. Note the
variability and fidelity of the result.
For a recap on GANs, you could see for example:
https://arxiv.org/abs/1701.07875
Cited by 106 - Related articles
What does GANs for point clouds mean in practice?
Point-cloud super-resolution (e.g. Ledig et al. 2016 for natural images), to improve
model appearance (e.g. remove staircasing), and inpainting (e.g. Iizuka et al. 2017)
to handle occlusion and gaps from indoor scans (“shape completion”). “Visual
plastic surgery” in other words (Tung et al. 2017)
Sung et al. (2015)
Data-driven Structural Priors for Shape Completion
Mönch et al. (2010)
Staircase-Aware Smoothing of Medical Surface Meshes
HardwarePointCloud Super-resolution multiplescans
https://doi.org/10.2312/SPBG/SPBG06/009-015
Cited by 47 articles
On the left, one scan of the the parrot
statue, with a sample spacing of
about 1mm. Center, we combine 100
nearly identical such scans to
produce the surface in the center,
produced on a grid with sample
spacing of about 0.3mm. Notice the
noise reduction and the improvement
in the detail, for instance in the face,
neck and wing feathers. On the right,
a photograph of the parrot statue.
Super-resolution reconstruction
using only 30 input scans at the left
and increasing to 140 at the right.
Noise is reduced dramatically at the
beginning but more slowly at the end.
Surfaces were reconstructed from
subsets which were pre-registered
using all 140 scans.
For absolute measurement accuracy (e.g. Biljecki et al. 2017), one can scan the same space multiple times
A thin strip of the super-resolved
surface, and the nearby sample
points from the input scans. The
input is very noisy, but the points are
densely and randomly distributed
near the surface with few outliers, so
the average gives an accurate
representation of the surface.
(a) One scan. (b) Final super-resolved surface from 100 scans. (c) Photo of
the object (a plaster cast of a subway token). The bottom row shows some
results of other kinds of processing, to evaluate the importance of the various
steps of the algorithm. (d) One scan, bilinearly interpolated onto the finer grid
and smoothed. Detail is missing. (e) The entire algorithm except for the final
bilateral filtering step. The noise removed by the filtering seems to be residual
registration error, which perhaps could be improved. (f) Just averaging 100
scans taken without moving the scanner, using the same Gaussian kernel. Noise
is decreased, but there is aliasing from the lower-resolution grid obscuring detail
visible in (b).
DeepLearningSuper-Resolution
Plentyofoptionsforimage/video/volumesuper-resolution
https://arxiv.org/abs/1706.03142
https://arxiv.org/abs/1704.02738
https://arxiv.org/abs/1704.02470 https://arxiv.org/abs/1612.00085
Novel texture enhancement framework
creates an HR style image that is rich in
details, which can be used to restore
high-frequency texture details back into
the initial HR image via the style transfer
algorithm.
Four examples of SR results for nearest
neighbor and cubic interpolation, the
best-performing sparse coding, 3D-
FSRCNN, and 3D-SRU-Net
configurations. Arrows indicate regions
in which at least one SR result mis-
interprets a cell boundary or an
ultrastructural feature. Scale bar 500
nm.
Our method includes a sub-pixel
motion compensation (SPMC) layer
that can better handle inter-frame
motion for this task. Our detail
fusion (DF) network that can
effectively fuse image details from
multiple images after SPMC
alignment
Point-cloudsuper-resolution
Upsampling‘on-the-fly’toavoid“dataexplosion”?
Jason Schreier
4/17/17 12:05pm Horizon Zero Dawn, Kotaku
http://kotaku.com/horizon-zero-dawn-uses-all-sorts-
of-clever-tricks-to-lo-1794385026
Games like this don’t just look incredible because of ‘hyper-realism’
but because their engineers use all sorts of tricks [LOD’ing, or Level
of Detail; Mipmapping; frustum culling, etc.] to save memory.
The engine is designed to produce models in CityGML and does so in multiple
LODs. Besides the generation of multiple geometric LODs, we implement the
realisation of multiple levels of spatiosemantic coherence, geometric reference
variants, and indoor representations. The datasets produced by Random3Dcity
are suited for several applications, as we show in this paper with documented
uses. The developed engine is available under an open-source licence at Github
at http://github.com/tudelft3d/Random3Dcity
http://doi.org/10.5194/isprs-annals-IV-4-W1-51-2016
Filip Biljecki, Hugo Ledoux, Jantien Stoter
Level of detail texture filtering with dithering
and mipmaps US 5831624 A
Original Assignee 3Dfx Interactive Inc
https://www.google.com/patents/US5831624
Level-of-detail rendering: colors identify different
subdivision levels as stated in the top left corner.
Feature-Adaptive Rendering of Loop
Subdivision Surfaces on Modern GPUs
November 2014 DOI: 10.1007/s11390-014-1486-x
ManyLoDs: Parallel Many-View
Level-of-Detail Selection for Real-
Time Global Illumination
Matthias Hollander, Tobias Ritschel, Elmar Eisemann, Tamy Boubekeur
(2011) http://dx.doi.org/10.1111/j.1467-8659.2011.01982.x
3DContentgeneration VolumetricCapture
Generatecontentbyscanningreal-lifescenesandobjects
Kul Wadhwa's and Roddy O'Hara's Uncorporeal
http://www.uncorporeal.com/
Uncorporeal: volumetric capture systems for VR & AR content
creation. The team includes a technical Oscar-winner and
engineering and product leadership from WETA, Google X, Lucas
ILM, and Wikimedia.
https://venturebeat.com/2016/10/13/pathbreaker-ventures-raises-12-milli
on-to-invest-in-emerging-tech-such-as-vr-ar-and-robotics/
Ryan Gembala, founder of Pathbreaker Ventures
believes connected homes and cars and
autonomous vehicles will create a lot of
opportunities in vertical applications for startups.
And he also thinks that space technologies such as
small satellites, analysis of space-captured data,
consumer transport, space mining, and others are
interesting.
REALITYVIRTUAL.CO - A NEW ZEALAND BASED
CREATIVE TECHNOLOGIES RESEARCH &
DEVELOPMENT COLLECTIVE WITH AN ENTHUSIAST
TOWARDS THE VISUAL REALM:
● unique post production & signal processing techniques
including the development of deep learning image
enhancement & automation throughout our 3D pipeline
for PBR workflow
● strong emphasis on advanced robotics & autonomous
operations for large data acquisition of 3D
environments.
3D Scene Creation with Photogrammetry
3DContentgeneration Automaticphotorealism#1
Stillcanbequitelabor-intensivetocreaterealisticcontent
Get to know Rense de Boer, a technical art director from
Sweden, who is not only pushing the envelope of photo-real
CGI environments, but he’s doing it all in a real-time engine!
Art by Rens
https://news.developer.nvidia.com/artist-spotlight-creating-photorealistic-cgi-environments-in-real-time/
https://www.youtube.com/watch?v=bXouFfqSfxg
One Ph.D. position (supervision by Profs Niessner and Rüdiger
Westermann) is available at our chair in the area of photorealistic rendering
for deep learning and online reconstruction
Research in this project includes the development of photorealistic realtime rendering
algorithms that can be used in deep learning applications for scene understanding, and for
high-quality scalable rendering of point scans from depth sensors and RGB stereo image
reconstruction. If you are interested in applying, you should have a strong background in
computer science, i.e., efficient algorithms and data structures, and GPU programming,
have experience implementing C/C++ algorithms, and you should be excited to work on
state-of-the-art research in the 3D computer graphics.
https://wwwcg.in.tum.de/group/joboffers/phd-position-photorealistic-rendering-for-deep-le
arning-and-online-reconstruction.html
Ph.D. Position – Photorealistic Rendering for
Deep Learning and Online Reconstruction
3DContentgeneration Automaticphotorealism#2
ConvertingLiDARscanstovisuallyhighquality3Dcontent
Atom View is a new piece of software that allows content creators to
translate real-world scans into assets for virtual environments. Not only
does it aim to produce realistic results but also reduce the workflow for
content creation. The standalone app takes files captured from
volumetric cameras, offline graphics renderers, 360 lidar and more.
Volumetric capture is a promising area of development that could one day
allow content creators to skip over several of the more laborious steps of
traditional 3D content creation with better results. With Atom View, users can
even edit objects once they’ve been imported.
https://youtu.be/YxRI_3gKP8g
3DContentgeneration Styletransfer formaps
Neural Networks and The Future of 3D Procedural Content Generation
by Sam Snider-Held, Creative Technologist at MediaMonks, focusing on the intersection of AR, VR, AI, UX, and
Style transfer output on the left, real terrain on the right. Both are planes
whose vertices are being displaced by the height map texture.
Now was time to create my own style transfer light field and light field renderer. I
basically reimplemented Andrew Lowndes’ WebGl light field renderer in Unity.
What this post demonstrates is the idea that neural network could
radically change how we generate 3D content. I went with light fields
because currently my GPU is not fast enough to style transfer or any
other generative network at 60 FPS. But if we do get to that point, it’s
entirely possible see generative neural networks become an alternative
rendering pipe line to the standard rasterization approach. In this way,
neural networks could generate each frame of a game in real time,
based on realtime feedback from the user.
But it also potentially allows for a much more powerful creative approach, for
the creator and the end user. Imagine playing Gears of War, but then telling the
computer “Keep the gameplay, story, and 3d models, but make it look like
Zelda: Breath of the Wild.” This is how creating or playing a future gaming
experience could be, all because computers now know what things “look like”
and can make other things “look like” them too.
3DContentgeneration from Videoto3D
Production-Level Facial Performance Capture Using Deep
Convolutional Neural Networks In Proceedings of SCA'17, Los Angeles,
CA, USA, July 28-30, 2017
http://research.nvidia.com/publication/facial-performance-capture-deep
-neural-networks
Samuli Laine, Tero Karras, Timo Aila, Antti Herva (Remedy
Entertainment), Shunsuke Saito (Pinscreen, University of Southern
California), Ronald Yu (Pinscreen, University of Southern California), Hao
Li (USC Institute for Creative Technologies, University of Southern
California, Pinscreen), Jaakko Lehtinen (NVIDIA, Aalto University)
NVIDIA and game developer Remedy (Alan Wake, Quantum Break) showcased their
team-up solution to streamlining motion capture and animation using a deep learning
neural network, running on NVIDIA’s powerful DGX-1 server. After being “trained” with
information on previously produced animations, the network is able to generate
sophisticated 3D facial animation from videos of live actors, greatly alleviating the
time and labor burden of traditional mo-cap animation — it can even learn enough to
generate facial animation from just an audio clip. The companies believe this system
could eventually produce animation that’s just as good or better than traditionally
produced fare.
http://www.animationmagazine.net/events/siggraph-facial-animation-advances-fabri
c-engine-the-french-contingent/
“We present a real-time deep learning framework for video-based facial
performance capture -- the dense 3D tracking of an actor's face given a monocular
video. Our pipeline begins with accurately capturing a subject using a high-end
production facial capture pipeline based on multi-view stereo tracking and artist-
enhanced animations.
With 5-10 minutes of captured footage, we train a convolutional neural network to
produce high-quality output, including self-occluded regions, from a monocular
video sequence of that subject. Since this 3D facial performance capture is fully
automated, our system can drastically reduce the amount of labor involved in the
development of modern narrative-driven video games or films involving realistic
digital doubles of actors and potentially hours of animated dialogue per character. “
3DContentgeneration from Video(&Audio) toVideo
Face2Face: Real-time Face Capture and Reenactment of RGB Videos
Justus Thies1
Michael Zollhöfer 2
Marc Stamminger 1
Christian Theobalt 2
Matthias Nießner 3
1
University of Erlangen-Nuremberg2
Max Planck Institute for Informatics 3
Stanford University
http://www.graphics.stanford.edu/~niessner/thies2016face.html
https://doi.org/10.1109/CVPR.2016.262
Neural Face Editing
with Intrinsic Image
Disentangling
Zhixin Shu, Ersin Yumer,
Sunil Hadap, Kalyan Sunkavalli,
Eli Shechtman, Dimitris Samaras
(Submitted on 13 Apr 2017)
https://arxiv.org/abs/1704.04131
University of Washington researchers have developed new
algorithms that solve a thorny challenge in the field of computer
vision: turning audio clips into a realistic, lip-synced video of the
person speaking those words.
As detailed in a paper to be presented Aug. 2 at SIGGRAPH 2017,
the team successfully generated highly-realistic video of former
president Barack Obama talking about terrorism, fatherhood, job
creation and other topics using audio clips of those speeches and
existing weekly video addresses that were originally on a different
topic.
Synthesizing Obama: learning lip sync
from audioSupasorn Suwajanakorn, Steven M. Seitz,
Ira Kemelmacher-Shlizerman
ACM Transactions on Graphics (TOG), Volume 36 Issue 4,
July 2017, https://doi.org/10.1145/3072959.3073640
http://www.washington.edu/news/2017/07
/11/lip-syncing-obama-new-tools-turn-a
udio-clips-into-realistic-video/
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech

Weitere ähnliche Inhalte

Was ist angesagt?

Image Recognition Expert System based on deep learning
Image Recognition Expert System based on deep learningImage Recognition Expert System based on deep learning
Image Recognition Expert System based on deep learningPRATHAMESH REGE
 
Volker infra kennissessie relatics informatiemanagement tender - definitiev...
Volker infra kennissessie relatics   informatiemanagement tender - definitiev...Volker infra kennissessie relatics   informatiemanagement tender - definitiev...
Volker infra kennissessie relatics informatiemanagement tender - definitiev...Relatics
 
Image Restoration for 3D Computer Vision
Image Restoration for 3D Computer VisionImage Restoration for 3D Computer Vision
Image Restoration for 3D Computer VisionPetteriTeikariPhD
 
Qgis raster 3.16
Qgis raster 3.16Qgis raster 3.16
Qgis raster 3.16Jyun Tanaka
 
BeyondCorp - Google Security for Everyone Else
BeyondCorp  - Google Security for Everyone ElseBeyondCorp  - Google Security for Everyone Else
BeyondCorp - Google Security for Everyone ElseIvan Dwyer
 
[Paper Presentation] EMOTIONAL STRESS DETECTION USING DEEP LEARNING
[Paper Presentation] EMOTIONAL STRESS DETECTION USING DEEP LEARNING[Paper Presentation] EMOTIONAL STRESS DETECTION USING DEEP LEARNING
[Paper Presentation] EMOTIONAL STRESS DETECTION USING DEEP LEARNINGAnalytics India Magazine
 
Introduction to mago3D, an Open Source Based Digital Twin Platform
Introduction to mago3D, an Open Source Based Digital Twin PlatformIntroduction to mago3D, an Open Source Based Digital Twin Platform
Introduction to mago3D, an Open Source Based Digital Twin PlatformSANGHEE SHIN
 
Cognitive Digital Twin by Fariz Saračević
Cognitive Digital Twin by Fariz SaračevićCognitive Digital Twin by Fariz Saračević
Cognitive Digital Twin by Fariz SaračevićBosnia Agile
 
4차산업혁명과 드론의 역할
4차산업혁명과 드론의 역할4차산업혁명과 드론의 역할
4차산업혁명과 드론의 역할왕구 강
 
환경영향평가 의사결정지원 시공간 표출기술
환경영향평가 의사결정지원 시공간 표출기술 환경영향평가 의사결정지원 시공간 표출기술
환경영향평가 의사결정지원 시공간 표출기술 SANGHEE SHIN
 
SSII2022 [TS2] 自律移動ロボットのためのロボットビジョン〜 オープンソースの自動運転ソフトAutowareを解説 〜
SSII2022 [TS2] 自律移動ロボットのためのロボットビジョン〜 オープンソースの自動運転ソフトAutowareを解説 〜SSII2022 [TS2] 自律移動ロボットのためのロボットビジョン〜 オープンソースの自動運転ソフトAutowareを解説 〜
SSII2022 [TS2] 自律移動ロボットのためのロボットビジョン〜 オープンソースの自動運転ソフトAutowareを解説 〜SSII
 
大域マッチングコスト最小化とLiDAR-IMUタイトカップリングに基づく三次元地図生成
大域マッチングコスト最小化とLiDAR-IMUタイトカップリングに基づく三次元地図生成大域マッチングコスト最小化とLiDAR-IMUタイトカップリングに基づく三次元地図生成
大域マッチングコスト最小化とLiDAR-IMUタイトカップリングに基づく三次元地図生成MobileRoboticsResear
 
SSII2019企画: 画像および LiDAR を用いた自動走行に関する動向
SSII2019企画: 画像および LiDAR を用いた自動走行に関する動向SSII2019企画: 画像および LiDAR を用いた自動走行に関する動向
SSII2019企画: 画像および LiDAR を用いた自動走行に関する動向SSII
 
ORB SLAM Proposal for NTU GPU Programming Course 2016
ORB SLAM Proposal for NTU GPU Programming Course 2016ORB SLAM Proposal for NTU GPU Programming Course 2016
ORB SLAM Proposal for NTU GPU Programming Course 2016Mindos Cheng
 
"Scale Aware Face Detection"と"Finding Tiny Faces" (CVPR'17) の解説
"Scale Aware Face Detection"と"Finding Tiny Faces" (CVPR'17) の解説"Scale Aware Face Detection"と"Finding Tiny Faces" (CVPR'17) の解説
"Scale Aware Face Detection"と"Finding Tiny Faces" (CVPR'17) の解説Yusuke Uchida
 

Was ist angesagt? (20)

Image Recognition Expert System based on deep learning
Image Recognition Expert System based on deep learningImage Recognition Expert System based on deep learning
Image Recognition Expert System based on deep learning
 
Volker infra kennissessie relatics informatiemanagement tender - definitiev...
Volker infra kennissessie relatics   informatiemanagement tender - definitiev...Volker infra kennissessie relatics   informatiemanagement tender - definitiev...
Volker infra kennissessie relatics informatiemanagement tender - definitiev...
 
Image Restoration for 3D Computer Vision
Image Restoration for 3D Computer VisionImage Restoration for 3D Computer Vision
Image Restoration for 3D Computer Vision
 
Qgis raster 3.16
Qgis raster 3.16Qgis raster 3.16
Qgis raster 3.16
 
BeyondCorp - Google Security for Everyone Else
BeyondCorp  - Google Security for Everyone ElseBeyondCorp  - Google Security for Everyone Else
BeyondCorp - Google Security for Everyone Else
 
[Paper Presentation] EMOTIONAL STRESS DETECTION USING DEEP LEARNING
[Paper Presentation] EMOTIONAL STRESS DETECTION USING DEEP LEARNING[Paper Presentation] EMOTIONAL STRESS DETECTION USING DEEP LEARNING
[Paper Presentation] EMOTIONAL STRESS DETECTION USING DEEP LEARNING
 
Introduction to mago3D, an Open Source Based Digital Twin Platform
Introduction to mago3D, an Open Source Based Digital Twin PlatformIntroduction to mago3D, an Open Source Based Digital Twin Platform
Introduction to mago3D, an Open Source Based Digital Twin Platform
 
Cognitive Digital Twin by Fariz Saračević
Cognitive Digital Twin by Fariz SaračevićCognitive Digital Twin by Fariz Saračević
Cognitive Digital Twin by Fariz Saračević
 
Digital twins
Digital twinsDigital twins
Digital twins
 
PointNet
PointNetPointNet
PointNet
 
4차산업혁명과 드론의 역할
4차산업혁명과 드론의 역할4차산업혁명과 드론의 역할
4차산업혁명과 드론의 역할
 
환경영향평가 의사결정지원 시공간 표출기술
환경영향평가 의사결정지원 시공간 표출기술 환경영향평가 의사결정지원 시공간 표출기술
환경영향평가 의사결정지원 시공간 표출기술
 
SSII2022 [TS2] 自律移動ロボットのためのロボットビジョン〜 オープンソースの自動運転ソフトAutowareを解説 〜
SSII2022 [TS2] 自律移動ロボットのためのロボットビジョン〜 オープンソースの自動運転ソフトAutowareを解説 〜SSII2022 [TS2] 自律移動ロボットのためのロボットビジョン〜 オープンソースの自動運転ソフトAutowareを解説 〜
SSII2022 [TS2] 自律移動ロボットのためのロボットビジョン〜 オープンソースの自動運転ソフトAutowareを解説 〜
 
大域マッチングコスト最小化とLiDAR-IMUタイトカップリングに基づく三次元地図生成
大域マッチングコスト最小化とLiDAR-IMUタイトカップリングに基づく三次元地図生成大域マッチングコスト最小化とLiDAR-IMUタイトカップリングに基づく三次元地図生成
大域マッチングコスト最小化とLiDAR-IMUタイトカップリングに基づく三次元地図生成
 
SSII2019企画: 画像および LiDAR を用いた自動走行に関する動向
SSII2019企画: 画像および LiDAR を用いた自動走行に関する動向SSII2019企画: 画像および LiDAR を用いた自動走行に関する動向
SSII2019企画: 画像および LiDAR を用いた自動走行に関する動向
 
ORB SLAM Proposal for NTU GPU Programming Course 2016
ORB SLAM Proposal for NTU GPU Programming Course 2016ORB SLAM Proposal for NTU GPU Programming Course 2016
ORB SLAM Proposal for NTU GPU Programming Course 2016
 
LiDARとSensor Fusion
LiDARとSensor FusionLiDARとSensor Fusion
LiDARとSensor Fusion
 
"Scale Aware Face Detection"と"Finding Tiny Faces" (CVPR'17) の解説
"Scale Aware Face Detection"と"Finding Tiny Faces" (CVPR'17) の解説"Scale Aware Face Detection"と"Finding Tiny Faces" (CVPR'17) の解説
"Scale Aware Face Detection"と"Finding Tiny Faces" (CVPR'17) の解説
 
Big data and analytics
Big data and analyticsBig data and analytics
Big data and analytics
 
Computer vision
Computer visionComputer vision
Computer vision
 

Ähnlich wie Emerging 3D Scanning Technologies for PropTech

Deep Learning for Structure-from-Motion (SfM)
Deep Learning for Structure-from-Motion (SfM)Deep Learning for Structure-from-Motion (SfM)
Deep Learning for Structure-from-Motion (SfM)PetteriTeikariPhD
 
IRJET - A Survey Paper on Efficient Object Detection and Matching using F...
IRJET -  	  A Survey Paper on Efficient Object Detection and Matching using F...IRJET -  	  A Survey Paper on Efficient Object Detection and Matching using F...
IRJET - A Survey Paper on Efficient Object Detection and Matching using F...IRJET Journal
 
A ROS IMPLEMENTATION OF THE MONO-SLAM ALGORITHM
A ROS IMPLEMENTATION OF THE MONO-SLAM ALGORITHMA ROS IMPLEMENTATION OF THE MONO-SLAM ALGORITHM
A ROS IMPLEMENTATION OF THE MONO-SLAM ALGORITHMcsandit
 
Integrated Hidden Markov Model and Kalman Filter for Online Object Tracking
Integrated Hidden Markov Model and Kalman Filter for Online Object TrackingIntegrated Hidden Markov Model and Kalman Filter for Online Object Tracking
Integrated Hidden Markov Model and Kalman Filter for Online Object Trackingijsrd.com
 
CV_sarah_frisken_05.15.2016
CV_sarah_frisken_05.15.2016CV_sarah_frisken_05.15.2016
CV_sarah_frisken_05.15.2016Sarah Frisken
 
An Analysis of Various Deep Learning Algorithms for Image Processing
An Analysis of Various Deep Learning Algorithms for Image ProcessingAn Analysis of Various Deep Learning Algorithms for Image Processing
An Analysis of Various Deep Learning Algorithms for Image Processingvivatechijri
 
Semantic Perception for Telemanipulation at SPME Workshop at ICRA 2013
Semantic Perception for Telemanipulation at SPME Workshop at ICRA 2013Semantic Perception for Telemanipulation at SPME Workshop at ICRA 2013
Semantic Perception for Telemanipulation at SPME Workshop at ICRA 2013Dariolakis
 
Dj31514517
Dj31514517Dj31514517
Dj31514517IJMER
 
Dj31514517
Dj31514517Dj31514517
Dj31514517IJMER
 
HOMOGENEOUS MULTISTAGE ARCHITECTURE FOR REAL-TIME IMAGE PROCESSING
HOMOGENEOUS MULTISTAGE ARCHITECTURE FOR REAL-TIME IMAGE PROCESSINGHOMOGENEOUS MULTISTAGE ARCHITECTURE FOR REAL-TIME IMAGE PROCESSING
HOMOGENEOUS MULTISTAGE ARCHITECTURE FOR REAL-TIME IMAGE PROCESSINGcscpconf
 
slam_research_paper
slam_research_paperslam_research_paper
slam_research_paperVinit Payal
 
Robust techniques for background subtraction in urban
Robust techniques for background subtraction in urbanRobust techniques for background subtraction in urban
Robust techniques for background subtraction in urbantaylor_1313
 
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro..."High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...Edge AI and Vision Alliance
 
IRJET- Criminal Recognization in CCTV Surveillance Video
IRJET-  	  Criminal Recognization in CCTV Surveillance VideoIRJET-  	  Criminal Recognization in CCTV Surveillance Video
IRJET- Criminal Recognization in CCTV Surveillance VideoIRJET Journal
 
Secure IoT Systems Monitor Framework using Probabilistic Image Encryption
Secure IoT Systems Monitor Framework using Probabilistic Image EncryptionSecure IoT Systems Monitor Framework using Probabilistic Image Encryption
Secure IoT Systems Monitor Framework using Probabilistic Image EncryptionIJAEMSJORNAL
 
Effective Object Detection and Background Subtraction by using M.O.I
Effective Object Detection and Background Subtraction by using M.O.IEffective Object Detection and Background Subtraction by using M.O.I
Effective Object Detection and Background Subtraction by using M.O.IIJMTST Journal
 

Ähnlich wie Emerging 3D Scanning Technologies for PropTech (20)

Deep Learning for Structure-from-Motion (SfM)
Deep Learning for Structure-from-Motion (SfM)Deep Learning for Structure-from-Motion (SfM)
Deep Learning for Structure-from-Motion (SfM)
 
Introduction of slam
Introduction of slamIntroduction of slam
Introduction of slam
 
IRJET - A Survey Paper on Efficient Object Detection and Matching using F...
IRJET -  	  A Survey Paper on Efficient Object Detection and Matching using F...IRJET -  	  A Survey Paper on Efficient Object Detection and Matching using F...
IRJET - A Survey Paper on Efficient Object Detection and Matching using F...
 
A ROS IMPLEMENTATION OF THE MONO-SLAM ALGORITHM
A ROS IMPLEMENTATION OF THE MONO-SLAM ALGORITHMA ROS IMPLEMENTATION OF THE MONO-SLAM ALGORITHM
A ROS IMPLEMENTATION OF THE MONO-SLAM ALGORITHM
 
Integrated Hidden Markov Model and Kalman Filter for Online Object Tracking
Integrated Hidden Markov Model and Kalman Filter for Online Object TrackingIntegrated Hidden Markov Model and Kalman Filter for Online Object Tracking
Integrated Hidden Markov Model and Kalman Filter for Online Object Tracking
 
AR/SLAM for end-users
AR/SLAM for end-usersAR/SLAM for end-users
AR/SLAM for end-users
 
CV_sarah_frisken_05.15.2016
CV_sarah_frisken_05.15.2016CV_sarah_frisken_05.15.2016
CV_sarah_frisken_05.15.2016
 
An Analysis of Various Deep Learning Algorithms for Image Processing
An Analysis of Various Deep Learning Algorithms for Image ProcessingAn Analysis of Various Deep Learning Algorithms for Image Processing
An Analysis of Various Deep Learning Algorithms for Image Processing
 
Semantic Perception for Telemanipulation at SPME Workshop at ICRA 2013
Semantic Perception for Telemanipulation at SPME Workshop at ICRA 2013Semantic Perception for Telemanipulation at SPME Workshop at ICRA 2013
Semantic Perception for Telemanipulation at SPME Workshop at ICRA 2013
 
paper
paperpaper
paper
 
Dj31514517
Dj31514517Dj31514517
Dj31514517
 
Dj31514517
Dj31514517Dj31514517
Dj31514517
 
HOMOGENEOUS MULTISTAGE ARCHITECTURE FOR REAL-TIME IMAGE PROCESSING
HOMOGENEOUS MULTISTAGE ARCHITECTURE FOR REAL-TIME IMAGE PROCESSINGHOMOGENEOUS MULTISTAGE ARCHITECTURE FOR REAL-TIME IMAGE PROCESSING
HOMOGENEOUS MULTISTAGE ARCHITECTURE FOR REAL-TIME IMAGE PROCESSING
 
slam_research_paper
slam_research_paperslam_research_paper
slam_research_paper
 
Robust techniques for background subtraction in urban
Robust techniques for background subtraction in urbanRobust techniques for background subtraction in urban
Robust techniques for background subtraction in urban
 
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro..."High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...
 
IRJET- Criminal Recognization in CCTV Surveillance Video
IRJET-  	  Criminal Recognization in CCTV Surveillance VideoIRJET-  	  Criminal Recognization in CCTV Surveillance Video
IRJET- Criminal Recognization in CCTV Surveillance Video
 
X36141145
X36141145X36141145
X36141145
 
Secure IoT Systems Monitor Framework using Probabilistic Image Encryption
Secure IoT Systems Monitor Framework using Probabilistic Image EncryptionSecure IoT Systems Monitor Framework using Probabilistic Image Encryption
Secure IoT Systems Monitor Framework using Probabilistic Image Encryption
 
Effective Object Detection and Background Subtraction by using M.O.I
Effective Object Detection and Background Subtraction by using M.O.IEffective Object Detection and Background Subtraction by using M.O.I
Effective Object Detection and Background Subtraction by using M.O.I
 

Mehr von PetteriTeikariPhD

ML and Signal Processing for Lung Sounds
ML and Signal Processing for Lung SoundsML and Signal Processing for Lung Sounds
ML and Signal Processing for Lung SoundsPetteriTeikariPhD
 
Next Gen Ophthalmic Imaging for Neurodegenerative Diseases and Oculomics
Next Gen Ophthalmic Imaging for Neurodegenerative Diseases and OculomicsNext Gen Ophthalmic Imaging for Neurodegenerative Diseases and Oculomics
Next Gen Ophthalmic Imaging for Neurodegenerative Diseases and OculomicsPetteriTeikariPhD
 
Next Gen Computational Ophthalmic Imaging for Neurodegenerative Diseases and ...
Next Gen Computational Ophthalmic Imaging for Neurodegenerative Diseases and ...Next Gen Computational Ophthalmic Imaging for Neurodegenerative Diseases and ...
Next Gen Computational Ophthalmic Imaging for Neurodegenerative Diseases and ...PetteriTeikariPhD
 
Wearable Continuous Acoustic Lung Sensing
Wearable Continuous Acoustic Lung SensingWearable Continuous Acoustic Lung Sensing
Wearable Continuous Acoustic Lung SensingPetteriTeikariPhD
 
Precision Medicine for personalized treatment of asthma
Precision Medicine for personalized treatment of asthmaPrecision Medicine for personalized treatment of asthma
Precision Medicine for personalized treatment of asthmaPetteriTeikariPhD
 
Two-Photon Microscopy Vasculature Segmentation
Two-Photon Microscopy Vasculature SegmentationTwo-Photon Microscopy Vasculature Segmentation
Two-Photon Microscopy Vasculature SegmentationPetteriTeikariPhD
 
Skin temperature as a proxy for core body temperature (CBT) and circadian phase
Skin temperature as a proxy for core body temperature (CBT) and circadian phaseSkin temperature as a proxy for core body temperature (CBT) and circadian phase
Skin temperature as a proxy for core body temperature (CBT) and circadian phasePetteriTeikariPhD
 
Summary of "Precision strength training: The future of strength training with...
Summary of "Precision strength training: The future of strength training with...Summary of "Precision strength training: The future of strength training with...
Summary of "Precision strength training: The future of strength training with...PetteriTeikariPhD
 
Precision strength training: The future of strength training with data-driven...
Precision strength training: The future of strength training with data-driven...Precision strength training: The future of strength training with data-driven...
Precision strength training: The future of strength training with data-driven...PetteriTeikariPhD
 
Intracerebral Hemorrhage (ICH): Understanding the CT imaging features
Intracerebral Hemorrhage (ICH): Understanding the CT imaging featuresIntracerebral Hemorrhage (ICH): Understanding the CT imaging features
Intracerebral Hemorrhage (ICH): Understanding the CT imaging featuresPetteriTeikariPhD
 
Hand Pose Tracking for Clinical Applications
Hand Pose Tracking for Clinical ApplicationsHand Pose Tracking for Clinical Applications
Hand Pose Tracking for Clinical ApplicationsPetteriTeikariPhD
 
Precision Physiotherapy & Sports Training: Part 1
Precision Physiotherapy & Sports Training: Part 1Precision Physiotherapy & Sports Training: Part 1
Precision Physiotherapy & Sports Training: Part 1PetteriTeikariPhD
 
Multimodal RGB-D+RF-based sensing for human movement analysis
Multimodal RGB-D+RF-based sensing for human movement analysisMultimodal RGB-D+RF-based sensing for human movement analysis
Multimodal RGB-D+RF-based sensing for human movement analysisPetteriTeikariPhD
 
Creativity as Science: What designers can learn from science and technology
Creativity as Science: What designers can learn from science and technologyCreativity as Science: What designers can learn from science and technology
Creativity as Science: What designers can learn from science and technologyPetteriTeikariPhD
 
Deep Learning for Biomedical Unstructured Time Series
Deep Learning for Biomedical  Unstructured Time SeriesDeep Learning for Biomedical  Unstructured Time Series
Deep Learning for Biomedical Unstructured Time SeriesPetteriTeikariPhD
 
Hyperspectral Retinal Imaging
Hyperspectral Retinal ImagingHyperspectral Retinal Imaging
Hyperspectral Retinal ImagingPetteriTeikariPhD
 
Instrumentation for in vivo intravital microscopy
Instrumentation for in vivo intravital microscopyInstrumentation for in vivo intravital microscopy
Instrumentation for in vivo intravital microscopyPetteriTeikariPhD
 
Future of Retinal Diagnostics
Future of Retinal DiagnosticsFuture of Retinal Diagnostics
Future of Retinal DiagnosticsPetteriTeikariPhD
 
OCT Monte Carlo & Deep Learning
OCT Monte Carlo & Deep LearningOCT Monte Carlo & Deep Learning
OCT Monte Carlo & Deep LearningPetteriTeikariPhD
 

Mehr von PetteriTeikariPhD (20)

ML and Signal Processing for Lung Sounds
ML and Signal Processing for Lung SoundsML and Signal Processing for Lung Sounds
ML and Signal Processing for Lung Sounds
 
Next Gen Ophthalmic Imaging for Neurodegenerative Diseases and Oculomics
Next Gen Ophthalmic Imaging for Neurodegenerative Diseases and OculomicsNext Gen Ophthalmic Imaging for Neurodegenerative Diseases and Oculomics
Next Gen Ophthalmic Imaging for Neurodegenerative Diseases and Oculomics
 
Next Gen Computational Ophthalmic Imaging for Neurodegenerative Diseases and ...
Next Gen Computational Ophthalmic Imaging for Neurodegenerative Diseases and ...Next Gen Computational Ophthalmic Imaging for Neurodegenerative Diseases and ...
Next Gen Computational Ophthalmic Imaging for Neurodegenerative Diseases and ...
 
Wearable Continuous Acoustic Lung Sensing
Wearable Continuous Acoustic Lung SensingWearable Continuous Acoustic Lung Sensing
Wearable Continuous Acoustic Lung Sensing
 
Precision Medicine for personalized treatment of asthma
Precision Medicine for personalized treatment of asthmaPrecision Medicine for personalized treatment of asthma
Precision Medicine for personalized treatment of asthma
 
Two-Photon Microscopy Vasculature Segmentation
Two-Photon Microscopy Vasculature SegmentationTwo-Photon Microscopy Vasculature Segmentation
Two-Photon Microscopy Vasculature Segmentation
 
Skin temperature as a proxy for core body temperature (CBT) and circadian phase
Skin temperature as a proxy for core body temperature (CBT) and circadian phaseSkin temperature as a proxy for core body temperature (CBT) and circadian phase
Skin temperature as a proxy for core body temperature (CBT) and circadian phase
 
Summary of "Precision strength training: The future of strength training with...
Summary of "Precision strength training: The future of strength training with...Summary of "Precision strength training: The future of strength training with...
Summary of "Precision strength training: The future of strength training with...
 
Precision strength training: The future of strength training with data-driven...
Precision strength training: The future of strength training with data-driven...Precision strength training: The future of strength training with data-driven...
Precision strength training: The future of strength training with data-driven...
 
Intracerebral Hemorrhage (ICH): Understanding the CT imaging features
Intracerebral Hemorrhage (ICH): Understanding the CT imaging featuresIntracerebral Hemorrhage (ICH): Understanding the CT imaging features
Intracerebral Hemorrhage (ICH): Understanding the CT imaging features
 
Hand Pose Tracking for Clinical Applications
Hand Pose Tracking for Clinical ApplicationsHand Pose Tracking for Clinical Applications
Hand Pose Tracking for Clinical Applications
 
Precision Physiotherapy & Sports Training: Part 1
Precision Physiotherapy & Sports Training: Part 1Precision Physiotherapy & Sports Training: Part 1
Precision Physiotherapy & Sports Training: Part 1
 
Multimodal RGB-D+RF-based sensing for human movement analysis
Multimodal RGB-D+RF-based sensing for human movement analysisMultimodal RGB-D+RF-based sensing for human movement analysis
Multimodal RGB-D+RF-based sensing for human movement analysis
 
Creativity as Science: What designers can learn from science and technology
Creativity as Science: What designers can learn from science and technologyCreativity as Science: What designers can learn from science and technology
Creativity as Science: What designers can learn from science and technology
 
Light Treatment Glasses
Light Treatment GlassesLight Treatment Glasses
Light Treatment Glasses
 
Deep Learning for Biomedical Unstructured Time Series
Deep Learning for Biomedical  Unstructured Time SeriesDeep Learning for Biomedical  Unstructured Time Series
Deep Learning for Biomedical Unstructured Time Series
 
Hyperspectral Retinal Imaging
Hyperspectral Retinal ImagingHyperspectral Retinal Imaging
Hyperspectral Retinal Imaging
 
Instrumentation for in vivo intravital microscopy
Instrumentation for in vivo intravital microscopyInstrumentation for in vivo intravital microscopy
Instrumentation for in vivo intravital microscopy
 
Future of Retinal Diagnostics
Future of Retinal DiagnosticsFuture of Retinal Diagnostics
Future of Retinal Diagnostics
 
OCT Monte Carlo & Deep Learning
OCT Monte Carlo & Deep LearningOCT Monte Carlo & Deep Learning
OCT Monte Carlo & Deep Learning
 

Kürzlich hochgeladen

What-are-the-latest-modular-wardrobe-designs.pdf
What-are-the-latest-modular-wardrobe-designs.pdfWhat-are-the-latest-modular-wardrobe-designs.pdf
What-are-the-latest-modular-wardrobe-designs.pdfKams Designer Zone
 
Experion Elements Sector 45 Noida_Brochure.pdf.pdf
Experion Elements Sector 45 Noida_Brochure.pdf.pdfExperion Elements Sector 45 Noida_Brochure.pdf.pdf
Experion Elements Sector 45 Noida_Brochure.pdf.pdfkratirudram
 
What is Affordable Housing? Bristol Civic Society April 2024
What is Affordable Housing? Bristol Civic Society April 2024What is Affordable Housing? Bristol Civic Society April 2024
What is Affordable Housing? Bristol Civic Society April 2024Paul Smith
 
Anandtara Iris Residences Mundhwa Pune Brochure.pdf
Anandtara Iris Residences Mundhwa Pune Brochure.pdfAnandtara Iris Residences Mundhwa Pune Brochure.pdf
Anandtara Iris Residences Mundhwa Pune Brochure.pdfabbu831446
 
Prestige Sector 94 at Noida E Brochure.pdf
Prestige Sector 94 at Noida E Brochure.pdfPrestige Sector 94 at Noida E Brochure.pdf
Prestige Sector 94 at Noida E Brochure.pdfsarak0han45400
 
Kolte Patil Mirabilis at Horamavu Road, Bangalore E brochure.pdf
Kolte Patil Mirabilis at Horamavu Road, Bangalore E brochure.pdfKolte Patil Mirabilis at Horamavu Road, Bangalore E brochure.pdf
Kolte Patil Mirabilis at Horamavu Road, Bangalore E brochure.pdfAhanundefined
 
Prestige Rainbow Waters Raidurgam, Gachibowli Hyderabad E- Brochure.pdf
Prestige Rainbow Waters Raidurgam, Gachibowli Hyderabad E- Brochure.pdfPrestige Rainbow Waters Raidurgam, Gachibowli Hyderabad E- Brochure.pdf
Prestige Rainbow Waters Raidurgam, Gachibowli Hyderabad E- Brochure.pdffaheemali990101
 
Listing Turkey Green life Istanbul Eyup Catalog
Listing Turkey Green life Istanbul Eyup CatalogListing Turkey Green life Istanbul Eyup Catalog
Listing Turkey Green life Istanbul Eyup CatalogListing Turkey
 
Listing Turkey - Viva Perla Maltepe Catalog
Listing Turkey - Viva Perla Maltepe CatalogListing Turkey - Viva Perla Maltepe Catalog
Listing Turkey - Viva Perla Maltepe CatalogListing Turkey
 
Radiance Majestic Valasaravakkam Chennai.pdf
Radiance Majestic Valasaravakkam Chennai.pdfRadiance Majestic Valasaravakkam Chennai.pdf
Radiance Majestic Valasaravakkam Chennai.pdfashiyadav24
 
Dynamic Netsoft A leader In Property management Software
Dynamic Netsoft A leader In Property management SoftwareDynamic Netsoft A leader In Property management Software
Dynamic Netsoft A leader In Property management SoftwareDynamic Netsoft
 
Provident Solitaire Park Square Kanakapura Road, Bangalore E- Brochure.pdf
Provident Solitaire Park Square Kanakapura Road, Bangalore E- Brochure.pdfProvident Solitaire Park Square Kanakapura Road, Bangalore E- Brochure.pdf
Provident Solitaire Park Square Kanakapura Road, Bangalore E- Brochure.pdffaheemali990101
 
Listing Turkey - Resim Modern Catalog Istanbul
Listing Turkey - Resim Modern Catalog IstanbulListing Turkey - Resim Modern Catalog Istanbul
Listing Turkey - Resim Modern Catalog IstanbulListing Turkey
 
Provident Kenworth Rajendra Nagar Hyderabad.pdf
Provident Kenworth Rajendra Nagar Hyderabad.pdfProvident Kenworth Rajendra Nagar Hyderabad.pdf
Provident Kenworth Rajendra Nagar Hyderabad.pdfashiyadav24
 
Saheel ITREND Futura At Baner Annexe Pune - PDF.pdf
Saheel ITREND Futura At Baner Annexe Pune - PDF.pdfSaheel ITREND Futura At Baner Annexe Pune - PDF.pdf
Saheel ITREND Futura At Baner Annexe Pune - PDF.pdfmonikasharma630
 
_Navigating Inflation's Influence on Commercial Real Estate (CRE) Investing I...
_Navigating Inflation's Influence on Commercial Real Estate (CRE) Investing I..._Navigating Inflation's Influence on Commercial Real Estate (CRE) Investing I...
_Navigating Inflation's Influence on Commercial Real Estate (CRE) Investing I...SyndicationPro, LLC
 
Mahindra Vista Kandivali East Mumbai Brochure.pdf
Mahindra Vista Kandivali East Mumbai Brochure.pdfMahindra Vista Kandivali East Mumbai Brochure.pdf
Mahindra Vista Kandivali East Mumbai Brochure.pdfPrachiRudram
 
Lancaster Market Expenses and Company Worksheet
Lancaster Market Expenses and Company WorksheetLancaster Market Expenses and Company Worksheet
Lancaster Market Expenses and Company WorksheetTom Blefko
 
Kolte Patil Universe Hinjewadi Pune Brochure.pdf
Kolte Patil Universe Hinjewadi Pune Brochure.pdfKolte Patil Universe Hinjewadi Pune Brochure.pdf
Kolte Patil Universe Hinjewadi Pune Brochure.pdfPrachiRudram
 
It’s Time to Fight Back Against the Media Narrative Regarding Real Estate Com...
It’s Time to Fight Back Against the Media Narrative Regarding Real Estate Com...It’s Time to Fight Back Against the Media Narrative Regarding Real Estate Com...
It’s Time to Fight Back Against the Media Narrative Regarding Real Estate Com...Tom Blefko
 

Kürzlich hochgeladen (20)

What-are-the-latest-modular-wardrobe-designs.pdf
What-are-the-latest-modular-wardrobe-designs.pdfWhat-are-the-latest-modular-wardrobe-designs.pdf
What-are-the-latest-modular-wardrobe-designs.pdf
 
Experion Elements Sector 45 Noida_Brochure.pdf.pdf
Experion Elements Sector 45 Noida_Brochure.pdf.pdfExperion Elements Sector 45 Noida_Brochure.pdf.pdf
Experion Elements Sector 45 Noida_Brochure.pdf.pdf
 
What is Affordable Housing? Bristol Civic Society April 2024
What is Affordable Housing? Bristol Civic Society April 2024What is Affordable Housing? Bristol Civic Society April 2024
What is Affordable Housing? Bristol Civic Society April 2024
 
Anandtara Iris Residences Mundhwa Pune Brochure.pdf
Anandtara Iris Residences Mundhwa Pune Brochure.pdfAnandtara Iris Residences Mundhwa Pune Brochure.pdf
Anandtara Iris Residences Mundhwa Pune Brochure.pdf
 
Prestige Sector 94 at Noida E Brochure.pdf
Prestige Sector 94 at Noida E Brochure.pdfPrestige Sector 94 at Noida E Brochure.pdf
Prestige Sector 94 at Noida E Brochure.pdf
 
Kolte Patil Mirabilis at Horamavu Road, Bangalore E brochure.pdf
Kolte Patil Mirabilis at Horamavu Road, Bangalore E brochure.pdfKolte Patil Mirabilis at Horamavu Road, Bangalore E brochure.pdf
Kolte Patil Mirabilis at Horamavu Road, Bangalore E brochure.pdf
 
Prestige Rainbow Waters Raidurgam, Gachibowli Hyderabad E- Brochure.pdf
Prestige Rainbow Waters Raidurgam, Gachibowli Hyderabad E- Brochure.pdfPrestige Rainbow Waters Raidurgam, Gachibowli Hyderabad E- Brochure.pdf
Prestige Rainbow Waters Raidurgam, Gachibowli Hyderabad E- Brochure.pdf
 
Listing Turkey Green life Istanbul Eyup Catalog
Listing Turkey Green life Istanbul Eyup CatalogListing Turkey Green life Istanbul Eyup Catalog
Listing Turkey Green life Istanbul Eyup Catalog
 
Listing Turkey - Viva Perla Maltepe Catalog
Listing Turkey - Viva Perla Maltepe CatalogListing Turkey - Viva Perla Maltepe Catalog
Listing Turkey - Viva Perla Maltepe Catalog
 
Radiance Majestic Valasaravakkam Chennai.pdf
Radiance Majestic Valasaravakkam Chennai.pdfRadiance Majestic Valasaravakkam Chennai.pdf
Radiance Majestic Valasaravakkam Chennai.pdf
 
Dynamic Netsoft A leader In Property management Software
Dynamic Netsoft A leader In Property management SoftwareDynamic Netsoft A leader In Property management Software
Dynamic Netsoft A leader In Property management Software
 
Provident Solitaire Park Square Kanakapura Road, Bangalore E- Brochure.pdf
Provident Solitaire Park Square Kanakapura Road, Bangalore E- Brochure.pdfProvident Solitaire Park Square Kanakapura Road, Bangalore E- Brochure.pdf
Provident Solitaire Park Square Kanakapura Road, Bangalore E- Brochure.pdf
 
Listing Turkey - Resim Modern Catalog Istanbul
Listing Turkey - Resim Modern Catalog IstanbulListing Turkey - Resim Modern Catalog Istanbul
Listing Turkey - Resim Modern Catalog Istanbul
 
Provident Kenworth Rajendra Nagar Hyderabad.pdf
Provident Kenworth Rajendra Nagar Hyderabad.pdfProvident Kenworth Rajendra Nagar Hyderabad.pdf
Provident Kenworth Rajendra Nagar Hyderabad.pdf
 
Saheel ITREND Futura At Baner Annexe Pune - PDF.pdf
Saheel ITREND Futura At Baner Annexe Pune - PDF.pdfSaheel ITREND Futura At Baner Annexe Pune - PDF.pdf
Saheel ITREND Futura At Baner Annexe Pune - PDF.pdf
 
_Navigating Inflation's Influence on Commercial Real Estate (CRE) Investing I...
_Navigating Inflation's Influence on Commercial Real Estate (CRE) Investing I..._Navigating Inflation's Influence on Commercial Real Estate (CRE) Investing I...
_Navigating Inflation's Influence on Commercial Real Estate (CRE) Investing I...
 
Mahindra Vista Kandivali East Mumbai Brochure.pdf
Mahindra Vista Kandivali East Mumbai Brochure.pdfMahindra Vista Kandivali East Mumbai Brochure.pdf
Mahindra Vista Kandivali East Mumbai Brochure.pdf
 
Lancaster Market Expenses and Company Worksheet
Lancaster Market Expenses and Company WorksheetLancaster Market Expenses and Company Worksheet
Lancaster Market Expenses and Company Worksheet
 
Kolte Patil Universe Hinjewadi Pune Brochure.pdf
Kolte Patil Universe Hinjewadi Pune Brochure.pdfKolte Patil Universe Hinjewadi Pune Brochure.pdf
Kolte Patil Universe Hinjewadi Pune Brochure.pdf
 
It’s Time to Fight Back Against the Media Narrative Regarding Real Estate Com...
It’s Time to Fight Back Against the Media Narrative Regarding Real Estate Com...It’s Time to Fight Back Against the Media Narrative Regarding Real Estate Com...
It’s Time to Fight Back Against the Media Narrative Regarding Real Estate Com...
 

Emerging 3D Scanning Technologies for PropTech

  • 1. Emerging 3D Scanning Technologies for PropTech Falling costs with rising quality via hardware innovations and deep learning
  • 2. Outlineofthepresentation StructurefromMotion(SfM) Low-cost passive sensing 360°imaging Omnidirectional immersiveimagesandvideos Rangesensing Structuredlight, Matterport,Kinectforexample Laserscanning LiDARs fromVelodyne for example Data-drivenprocessing DeepLearning 3DDatasets Withwhat totrain yourdeeplearningpipelines FutureProspects Short overview of future applications Thepresentationismeant asatechnical introductionfor typical hardware andsoftware processingtechniquesusedinreal estateandconstruction site scanning. Computerscientistsnew to proptechorganizations andreal estate fieldin generalmight especiallyfindthispresentation useful.One assumesthat thereaderisfamiliarwiththe basics ofdeeplearning.
  • 3. Datastructuresfor realestatescans RGB+D Pixel grid presenting colorand depth Example from Prof. Li Mesh(Polygon) from voxel data(“3Dpixels”) Voxel grid meshing using marching cubes (StackExchange) PointCloud unordered datatypically (i.e. not on agrid but sparse
  • 4. PropTechResources for domaininsights https://www.inman.com/ Inman Hacker Connect is created by and for the real estate technology community. Debate, discuss and define the future of real estate’s most pressing tech issues at Hacker Connect. Join more than 400 engineers, developers, designers, product managers, database architects, webmasters, and technology executives from across the real estate space. Build partnerships, connect with peers, tackle thorny tech issues, learn best practices discover innovative breakthroughs and collaborate during special hands-on keyboard sessions at this day-long, tech- first event. WHY YOU SHOULD ATTEND Hear from industry leaders on APIs, bots, data security, ownership, user experience, blockchain and more. Take part in collaborative hands-on-keyboard sessions and come out with a new tool to apply to your job. Learn how to better integrate data, workflows and be competitive in your recruitment efforts https://www.inman.com/event/hacker-17-sf/ http://www.moderneventures.com/accelerator/ https://gust.com/accelerators/moderne-accelerator (Pi Labs) is Europe’s first venture capital platform investing exclusively in early stage ventures in the property tech vertical. London, United Kingdom. http://pilabs.co.uk/ http://www.jamesdearsley.co.uk/ “The only PropTech site for the latest Property Technology news and views” #PropTech community across Europe. Join us for our next event in #Berlin http://futureproptech.de/
  • 6. StructurefromMotionBasics Structure-from-Motion (SfM). Instead of a single stereo pair, the SfM technique requires multiple, overlapping photographs as input to feature extraction and 3-D reconstruction algorithms. - Westoby et al praehistorische-archaeologie.de - Florian Tubbesing Structure from Motion can achieve good accuracy compared to laser scanners. James and Robson (2012) Cited by 281 Articles, and see Related articles This volcanic bomb (~10 cm across) from Soufrière Hills volcano was scanned by an Arius3d laser scanner ( Stuart Robson, University College London) and also reconstructed using the SfM-MVS technique, with the results scaled by sfm_georef. Differences between cross sections through the two models have RMS values of ~0.3 mm. Point cloud: low res (6 Mb) http://www.lancaster.ac.uk/staff/jamesm/software/sfm_georef.htm SfM method basically computes the relative camera positions between all related photos. After every relative camera position is found, the scheme uses these matrices to reconstruct all feature points using triangulation. Thus there are two main problems: 1) Image registration (e.g. SIFT, SURF, ORB, etc) 2) Pose Estimation (e.g. Perspective-n-Point with RANSAC) By Dr Calle Olsson https://www.youtube.com/watch?v=i7ierVkXYa8
  • 7. StructurefromMotionLiteratureReferences https://doi.org/10.1016/j.geomorph.2012.08.021 Cited by 631 articles, and see Related articles https://arxiv.org/abs/1701.08493 Structure-from-Motion’ (SfM) operates under the same basic tenets as stereoscopic photogrammetry, namely that 3-D structure can be resolved from a series of overlapping, offset images. However, it differs fundamentally from conventional photogrammetry, in that the geometry of the scene, camera positions and orientation is solved automatically without the need to specify a priori, a network of targets which have known 3-D positions. Instead, these are solved simultaneously using a highly redundant, iterative bundle adjustment procedure, based on a database of features automatically extracted from a set of multiple overlapping images (Snavely et al 2008). Finally, even though there exist various theoretical works in the literature that study fundamental problems in SfM and/or provide rigorous analysis of stability and robustness of specific methods, we believe that the SfM community would still highly benefit from rigorous results on fundamental problems (e.g., what is the theoretically maximal amount of mismatched features or level of noise in the images that can be tolerated for a stable structure recovery, and can this be achieved efficiently?) and theoretical analysis of stability, robustness and computational efficiency of existing or new methods
  • 8. SLAM Simultaneouslocalizationandmapping SLAM, Visual Odometry, Structure from Motion, Multiple View Stereo Yu Huang, Senior Architect, Autonomous Driving@Baidu USA https://www.slideshare.net/yuhuang/visual-slam-structure-from-motion-multiple-view-stereo Samsung R&D Institute Necessary Skills / Attributes: ● 5+ years’ experience delivering computer vision based products using C++ or Python (Masters or PhD study will be considered). ● Theoretical and practical understanding of multi-view geometry and 3D reconstruction. ● Experience with machine learning techniques within a computer vision context. ● PhD/MS in Computer Vision, Artificial Intelligence or Machine Learning. ● Expertise with Deep Neural Networks using TensorFlow or Keras. SLAM stands for Simultaneous Localization and Mapping and one way to understand it is to imagine yourself entering an unfamiliar building for the first time. As you move about the building, you don't completely forget where you have already been. Indeed, at any moment you have a pretty good idea where you are within the current map that you have so far constructed in your head, and unless you have a really bad sense of direction, you could probably turn around and get back out of the building without too much trouble. Finding your way around the building is a good example of simultaneously constructing a map and localizing yourself within that map. http://www.pirobot.org/blog/0015/
  • 9. SLAM Traditionalalgorithm comparison http://dx.doi.org/10.1186/s41074-017-0027-2 The framework is mainly composed of three modules as follows. 1) Initialization 2) Tracking 3) Mapping Additional modules for stable and accurate vSLAM + Relocalization +Global map optimization “ From the technical point of views, there is no definitive difference between SLAM and real-time SfM.” Even though visual SLAM algorithms have been developed since 2003, vSLAM is still an active research field. Each algorithm has different characteristics. We need to choose an appropriate algorithm by considering a purpose of an application.
  • 10. VisualOdometry Taketomi et al. (2017): http://dx.doi.org/10.1186/s41074-017-0027-2 “Odometry is to estimate the sequential changes of sensor positions over time using sensors such as wheel encoder to acquire relative sensor movement. Camera-based odometry called visual odometry (VO) is also one of the active research fields in the literature [16, 17]. From the technical point of views, vSLAM and VO are highly relevant techniques because both techniques basically estimate sensor positions. According to the survey papers in robotics [18, 19], the relationship between vSLAM and VO can be represented as follows. vSLAM = VO + global map optimization The relationship between vSLAM and VO can also be found from the papers [20, 21] and the papers [22, 23]. In the paper [20, 22], a technique on VO was first proposed. Then, a technique on vSLAM was proposed by adding the global optimization in VO [21, 23].” Towards stable visual odometry & SLAM solutions for autonomous vehicles https://www.youtube.com/watch?v=T5Y6OPG-d08 NavStik Hackerspace | Projects at Hackerspace Visual Odometry using Optic Flow
  • 11. SoftwareOpen-sourceVisualSFM VisualSFM:AVisualStructurefromMotion System Changchang Wu Cited by 326 articles, and see Related articles VisualSFM is a GUI application for 3D reconstruction using structure from motion (SFM). The reconstruction system integrates several of my previous projects: SIFT on GPU(SiftGPU), Multicore Bundle Adjustment, and Towards Linear-time Incremental Structure from Motion . VisualSFM runs fast by exploiting multicore parallelism for feature detection, feature matching, and bundle adjustment. Using VisualSFM and Meshlab as an offline alternative to Autodesk's excellent 123D catch. I walk you through my workflow for converting multiple images into a 3D model suitable for use in Blender. Tutorial for amateur photographers by Jamie Fuller. https://www.youtube.com/watch?v=V4iBb_j6k_g OpenSourcePhotogrammetrywithVisualSFM: Ditching123DCatchJuly12,2013 by Jesse Indoor Navigation from Multiple Images By Jaan Tollander de Balsch, 2016, Aalto https://jaantollander.github.io/SCI-C1000/pr ototype.html What is the best method for 3D object modelling and reconstruction from photos or videos taken by flying robots or drones? What is the accuracy of such reconstruction methods with regards to the vibrations of the flying drones, quality of camera and resolution? Is it possible to improve the results by organizing multiple flights and overlaying/accumulating the data in the point cloud? Is there any free software available?
  • 12. SoftwarePythonPhotogrammetryToolbox(PPT)GUI Real photo x SfM with texture color x SfM with simple shader. Made with Python Photogrammetry Toolbox GUI and rendered in Blender with Cycles. http://184.106.205.13/arcteam/ppt.php https://github.com/archeos/ppt-gui/ Converting pictures into a 3D mesh with PPT, MeshLab and Blender http://arc-team-open-research.blogspot.co.uk/2012/09/converting-pi ctures-into-3d-mesh-with.html Blender camera tracking + Python Photogrammetry Toolbox http://arc-team-open-research.blogspot.co.uk/2012/11/blender-camer a-tracking-python.html The video show the skull reconstructed in 3D with Python Photogrammetry Toolkit GUI. Smilodon, the 3D reconstruction of the saber-toothed cat http://arc-team-open-research.blogspot.co.uk/2013/03/
  • 13. Open-sourcelibraries forSfM OpenSfM is a Structure from Motion library written in Python on top of OpenCV. The library serves as a processing pipeline for reconstructing camera poses and 3D scenes from multiple images. https://github.com/mapillary/OpenSfM 656 stars OpenSfM OpenMVG (Multiple View Geometry) "open Multiple View Geometry" is a library for computer-vision scientists and especially targeted to the Multiple View Geometry community. https://github.com/openMVG/openMVG 1,1856 stars OpenMVG https://doi.org/10.1007/978-3-319-56414-2_5 http://imagine.enpc.fr/~marletr/publi/RRPR-2016 -Moulon-et-al.pdf Sung and Lin (2017): “VisualSFM uses the pre- emptive feature matching, the incremental structure from motion and the re-triangulation techniques. The incremental feature matching can greatly speed up the process because this kind of matching will first sort all feature points and match only first h feature points for each photo.” Sung and Lin (2017): “OpenMVG also contains incremental structure from motion technique. Besides that, they proposed a new iterative sampling method called a contrario Random Sample Consensus (AC-RANSAC) as a substitution to the original RANSAC in order to acquire higher precision and better performance. The AC-RANSAC using the “a contrario” methodology in order to find a model that best fits the data with a threshold T that adapts automatically to the noise. Hence, it is able to find a model and its associated noise without a fixed threshold.”
  • 14. Open-sourcelibraries forSfM+SLAM OpenChisel https://github.com/personalrobotics/OpenChisel An open-source version of the Chisel chunked TSDF library. It contains two packages: open_chisel open_chisel is an implementation of a generic truncated signed distance field (TSDF) 3D mapping library; based on the Chisel mapping framework developed originally for Google's Project Tango. It is a complete re-write of the original mapping system (which is proprietary). open_chisel is chunked and spatially hashed inspired by this work from Neissner et. al, making it more memory-efficient than fixed-grid mapping approaches, and more performant than octree-based approaches. A technical description of how it works can be found in our RSS 2015 paper. http://ri.cmu.edu/pub_files/2015/7/ChiselPaper.pdf
  • 15. Research-gradeSfM old-school monovideo http://dx.doi.org/10.1186/s13640-017-0168-3 Inspired by the structure from motion systems, we propose a system that reconstructs sparse feature points to a 3D point cloud using a mono video sequence so as to achieve higher computation efficiency. The system keeps tracking all detected feature points and calculates both the amount of these feature points and their moving distances. We only use the key frames to estimate the current position of the camera in order to reduce the computation load and the noise interference on the system. Furthermore, for the sake of avoiding duplicate 3D points, the system reconstructs the 2D point only when the point shifts out of the boundary of a camera. In our experiments, we show that our system is able to be implemented on tablets and can achieve state-of-the-art accuracy with a denser point cloud with high speed.
  • 17. Research-gradeSfM DeepLearning -based#2 https://arxiv.org/abs/1702.01381, 2 May 2017 We evaluated the performance of our proposal on the DTU dataset comparing it with two traditional feature based methods, namely SURF (Cited by 8683 articles) and ORB ( Cited by 2739 articles). The system is trained in an end-to-end manner utilising transfer learning from a large scale classification dataset. In addition, a variant of the proposed architecture containing a spatial pyramid pooling (SPP) layer is evaluated and shown to further improve the performance. RegNet is able to correct even large decalibrations such as depicted in the top image. The inputs for the deep neural network are an RGB image and a projected depth map. RegNet is able to establish correspondences between the two modalities which enables it to estimate a 6 DOF extrinsic calibration. Additionally, with an iterative execution of multiple CNNs, that are trained on different magnitudes of decalibration, our approach compares favorably to state-of-the-art methods in terms of a mean calibration error of 0.28º for the rotational and 6 cm for thetranslation components even for large decalibrations up to 1.5 m and 20º . https://arxiv.org/abs/1702.02295
  • 18. Research-gradePose/Structure DeepLearning -based#1 Essentially the same technology for stereo matching and depth map generation as for SfM https://arxiv.org/abs/1703.04309 https://arxiv.org/abs/1704.07813 Empirical evaluation on the KITTI dataset demonstrates the effectiveness of our approach: 1) monocular depth performs comparably with supervised methods that use either ground-truth pose or depth for training, and 2) pose estimation performs favorably compared to established SLAM systems under comparable input settings.
  • 19. Research-gradePose/Structure DeepLearning -based#2 GANs on everything, so here as well :) The usefulness of VisualSFM/ openSFM/ openMVG for defensible startup products? Inversion is often ambiguous, e.g., many compositions of 3D shape and camera pose give rise to the same 2D projection. To address this ambiguity, we impose priors on the predicted latent factors, through an adversarial discriminator network trained to discriminate between predicted factors and ground-truth ones. Training adversarial inversion does not require input-output paired annotations, but merely a collection of ground-truth factors, unrelated (unpaired) to the current input. Our model can thus be self-supervised by unlabelled image data, by minimizing a joint reconstruction and adversarial loss, complementing any direct supervision provided by paired annotations. Applying adversarial inversion to super-resolution and inpainting results in automated “visual plastic surgery” Structure-from-motion(SfM) results with and without adversarial priors. The results of the baseline (columns 5th and 8th) are obtained from a model with depth smooothness prior, trained with early stopping at 40K iterations (before divergence).
  • 20. SfMonMobileDevices https://arxiv.org/abs/1611.09498 https://doi.org/10.1109/ICCV.2013.15 | Cited by 141 articles, see Related articles https://doi.org/10.1016/j.cviu.2016.09.007 After introducing the reconstruction algorithms at the base of our approach, we show how to build applications able to generate 3D floor plans scaled to their real-world metric dimensions and capable to manage scene not necessary limited by Manhattan World assumptions. Then, exploiting the resulting structural and visual model, we propose a client-server interactive exploration system implementing a low-DOF navigation interface, specifically developed for touch interaction on smartphones and tablets. https://doi.org/10.1145/2999508.2999526
  • 21. SfMonMobileDevices CaseDacuda Magic Leap, the augmented reality startup that has raised $1.4 billion in funding but has yet to release a product, has made an acquisition to expand its work in computer vision and deep learning, and to build out its operations into Europe. The company has acquired the 3D division of Dacuda, a computer vision startup based out of Zurich. One of Dacuda’s focuses had been developing algorithms for consumer- grade cameras (and not just cameras, but any device with a camera function) to capture 2D and 3D imaging in real time, “making 3D content as easy as taking a video.” https://techcrunch.com/2017/02/18/confir med-magic-leap-acquires-3d-division-of-d As you can see, no detail about what the two might be working on. The acquisition was first rumored last week — after Dacuda posted a note on its blog about selling its 3D division, and then some Dacuda employees updated their LinkedIn profiles as Magic Leap employees (one example here). Tom’s Hardware then speculated it could signal Magic Leap using technology developed by Dacuda to enable room-scale, six degrees of freedom tracking (essentially to improve its image capturing sensors in 3D environments). The ecosystem there is attracting other big-name M&A. Faceshift, a motion capture startup acquired by Apple in 2015, was also founded in Zurich. Facebook’s Oculus VR in August 2016 also quietly acquired a startup called Zurich Eye, incubated at the University of Zurich and ETH, the federal institute of technology. Zurich Eye became the basis of Oculus and Facebook’s office in the city. Zurich Eye, ironically, was co-founded by a three former software engineers from Dacuda (they all now work for Oculus VR). For example, in October the company had linked up with MindMaze, another virtual/augmented reality startup out of Switzerland, to build a platform they were calling “MMI, the world’s first multisensory computing platform for mobile-based, immersive and social virtual reality applications,” MindMaze noted. MindMaze said it planned to “deploy the technology for users globally to address a void left by Google’s DayDream View for positional tracking and multiplayer interactions.” We have contacted Magic Leap for comment and will update this post if and when we learn more.
  • 22. AppleARKit Technology https://developer.apple.com/arkit/ Since the iPhone 6, iPhones have used what Apple calls “Focus Pixels”, which is its term for phase detection AF. Fast Company reports that system will be replaced with laser autofocus possibly as soon as the next iPhone, which is set to debut this fall. It is likely that Apple would use both AF technologies, as Google does in its Pixel line of phones. The technology would serve a dual purpose, also allowing for better depth perception with the inbuilt camera for augmented reality apps. ARKit rolls out with iOS 11 this fall, so it would make sense to also include the VSCEL laser system in the phone launching at the same time. https://petapixel.com/2017/07/20/apple-bring-3d-laser-autofocus-iphone-cameras-report-says/ https://www.theverge.com/2017/6/26/15872332/apple-arkit-ios-11-augmented-reality-developer-excitement
  • 23. AppleARKit ExampleApplications https://twitter.com/madewithARKit Measuring kitchen dimensions http://bit.ly/2tJ5KV8 app by→ @SmartPicture3D Measure distances with your iPhone. Clever little #ARKit app by @BalestraPatrick http://bit.ly/2sFl8RB Inter-dimensional iPhone AR portals are closer than they appear http://bit.ly/2sufO0d ARkit demo by @nedd Demo Shows How Augmented Reality Will Make Advertising More Immersive. Mixed reality producer Bilawal Singh Sidhu show peek of what the world of advertising could be with the ARKit. #adtech https://mobile-ar.reality.news/news/apple-ar-demo-shows- augmented-reality-will-make-advertising-more-immersive-0 178905/
  • 24. Google’s responsetoARKit ARCore DAVID JAGNEUX, UPLOADVR@UPLOADVR SEPTEMBER 2, 2017 6:00 AM “Earlier this week, Google announced ARCore, a software-based solution for making more Android devices AR-capable without the need for depth sensors and extra cameras. It will even work on the Google Pixel, Galaxy S8, and several other devices very soon and supports Java, Unity, and Unreal from day one. In short, it’s kind of like Google’s answer to Apple’s ARKit.” - https://venturebeat.com/2017/09/02/googles-first-arcore-goal-100-million-ar-capable-android-phones/ “Another example, which is especially relevant for developers that build traditional smartphone apps in Java, is that we want to make it easier than ever for people to get into 3D modeling that haven’t done it before,” Bavor says. “We know there are a lot of people that want to get into 3D development and AR but aren’t experts in Maya, or Unity, or anything. So Blocks is an app we built with the intention of enabling people that have never done a 3D model in their life to feel comfortable building 3D assets. We even made it easy to export right from Blocks and pull into ARCore apps you’re developing.”
  • 25. ARCore tooearlytotellhowitwilldoagainst“AppleCult” Verge Adi Robertson https://youtu.be/NhJydpMkpug FusedVR https://youtu.be/dNXBvDKRg1M https://venturebeat.com/2017/08/29/google-launches-arcore -sdk-in-preview-ar-on-android-phones-no-extra-hardware-re quired/ https://youtu.be/ttdPqly4OF8 Super Ventures Blog Matt Miesnieks CEO 6D.ai, Partner @Super_Ventures, AR technology & cycling https://medium.com/super-ventures-blog/how-is-arcore-better-than-arkit-5223e6b3e79d ● Isn’t ARCore just Tango-lite? ● The iPhone-8-keynote sized elephant in the room ● So should I build on ARCore now? ● Is ARCore better than ARKit? Scottie Gardonio Aug 30 AR / VR enthusiast. Creative Manager. Passionate graphic designer. https://medium.com/iotforall/arcore-vs-arkit-google-counters-apple-33483c08d3da ARCore vs. ARKit: Google Counters Apple Let the Dueling Begin Google announcing inside-out 6-DOF tracking support for Daydream back at Google IO earlier this year.
  • 26. DeepLearningonMobileDevices https://techcrunch.com/2017/05/17/googles-tensorflow-lite-brings-machine-learning-to-android-devices/ http://blog.stratospark.com/creating-a-deep-learning-ios-app-with-keras-and-tensorflow.html ● 3D Face Capture ● 3D Scene Reconstruction ● 2.5D Scene Reconstruction and Computational Photography ● SLAM and Object Tracking ● Augmented Reality ● Google Cardboard SDK for iOS https://doi.org/10.1109/IPSN.2016.7460664 | Cited by 28 articles, see Related articles Thursday 20 July 2017, Movidius USB stick https://techcrunch.com/2017/07/20/movidius-launches-a-79-deep-learning-usb-stick/ Snapchat secretly acquires Seene, a computer vision startup that lets ... https://techcrunch.com/.../snapchat-secretly-acquires-seene-a- computer-vision-startup-... 3 Jun 2016 https://doi.org/10.1109/PDP.2017.98 https://arxiv.org/abs/1705.06224
  • 28. 360°(omnidirectionalimaging) Introduction The Panoptic Camera platform developed jointly by Microelectronic Systems Laboratory (LSM) and Signal Processing Laboratory (LTS2) of EPFL.* http://lsm.epfl.ch/page-52820-en.html Wikipedia: “360-degree videos, also known as immersive videos[1] or spherical videos ,[2] are video recordings where a view in every direction is recorded at the same time, shot using an omnidirectional camera or a collection of cameras. During playback the viewer has control of the viewing direction like a panorama.” Consumer-level camera review http://thewirecutter.com/reviews/best-360-degree-camera/ By DANIEL CULPANWednesday 12 August 2015 http://www.wired.co.uk/article/9-mind-blowing-360-degree-videos Scuba Diving Short Film in 360° Green Island, Taiwan https://youtu.be/2OzlksZBTiA
  • 29. 360°aspartof “10BreakthroughTechnologiesof2017” https://www.technologyreview.com/s/603496/10-breakthrough-technologies-2017-the-360-degree-selfie/ Seasonal changes to vegetation fascinate Koen Hufkens. So last fall Hufkens, an ecological researcher at Harvard, devised a system to continuously broadcast images from a Massachusetts forest to a website called VirtualForest.io. And because he used a camera that creates 360°pictures, visitors can do more than just watch the feed; they can use their mouse cursor (on a computer) or finger (on a smartphone or tablet) to pan around the image in a circle or scroll up to view the forest canopy and down to see the ground. Journalists from the New York Times and Reuters are using $350 Samsung Gear 360 cameras to produce spherical photos and videos that document anything from hurricane damage in Haiti to a refugee camp in Gaza. One New York Times video that depicts people in Niger fleeing the militant group Boko Haram puts you in the center of a crowd receiving food from aid groups. Or consider the spherical videos of medical procedures that the Los Angeles startup Giblib makes to teach students about surgery. The company films the operations by attaching a $500 360fly 4K camera, which is the size of a baseball, to surgical lights above the patient. The 360° view enables students to see not just the surgeon and surgical site, but also the way the operating room is organized and how the operating room staff interacts. These applications are feasible because of the smartphone boom and innovations in several technologies that combine images from multiple lenses and sensors. For instance, 360° cameras require more horsepower than regular cameras and generate more heat, but that is handled by the energy-efficient chips that power smartphones. Both the 360fly and the $499 ALLie camera use Qualcomm Snapdragon processors similar to those that run Samsung’s high- end handsets. Once people discover spherical videos, research suggests, they shift their viewing behavior quickly. The company Humaneyes, which is developing an $800 camera that can produce 3-D spherical images, says people need to watch only about 10 hours of 360° content before they instinctively start trying to interact with all videos. When you see 360°imagery that truly transports you somewhere else, you want it more and more.
  • 30. Low-costendSamsung Gear andGalaxy Samsung Gear360, ~£250 Samsung GearVR, ~£100 Samsung Galaxy S6-8, smartphone, ~£200-£700 http://www.samsung.com/uk/wearables/gear-360-c200/ If you’re clamoring to shoot in 360 degrees, the Gear 360 balances simple design with workable image quality — but you really need a Samsung phone (and a Gear VR, and a good hunk of money) to get the most out of it. And, for now, that's fine. This version of the Gear 360 is more likely to be looked back on as a relic anyway, a recognizable but eventually dismissible attempt at a new idea, and the foundation for whatever Samsung does next.
  • 31. Low-costend#2Ricoh Theta Ricoh’s Theta V 4K camera sports 360- degree video and wireless playback RYAN WINTERHALTER, UPLOADVR@@UPLOADVR SEPTEMBER 02, 2017 07:03 PM https://venturebeat.com/2017/09/02/ricohs-theta-v-4k-camera-sport s-360-degree-video-and-wireless-playback/ Ricoh is unveiling its latest 360-degree camera this morning. Dubbed the Ricoh Theta V, the $430 4K camera is the latest in the line which launched in 2013 with the Ricoh Theta. Available for pre-order now, and shipping in mid-September, the Theta V features 3,820-by-1,920 resolution video capture. That’s a massive improvement on the earlier Theta S, which offered a sub-1,080p 1,920-by-960, and the Theta SC, which allowed for 1,920-by-1,080 recording. Perhaps the biggest usability improvement to the Theta V is the inclusion of remote playback. Users can now wirelessly stream their video to an external display directly from the camera. Previous devices in the Theta line (except the developer-only Theta R) required users to export their raw footage into a computer to stitch the image and create a useable video. That’s now all done on the device. Videographers can watch their footage on any display, and move the POV by moving the camera itself. The Theta V boosts sound quality as well. Four microphones capture data from their respective dimensions, creating spatial audio that allows users to hear where the sound is coming from within the recording. Ricoh Theta V hands-on Published Aug 31, 2017 | Jeff Keller Based on some quick tests of a non-final Theta V, both stills and videos are noticeably better than those from its predecessor. We're looking forward to getting our hands on a production model in a few weeks and putting it through its paces. For higher quality audio capture, Ricoh is offering the TA-1 3D Microphone ($269). Developed by Audio Technica, the mic attaches via the tripod mount and uses a standard 3.5mm audio jack.
  • 32. HigherEndGoPro, Nokia Ozo, FacebookSurround, etc. GoPro (NASDAQ:GPRO) recently unveiled the Omni, a six-camera rig for filming interactive spherical videos that can be explored through a smartphone's movements, a user's finger swipes, or a virtual reality headset. The device is the smaller sibling of the 16-camera Odyssey rig ($15,000), which hasn't been launched despite being announced nearly a year ago. Let's take a look at four key things investors should know about the Omni ($3,500), and how they might impact GoPro's future. https://www.fool.com/investing/general/2016/04/14/4-things-inves tors-need-to-know-about-gopro-incs-o.aspx What's next for GoPro? GoPro investors don't have many catalysts to look forward to this year. The Omni is too pricey relative to its peers to gain any mainstream traction. The Karma drone, which is due to arrive within the next two months, faces tough competition from market leader DJI Innovations. By the time the Hero 5 cameras arrive near the end of the year, the mainstream market could be saturated with cheap VR and flying cameras. Introducing Facebook Surround 360: An open, high-quality 3D-360 video capture system Brian K Cabral, April 12, 2016 ● Facebook has designed and built a durable, high- quality 3D-360 video capture system. ● The system includes a design for camera hardware and the accompanying stitching code, and we will make both available on GitHub this summer. We're open-sourcing the camera and the software to accelerate the growth of the 3D-360 ecosystem — developers can leverage the designs and code, and content creators can use the camera in their productions. ● The system exports 4K, 6K, and 8K video for each eye. The 8K videos double industry standard output and can be played on Gear VR with Facebook's custom Dynamic Streaming technology. https://code.facebook.com/posts/1755691291326688/introduc ing-facebook-surround-360-an-open-high-quality-3d-360-vid eo-capture-system/ https://www.theverge.com/2016/4/25/11421992/disney-nokia-oz o-camera-virtual-reality-star-wars-marvel Ever since Nokia announced its 360-degree Ozo virtual reality camera it has positioned the system as a high-end option for Hollywood filmmakers, and today the company is announcing a partnership with Disney that should help deliver on that promise. As part of the deal, Ozo cameras will be put into the hands of Disney filmmakers and its marketing teams to create 360-degree, virtual reality content across all of the studio’s various brands.
  • 33. LytroImmerge The world'sfirst professional Light Field solution forcinematicVR roadtovr.com/lytros-immerge-360 https://www.lytro.com/immerge Consequently, to create a virtual reality that even the human eye cannot distinguish from the real world, we must achieve the perfect immersive viewing experience, such that human viewers feel they can walk into the scene. This is known as the virtual walk-in effect, and it requires light-field technology—3D imaging technology that emerged from the field of computational imaging/photography to capture the light rays that people perceive from different locations and directions. When combined with computer vision and deep learning, light- field technology provides a viable path for producing low-cost, high-quality VR content, positioning this technology to be the most profitable segment of the VR industry.
  • 34. “DepthLytro”‘Depth sensing with light fieldtechniques Refocusing in spite of foreground occlusions: (a) Scene containing a monkey toy being partially occluded by a plant in the foreground, (b) traditional synthetic aperture refocusing on light field is partially effective in removing the effect of foreground plants, (c) synthetic aperture refocusing of depth displays corruption due to occlusion, (d) histogram of depth clearly shows two clusters corresponding to plant and monkey, (e) virtual aperture refocusing after removal of plant pixels shows sharp depth image of monkey, (f) Quantitative comparison of indicated scan line of the monkey’s head for (c) and (e) We use coding techniques from Tadano et al. (2015) to image beyond backscattering nets. Notice how the corrupted depth maps are improved using the codes. We show how digital refocusing can be performed on the images without the scattering occluders by combining depth fields with coded TOF. https://arxiv.org/abs/1509.00816
  • 35. Post-processingfor360° imaging https://doi.org/10.1007/s00371-017-1368-7 Overall process. a Input image. b Lines detected and classified: red for vertical lines and yellow for horizontal lines. c Great circles from the classified lines. Green dots are vanishing points computed from horizontal (yellow) lines. d Upright adjustment result We implemented our method using C++ and the OpenCV library on a 64-bit Windows PC with an Intel i7- 6700K 4.00GHz CPU and 32GB RAM. For an input image of size 5376 × 2688 px, it takes a few hundred milliseconds (less than one second) to obtain the final rotation matrix R for upright adjustment. https://arxiv.org/abs/1703.10798 http://vllab1.ucmerced.edu/~wlai24/360hyperlapse Pipeline of the proposed algorithm. Given a 360 video, we first stabilize the sequence to smooth the relative rotation◦ between adjacent frames. We estimate the focus of expansion (i.e., the direction of forward motion) as a prior information for our camera path planning. To extract the regions of interest, we compute the spatial-temporal saliency and semantic segmentation. The detected regions of interest are used to guide the camera path planning. Finally, we use an adaptive 2D video stabilization to render a smooth hyperlapse.
  • 36. 360°DeepLearning #1 http://dx.doi.org/10.3390/s17061341 https://arxiv.org/abs/1705.01759 Watching a 360º sports video requires a viewer to continuously select a viewing angle, either through a sequence of mouse clicks or head movements. To relieve the viewer from this “360 piloting” task, we propose “deep 360 pilot” – a deep learning-based agent for piloting through 360º sports videos automatically Panel (a) overlaps three panoramic frames sampled from a 360 skateboarding video◦ with two skateboarders. One skateboarder is more active than the other in this example. For each frame, the proposed “deep 360 pilot” selects a view – a viewing angle, where a Natural Field of View (NFoV) (cyan box) is centered at. It first extracts candidate objects (yellow boxes), and then selects a main object (green dash boxes) in order to determine a view (just like a human agent). Panel (b) shows the NFoV from a viewer’s perspective.
  • 37. 360°DeepLearning #2 Flat2Sphere: Learning Spherical Convolution for Fast Features from 360° Imagery Yu-Chuan Su, Kristen Grauman (Submitted on 2 Aug 2017) https://arxiv.org/abs/1708.00919 We propose to learn a spherical convolutional network that translates a planar CNN to process 360° imagery directly in its equirectangular projection. Our approach learns to reproduce the flat filter outputs on 360° data, sensitive to the varying distortion effects across the viewing sphere. The key benefits are 1) Efficient feature extraction for 360°images and video, and 2) The ability to leverage powerful pre- trained networks researchers have carefully honed (together with massive labeled image training sets) for perspective images. We validate our approach compared to several alternative methods in terms of both raw CNN output accuracy as well as applying a state-of-the-art "flat" object detector to 360° data. Our method yields the most accurate results while saving orders of magnitude in computation versus the existing exact reprojection solution.
  • 38. 360°Therolein PropTech? #1a Usefor real estate agents, still a novelty/gimmicky? (from 2014 until 2017) MAY 26, 2014 By James Dearsley http://www.jamesdearsley.co.uk/is-the-property-industry-intereste d-in-360-degree-hd-filming/ USES OF 360 DEGREE HD FILMING IN REAL ESTATE: 1. Sales and Marketing. Firstly, from a realtor or estate agent perspective there are several uses here of 360 degree cameras, the first being obvious, that of sales and marketing. It will be simple and efficient to take a quick film of each room, or just walk through the property with these devices to record what you need 2. Property Management issues. We have also seen interest from companies looking to use these bits of equipment for inventory taking. Seeing as they are of HD quality it means you can quickly take photographs of properties which can later be looked at in more detail should problems arise in letting disputes. 3. Virtual Reality. With Facebook recently buying Oculus Rift for $2 Billion, it is getting less far fetched. Considering the price of an Oculus is relatively cheap (reckoned to be less than $500/£360 when released next year) it would not be surprising if Facebook are hoping for a lot of people to be purchasing these (Candy Crush Saga in Virtual Reality anyone?!). It isn’t just Facebook though; Sony have a VR headset in production as does Samsung (it was recently announced) and so this space is going to move quickly. By using these cameras you can put your clients into these homes very quickly and easily – either in the office, if you get a set of these yourself, or, in time, in their own home if Facebook get their way. https://www.forbes.com/sites/forbesagencycouncil/2017/06/28/want-to-use-360 -degree-photo-and-video-11-things-to-consider/#22fffa955002 1. I would recommend that marketers stay on the sidelines until the industry matures. - Kristopher Jones, LSEO.com 4. Use A Strategic Approach The capabilities of 360-degree photo/video have powerful applications in many industries, including real estate, retail and tourism. A 360-degree view has a better chance of selling a house than a static image. - Brock Murray, seoplus+ 7. Prepare For Tomorrow's Consumer Expectations Today, 360-degree photos and videos are very helpful in industries such as the auto industry or real estate where visualizing the product is essential. As VR continues to grow, 360-degree photos and videos will likely become a standard. The consumers' expectations will likely adjust to needing to learn more about the overall "360-degree" experience of the restaurant for example, not just a picture of the dish. - Ahmad Kareh, Twistlab Marketing 11. Create An Emotional Connection 360-degree multimedia is a brilliant tool for meaningful storytelling, as it allows the consumer to be transported to the experience you want them to have, bringing the story to life. Companies should take advantage of these tools to transform products into experiences, cultivating an immersive and emotional connection with the brand. - Joey Hodges, Demonstrate PR JUN 28, 2017 by Forbes Agency Council
  • 39. 360°Therolein PropTech? #1b Usefor real estate agents A four-wheeled tripod outfitted with a computer, 360- degree camera and sensors can roam properties, producing highly choreographed, immersive videos that would be difficult — if not impossible — to replicate with a normal video camera. VirtualAPT (Brooklyn, NYC) offers residential tour service at now $1/ft² (~10.8$/m²), and for commercial uses, for a monthly fee per building or $0.50/ft² (~5.4$/m²) for separate units. Generated by technology from companies such as Matterport, 3-D home tours allow users to jump between 360-degree photos — sometimes situated within a 3-D model. ● A rover can shoot 360-degree footage of a home while moving along a pre-plotted route. ● Made by VirtualAPT, the videos can include on-camera presentations from real estate agents. ● They're an alternative to 3-D homes tours from companies such as Matterport. https://www.youtube.com/watch?v=JhfQK-tDvGU
  • 40. 360°Therolein PropTech? #2a Use forconstruction andasatoolforconstructing4D/5D/6DBIM (BuildingInformationModel) Construction site manager manually taking photos of the progress. - Time-consuming to walk through and take photos - No full coverage of site - Might forget some spots - Nice initial 3D BIM not properly maintained during construction site. + Ideally have a drone inspecting the whole construction site with an on- board 360 degree video and a LIDAR / laser scanner. + One can go back in time and see who of the subcontractors for example are responsible for possible problems https://doi.org/10.1186/s40327-014-0016-9
  • 41. 360°Therolein PropTech? #2b 360 videos registered or not to 3D BIM model allows inspection of the progress (“4D BIM”) in the construction site also retrospectively, and can possibly reduce legal battles when it is clearer who is the one to be held responsible in case of discrepancies between as-built and as-planned data. VISUAL ASSET MANAGEMENT Visual Asset Management (VAM) service digitizes industrial and infrastructure assets using 360 degree images, 3D Models, and relative asset information. 3D MODELING We thrive on enabling 3D realistic visualization to projects while preserving the minute details necessary to portray our world. 360 VIDEO 360 video enables viewers to be at the center of any medium, allowing for a unique visual experience and situational awareness from any device. VIRTUAL REALITY OcuTech’s virtual reality solutions stimulate creative thinking and enhanced information sharing allowing for one of kind virtual experience. Ocutech from Houston, Texas, USA is already providing these type of services https://ocutech360.com/3d-architectural-visualization-solution/#3dvrvideo
  • 43. 360°intosmartphones howbigwillitbe? https://www.engadget.com/2017/07/10/future-of-smartphone-camera/ 1) Augmented reality 2) Dual-lens cameras 3)Better lenses 4)4K recording 5)Thermal imaging 6)Optical zoom 7)360 video “Several smartphone makers, including Samsung and Huawei, have already released add-on 360- degree cameras for their handsets, but this is something that could eventually be integrated into the phones themselves. Immersive 360-degree videos are gradually making their mark, with Facebook among the big firms pushing the technology, while virtual reality companies are gradually introducing more 360-VR content that be viewed from mobile phones.” https://techcrunch.com/2016/08/30/the-future-of-mobile-video-is-virtual-reality/ Are 360 cameras the future? https://youtu.be/i8EUerX90-0 TechAltar So whether teens in big numbers will ever apply Snapchat bunny ears to immersive 360 degree videos?
  • 44. 360°intosmartphones plentyofoptionscoming#1 Acer’s new Holo 360 degree camera is essentially a smartphone Acer has announced its entry into the VR video market with a device that’s half 360-degree camera, half smartphone. http://www.trustedreviews.com/news/acer-s-new-ho lo-360-degree-camera-is-essentially-a-smartphone -2953609 Paul Monckton CONTRIBUTOR I write about photography and related subjects https://www.forbes.com/sites/paulmonckton/2016/05/31/worlds-first-live-smartphone-vr-camera/#9 fea6921a8b0 Yesterday at this year’s Computex trade show in Taipei, Quanta Computer and ImmerVision jointly announced what is claimed to be the world’s first 360-degree live VR streaming camera for smartphones, with demos starting from today. The, as yet unnamed, camera fits in the palm of the hand and is designed to attach magnetically to any smartphone. It comes with a 360-degree by 187-degree lens and uses a Sony Exmor-HDR imaging sensor to produce 16 megapixel panoramic images. ImmerVision's Panamorph lens makes more efficient use of an image sensor (Image credit: ImmerVision) THIS ADD-ON CAMERA WILL TURN YOUR SMARTPHONE INTO A 360 CAMERAJULY 26, 2017 ION360 U 4K 360-Degree Smartphone Camera is comprised of a 360 camera that goes on top of Essential's 360 Camera Is the World's Smallest 360-Degree Personal Camera for a Smartphone 30 May 2017 http://gadgets.ndtv.com/mobiles/news/essentials-360-camera-is-the-worlds-sm allest-360-degree-personal-camera-for-a-smartphone-1705826 After months of teasing, Android creator Andy Rubin has finally unveiled the Essential Phone that features a near bezel-less display that tries to outdo Samsung's Galaxy S8. Essential's 360 camera, which weighs around 35 grams and is being called the world's smallest 360- degree personal camera by the company, includes a dual 12-megapixel fisheye sensors that can capture 4K 360 video at 30fps. The camera also features 4 microphones to capture sound in 3D. The 360 camera can be bought along with the Essential Phone for an additional $50, or can be bought separately which will cost you $199. @essential, Palo Alto, CA, essential.com
  • 45. 360°intosmartphones plentyofoptionscoming#2 ProTruly’s Darling https://www.theverge.com/2017/3/5/14809 182/protruly-darling-360-degree-camera- smartphone A company called HT Optical that makes the cameras found on ProTruly’s devices. The company said that it is working on a much smaller 360 camera module that will actually fit into a 7.6 mm thick smartphone and will be capable of capturing 16 MP photos and shoot 4K videos. What’s even more interesting is that the module will only add an extra 1 mm to the overall thickness of a device. https://www.theverge.com/ci rcuitbreaker/2017/2/22/1469 8026/huawei-360-degree-came ra-honor-vr-smartphones http://360rumors.com/ https://www.vrfocus.com/2017/07/360-degree-video-edi ting-app-for-smartphones/ V360 -360 video editor Avincel GroupInc  360-DegreeVideo Editing App ForSmartphonesV360editingsuite alreadyout for Android, withiOS versioncomingsoon.
  • 46. 360°intosmartphones convergencewith AI players of course https://www.embedded-vision.com/news/movidius-low-po wer-vpu-technology-delivers-4k-vr-pixel-processing-p erformance-motorola%E2%80%99s-newest Movidius’ Myriad 2 Vision Processing Unit (VPU) technology, known for its image signal processing and computer vision capabilities with high energy efficiency, was selected by Motorola Mobility to power their newest Moto Mod: the 360 Camera. Moto Mods are unique modular accessories for Motorola smartphones that bring advanced functionality beyond traditional smartphone features. Motorola’s newest Moto Mod brings users the ability to live stream 360 videos⁰ while preserving battery life. Say Hello to the moto z² Force Edition with moto mods https://www.youtube.com/watch?v=0moMnChM6Ds https://www.wsj.com/articles/intel-to-buy-semiconduct or-startup-movidius-1473170441 https://www.altera.com/solutions/industry/automotive/applicat ions/drive-assistance/surround-view-camera.html http://www.nvidia.co.uk/object/drive-px-uk.html
  • 47. 360°VideoSfM Obviousextensiontocombineboth Instead of manuallyrotatingyour camera,image all angles simultaneously while going through the rooms in an apartment https://uploadvr.com/adobe-algorithm-6dof-360-cam/ http://variety.com/2017/digital/news/adobe-6dof-vr-v ideo-algorithms-1202394491/ Adobe Motion Parallax demo https://youtu.be/37Z4f6p1HOY https://www.roadtovr.com/adobes-new-research-aims-give-depth-monoscopic-360-video/: Other techniques to achieve 6-DoF VR video usually require light-field cameras like HypeVR’s crazy 6k/60 FPS, LiDAR rig or Lytro’s giant Immerge camera. While these undoubtedly will produce a higher quality 3D effect, they’re also custom-built and ungodly expensive. 6-DOF VR videos with a single 360-camera Jingwei Huang ; Zhili Chen ; Duygu Ceylan ; Hailin Jin, Virtual Reality (VR), 2017 IEEE http://dx.doi.org/10.1109/VR.2017.7892229, 18-22 March 2017 Given a 360-video captured by a single spherical panorama camera, in an offline pre-processing stage, we recover the camera motion and the scene geometry first by performing structure-from-motion (SfM) followed by dense reconstruction. Then, in real-time we playback the video in a VR headset where we track the 6-DOF motion of the headset and synthesize new views by a novel warping algorithm.
  • 48. 360°VideoSfM KoreaAdvanced Institute ofScience andTechnology(KAIST) Spherical panoramic cameras (Ricoh Theta S, Samsung Gear 360 and LG 360) Our sphere sweeping algorithm enables to compute all-around dense depth maps, minimizing the loss of spatial resolution. With the estimated all-around image and depth map, we have shown practical utilities by introducing 360 stereoscopic and anaglyph◦ images as VR contents. European Conference on Computer Vision ECCV 2016: Computer Vision – ECCV 2016 pp 156-172 https://doi.org/10.1007/978-3-319-46487-9_10 All-Around Depth from Small Motion with a Spherical Panoramic Camera. Sunghoon ImEmail author Hyowon Ha François Rameau Hae-Gon Jeon Gyeongmin Choe In So Kweon
  • 50. MicrosoftKinect Democratizing structuredlightscanning https://arxiv.org/abs/1505.05459 Structured light A sequence of known patterns is sequentially projected onto an object, which gets deformed by geometric shape of the object. The object is then observed from a camera from a different direction. By analyzing the distortion of the observed pattern, i.e. the disparity from the original projected pattern, depth information can be extracted The Time-of-Flight (ToF) technology is based on measuring the time that light emitted by an illumination unit requires to travel to an object and back to the sensor array. The Kinec tToF camera applies this CW intensity modulation approach. . Due to the distance between the camera and the object (sensor and illumination are assumed to be at the same location), and the finite speed of light c, a time shift [s]φ is caused in the optical signal which is equivalent to a phase shift in the periodic signal. This shift is detected in each sensor pixel by a so-called mixing process. The time shift can be easily transformed into the sensor-object distance as the light has to travel the distance twice, Cited by 65 articles - see Related articles
  • 51. KinectFusion Scanning with Kinect https://doi.org/10.1145/2047196.2047270 Cited by 1356 articles, see Related articles https://arxiv.org/abs/1704.01047 https://arxiv.org/abs/1612.02859 The semantic cue from floorplan (i.e., door detection) resolves ambiguities. The figure shows the best placement based on the unary potential with or without the semantic cue We show qualitative results on ModelNet using the TSDF encoding (Curless and Levoy, 1996) and 4 views. The same TSDF truncation threshold has been used for traditional fusion, our OctNetFusion approach and the ground truth generation process. While the baseline approach is not able to resolve conflicting TSDF information from different viewpoints, our approach learns produce a smooth and accurate 3D model from highly noisy input. By learning the structure of real world 3D objects and scenes, our approach is further able to reconstruct occluded regions and to fill gaps in the reconstruction. We evaluate our approach extensively on both synthetic and real-world datasets for volumetric fusion. Further, we apply our approach to the problem of 3D shape completion from a single view where our approach achieves state-of-the-art results.
  • 52. Kinecttweaks depthresolution improvementswithpolarization measurement? http://news.mit.edu/2015/object-recognition-robots-0724 https://youtu.be/m6sStUk3UVk http://news.mit.edu/2015/algorithms-boost-3-d-imaging-resolution-1000-times-1201 https://doi.org/10.1007/s11263-017-1025-7 https://doi.org/10.1364/OE.25.001173
  • 53. RangeSensing PlentyofOptions http://3dscanexpert.com/photogrammetry-benchmarks-r emake-vs-photoscan-vs-realitycapture-vs-zephyr/ This post is just an example based on a single photoset from a single object. That makes it zero percent scientific. Also, RealityCapture might have won this Drag Race in terms of both speed with the Fast preset and quality with the Normal preset, but an organic object like this is very favorable to its algorithms. Read my Full RC Review to see that it can’t always handle non-organic objects well. COMMERCIAL SOFTWARE http://3dscanexpert.com/ By Nick Lievendag Entrepreneur at the intersection of Creativity × Technology. Writes, Speaks and Consults about 3D Capture (3D Scanning & Photogrammetry). Founder of 3D Scan Expert.
  • 54. Matterportdominating RealEstatescanning This $4,500 camera turns the real world into the virtual one. Today, Matterport ’s hardware is a hit with real estate agents. But fueled by the $30 million Series C it just raised, Matterport’s software and partnership with Google’s Project Tango could let you wave your phone around to create VR tours of anywhere you want. https://techcrunch.com/2015/06/25/matterport/ https://www.crunchbase.com/organization/matterport#/entity Matterport spawned out of the Xbox Kinect hacker scene in 2010. Founder Matt Bell had been working for a gesture recognition company that relied on a $50,000 camera and expert operators to produce a huge CAD file that could only be accessed through a specialized application. Bell was flabbergasted by the power of the $150 Kinect. He realized the potential for a relatively cheap device with similar technology that could let anyone map out rooms to create 3D models accessible straight from the web. https://youtu.be/HZX8RupfQls
  • 55. MatterportResearch onsemanticindoor segmentation We collected the data using the Matterport Camera, which combines 3 structured-light sensors to capture 18 RGB and depth images during a 360 rotation at each scan location◦ . The output is the reconstructed 3D textured meshes of the scanned area, the raw RGB-D images, and camera metadata. We used this data as a basis to generate additional RGB-D data and make point clouds by sampling the meshes. We semantically annotated the data directly on the 3D point cloud, rather than images, and then projected the per point labels on the 3D mesh and the image domains. https://arxiv.org/abs/1702.01105 | Cited by 3 - Related articles https://arxiv.org/abs/1702.07600 https://www.fastcompany.com/3059281/ introducing-hover-an-ai-powered-indo or-safe-camera-drone + Indoor scanning with tripod-based Matterport still requires a lot of manual work, and at some point will be updated to autonomous AI- powered indoor drone for better user experience.
  • 56. MatterportTechnologypatents Capturing and aligning multiple 3-dimensional sceneswww.google.com/patents/US8879828Grant - Filed Jun 29, 2012 - Issued Nov 4, 2014 - Matthew Bell - Matterport, Inc. Multi-modal method for interacting with 3d models www.google.com/patents/US20130342533App. - Filed Jun 24, 2013 - Published Dec 26, 2013 - Matthew Bell - Matterport, Inc. Identifying and filling holes across multiple aligned three-dimensional scenes www.google.com/patents/US8861840Grant - Filed Oct 14, 2013 - Issued Oct 14, 2014 - Matthew Bell - Matterport, Inc. Building a three-dimensional composite scene www.google.com/patents/US8861841Grant - Filed Oct 14, 2013 - Issued Oct 14, 2014 - Matthew Bell - Matterport, Inc. Processing and/or transmitting 3D data www.google.com/patents/US9396586Grant - Filed Mar 14, 2014 - Issued Jul 19, 2016 - Matthew Tschudy Bell - Matterport, Inc. Semantic understanding of 3d data www.google.com/patents/US20160055268App. - Filed Jun 6, 2014 - Published Feb 25, 2016 - Matthew Tschudy Bell - Matterport, Inc. Selecting two-dimensional imagery data for display within a three-dimensional model www.google.com/patents/EP3120329A1?cl=enApp. - Filed Mar 13, 2015 - Published Jan 25, 2017 - Matthew Tschudy BELL - Matterport, Classifying, separating and displaying individual stories of a three-dimensional model of a multi-story structure based on captured image data of the multi-story structure www.google.com/patents/US20160217225App. - Filed Jan 28, 2016 - Published Jul 28, 2016 - Matthew Tschudy Bell - Matterport, Inc. Semantic understanding of 3d data US 20160055268 A1 ABSTRACT Systems and techniques for processing three- dimensional (3D) data are presented. Captured three- dimensional (3D) data associated with a 3D model of an architectural environment is received and at least a portion of the captured 3D data associated with a flat surface is identified. Furthermore, missing data associated with the portion of the captured 3D data is identified and additional 3D data for the missing data is generated based on other data associated with the portion of the captured 3D data. REFERENCED BY US9576184 Textura Planswift Corporation Detection of a perimeter of a region of interest in a floor plan document US20130328872 Tekla Corporation Computer aided modeling US20150227644 Pictometry International Corp. Method and system for displaying room interiors on a floor plan US20160063722 Textura Planswift Corporation Detection of a perimeter of a region of interest in a floor plan document US20160379405 Jim S Baca Technologies for generating computer models, devices, systems, and methods utilizing the same
  • 57. GoogleTangoTechnology http://www.deccanchronicle.com/technology/gadgets/210717/i s-google-tango-relevant-in-2017.html https://arstechnica.co.uk/gadgets/2016/12/google- tango-phab-2-pro-review/ A Project Tango device ‘sees’ the environment around it through a combination of three core functions. First up is motion tracking, which allows the device to understand its position and orientation using a range of sensors (including accelerometer and gyroscope). Then there’s depth perception, which examines the shape of the world around you. Intel provides a vital cog in this respect with its RealSense 3D camera. With this component on board, a device can gain accurate gesture control and snappy 3D object rendering among other things. Finally, Project Tango incorporates area learning, which means that it maps out and remembers the area around it. Point Cloud Framework for Rendering 3D Models Using Google Tango Maxen Chung, Santa Clara University Julian Callin, Santa Clara University http://scholarcommons.scu.edu/cseng_senior/84 https://doi.org/10.1007/s11227-016-1891-8 Project Tango Tablet Development Kit, recently introduced by Google, Inc. Equipped with the most powerful processor available to date on a consumer-level mobile platform (i.e., NVIDIA Tegra K1 whose 192 programmable CUDA-enabled GPU cores use the same efficient Kepler architecture found in the world’s most powerful supercomputers and workstations) along with several sensors (motion tracking camera, 3D depth sensor, accelerometer, ambient light sensor, barometer, compass, GPS, gyroscope), this mobile device can readily utilize GPU computing making it an ideal platform for developing real-time contextual awareness applications for the visually impaired (VI). Moreover, being compact, lightweight, potentially wearable, relatively discreet and affordable render it aesthetically appealing, socially acceptable and accessible for VI users
  • 58. GoogleTangoExampleApplications#1 We broke the news yesterday that Google was producing a prototype 3D sensing smartphone called Project Tango. We also broke down the capabilities of the vision processor inside the device and talked about what it means for the future of phones. Now, we’ve got an exclusive look in the video below at a real 3D indoor map of a room captured with one of the prototype devices by Matterport. https://techcrunch.com/2014/02/21/heres-an-actual-3d-indoor-map-of-a-room-captured-with-googles-project-tango-phone/ https://matterport.com/mobile-3d-capture/ https://developers.google.com/tango/apis/overview Daydream is Google’s platform for virtual reality. It consists of Daydream-ready phones, Daydream-ready headsets and controllers, and Daydream apps. Daydream View is the first Daydream-ready headset and controller designed and developed by Google. It also comes with a touch-and-motion enabled controller so you can easily interact with VR apps. With the Daydream View, you will be able to explore new worlds through Google Street View and Fantastic Beasts. Kick back in your personal cinema with YouTube, Netflix, Hulu, and HBO. Get in the game with Gunjack 2, LEGO® BrickHeadz, and Need for Speed. That’s just the beginning of the VR possibilities with Daydream. http://www.techphlie.com/ 2017/07/what-is-google-ta ngo-and-daydream.html Google has notably been pushing AR/VR technologies with its latest Android OS. The most prominent introduction however, has been the ASUS ZenFone AR launch that took place at CES, 2017, earlier this year.
  • 59. GoogleTangoExampleApplications#2 Google Tango SDK examples: how to make a floor plan in 50 seconds Alexander Grau Google Tango and Revit Leonardo Manzione https://www.youtube.com/watch?v=A-4cuJ1kOQ4
  • 60. “GoogleTango”withoutdepth sensors I have always believed that bringing 3D to consumers could only work without the need for dedicated depth sensors. This pure-software approach is already being embraced for Augmented Reality with Apple’s upcoming ARKit and Google’s ARCore which was announced last week. Both can give modern smartphones AR-capabilities by just using the regular camera(s), instead of using dedicated sensors like Tango. https://3dscanexpert.com/sony-3d-creator-brings-sensor-less-3d-scanning-consumers/ But yesterday, at IFA Berlin, Sony announced its latest smartphone, the XZ1. Which has all the bells and whistles you expect from a flagship Android phone but also an app called 3D Creator . It basically does exactly what Microsoft showed last year, but is actually available — albeit exclusive for the XZ1. https://www.sonymobile.com/global-en/products/phones/xperia -xz1/3d-creator/
  • 61. AppleDepthSensing TheiPhoneX’s notch isbasically aKinect 365by Paul Miller@futurepaul  Sep 17,2017, 10:00am EDT https://www.theverge.com/circuitbreaker/2017/9/17/16315510/iphone-x-notch-kinect-apple-primesense-microsoft And now, in late 2017, Apple is going to sell a phone witha front-facing depthcamera. Unlike the original Kinect, which was built to track motion in a whole living room, the sensor is primarily designed for scanning faces and powers Apple’s Face ID feature. Apple’s “TrueDepth” camera blasts “more than 30,000 invisible dots” and can create incredibly detailed scans of a human face. In fact, while Apple’s Animoji feature is impressive,  the developerAPIbehind it is even wilder: Apple generates, in real time, a full animated 3D mesh of your face, while also approximating your face’s lighting conditions to improve the realismofAR applications. How Apple’siPhone X TrueDepth CameraWorks By David Cardinal onSeptember 14, 2017 Beyond the Camera: Facial Motions and Changing Features Getting a depth estimate for portions of a scene is only the beginning of what’s required for Apple’s implementation of secure facial recognition and Animojis. For example, a mask could be used to hack a facial recognition system that relied solely on the shape of the face. So Apple is using processing power to learn and recognize 50 different facial motionsthat are muchharder toforge.Theyalso provide the basis for making Animoji figures seem to mimicthephone’sowner. How Secure is Face ID? Given how willing Apple is to commit to using Face ID for financial transactions, I’m sure they have pushed the limits beyond either simple 3D models or 2D motion. It is likely they are relying on the phone’s abilitytorecognize minute facial movements and feed them into a machine learning system on the A11Bionicchip that will add another layer of security to the system. That piece will also be key in helping the phone decide whether you’re the same person when you put on a pair of glasses, a hat, or grow a beard — all of which Apple claims Face ID willhandle.
  • 63. LaserScanning LiDAR(LightDetection AndRanging) http://dx.doi.org/10.1038/nphoton.2010.148 http://dx.doi.org/10.1080/19479832.2013.811124 3D building modeling (BIM) using images and LiDAR: a review https://techcrunch.com/2017/07/12/nyu-releases-the-largest-lidar- dataset-ever-to-help-urban-development/ http://ia.cr/2017/613 https://www.theregister.co.uk/2017/06/27/lidar_spoofed_bad_news_for_self_driving_cars/
  • 65. RieglA rangeof differentlaserscanners http://www.riegl.com/products/unmanned-scanning/ RIEGL VZ-400 Indoor Scanned Data by Jamis Choi, Published on Apr 1, 2010 https://www.youtube.com/watch?v=hOf0hpCn92I Scanning made simple with RiSOLVE - RIEGL's new 3D Scene Capture Software Published on Oct 4, 2012 (feat. horrible lounge music) https://www.youtube.com/watch?v=lbxvzMlTWyg
  • 66. Rieglsystemin practice https://doi.org/10.1109/IROS.2016.7759501 Namely, we propose a method for the automatic selection of feature coordinate locations, and introduce the concept of localized automatic relevance determination (LARD) to the Hilbert Maps framework, in which different dimensions in the projected Hilbert space operate within independent length scale values. The proposed technique was tested against other state-of-the-art 3D scene reconstruction tools in three different datasets: a simulated indoors environment, RIEGL laser scans and dense LSD-SLAM pointclouds. The results testify to the proposed framework’s ability to model complex structures and correctly interpolate over unobserved areas of the input space while achieving real-time training and querying performances.
  • 67. HandheldScanning GeoSLAMZEB-REVO Handheld Laser Scanning - ZEB-REVO The ZEB-REVO is the latest, lightweight revolving laser scanner from GeoSLAM. Handheld, pole-mounted or attached to a mobile platform, the ZEB-REVO can record more than 40,000 measurement points per second from the survey environment. NEW ZEB-CAM The new ZEB-CAM is an optional upgrade for standard ZEB-REVO systems. Simply attach ZEB-CAM to the underside of a standard REVO and begin scanning immediately. The ZEB-CAM captures live video footage of the survey environment and adds contextual video and imagery to scan data to aid feature identification. Optical flow technology is utilised to accurately synchronise the video and scan together in GeoSLAM's Desktop software. http://www.3dlasermapping.com/zeb-revo- handheld-laser-scanning/ https://youtu.be/k8q5xr_eLgk
  • 68. GeoSlamvs.Leica Portablescanningquality http://dx.doi.org/10.1117/12.2270761 The paper investigates the performances of two portable mobile mapping systems (MMSs), the handheld GeoSLAM ZEB-REVO and Leica Pegasus:Backpack, in two typical user-case scenarios: an indoor two-floors building and an outdoor open city square. Note! This paper would have been even nicer with a ‘gold standard’ giving the “correct measurements” instead of just comparing two “good enough” scanners.
  • 69. ResearchScanners SensorFusion The Indoor Multi-sensor Acquisition System (IMAS) presented in this paper consists of a wheeled platform equipped with two 2D laser heads, RGB cameras, thermographic camera, thermohygrometer, and luxmeter. One of the laser scanning sensors is foreseen to obtain the building map and the navigation information, and the other one to the 3D environment reconstruction. The thermographic and optical images, and the geometric and comfort data are synchronized and automatically linked to trajectory positions, so that they are georeferenced in the building in terms of a relativepositioning system Software interface for virtual immersive navigation and ex situ data analysis. http://dx.doi.org/10.3390/s16060785
  • 70. AppliedPointCloud Scans Accessibility Point Clouds to Indoor/Outdoor Accessibility Diagnosis J. Balado, L. Díaz-Vilariño, P. Arias, I. Garrido https://www.isprs-ann-photogramm-remote-sens-spatial-inf-sci.net/IV-2-W4/287/2017/isprs-annals-IV-2- W4-287-2017.pdf This work presents an approach to automatically detect structural floor elements such as steps or ramps in the immediate environment of buildings, elements that may affect the accessibility to buildings. The methodology is based on Mobile Laser Scanner (MLS) point cloud and trajectory information. The methodology is tested in a real case study, consisting of 100 m of an urban street. Ground elements are correctly classified in an acceptable computation time. Steps and ramps also are exported to GIS software to enrich building models from Open Street Map with information about accessible/inaccessible entrances and their locations. http://www.wired.co.uk/article/wayfindr-app A project initiated by the Royal London Society for the Blind's (RLSB) Youth Forum has led to the prototyping of a new app called Wayfindr, which has been built especially to help blind and partially sighted people use London's transport network independently. The app relies on smartphones and iBeacons and has been developed in collaboration with global digital product design studio ustwo Our Open Standard gives you the tools to create inclusive and consistent experiences for your vision impaired customers. From transport networks and shopping centres, to hospitals and any other indoor space - we can help. Through our on-site trials and consultancy we will work together with you to understand how digital wayfinding can make your estate accessible. https://www.wayfindr.net/
  • 72. DataQuality compromisebetweenfilesize,computationaltimeandquality 3D model reconstruction from point cloud processed either with OpenSFM, VisualSFM or Pix4D (top row) to mesh model (middle row) to final textured 3D model (bottom row) across a series of downsampled Sky Ranger UAV including full resolution (first column) half resolution (second column) and quarter resolution (last column). Bolick and Harguess (2016), http://dx.doi.org/10.1117/12.2224677 Garbage in – garbage out true like as always. The more high-quality images / points you have as input, the higher the reconstruction quality will obviously be. Top-left: points sampled on a sphere and corrupted with a lot of noise. Top-right: reconstructed surface mesh. Bottom-left: smoothed point set. Bottom- right: reconstructed surface mesh. Reconstruction error (mm) against number of points for the Bimba con Nastrino point set with 1.6M points as well as for simplified versions. CGAL 4.10 - Poisson Surface Reconstruction The sensitivity of biological finite element models to the resolution of surface geometry: a case study of crocodilian crania: “Example of the simplified models. C. moreletti models composed of 20k, 30k, 90k and 300k surface (mesh) elements.” https://doi.org/10.7717/peerj.988 point cloud & mesh processing MAY 27 2017, posted by Taylor Wang The final goal is to get a fully editable NURBS CAD model so that it can be modified by any CAD software to improve the design or reproduce the product.
  • 73. PointCloudLibray(PCL) The mostpopular open-sourcelibrary http://unanancyowen.com/en/pcl-with-velodyne/ https://www.youtube.com/watch?v=7BUFxkyH1r0 https://doi.org/10.1109/MRA.2012.2206675 Cited by 186 articles - see Related articles
  • 75. Driftcorrection forproperimageregistration https://doi.org/10.1109/ROBOT.2010.5509312 Correcting for drift (distortion) between different scans or overlapping point clouds with added velocity information for ICP (Iterative Closest Point) algorithm. (a) is a given environment. Blue points in (b) shows distortion of the scan, and red points in (b) show compensated scan. Transformation estimated using distorted data includes inevitable errors(c). Transformation estimated from the rectified scan gives us more accurate results(d). Kaarta - Common point cloud registration issues http://www.kaarta.com/cloud-registration-issues/ Published: 8 March 2017 http://dx.doi.org/10.3390/s17030539 Keywords: LiDAR; inertial measurement unit; iterative closest point; iterated sigma point Kalman filter; time delay calibration
  • 76. DataReduction andsimplificationfor storage Imran Ashraf ; Soojung Hur ; Yongwan Park https://doi.org/10.1109/ACCESS.2017.2699686 LIDAR produces large point cloud, but, while generating images for limited field of view, data sparsity results in poor quality images. Moreover, 3D to 2D data transformation also involves data reduction, which further deteriorates the quality of images. http://dx.doi.org/10.1117/12.2270833 31 October 2016 https://doi.org/10.1109/TIP.2016.2623488 https://www.google.com/patents/US9582939 https://arxiv.org/abs/1609.00893 Keywords: Tensor networks, Function-related tensors, CP decomposition, Tucker models, tensor train (TT) decompositions, matrix product states (MPS), matrix product operators (MPO), basic tensor operations, multiway component analysis, multilinear blind source separation, tensor completion, linear/multilinear dimensionality reduction, large-scale optimization problems, symmetric eigenvalue decomposition (EVD), PCA/SVD, huge systems of linear equations, pseudo-inverse of very large matrices, Lasso and Canonical Correlation Analysis (CCA) https://doi.org/10.1016/j.isprsjprs.2016.06.012 In-base point cloud management pipeline in the point cloud server (PCS).
  • 77. DataReduction CompressiongPointClouds Dynamic polygon cloud compression Eduardo Pavez ; Philip A. Chou (2017) https://doi.org/10.1109/ICASSP.2017.7952694 We introduce a compressible representation of 3D geometry (including its attributes, such as color texture) intermediate between polygonal meshes and point clouds called a polygon cloud. Polygon clouds, compared to polygonal meshes, are more robust to live capture noise and artifacts. Furthermore, dynamic polygon clouds, compared to dynamic point clouds, are easier to compress, if certain challenges are addressed. In this paper, we propose methods for compressing dynamic polygon clouds using transform coding of color and motion residuals. Real-time compression of point cloud streams Julius Kammerl ; Nico Blodow ; Radu Bogdan Rusu ; Suat Gedikli ; Michael Beetz ; Eckehard Steinbach (2012) https://doi.org/10.1109/ICRA.2012.6224647 We present a novel lossy compression approach for point cloud streams which exploits spatial and temporal redundancy within the point data. Our proposed compression framework can handle general point cloud streams of arbitrary and varying size, point order and point density. Furthermore, it allows for controlling coding complexity and coding precision. To compress the point clouds, we perform a spatial decomposition based on octree data structures. 3D Reconstruction Framework for Multiple Remote Robots on Cloud System Phuong Minh Chu, Seoungjae Cho, Simon Fong, Yong Woon Park and Kyungeun Cho (2017) http://dx.doi.org/10.3390/sym9040055 This paper proposes a cloud-based framework that optimizes the three-dimensional (3D) reconstruction of multiple types of sensor data captured from multiple remote robots. A working environment using multiple remote robots requires massive amounts of data processing in real-time, which cannot be achieved using a single computer. In the proposed framework, reconstruction is carried out in cloud-based servers via distributed data processing.
  • 79. DeepLearningbeyondnon-euclidean problems Michael M. Bronstein, Joan Bruna, Yann LeCun, Arthur Szlam, andPierre Vandergheynst https://doi.org/10.1109/MSP.2017.2693418 https://arxiv.org/abs/1705.10819
  • 81. DeepLearningPointNet++ PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas Stanford University, (Submitted on 7 Jun 2017) https://arxiv.org/abs/1706.02413 Illustration of our hierarchical feature learning architecture and its application for set segmentation and classification using points in 2D Euclidean space as an example. Single scale point grouping is visualized here. Left: Point cloud with random point dropout. Right: Curve showing advantage of our density adaptive strategy in dealing with non-uniform density. DP means random input dropout during training; otherwise training is on uniformly dense points Scannet labeling results. PointNet captures the overall layout of the room correctly but fails to discover the furniture. Our approach, in contrast, is much better at segmenting objects besides the room layout.
  • 82. DeepLearning2DFeatureDescriptors Instead of using the old-school SIFT, SURF, ORB, etc., the feature descriptor / matching can be done with data-driven deep learning network as well Note This model was trained with SfM data, which does not have strong rotation changes. Newer models work better in this case, which will be released soon. In the meantime, you can also use the models in the learn-orientation, benchmark-orientation. https://github.com/cvlab-epfl/LIFT https://arxiv.org/abs/1603.09114 | Cited by 23 Related articles
  • 83. DeepLearning3DFeatureDescriptors https://arxiv.org/abs/1706.04496 We present a view-based convolutional network that produces local, point-based shape descriptors. The network is trained such that geometrically and semantically similar points across different 3D shapes are embedded close to each other in descriptor space (left). Our produced descriptors are quite generic — they can be used in a variety of shape analysis applications, including dense matching, prediction of human affordance regions, partial scan-to-shape matching, and shape segmentation (right). In contrast to findings in the image analysis community where learned 2D descriptors are ubiquitous and general (e.g. LIFT), learned 3D descriptors have not been as powerful as 2D counterparts because they (1) rely on limited training data originating from small-scale shape databases, (2) are computed at low spatial resolutions resulting in loss of detail sensitivity, and (3) are designed to operate on specific shape classes, such as deformable shapes. We generate training correspondences automatically by leveraging highly structured databases of consistently segmented shapes with labeled parts. The largest such database is the segmented ShapeNetCore dataset [ Yi et al. 2016, https://www.shapenet.org/] that includes 17K man-made shapes distributed in 16 categories
  • 84. Meshgenerativeshapeswith GAN https://arxiv.org/abs/1705.02090 Our key insight is that 3D shapes are effectively characterized by their hierarchical organization of parts, which reflects fundamental intra-shape relationships such as adjacency and symmetry. We develop a recursive neural net (RvNN) based autoencoder to map a flat, unlabeled, arbitrary part layout to a compact code. The code effectively captures hierarchical structures of man-made 3D objects of varying structural complexities despite being fixed-dimensional: an associated decoder maps a code back to a full hierarchy. The learned bidirectional mapping is further tuned using an adversarial setup to yield a generative model of plausible structures, from which novel structures can be sampled. It would be interesting to thoroughly investigate the effect of code length on structure encoding. Finally, it is worth exploring recent developments in GANs, e.g. Wasserstein GAN [Arjovsky et al. 2017], in our problem setting. It would also be interesting to compare with plain VAE and other generative adaptations.
  • 85. PointCloud generativeGANsforpointclouds #1a https://arxiv.org/abs/1707.02392 We build an end-to-end pipeline for 3D point clouds that uses an autoencoder (AE) to create a latent representation, and a Generative Adversarial Networks (GAN) to generate new samples in that latent space. Our AE is designed with a structural loss tailored to unordered point clouds. Our learned latent space, while compact, has excellent class- discriminative ability: per our classification results, it outperforms recent GAN-based representations by 4.3%. In addition, the latent space allows for vector arithmetic, which we apply in a number of shape editing scenarios, such as interpolation and structural manipulation. We argue that jointly learning the representation and training the GAN is unnecessary for our modality. We propose a workflow that first learns a representation by training an AE with a compact bottleneck layer, then trains a plain GAN in that fixed latent representation. One benefit of this approach is that AEs are a mature technology: training them is much easier and they are compatible with more architectures than GANs. We point to theory that supports this idea, and verify it empirically: we show that GANs trained in our learned AE-based latent space generate visibly improved results, even with a generator and discriminator as shallow as a single hidden layer. Within a handful of epochs, we generate geometries that are recognized in their right object class at a rate close to that of ground truth data. Importantly, we report significantly better diversity measures (10x divergence reduction) over the state of the art, establishing that we cover more of the original data distribution. In summary, we contribute. ● An effective cross-category AE-based latent representation on point clouds. ● The first (monolithic) GAN architecture operating on 3D point clouds. ● A surprisingly simpler, state-of-the-art GAN working in the AE’s latent space. 1) Autoencoder For fixed latent representation Vector arithmetic 2) Generative Adversarial Network Using the fixed latent representation In our latent-space GAN, instead of operating on the raw point cloud input, we pass the data through our pre-trained autoencoder, trained separately for each object class with the Earth Mover’s distance (EMD) loss function. Both the generator and the discriminator of the GAN then operate on the 512- dimensional bottleneck variable of the AE. Finally, once the GAN training is over, the output of the generator is decoded to a point cloud via the AE decoder. We found that very shallow designs for both the generator and discriminator (in our case, 1 hidden layer for the generator and 2 for the discriminator) are sufficient to produce realistic results
  • 86. PointCloud generativeGANsforpointclouds #1b Interpolating between different point clouds, using our latent space representation. Note the interpolation between structurally and topologically different shapes. Generative results using our latent-space GAN. Note the variability and fidelity of the result. For a recap on GANs, you could see for example: https://arxiv.org/abs/1701.07875 Cited by 106 - Related articles What does GANs for point clouds mean in practice? Point-cloud super-resolution (e.g. Ledig et al. 2016 for natural images), to improve model appearance (e.g. remove staircasing), and inpainting (e.g. Iizuka et al. 2017) to handle occlusion and gaps from indoor scans (“shape completion”). “Visual plastic surgery” in other words (Tung et al. 2017) Sung et al. (2015) Data-driven Structural Priors for Shape Completion Mönch et al. (2010) Staircase-Aware Smoothing of Medical Surface Meshes
  • 87. HardwarePointCloud Super-resolution multiplescans https://doi.org/10.2312/SPBG/SPBG06/009-015 Cited by 47 articles On the left, one scan of the the parrot statue, with a sample spacing of about 1mm. Center, we combine 100 nearly identical such scans to produce the surface in the center, produced on a grid with sample spacing of about 0.3mm. Notice the noise reduction and the improvement in the detail, for instance in the face, neck and wing feathers. On the right, a photograph of the parrot statue. Super-resolution reconstruction using only 30 input scans at the left and increasing to 140 at the right. Noise is reduced dramatically at the beginning but more slowly at the end. Surfaces were reconstructed from subsets which were pre-registered using all 140 scans. For absolute measurement accuracy (e.g. Biljecki et al. 2017), one can scan the same space multiple times A thin strip of the super-resolved surface, and the nearby sample points from the input scans. The input is very noisy, but the points are densely and randomly distributed near the surface with few outliers, so the average gives an accurate representation of the surface. (a) One scan. (b) Final super-resolved surface from 100 scans. (c) Photo of the object (a plaster cast of a subway token). The bottom row shows some results of other kinds of processing, to evaluate the importance of the various steps of the algorithm. (d) One scan, bilinearly interpolated onto the finer grid and smoothed. Detail is missing. (e) The entire algorithm except for the final bilateral filtering step. The noise removed by the filtering seems to be residual registration error, which perhaps could be improved. (f) Just averaging 100 scans taken without moving the scanner, using the same Gaussian kernel. Noise is decreased, but there is aliasing from the lower-resolution grid obscuring detail visible in (b).
  • 88. DeepLearningSuper-Resolution Plentyofoptionsforimage/video/volumesuper-resolution https://arxiv.org/abs/1706.03142 https://arxiv.org/abs/1704.02738 https://arxiv.org/abs/1704.02470 https://arxiv.org/abs/1612.00085 Novel texture enhancement framework creates an HR style image that is rich in details, which can be used to restore high-frequency texture details back into the initial HR image via the style transfer algorithm. Four examples of SR results for nearest neighbor and cubic interpolation, the best-performing sparse coding, 3D- FSRCNN, and 3D-SRU-Net configurations. Arrows indicate regions in which at least one SR result mis- interprets a cell boundary or an ultrastructural feature. Scale bar 500 nm. Our method includes a sub-pixel motion compensation (SPMC) layer that can better handle inter-frame motion for this task. Our detail fusion (DF) network that can effectively fuse image details from multiple images after SPMC alignment
  • 89. Point-cloudsuper-resolution Upsampling‘on-the-fly’toavoid“dataexplosion”? Jason Schreier 4/17/17 12:05pm Horizon Zero Dawn, Kotaku http://kotaku.com/horizon-zero-dawn-uses-all-sorts- of-clever-tricks-to-lo-1794385026 Games like this don’t just look incredible because of ‘hyper-realism’ but because their engineers use all sorts of tricks [LOD’ing, or Level of Detail; Mipmapping; frustum culling, etc.] to save memory. The engine is designed to produce models in CityGML and does so in multiple LODs. Besides the generation of multiple geometric LODs, we implement the realisation of multiple levels of spatiosemantic coherence, geometric reference variants, and indoor representations. The datasets produced by Random3Dcity are suited for several applications, as we show in this paper with documented uses. The developed engine is available under an open-source licence at Github at http://github.com/tudelft3d/Random3Dcity http://doi.org/10.5194/isprs-annals-IV-4-W1-51-2016 Filip Biljecki, Hugo Ledoux, Jantien Stoter Level of detail texture filtering with dithering and mipmaps US 5831624 A Original Assignee 3Dfx Interactive Inc https://www.google.com/patents/US5831624 Level-of-detail rendering: colors identify different subdivision levels as stated in the top left corner. Feature-Adaptive Rendering of Loop Subdivision Surfaces on Modern GPUs November 2014 DOI: 10.1007/s11390-014-1486-x ManyLoDs: Parallel Many-View Level-of-Detail Selection for Real- Time Global Illumination Matthias Hollander, Tobias Ritschel, Elmar Eisemann, Tamy Boubekeur (2011) http://dx.doi.org/10.1111/j.1467-8659.2011.01982.x
  • 90. 3DContentgeneration VolumetricCapture Generatecontentbyscanningreal-lifescenesandobjects Kul Wadhwa's and Roddy O'Hara's Uncorporeal http://www.uncorporeal.com/ Uncorporeal: volumetric capture systems for VR & AR content creation. The team includes a technical Oscar-winner and engineering and product leadership from WETA, Google X, Lucas ILM, and Wikimedia. https://venturebeat.com/2016/10/13/pathbreaker-ventures-raises-12-milli on-to-invest-in-emerging-tech-such-as-vr-ar-and-robotics/ Ryan Gembala, founder of Pathbreaker Ventures believes connected homes and cars and autonomous vehicles will create a lot of opportunities in vertical applications for startups. And he also thinks that space technologies such as small satellites, analysis of space-captured data, consumer transport, space mining, and others are interesting. REALITYVIRTUAL.CO - A NEW ZEALAND BASED CREATIVE TECHNOLOGIES RESEARCH & DEVELOPMENT COLLECTIVE WITH AN ENTHUSIAST TOWARDS THE VISUAL REALM: ● unique post production & signal processing techniques including the development of deep learning image enhancement & automation throughout our 3D pipeline for PBR workflow ● strong emphasis on advanced robotics & autonomous operations for large data acquisition of 3D environments. 3D Scene Creation with Photogrammetry
  • 91. 3DContentgeneration Automaticphotorealism#1 Stillcanbequitelabor-intensivetocreaterealisticcontent Get to know Rense de Boer, a technical art director from Sweden, who is not only pushing the envelope of photo-real CGI environments, but he’s doing it all in a real-time engine! Art by Rens https://news.developer.nvidia.com/artist-spotlight-creating-photorealistic-cgi-environments-in-real-time/ https://www.youtube.com/watch?v=bXouFfqSfxg One Ph.D. position (supervision by Profs Niessner and Rüdiger Westermann) is available at our chair in the area of photorealistic rendering for deep learning and online reconstruction Research in this project includes the development of photorealistic realtime rendering algorithms that can be used in deep learning applications for scene understanding, and for high-quality scalable rendering of point scans from depth sensors and RGB stereo image reconstruction. If you are interested in applying, you should have a strong background in computer science, i.e., efficient algorithms and data structures, and GPU programming, have experience implementing C/C++ algorithms, and you should be excited to work on state-of-the-art research in the 3D computer graphics. https://wwwcg.in.tum.de/group/joboffers/phd-position-photorealistic-rendering-for-deep-le arning-and-online-reconstruction.html Ph.D. Position – Photorealistic Rendering for Deep Learning and Online Reconstruction
  • 92. 3DContentgeneration Automaticphotorealism#2 ConvertingLiDARscanstovisuallyhighquality3Dcontent Atom View is a new piece of software that allows content creators to translate real-world scans into assets for virtual environments. Not only does it aim to produce realistic results but also reduce the workflow for content creation. The standalone app takes files captured from volumetric cameras, offline graphics renderers, 360 lidar and more. Volumetric capture is a promising area of development that could one day allow content creators to skip over several of the more laborious steps of traditional 3D content creation with better results. With Atom View, users can even edit objects once they’ve been imported. https://youtu.be/YxRI_3gKP8g
  • 93. 3DContentgeneration Styletransfer formaps Neural Networks and The Future of 3D Procedural Content Generation by Sam Snider-Held, Creative Technologist at MediaMonks, focusing on the intersection of AR, VR, AI, UX, and Style transfer output on the left, real terrain on the right. Both are planes whose vertices are being displaced by the height map texture. Now was time to create my own style transfer light field and light field renderer. I basically reimplemented Andrew Lowndes’ WebGl light field renderer in Unity. What this post demonstrates is the idea that neural network could radically change how we generate 3D content. I went with light fields because currently my GPU is not fast enough to style transfer or any other generative network at 60 FPS. But if we do get to that point, it’s entirely possible see generative neural networks become an alternative rendering pipe line to the standard rasterization approach. In this way, neural networks could generate each frame of a game in real time, based on realtime feedback from the user. But it also potentially allows for a much more powerful creative approach, for the creator and the end user. Imagine playing Gears of War, but then telling the computer “Keep the gameplay, story, and 3d models, but make it look like Zelda: Breath of the Wild.” This is how creating or playing a future gaming experience could be, all because computers now know what things “look like” and can make other things “look like” them too.
  • 94. 3DContentgeneration from Videoto3D Production-Level Facial Performance Capture Using Deep Convolutional Neural Networks In Proceedings of SCA'17, Los Angeles, CA, USA, July 28-30, 2017 http://research.nvidia.com/publication/facial-performance-capture-deep -neural-networks Samuli Laine, Tero Karras, Timo Aila, Antti Herva (Remedy Entertainment), Shunsuke Saito (Pinscreen, University of Southern California), Ronald Yu (Pinscreen, University of Southern California), Hao Li (USC Institute for Creative Technologies, University of Southern California, Pinscreen), Jaakko Lehtinen (NVIDIA, Aalto University) NVIDIA and game developer Remedy (Alan Wake, Quantum Break) showcased their team-up solution to streamlining motion capture and animation using a deep learning neural network, running on NVIDIA’s powerful DGX-1 server. After being “trained” with information on previously produced animations, the network is able to generate sophisticated 3D facial animation from videos of live actors, greatly alleviating the time and labor burden of traditional mo-cap animation — it can even learn enough to generate facial animation from just an audio clip. The companies believe this system could eventually produce animation that’s just as good or better than traditionally produced fare. http://www.animationmagazine.net/events/siggraph-facial-animation-advances-fabri c-engine-the-french-contingent/ “We present a real-time deep learning framework for video-based facial performance capture -- the dense 3D tracking of an actor's face given a monocular video. Our pipeline begins with accurately capturing a subject using a high-end production facial capture pipeline based on multi-view stereo tracking and artist- enhanced animations. With 5-10 minutes of captured footage, we train a convolutional neural network to produce high-quality output, including self-occluded regions, from a monocular video sequence of that subject. Since this 3D facial performance capture is fully automated, our system can drastically reduce the amount of labor involved in the development of modern narrative-driven video games or films involving realistic digital doubles of actors and potentially hours of animated dialogue per character. “
  • 95. 3DContentgeneration from Video(&Audio) toVideo Face2Face: Real-time Face Capture and Reenactment of RGB Videos Justus Thies1 Michael Zollhöfer 2 Marc Stamminger 1 Christian Theobalt 2 Matthias Nießner 3 1 University of Erlangen-Nuremberg2 Max Planck Institute for Informatics 3 Stanford University http://www.graphics.stanford.edu/~niessner/thies2016face.html https://doi.org/10.1109/CVPR.2016.262 Neural Face Editing with Intrinsic Image Disentangling Zhixin Shu, Ersin Yumer, Sunil Hadap, Kalyan Sunkavalli, Eli Shechtman, Dimitris Samaras (Submitted on 13 Apr 2017) https://arxiv.org/abs/1704.04131 University of Washington researchers have developed new algorithms that solve a thorny challenge in the field of computer vision: turning audio clips into a realistic, lip-synced video of the person speaking those words. As detailed in a paper to be presented Aug. 2 at SIGGRAPH 2017, the team successfully generated highly-realistic video of former president Barack Obama talking about terrorism, fatherhood, job creation and other topics using audio clips of those speeches and existing weekly video addresses that were originally on a different topic. Synthesizing Obama: learning lip sync from audioSupasorn Suwajanakorn, Steven M. Seitz, Ira Kemelmacher-Shlizerman ACM Transactions on Graphics (TOG), Volume 36 Issue 4, July 2017, https://doi.org/10.1145/3072959.3073640 http://www.washington.edu/news/2017/07 /11/lip-syncing-obama-new-tools-turn-a udio-clips-into-realistic-video/