SlideShare a Scribd company logo
1 of 19
Download to read offline
Tools for Creating
Next-Gen Computer Vision
Apps on Snapdragon
Judd Heape
VP Product Management for Camera,
Computer Vision and Video Technology
Computer Vision in Snapdragon
Three function levels to provide comprehensive CV solutions
2
CV hardware
Acceleration blocks to
support and enable
hardware, software and
system designs in
Snapdragon platforms
1
CV Algorithms to
demonstrate complete
workflows that provide
state-of-the-art
solutions to certain
perception problems
2
CV end-to-end
Applications in
mobile, XR,
Automotive and IOT
market segments to
enable unique and
enhanced user
experiences
3
© 2022 Qualcomm
Engine for Visual Analytics (EVA):
Computer Vision Hardware Blocks
3
EVA1.x
EVA2.x
EVA3.x
Object/Face
Detection
Mobile Camera / Video XR Auto IoT
Optical Flow Depth Estimation
Feature
Extraction
Geometry
Correction
XR & 3DR
Object Detection
HOG/SVM
Feature – video encode
with 30% BR reduction
HCD/NCC
Semi-dense OF
GMO for video encode
DFS 1080p@30
Video Bokeh
Lens Distortion
Correction
ACF/RDF Face
Detection
Dense OF (SGM based)
Dense motion map for
multi-frame processing,
sensor alignment
DFS (SGM based)
• Bokeh better quality
• Visual special effect
• XR 3D reconstruction
HCD/ORB – Centralized
ME
for camera
Flow improvement
for XR 6DoF/VIO
Exposure
Compensation
Motion and depth
map warping
LSR (in EVAa 3.5)
XRA - DoH,DoG
FREAK, R-BRIEF
(In EVAa/EVAv 3.5)
© 2022 Qualcomm
Optical Flow
4
Semi-Dense OF Dense OF
Motion Density Every 2x2 block Every pixel
Motion Accuracy 1/8 pixel 1/16 pixel
Motion Range (X,Y) ±128, ±64 ±64, ±32
Max Resolution 1920x1080 1152x648
Confidence Map 8-bit 8-bit
Frames per Second 60 60
Sparse Motion
• Feature Point Detection, Local and Global Motion
• Various Detector and Descriptors
(Harris, DoH, DoG, FREAK)
Dense Motion
• Semi-dense Optical Flow (sDOF)
• Dense Optical Flow (DOF)
• Hybrid Deep Learning based Motion + OF
Segmentation Enhanced
© 2022 Qualcomm
Depth from Stereo Estimation
5
Depth from Stereo (DFS)
• Super-pixel Segmentation on SLIC
• Feature Extraction and Matching
• Confidence Map and Post Processing
DFS Engine
Depth Density Every pixel
Disparity Accuracy 1/16 pixel
Disparity Level [0,63]
Max Resolution 720P@60FPS
Input Images Flat Area Detection Depth Map
SLIC Map
Confidence Measure Post Processing
SLIC
(Simple Linear
Iterative Clustering)
Census Feature SGM
© 2022 Qualcomm
Geometric Correction Engine (GCE)
6
Low-power High-quality Warping
• ICA maps output pixels to input pixels
Effective Transformation
• Sparse grid transformation (35x27 or 67x51)
• Dense grid transformation ( 8 pixel grid )
• Perspective transformation (3x3 transform)
Output domain
Upscale +
Offset
Input/IFE domain
Virtual domain
Effective
transform
Virtual domain
Offset +
Downscale
Effective Transform
Sparse Grid
Transformation
Dense Grid
Transformation
Perspective
Transformation
GCE Use Cases
• Lens distortion correction
• Motion vector grid composition
• Rectification
Rectification
Lens Distortion Correction
© 2022 Qualcomm
Normalized Cross Correlation
7
NCC Supports Two Modes
• Patch to Frame Mode
• Frame to Frame Mode
18
8
8x8 templates: Prepared
by application, can come
from different sources
Reference frame
Patch-to-frame mode
Reference frame Current frame
Templates: All
in the same
frame
Frame-to-frame mode
Frame Matching Using Harris Corners and NCC
© 2022 Qualcomm
Face Detection
8
Deep Learning based Face Detection (FD)
• Min Face Size: 32x32
• Detection Accuracy: 95%
• 1080p@60FPS
• Multiple cameras supported
Under Non-Ideal Conditions
• Strong Backlight
• Full Profile
• Occlusions – Face Masks, Hats, Glasses, Sunglasses
Strong Backlight Full Profile Occlusions
© 2022 Qualcomm
EVA Architecture and Access
9
• The EVA APIs are exposed both from
the CPU and Hexagon Processor sides
• It includes both synchronous APIs
and asynchronous APIs
• There are direct interrupts between
the Hexagon Processor and EVA cores
for low latency communication
• EVA includes embedded CPU primarily
for task scheduling and hardware
pipes
• EVA hardware pipes are shared
between certain functions
Hardware Pipes
EVA
CPU
Data
API & Control
Hexagon Processor
EVA API
CV App
CV Engine CV Engine
EVA Driver EVA Driver
Firmware CPU
OF/DFS GCE HCD, NCC, ORB, DS
DDR
© 2022 Qualcomm
EVA Feature APIs
10
EVA3.0 Features EVA API
Image Warping evaWarp_Sync / evaWarp_Async
Depth from Stereo (DFS) evaDfs_Sync / evaDfs_Async
Normalized Cross Correlation (NCC) evaNccFrame_Sync / evaNccFrame_Async
Optical Flow (OF) evaOF_Sync / evaOF_Async
Feature Extraction (HCD) evaFeaturePoint_Sync / evaFeaturePoint_Async
Feature Descriptor Calc & Matching evaDcm_Sync / evaDcm_Async
Downscaler evaScaledown_Sync / evaScaledown_Async
Pyramid Image evaPyramidImage_Sync / evaPyramidImage_Async
© 2022 Qualcomm
EVA SDK Simulator
11
© 2022 Qualcomm
CV Use Case 1
Depth Map from Stereo Cameras (DFS)
12
Applications
• Accurate Camera/Video Bokeh effect
• Background replacement in video
recording or Zoom call
• AR/VR
(3D Reconstruction, Video
Passthrough, Occlusion)
© 2022 Qualcomm
CV Use Case 2
Real Time Bokeh Effect using Depth Map
from Stereo Cameras (DFS)
13
Applications
• Accurate Camera/Video Bokeh effect
© 2022 Qualcomm
CV Use Case 3
Dense Motion Map (DMM) for Video MCTF
14
Key Benefits of EVA
• Register multiple frames with
local motion compensated
• Remove ghosting artifacts in
combined video frames
© 2022 Qualcomm
CV Use Case 4
Dense Motion Map (DMM) for Video MFHDR
15
Key Benefits of EVA
• Estimating and compensating
for motion is key to achieve
high quality HDR video
• Remove ghosting artifacts in
combined video frames
• Running global motion and
local motion estimation
simultaneously requires large
amount of computation power
© 2022 Qualcomm
CV Use Case 5
Face Detection (FD) and
Face Landmark Detection (FLD)
16
Applications
• Gender/Expression/
Emotion/Gaze detection
• Avatar animation
• Geometric personalization
Qualcomm Deep Learning-based
3D face landmark detection reaches
high accuracy in locating
115or 300facial landmarks
© 2022 Qualcomm
Start Developing on Snapdragon
17
Capture at higher FPS Extend battery life
Tap into hardware-accelerated
CV features with an SDK not
previously available
© 2022 Qualcomm
Start Developing on Snapdragon
18
Xin Zhong
Director, Product Management
xzhong@qti.qualcomm.com
For access to the SDK contact:
© 2022 Qualcomm
Thank You
19

More Related Content

What's hot

“CMOS Image Sensors: A Guide to Building the Eyes of a Vision System,” a Pres...
“CMOS Image Sensors: A Guide to Building the Eyes of a Vision System,” a Pres...“CMOS Image Sensors: A Guide to Building the Eyes of a Vision System,” a Pres...
“CMOS Image Sensors: A Guide to Building the Eyes of a Vision System,” a Pres...
Edge AI and Vision Alliance
 
"Understanding and Implementing Face Landmark Detection and Tracking," a Pres...
"Understanding and Implementing Face Landmark Detection and Tracking," a Pres..."Understanding and Implementing Face Landmark Detection and Tracking," a Pres...
"Understanding and Implementing Face Landmark Detection and Tracking," a Pres...
Edge AI and Vision Alliance
 

What's hot (20)

85 videocompress
85 videocompress85 videocompress
85 videocompress
 
Face recognition Face Identification
Face recognition Face IdentificationFace recognition Face Identification
Face recognition Face Identification
 
On-Device AI
On-Device AIOn-Device AI
On-Device AI
 
“CMOS Image Sensors: A Guide to Building the Eyes of a Vision System,” a Pres...
“CMOS Image Sensors: A Guide to Building the Eyes of a Vision System,” a Pres...“CMOS Image Sensors: A Guide to Building the Eyes of a Vision System,” a Pres...
“CMOS Image Sensors: A Guide to Building the Eyes of a Vision System,” a Pres...
 
Neil Sarkar (AdHawk Microsystems): Ultra-Fast Eye Tracking Without Cameras fo...
Neil Sarkar (AdHawk Microsystems): Ultra-Fast Eye Tracking Without Cameras fo...Neil Sarkar (AdHawk Microsystems): Ultra-Fast Eye Tracking Without Cameras fo...
Neil Sarkar (AdHawk Microsystems): Ultra-Fast Eye Tracking Without Cameras fo...
 
Smart Mirror using Raspberry PI
Smart Mirror using Raspberry PISmart Mirror using Raspberry PI
Smart Mirror using Raspberry PI
 
HAND GESTURE RECOGNITION.ppt (1).pptx
HAND GESTURE RECOGNITION.ppt (1).pptxHAND GESTURE RECOGNITION.ppt (1).pptx
HAND GESTURE RECOGNITION.ppt (1).pptx
 
AMD Radeon™ RX 5700 Series 7nm Energy-Efficient High-Performance GPUs
AMD Radeon™ RX 5700 Series 7nm Energy-Efficient High-Performance GPUsAMD Radeon™ RX 5700 Series 7nm Energy-Efficient High-Performance GPUs
AMD Radeon™ RX 5700 Series 7nm Energy-Efficient High-Performance GPUs
 
Image Restoration for 3D Computer Vision
Image Restoration for 3D Computer VisionImage Restoration for 3D Computer Vision
Image Restoration for 3D Computer Vision
 
Skeleton-based Human Action Recognition with Recurrent Neural Network
Skeleton-based Human Action Recognition with Recurrent Neural NetworkSkeleton-based Human Action Recognition with Recurrent Neural Network
Skeleton-based Human Action Recognition with Recurrent Neural Network
 
Motion capture technology
Motion capture technologyMotion capture technology
Motion capture technology
 
Object tracking
Object trackingObject tracking
Object tracking
 
Top 16 Applications of Computer Vision in Video Surveillance and Security.pdf
Top 16 Applications of Computer Vision in Video Surveillance and Security.pdfTop 16 Applications of Computer Vision in Video Surveillance and Security.pdf
Top 16 Applications of Computer Vision in Video Surveillance and Security.pdf
 
"Understanding and Implementing Face Landmark Detection and Tracking," a Pres...
"Understanding and Implementing Face Landmark Detection and Tracking," a Pres..."Understanding and Implementing Face Landmark Detection and Tracking," a Pres...
"Understanding and Implementing Face Landmark Detection and Tracking," a Pres...
 
Deep VO and SLAM
Deep VO and SLAMDeep VO and SLAM
Deep VO and SLAM
 
Face recognization using artificial nerual network
Face recognization using artificial nerual networkFace recognization using artificial nerual network
Face recognization using artificial nerual network
 
AI Accelerators for Cloud Datacenters
AI Accelerators for Cloud DatacentersAI Accelerators for Cloud Datacenters
AI Accelerators for Cloud Datacenters
 
Mtech Second progresspresentation ON VIDEO SUMMARIZATION
Mtech Second progresspresentation ON VIDEO SUMMARIZATIONMtech Second progresspresentation ON VIDEO SUMMARIZATION
Mtech Second progresspresentation ON VIDEO SUMMARIZATION
 
Micron Persistent Memory & NVDIMM
Micron Persistent Memory & NVDIMMMicron Persistent Memory & NVDIMM
Micron Persistent Memory & NVDIMM
 
Hand Gesture Recognition
Hand Gesture RecognitionHand Gesture Recognition
Hand Gesture Recognition
 

Similar to “Tools for Creating Next-Gen Computer Vision Apps on Snapdragon,” a Presentation from Qualcomm

Multi Processor Architecture for image processing
Multi Processor Architecture for image processingMulti Processor Architecture for image processing
Multi Processor Architecture for image processing
ideas2ignite
 
[03 1][gpu용 개발자 도구 - parallel nsight 및 axe] miller axe
[03 1][gpu용 개발자 도구 - parallel nsight 및 axe] miller axe[03 1][gpu용 개발자 도구 - parallel nsight 및 axe] miller axe
[03 1][gpu용 개발자 도구 - parallel nsight 및 axe] miller axe
laparuma
 
"Computer-vision-based 360-degree Video Systems: Architectures, Algorithms an...
"Computer-vision-based 360-degree Video Systems: Architectures, Algorithms an..."Computer-vision-based 360-degree Video Systems: Architectures, Algorithms an...
"Computer-vision-based 360-degree Video Systems: Architectures, Algorithms an...
Edge AI and Vision Alliance
 
2008 brokerage 04 smart vision system [compatibility mode]
2008 brokerage 04 smart vision system [compatibility mode]2008 brokerage 04 smart vision system [compatibility mode]
2008 brokerage 04 smart vision system [compatibility mode]
imec.archive
 

Similar to “Tools for Creating Next-Gen Computer Vision Apps on Snapdragon,” a Presentation from Qualcomm (20)

“Develop Next-gen Camera Apps Using Snapdragon Computer Vision Technologies,”...
“Develop Next-gen Camera Apps Using Snapdragon Computer Vision Technologies,”...“Develop Next-gen Camera Apps Using Snapdragon Computer Vision Technologies,”...
“Develop Next-gen Camera Apps Using Snapdragon Computer Vision Technologies,”...
 
20200509 sid china digital optics and digital modulation_v5.0
20200509 sid china digital optics and digital modulation_v5.020200509 sid china digital optics and digital modulation_v5.0
20200509 sid china digital optics and digital modulation_v5.0
 
AWS_Re_invent_22_VNova.pdf
AWS_Re_invent_22_VNova.pdfAWS_Re_invent_22_VNova.pdf
AWS_Re_invent_22_VNova.pdf
 
Imaging automotive 2015 addfor v002
Imaging automotive 2015   addfor v002Imaging automotive 2015   addfor v002
Imaging automotive 2015 addfor v002
 
Imaging automotive 2015 addfor v002
Imaging automotive 2015   addfor v002Imaging automotive 2015   addfor v002
Imaging automotive 2015 addfor v002
 
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro..."High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...
 
“Efficient Video Perception Through AI,” a Presentation from Qualcomm
“Efficient Video Perception Through AI,” a Presentation from Qualcomm“Efficient Video Perception Through AI,” a Presentation from Qualcomm
“Efficient Video Perception Through AI,” a Presentation from Qualcomm
 
Cloud Graphical Rendering: A New Paradigm
Cloud Graphical Rendering:  A New ParadigmCloud Graphical Rendering:  A New Paradigm
Cloud Graphical Rendering: A New Paradigm
 
XLcloud 3-d remote rendering
XLcloud 3-d remote renderingXLcloud 3-d remote rendering
XLcloud 3-d remote rendering
 
Real-time Bangla License Plate Recognition System for Low Resource Video-base...
Real-time Bangla License Plate Recognition System for Low Resource Video-base...Real-time Bangla License Plate Recognition System for Low Resource Video-base...
Real-time Bangla License Plate Recognition System for Low Resource Video-base...
 
Multi Processor Architecture for image processing
Multi Processor Architecture for image processingMulti Processor Architecture for image processing
Multi Processor Architecture for image processing
 
Resume marky20181025
Resume marky20181025Resume marky20181025
Resume marky20181025
 
[03 1][gpu용 개발자 도구 - parallel nsight 및 axe] miller axe
[03 1][gpu용 개발자 도구 - parallel nsight 및 axe] miller axe[03 1][gpu용 개발자 도구 - parallel nsight 및 axe] miller axe
[03 1][gpu용 개발자 도구 - parallel nsight 및 axe] miller axe
 
Real Time Video Processing in FPGA
Real Time Video Processing in FPGA Real Time Video Processing in FPGA
Real Time Video Processing in FPGA
 
"Computer-vision-based 360-degree Video Systems: Architectures, Algorithms an...
"Computer-vision-based 360-degree Video Systems: Architectures, Algorithms an..."Computer-vision-based 360-degree Video Systems: Architectures, Algorithms an...
"Computer-vision-based 360-degree Video Systems: Architectures, Algorithms an...
 
“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...
“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...
“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...
 
“Tensilica Processor Cores Enable Sensor Fusion for Robust Perception,” a Pre...
“Tensilica Processor Cores Enable Sensor Fusion for Robust Perception,” a Pre...“Tensilica Processor Cores Enable Sensor Fusion for Robust Perception,” a Pre...
“Tensilica Processor Cores Enable Sensor Fusion for Robust Perception,” a Pre...
 
Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)Introduction to Software Defined Visualization (SDVis)
Introduction to Software Defined Visualization (SDVis)
 
Cuda project paper
Cuda project paperCuda project paper
Cuda project paper
 
2008 brokerage 04 smart vision system [compatibility mode]
2008 brokerage 04 smart vision system [compatibility mode]2008 brokerage 04 smart vision system [compatibility mode]
2008 brokerage 04 smart vision system [compatibility mode]
 

More from Edge AI and Vision Alliance

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
Edge AI and Vision Alliance
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
Edge AI and Vision Alliance
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
Edge AI and Vision Alliance
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
Edge AI and Vision Alliance
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
Edge AI and Vision Alliance
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
Edge AI and Vision Alliance
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
Edge AI and Vision Alliance
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
Edge AI and Vision Alliance
 

More from Edge AI and Vision Alliance (20)

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
 

Recently uploaded

CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)
Wonjun Hwang
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
panagenda
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Microsoft BitLocker Bypass Attack Method.pdf
Microsoft BitLocker Bypass Attack Method.pdfMicrosoft BitLocker Bypass Attack Method.pdf
Microsoft BitLocker Bypass Attack Method.pdf
Overkill Security
 

Recently uploaded (20)

2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdf
 
Microsoft BitLocker Bypass Attack Method.pdf
Microsoft BitLocker Bypass Attack Method.pdfMicrosoft BitLocker Bypass Attack Method.pdf
Microsoft BitLocker Bypass Attack Method.pdf
 

“Tools for Creating Next-Gen Computer Vision Apps on Snapdragon,” a Presentation from Qualcomm

  • 1. Tools for Creating Next-Gen Computer Vision Apps on Snapdragon Judd Heape VP Product Management for Camera, Computer Vision and Video Technology
  • 2. Computer Vision in Snapdragon Three function levels to provide comprehensive CV solutions 2 CV hardware Acceleration blocks to support and enable hardware, software and system designs in Snapdragon platforms 1 CV Algorithms to demonstrate complete workflows that provide state-of-the-art solutions to certain perception problems 2 CV end-to-end Applications in mobile, XR, Automotive and IOT market segments to enable unique and enhanced user experiences 3 © 2022 Qualcomm
  • 3. Engine for Visual Analytics (EVA): Computer Vision Hardware Blocks 3 EVA1.x EVA2.x EVA3.x Object/Face Detection Mobile Camera / Video XR Auto IoT Optical Flow Depth Estimation Feature Extraction Geometry Correction XR & 3DR Object Detection HOG/SVM Feature – video encode with 30% BR reduction HCD/NCC Semi-dense OF GMO for video encode DFS 1080p@30 Video Bokeh Lens Distortion Correction ACF/RDF Face Detection Dense OF (SGM based) Dense motion map for multi-frame processing, sensor alignment DFS (SGM based) • Bokeh better quality • Visual special effect • XR 3D reconstruction HCD/ORB – Centralized ME for camera Flow improvement for XR 6DoF/VIO Exposure Compensation Motion and depth map warping LSR (in EVAa 3.5) XRA - DoH,DoG FREAK, R-BRIEF (In EVAa/EVAv 3.5) © 2022 Qualcomm
  • 4. Optical Flow 4 Semi-Dense OF Dense OF Motion Density Every 2x2 block Every pixel Motion Accuracy 1/8 pixel 1/16 pixel Motion Range (X,Y) ±128, ±64 ±64, ±32 Max Resolution 1920x1080 1152x648 Confidence Map 8-bit 8-bit Frames per Second 60 60 Sparse Motion • Feature Point Detection, Local and Global Motion • Various Detector and Descriptors (Harris, DoH, DoG, FREAK) Dense Motion • Semi-dense Optical Flow (sDOF) • Dense Optical Flow (DOF) • Hybrid Deep Learning based Motion + OF Segmentation Enhanced © 2022 Qualcomm
  • 5. Depth from Stereo Estimation 5 Depth from Stereo (DFS) • Super-pixel Segmentation on SLIC • Feature Extraction and Matching • Confidence Map and Post Processing DFS Engine Depth Density Every pixel Disparity Accuracy 1/16 pixel Disparity Level [0,63] Max Resolution 720P@60FPS Input Images Flat Area Detection Depth Map SLIC Map Confidence Measure Post Processing SLIC (Simple Linear Iterative Clustering) Census Feature SGM © 2022 Qualcomm
  • 6. Geometric Correction Engine (GCE) 6 Low-power High-quality Warping • ICA maps output pixels to input pixels Effective Transformation • Sparse grid transformation (35x27 or 67x51) • Dense grid transformation ( 8 pixel grid ) • Perspective transformation (3x3 transform) Output domain Upscale + Offset Input/IFE domain Virtual domain Effective transform Virtual domain Offset + Downscale Effective Transform Sparse Grid Transformation Dense Grid Transformation Perspective Transformation GCE Use Cases • Lens distortion correction • Motion vector grid composition • Rectification Rectification Lens Distortion Correction © 2022 Qualcomm
  • 7. Normalized Cross Correlation 7 NCC Supports Two Modes • Patch to Frame Mode • Frame to Frame Mode 18 8 8x8 templates: Prepared by application, can come from different sources Reference frame Patch-to-frame mode Reference frame Current frame Templates: All in the same frame Frame-to-frame mode Frame Matching Using Harris Corners and NCC © 2022 Qualcomm
  • 8. Face Detection 8 Deep Learning based Face Detection (FD) • Min Face Size: 32x32 • Detection Accuracy: 95% • 1080p@60FPS • Multiple cameras supported Under Non-Ideal Conditions • Strong Backlight • Full Profile • Occlusions – Face Masks, Hats, Glasses, Sunglasses Strong Backlight Full Profile Occlusions © 2022 Qualcomm
  • 9. EVA Architecture and Access 9 • The EVA APIs are exposed both from the CPU and Hexagon Processor sides • It includes both synchronous APIs and asynchronous APIs • There are direct interrupts between the Hexagon Processor and EVA cores for low latency communication • EVA includes embedded CPU primarily for task scheduling and hardware pipes • EVA hardware pipes are shared between certain functions Hardware Pipes EVA CPU Data API & Control Hexagon Processor EVA API CV App CV Engine CV Engine EVA Driver EVA Driver Firmware CPU OF/DFS GCE HCD, NCC, ORB, DS DDR © 2022 Qualcomm
  • 10. EVA Feature APIs 10 EVA3.0 Features EVA API Image Warping evaWarp_Sync / evaWarp_Async Depth from Stereo (DFS) evaDfs_Sync / evaDfs_Async Normalized Cross Correlation (NCC) evaNccFrame_Sync / evaNccFrame_Async Optical Flow (OF) evaOF_Sync / evaOF_Async Feature Extraction (HCD) evaFeaturePoint_Sync / evaFeaturePoint_Async Feature Descriptor Calc & Matching evaDcm_Sync / evaDcm_Async Downscaler evaScaledown_Sync / evaScaledown_Async Pyramid Image evaPyramidImage_Sync / evaPyramidImage_Async © 2022 Qualcomm
  • 11. EVA SDK Simulator 11 © 2022 Qualcomm
  • 12. CV Use Case 1 Depth Map from Stereo Cameras (DFS) 12 Applications • Accurate Camera/Video Bokeh effect • Background replacement in video recording or Zoom call • AR/VR (3D Reconstruction, Video Passthrough, Occlusion) © 2022 Qualcomm
  • 13. CV Use Case 2 Real Time Bokeh Effect using Depth Map from Stereo Cameras (DFS) 13 Applications • Accurate Camera/Video Bokeh effect © 2022 Qualcomm
  • 14. CV Use Case 3 Dense Motion Map (DMM) for Video MCTF 14 Key Benefits of EVA • Register multiple frames with local motion compensated • Remove ghosting artifacts in combined video frames © 2022 Qualcomm
  • 15. CV Use Case 4 Dense Motion Map (DMM) for Video MFHDR 15 Key Benefits of EVA • Estimating and compensating for motion is key to achieve high quality HDR video • Remove ghosting artifacts in combined video frames • Running global motion and local motion estimation simultaneously requires large amount of computation power © 2022 Qualcomm
  • 16. CV Use Case 5 Face Detection (FD) and Face Landmark Detection (FLD) 16 Applications • Gender/Expression/ Emotion/Gaze detection • Avatar animation • Geometric personalization Qualcomm Deep Learning-based 3D face landmark detection reaches high accuracy in locating 115or 300facial landmarks © 2022 Qualcomm
  • 17. Start Developing on Snapdragon 17 Capture at higher FPS Extend battery life Tap into hardware-accelerated CV features with an SDK not previously available © 2022 Qualcomm
  • 18. Start Developing on Snapdragon 18 Xin Zhong Director, Product Management xzhong@qti.qualcomm.com For access to the SDK contact: © 2022 Qualcomm