Design and develop your computer vision model with 3D data using PyTorch3D and more
Xudong Ma
Vishakh Hegde
Lilit Yolyan
BIRMINGHAM—MUMBAI
Copyright © 2022 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Publishing Product Manager: Dinesh Chaudhary
Content Development Editor: Joseph Sunil
Technical Editor: Rahul Limbachiya
Copy Editor: Safis Editing
Project Coordinator: Farheen Fathima
Proofreader: Safis Editing
Indexer: Rekha Nair
Production Designer: Ponraj Dhandapani
Marketing Coordinator: Shifa Ansari
First published: November 2022
Production reference: 1211022
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-80324-782-3
Xudong Ma is a Staff Machine Learning engineer with Grabango Inc. in Berkeley California. He was a Senior Machine Learning Engineer at Facebook (Meta) Oculus and worked closely with the 3D PyTorch Team on 3D facial tracking projects. He has many years of experience working on computer vision, machine learning, and deep learning and holds a Ph.D. in Electrical and Computer Engineering.
Vishakh Hegde is a Machine Learning and Computer Vision researcher. He has over 7 years of experience in the field, during which he has authored multiple well-cited research papers and published patents. He holds a masters from Stanford University specializing in applied mathematics and machine learning, and a BS and MS in Physics from IIT Madras. He previously worked at Schlumberger and Matroid. He is a Senior Applied Scientist at Ambient.ai, where he helped build their weapon detection system which is deployed at several Global Fortune 500 companies. He is now leveraging his expertise and passion for solving business challenges to build a technology startup in Silicon Valley. You can learn more about him on his website.
I would like to thank the computer vision researchers whose breakthrough research I got to write about. I want to thank the reviewers for their feedback and the wonderful team at Packt Publishing for giving me the chance to be creative. Finally, I want to thank my wife and family for all their support and encouragement when I most needed it.
Lilit Yolyan is a machine learning researcher working on her Ph.D. at YSU. Her research focuses on building computer vision solutions for smart cities using remote sensing data. She has 5 years of experience in the field of computer vision and has worked on a complex driver safety solution to be deployed by many well-known car manufacturing companies.
Eya Abid is a Masters of Engineering student specializing in Deep Learning and Computer Vision. She holds the position of an AI instructor within NVIDIA and quantum machine learning at CERN.
I would like to dedicate this work first to my family, friends, and whoever helped me through this process. A special dedication to Aymen, to whom I am forever grateful.
Ramesh Sekhar is the CEO and co-founder of Dapster.ai, a company that builds affordable and easily deployable robots that perform the most arduous tasks in warehouses. Ramesh has worked at companies like Symbol, Motorola, and Zebra and specializes in building products at the intersection of computer vision, AI, and Robotics. He has a BS in Electrical Engineering and an MS in Computer Science. Ramesh founded Dapster.ai in 2020. Dapster’s mission is to build robots that positively impact human beings by performing dangerous and unhealthy tasks. Their vision is to unlock better jobs, fortify supply chains, and better negotiate the challenges arising from climate change.
Utkarsh Srivastava is an AI/ML professional, trainer, YouTuber, and blogger. He loves to tackle and develop ML, NLP, and computer vision algorithms to solve complex problems. He started his data science career as a blogger on his blog (datamahadev.com) and YouTube channel (datamahadev), followed by working as a senior data science trainer at an institute in Gujarat. Additionally, he has trained and counseled 1,000+ working professionals and students in AI/ML. Utkarsh has completed 40+ freelance training and development work/projects in data science and analytics, AI/ML, Python development, and SQL. He hails from Lucknow and is currently settled in Bangalore, India, as an analyst at Deloitte USI Consulting.
I would like to thank my mother, Mrs. Rupam Srivastava, for her continuous guidance and support throughout my hardships and struggles. Thanks also to the Supreme Para-Brahman.
Mason McGough is a Sr. R&D Engineer and Computer Vision Specialist at Lowe’s Innovation Labs. He has a passion for imaging and has spent over a decade solving computer vision problems across a broad range of industrial and academic disciplines including geology, bio-informatics, game development, and retail. Most recently he is exploring the use of Digital Twins and 3D scanning for retail stores.
I wish to thank Andy Lykos, Joseph Canzano, Alexander Arango, Oleg Alexander, Erin Clark, and my family for their support.
3.145.59.187