9 Game-Changing Deep Learning Topics Coming to ODSC West

9 Game-Changing Deep Learning Topics Coming to ODSC West

Every year, the data science landscape changes, with new tools, trends, frameworks, and specializations being introduced. Deep learning is no exception, as evidenced by the Deep Learning track as part of ODSC West 2022. We took a look at a few sessions coming to the conference this November 1st-3rd to get a better understanding of trends in deep learning.

Deep neural networks can make overconfident errors and assign high confidence predictions to inputs far away from the training data. Well-calibrated predictive uncertainty estimates are important to know when to trust a model’s predictions, especially for safe deployment of models in applications where the train and test distributions can be different. In this session, the speaker will present some concrete examples that motivate the need for uncertainty and out-of-distribution (OOD) robustness in deep learning. Next, he will present an overview of his recent work focused on building neural networks that know what they don’t know.

Session: Practical Tutorial on Uncertainty and Out-of-distribution Robustness in Deep Learning | Balaji Lakshminarayanan, PhD | Stalesearch Scientist | Google Brain

As deep learning and AI use grow, the complexity and size of the models grow. Training large models such as GPT-2 and Megatron, among others, has been a daunting task. Several distributed computing frameworks are available to address these tasks, and the oldest and most resilient is the OpenMPI (open message passing interface) library. OpenMPI is used for high-performance computing at supercomputing centers as part of distributed computing systems.

Session: Large Scale Deep Learning using the High-Performance Computing Library OpenMPI and DeepSpeed | Jennifer Dawn Davis | Staff Field Data Scientist | Domino Data Lab

In recent years, we have seen amazing results in artificial intelligence and machine learning owing to the emergence of models such as transformers and pretrained language models. Despite the astounding results published in academic papers, there remains a lot of ambiguity and challenges when it comes to deploying these models in industry because 1) troubleshooting, training, and maintaining these models is very time and cost-consuming due to their inherent large sizes and complexities 2) there is not yet enough clarity about when the advantages and challenges of these models outweigh classical ML models. These challenges are even more severe for small and mid-sized companies that do not have access to huge compute resources and infrastructure.

Session: Transforming The Retail Industry with Transformers | Azin Asgarian | Applied Research Scientist | Georgian and Elliot Henry | Data Science Manager | SPINS

Recommender systems are now ubiquitous in e-commerce. At Wayfair, we use a collection of machine learning models to predict which content and which products to show our customers at every stage in their shopping journey. Machine learning models for recommendations are typically trained to optimize for near-term measures of customer satisfaction, such as clicks and conversion. However, there are often other business objectives that can appear to be in conflict with this idealized ranking. Thus we need to find ways to balance these competing objectives if we want to make our product recommendations “profit aware.”

Deep Reinforcement Learning equips AI agents with the ability to learn from their own trial and error. Success stories include learning to play Atari games, Go, Dota2, robots learning to run, jump, manipulate. This tutorial will cover the foundations of Deep Reinforcement Learning, including MDPs, DQN, Policy Gradients, TRPO, PPO, DDPG, SAC, TD3, model-based RL, as well as current research frontiers.

Reinforcement learning (RL) has achieved remarkable success in various tasks, such as defeating all-human teams in MMP (massive multi-player) games, advances in robotics, and astonishing results in the protein folding problem in chemistry. Expertise in RL requires strong knowledge of machine learning, statistics, and areas of mathematics. Moreover, RL contains many concepts that seem “fuzzy” and hence can be challenging for beginners who are trying to learn RL. However, this session provides the intuition of various RL concepts, such as exploit/explore and maximization of expected reward, along with real-life examples of these concepts.

The Transformer architecture has improved the performance of deep learning models in domains such as Computer Vision and Natural Language Processing. Together with better performance come larger model sizes. This imposes challenges to the memory wall of the current accelerator hardware such as GPU. It is never ideal to train large models such as Vision Transformer, BERT, and GPT on a single GPU or a single machine. There is an urgent demand to train models in a distributed environment. However, distributed training, especially model parallelism, often requires domain expertise in computer systems and architecture. It remains a challenge for AI researchers to implement complex distributed training solutions for their models.

Session: Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training | James Demmel, PhD | Professor of Mathematics and Computer Science | UC Berkeley and Yang You, PhD | Presidential Young Professor | National University of Singapore

To learn a new task as we humans need not always start afresh but rather apply previously-learned knowledge to perform the new task. In the same way, “transfer learning” allows a machine learning model to port the knowledge it has acquired during previous training of a task to a new task. Transfer learning has lately shown much promise and is a very active area of research. In this tutorial session, we will discuss the basic theory of transfer learning and a few applications with a hands-on python coding session with TensorFlow 2.0. We will briefly discuss some latest applications of transfer learning like privacy preserving ML.

Is your Generative adversarial neural network (GANs) not producing representative synthetic data? If yes, that is no surprise because training a GAN to produce quality data representative of the natural distributions is more complex than traditional predictive modeling. Ensuring the data is representative often requires an analysis of the covariate relationships and a comparison of the moments in the synthetic and natural (actual) distributions. This presentation will detail how a genetic algorithm can be combined with a set pseudo discriminators to automate constructing a better GAN.

Learn Methods For Better Deep Learning at ODSC West 2022

To dive deeper into these topics, join us at ODSC West 2022 this November 1st to 3rd, either in-person or virtually. The conference will also feature hands-on training sessions in focus areas, such as machine learning, deep learning, MLOps and data engineering, responsible AI, and more. What’s more, you can extend your immersive training to 4 days with a Mini-Bootcamp Pass. Check out all of our types of passes here.

Images Powered by Shutterstock