Welcome to the auxillary page for the Deep Learning course!

Textbook #1: Deep Learning with Python (2nd edition), by Francois Chollet. Like Francois Chollet's Keras library, the intent of this book is to help democratize deep learning via hands-on learning.

Textbook #2: Deep Learning Illustrated, by Jon Krohn, Grant Beyleveld, and Algae Bassens. This book provides a broad [not necessarily deep] overview of a range of topics, including recent developments such as the Mask R-CNN model and the transformer architecture.

Textbook #3: The Science of Deep Learning, by Iddo Drori. This will serve as an occasional reference for us.

The current course materials include ...

 Introduction:
 Fundamentals:
 ConvNets (Part 1):
 ConvNets (Part 2):
 Embeddings, Recurrent Neural Networks, and Sequences (Part 1):
 Embeddings, Recurrent Neural Networks, and Sequences (Part 2):
 Generative Models:
 Reinforcement Learning:
  • Slides
  • Variations on the DQN ...
    • https://github.com/mimoralea/gdrl/blob/master/notebooks/chapter_10/chapter-10.ipynb
    • class FCDuelingQ(nn.Module): # fully connected: the state value and action advantage are dueling
    • q = v + a - a.mean(1, keepdim=True).expand_as(a)
    • class DuelingDDQN():
    • argmax_a_q_sp = self.online_model(next_states).max(1)[1] # online model selects action
    • q_sp = self.target_model(next_states).detach() # target model estimates action value
    • mixed_weights = target_ratio + online_ratio
    • class PrioritizedReplayBuffer():
    • self.memory[idxs, self.td_error_index] = np.abs(td_errors)
    • sorted_arg = self.memory[:self.n_entries, self.td_error_index].argsort()[::-1] # sorted by magnitude of TD error 
    • if self.rank_based:
    •     priorities = 1/(np.arange(self.n_entries) + 1)
    • else: # proportional
    •     priorities = entries[:, self.td_error_index] + EPS
    • scaled_priorities = priorities**self.alpha
    • probs = np.array(scaled_priorities/np.sum(scaled_priorities), dtype=np.float64)
    • weights = (self.n_entries * probs)**-self.beta
    • normalized_weights = weights/weights.max()
    • What is the role of weighted importance sampling for TD error?
  • Vanilla Policy Gradient (aka REINFORCE with baseline) ...
    • https://github.com/mimoralea/gdrl/blob/master/notebooks/chapter_11/chapter-11.ipynb

    • class FCDAP(nn.Module): # fully connected discrete-action policy
    • dist = torch.distributions.Categorical(logits=logits)
    • action = dist.sample()
    • class VPG():
    • value_error = returns - self.values
    • policy_loss = -(discounts * value_error.detach() * self.logpas).mean()
    • entropy_loss = -self.entropies.mean()
    • loss = policy_loss + self.entropy_loss_weight * entropy_loss # updates the policy model parameters
    • value_loss = value_error.pow(2).mul(0.5).mean() # updates the state value model parameters
    • How is the return (trajectory rewards) affecting parameter updates for the policy?
 Misc:
 Homework Questions:

Deep Learning Word Cloud

about me