Chapter 7: Meta-SGD and Reptile Algorithms

  1. Unlike MAML, in Meta-SGD, along with finding optimal parameter value, , we also find the optimal learning rate, , and update the direction.
  2. The learning rate is implicitly implemented in the adaptation term. So, in Meta-SGD, we don't initialize a learning rate with a small scalar value. Instead, we initialize them with random values with the same shape as  and learn them along with .
  1. The update equation of the learning rate can be expressed as .
  2. Sample n tasks and run SGD for fewer iterations on each of the sampled tasks, and then update our model parameter in a direction that is common to all the tasks.
  3. The reptile update equation can be expressed as .
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.237.79