Structured Prediction with autoregressive models and reinforcement learning: Slides

with autoregressive models and reinforcement learning

I have a poster to prepare

I 2pv* a Uoin8r to prTpa@e

POS Tagging

Protein folding

Spelling correction

Structured Prediction

Betty Fabre

Ph.D student

Tanguy

Urvoy

Orange Labs

Damien

Lolive

Jonathan

Chevelu

HPC Summer School , July2019

Encoder-Decoder / Autoregressive models

Encoder

Decoder

Input features x

Structured output

y = [START, y1, y2, .., yn, STOP]

Fixed length representation of x

ref : [2014] Sequence to Sequence Learning with Neural Networks, Ilya Sutskever, Oriol Vinyals, Quoc V. Le

Encoder-Decoder / Autoregressive models

Structured

output y

Decoder

START

ŷ1

ŷ2

...

STOP

Model

ref : [2014] Sequence to Sequence Learning with Neural Networks, Ilya Sutskever, Oriol Vinyals, Quoc V. Le

Training vs. Decoding

START

ŷ1

Model

Training : likelihood optimization & teacher forcing

START

ŷ1

ŷ2

...

STOP

Model

Training vs. Decoding

START

ŷ1

Model

Training : likelihood optimization & teacher forcing

START

ŷ1

ŷ2

...

STOP

Model

Training vs. Decoding

START

ŷ1

Model

Training : likelihood optimization & teacher forcing

START

ŷ1

ŷ2

...

STOP

Model

Training vs. Decoding

START

ŷ1

Model

Training : likelihood optimization & teacher forcing

START

ŷ1

max p(yt|y1, y2, ..yt-1)

...

STOP

Model

Training vs. Decoding

START

ŷ1

Model

Decoding : autoregressive

START

ŷ1

ŷt = argmaxy' p(y'|ŷ1,..ŷt-1)

...

Model

→ decoding local while

evaluation is global

→ exposure bias

Reinforcement Learning Approaches

REINFORCE Algorithm

training with exposure bias

training with expectation

Reinforcement Learning Approaches

Actor-Critic Algorithm

ref:

- [1992] Simple statistical gradient-following algorithms for connectionist reinforcement learning, Ronald J. Williams

- [2015] Sequence Level Training with Recurrent Neural Networks, Marc'Aurelio Ranzato, Sumit Chopra, Michael Auli, Wojciech Zaremba

- [2017] An Actor-Critic Algorithm for Sequence Prediction, Dzmitry Bahdanau, Philemon Brakel, Kelvin Xu, Anirudh Goyal, Ryan Lowe, Joelle Pineau, Aaron Courville, Yoshua Bengio