Deep Learning Basics Lecture7: Recurrent Neural Networks

Sequential Model

input xt-2 -> xt-1 -> xt

p(xt|xt-1, xt-2,...) <- The number of inputs varies. Fix the past timespan: 과거 한 개만 보는 것, 현재는 과거 다음 이전을 디펜던트 한다.

시퀀셜 데이터는 길이가 언제 끝날지 알 수가 없어서 몇 개의 음절과 이미지가 주어질지 모른다.
가장 기본적인 시퀀셜 모델은 입력이 들어왔을 때, 다음을 예측해보는 것.
10개의 입력이 있으면 첫 번째 입력은 그 자체, 두 번째는 첫 번째, 세 번째는 첫 번째와 두 번째.
- Markov model(first-order autoregressive model) Easy to express the joint distribution! -> 많은 정보를 버릴 수밖에 없다.
Latent autoregressive model

Output yt-2 yt-1. yt

Hidden state ht-2. ->. ht-1. ->. ht

Input xt-2 xt-1 xt

중간의 히든 스테이트가 과거의 정보를 요약하고, 다음을 히든 스테이트에 의하여 과거 이전의 스테이트가 아니라 과거의 정보를 요약한 히든 스테이트라 한다.

x = p(xt|ht) <- summary of the past

Recurrent Neural Network

Long short Term Memory

input -> previous hidden state -> Previous cell state -> Next hidden state -> Next cell state -> Output(hidden state)
Core idea: cell state는 요약하는데, 컨베이너 벨트가 있다고 할 때 어떤 정보가 유용하고, 유용하지 않은지 전달하는 역할. 어떤 것을 넣고, 뺄지 조작한다.
Forget Gate: Decide which information to throw away
Input Gate: Decide which information to store in the cell state

이전의 셀 스테이트와 지금의 셀 스테이트를 합쳐서 업데이트시킨다.

이전까지 들어온 정보(input)를 현재의 입력으로 지울지, 새롭게 쓸지, 이를 취합할지(Update) 또 한 번 취합한 것을 빼낼지(Output)

Gated Recurrent Unit

요즘엔 transformer로 많이 쓴다.

Mathematics for Artificial Intelligence 10강: RNN 첫걸음 (0)	2023.01.06
Mathematics for Artificial Intelligence 9강: CNN 첫걸음 (0)	2023.01.05
Deep Learning Basics Lecture3: Optimization (0)	2023.01.05
Mathematics for Artificial Intelligence 8강: 베이즈 통계학 맛보기 (0)	2023.01.04
Deep Learning Basis Lecture 4: Convolutional Neural Networks (0)	2023.01.04