Skip to content

Figure 4.17 Explanation (4.7 Generating text) #308

Closed Answered by rasbt
labdmitriy asked this question in Q&A
Discussion options

You must be logged in to vote

These are good points, and it sounds like there are two related questions. Let's talk about inference first ("generate"), which you mentioned at the top of this thread. Here, we take the last token only, because we already have the other input tokens via the provided input. E.g., consider the input

"Sunday is my favorite day of the week, because"

In this case, it would be wasteful (and error prone) to have the model regenerate the input shifted by +1 token as we do during training. Instead, we are only interested in the token that comes after "because".

Then, you mentioned

For example, as I understand, for the first token in the training sample we have corresponding target (next token), b…

Replies: 1 comment 5 replies

Comment options

You must be logged in to vote
5 replies
@rasbt
Comment options

@labdmitriy
Comment options

@labdmitriy
Comment options

@rasbt
Comment options

Answer selected by labdmitriy
@labdmitriy
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants