Query on Attention Architecture

Question

As we most know that, Attention is focuses on specific parts of the input sequence those are most relevant in generating output sequence.

Ex: The driver could not drive the car fast because it had a problem.

how attention finds the specific parts(here it is 'it') in input and how it will assign score for the token?
is attention context- based model?
how to obtain attention maps (query, key, value)?
On what basis attention assigns higher weights to input tokens?

score 0 · Accepted Answer · answered Jun 21 '23 at 08:30

0

Neural networks are considered black boxes because they are not interpretable: we don't know why they compute the results they do.
Yes.
To get attention maps you can you the BERTViz library, e.g.:
Same answer as 1).

answered Jun 21 '23 at 08:30

noe

1 Answers1