Abstract
Large language models can produce powerful contextual representations that lead to improvements across many NLP tasks. Since these models are typically guided by a sequence of learned self attention mechanisms and may comprise undesired inductive biases, it is paramount to be able to explore what the attention has learned. While static analyses of these models lead to targeted insights, interactive tools are more dynamic and can help humans better gain an intuition for the model-internal reasoning process. We present exBERT, an interactive tool named after the popular BERT language model, that provides insights into the meaning of the contextual representations by matching a human-specified input to similar contexts in a large annotated dataset. By aggregating the annotations of the matching similar contexts, exBERT helps intuitively explain what each attention-head has learned.
Keywords
Affiliated Institutions
Related Publications
Understanding the Behaviors of BERT in Ranking
This paper studies the performances and behaviors of BERT in ranking tasks. We explore several different ways to leverage the pre-trained BERT and fine-tune it on two ranking ta...
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
Abstract Motivation Biomedical text mining is becoming increasingly important as the number of biomedical documents rapidly grows. With the progress in natural language processi...
ALBERT: A Lite BERT for Self-supervised Learning of Language\n Representations
Increasing model size when pretraining natural language representations often\nresults in improved performance on downstream tasks. However, at some point\nfurther model increas...
Composition in Distributional Models of Semantics
Abstract Vector‐based models of word meaning have become increasingly popular in cognitive science. The appeal of these models lies in their ability to represent meaning simply ...
Deep Contextualized Word Representations
We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses ...
Publication Info
- Year
- 2019
- Type
- preprint
- Citations
- 47
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.48550/arxiv.1910.05276