Abstract
Distributed word representations (word embeddings) have recently contributed to competitive performance in language modeling and several NLP tasks. In this work, we train word embeddings for more than 100 languages using their corresponding Wikipedias. We quantitatively demonstrate the utility of our word embeddings by using them as the sole features for training a part of speech tagger for a subset of these languages. We find their performance to be competitive with near state-of-art methods in English, Danish and Swedish. Moreover, we investigate the semantic features captured by these embeddings through the proximity of word groupings. We will release these embeddings publicly to help researchers in the development and enhancement of multilingual applications.
Keywords
Affiliated Institutions
Related Publications
Evaluating the Effectiveness of Large Language Models in Representing Textual Descriptions of Geometry and Spatial Relations (Short Paper)
This research focuses on assessing the ability of large language models (LLMs) in representing geometries and their spatial relations. We utilize LLMs including GPT-2 and BERT t...
Feature-rich part-of-speech tagging with a cyclic dependency network
We present a new part-of-speech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representati...
Towards Learning Terminological Concept Systems from Multilingual Natural Language Text
Terminological Concept Systems (TCS) provide a means of organizing, structuring and representing domain-specific multilingual information and are important to ensure terminologi...
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
Abstract Motivation Biomedical text mining is becoming increasingly important as the number of biomedical documents rapidly grows. With the progress in natural language processi...
Exploring the Limits of Transfer Learning with a Unified Text-to-Text\n Transformer
Transfer learning, where a model is first pre-trained on a data-rich task\nbefore being fine-tuned on a downstream task, has emerged as a powerful\ntechnique in natural language...
Publication Info
- Year
- 2013
- Type
- preprint
- Pages
- 183-192
- Citations
- 307
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.48550/arxiv.1307.1662