Abstract

We present a simple, but surprisingly effective, method of self-training a twophase parser-reranker system using readily available unlabeled data.We show that this type of bootstrapping is possible for parsing when the bootstrapped parses are processed by a discriminative reranker.Our improved model achieves an f -score of 92.1%, an absolute 1.1% improvement (12% error reduction) over the previous best result for Wall Street Journal parsing.Finally, we provide some analysis to better understand the phenomenon.

Keywords

ParsingComputer scienceBootstrapping (finance)Discriminative modelArtificial intelligenceNatural language processingBottom-up parsingTraining setSimple (philosophy)Top-down parsingMachine learningMathematics

Affiliated Institutions

Related Publications

Publication Info

Year
2006
Type
article
Pages
152-159
Citations
615
Access
Closed

External Links

Social Impact

Altmetric

Social media, news, blog, policy document mentions

Citation Metrics

615
OpenAlex

Cite This

David McClosky, Eugene Charniak, Mark Johnson (2006). Effective self-training for parsing. , 152-159. https://doi.org/10.3115/1220835.1220855

Identifiers

DOI
10.3115/1220835.1220855