Abstract

Working with large amounts of text data has become hectic and time-consuming. In order to reduce human effort, costs, and make the process more efficient, companies and organizations resort to intelligent algorithms to automate and assist the manual work. This problem is also present in the field of toxicological analysis of chemical substances, where information needs to be searched from multiple documents. That said, we propose an approach that relies on Question Answering for acquiring information from unstructured data, in our case, English PDF documents containing information about physicochemical and toxicological properties of chemical substances. Experimental results confirm that our approach achieves promising results which can be applicable in the business scenario, especially if further revised by humans.

Keywords

Computer scienceSecurity tokenLanguage modelGenerator (circuit theory)Task (project management)Benchmark (surveying)Discriminative modelEncoderArtificial intelligenceSample (material)Masking (illustration)Natural language processingSpeech recognitionMachine learningPower (physics)

Related Publications

Publication Info

Year
2022
Type
preprint
Citations
1550
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1550
OpenAlex
0
Influential

Cite This

Kevin Clark, Minh-Thang Luong, Quoc V. Le et al. (2022). Question Answering For Toxicological Information Extraction. Leibniz-Zentrum für Informatik (Schloss Dagstuhl) . https://doi.org/10.4230/oasics.slate.2022.3

Identifiers

DOI
10.4230/oasics.slate.2022.3

Data Quality

Data completeness: 77%