UTHealth@ BioCreativeVII: domain-specific transformer models for drug-protein relation extraction

Abstract

It is important to automatically extract the relations between drugs and proteins from ever-growing biomedical literature, to build up-to-date knowledge bases in biomedicine. Through the DRUGPROT track at BioCreative VII, we developed automated methods to recognize drug-protein entity relations from PubMed abstracts. In this short system description paper, we outline and describe our proposed system submissions that leverage multiple transformer models pre-trained on biomedical data. The outputs of some of the systems have been combined using a decision based on majority voting. Our best system obtained 80.44% in precision and 74.96% in recall for an F1-score of 77.60%, demonstrating the effectiveness of deep learning-based approaches for automatic relation extraction from biomedical literature for the main track. We also participated in the LargeScale Track - the micro-averaged precision, recall and F1-score of our best system being 79.49%, 75.27% and 77.32% respectively.

Publication
In ‘Proceedings of the BioCreative VII challenge evaluation workshop
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.
Create your slides in Markdown - click the Slides button to check out the example.

Add the publication’s full text or supplementary notes here. You can use rich formatting such as including code, math, and images.

Avisha Das
Avisha Das
Research Fellow

My research interests include natural language understanding and generation with a focus on Biomedical NLP and AI Security.