To translate or not to translate? Exploring machine translation and multilingual models for mental health text classification
Version
Published
Date Issued
2023-06-14
Type
Conference Paper
Language
English
Abstract
It is often difficult to obtain a sufficient amount of training data for natural language processing
methods when working with local languages. This challenge is even more present in the context of sensitive topics related to the detection of mental illnesses such as burnout. In this paper we explore the impact of machine translation and the use of multilingual models to mitigate this limitation. Specifically, we are interested in the potential for cross-lingual transfer learning, i.e., attempting to improve model performance by adding training data sourced from other languages. We compare different setups using monolingual BERT and multilingual BERT, applying different methods such as zero-shot transfer learning and joint training for a multilingual dataset consisting of English, German, French and Arabic examples. Our results suggest that low-resource languages may in some circumstances benefit from cross-lingual transfer learning.
methods when working with local languages. This challenge is even more present in the context of sensitive topics related to the detection of mental illnesses such as burnout. In this paper we explore the impact of machine translation and the use of multilingual models to mitigate this limitation. Specifically, we are interested in the potential for cross-lingual transfer learning, i.e., attempting to improve model performance by adding training data sourced from other languages. We compare different setups using monolingual BERT and multilingual BERT, applying different methods such as zero-shot transfer learning and joint training for a multilingual dataset consisting of English, German, French and Arabic examples. Our results suggest that low-resource languages may in some circumstances benefit from cross-lingual transfer learning.
Subjects
BF Psychology
QA75 Electronic computers. Computer science
QA76 Computer software
Publisher URL
Conference
SwissText 2023
Submitter
Kurpicz-Briki, Mascha
Citation apa
Puttick, A. R., Merhbene, G., & Kurpicz-Briki, M. (2023). To translate or not to translate? Exploring machine translation and multilingual models for mental health text classification. SwissText 2023. https://arbor.bfh.ch/handle/arbor/36342
