A World Full of Stereotypes? Further Investigation on Origin and Gender Bias in Multi-Lingual Word Embeddings
Version
Published
Date Issued
2021-06-03
Type
Article
Language
English
Abstract
Publicly available off-the-shelf word embeddings that are often used in productive applications for natural language processing have been proven to be biased. We have previously shown that this bias can come in a different form, depending on the language and the cultural context. In this work we extend our previous work and further investigate how bias varies in different languages. We examine Italian and Swedish word embeddings for gender and origin bias, and demonstrate how an origin bias concerning local migration groups in Switzerland is included in German word embeddings. We propose BiasWords, a method to automatically detect new forms of bias. Finally, we discuss how cultural and language aspects are relevant to the impact of bias on the application, and to potential mitigation measures.
Subjects
QA75 Electronic computers. Computer science
QA76 Computer software
Publisher DOI
Journal or Serie
Frontiers in Big Data
ISSN
2624-909X
Volume
4
Issue
625290
Publisher
Frontiers
Submitter
Kurpicz-Briki, Mascha
Citation apa
Kurpicz-Briki, M., & Leoni, T. A. D. (2021). A World Full of Stereotypes? Further Investigation on Origin and Gender Bias in Multi-Lingual Word Embeddings. In Frontiers in Big Data (Vol. 4, Issue 625290). Frontiers. https://doi.org/10.24451/arbor.14815
File(s)![Thumbnail Image]()
Loading...
open access
Name
fdata-04-625290.pdf
License
Attribution 4.0 International
Version
published
Size
297.77 KB
Format
Adobe PDF
Checksum (MD5)
f23c58c9f1cecc421d794703af1ead7f
