MARB
https://doi.org/10.23695/V3WP-6C64
Reporting bias (the human tendency to not mention obvious or redundant information) and social bias (societal attitudes toward specific demographic groups) have both been shown to propagate from human text data to language models trained on such data. However, the two phenomena have not previously been studied in combination. The MARB dataset was developed to begin to fill this gap by studying the interaction between social biases and reporting bias in language models. Unlike many existing benchmark datasets, MARB does not rely on artificially constructed templates or crowdworkers to create contrasting examples. Instead, the templates used in MARB are based on naturally occurring written language from the 2021 version of the enTenTen corpus (Jakubíček et al., 2013).
Gå till källa för data
Öppnas i en ny tabbhttps://doi.org/10.23695/V3WP-6C64
Citering och åtkomst
Citering och åtkomst
Skapare/primärforskare:
- Södahl Bladsjö, Tom
- Muñoz Sánchez, Ricardo
Forskningshuvudman:
Citering:
Språk:
Administrativ information
Administrativ information
Ämnesområde och nyckelord
Ämnesområde och nyckelord
Metadata
Metadata
