Optimal Fair Aggregation of Crowdsourced Noisy Labels using Demographic Parity Constraints

Generative AI & LLMs
Published: arXiv: 2601.23221v1
Authors

Gabriel Singer Samuel Gruffaz Olivier Vo Van Nicolas Vayatis Argyris Kalogeratos

Abstract

As acquiring reliable ground-truth labels is usually costly, or infeasible, crowdsourcing and aggregation of noisy human annotations is the typical resort. Aggregating subjective labels, though, may amplify individual biases, particularly regarding sensitive features, raising fairness concerns. Nonetheless, fairness in crowdsourced aggregation remains largely unexplored, with no existing convergence guarantees and only limited post-processing approaches for enforcing $\varepsilon$-fairness under demographic parity. We address this gap by analyzing the fairness s of crowdsourced aggregation methods within the $\varepsilon$-fairness framework, for Majority Vote and Optimal Bayesian aggregation. In the small-crowd regime, we derive an upper bound on the fairness gap of Majority Vote in terms of the fairness gaps of the individual annotators. We further show that the fairness gap of the aggregated consensus converges exponentially fast to that of the ground-truth under interpretable conditions. Since ground-truth itself may still be unfair, we generalize a state-of-the-art multiclass fairness post-processing algorithm from the continuous to the discrete setting, which enforces strict demographic parity constraints to any aggregation rule. Experiments on synthetic and real datasets demonstrate the effectiveness of our approach and corroborate the theoretical insights.

Paper Summary

Problem
The main challenge addressed in this research paper is the amplification of individual biases in crowdsourced aggregation, particularly regarding sensitive features, which raises fairness concerns. This is a significant problem because crowdsourcing and aggregation of noisy human annotations is a common practice in domains where ground-truth is inherently subjective or prohibitively expensive to obtain.
Key Innovation
The key innovation of this research paper is the development of a post-processing algorithm called FairCrowd, which regularizes any label aggregation rule to enforce strict ε-fairness constraints. This is a novel solution to the problem of fairness in crowdsourced aggregation, as existing approaches only provide limited post-processing methods for enforcing ε-fairness under demographic parity.
Practical Impact
The practical impact of this research is significant, as it provides a solution to the problem of fairness in crowdsourced aggregation. This has implications for various domains where crowdsourcing and aggregation of noisy human annotations is used, such as medical diagnosis, content moderation, and sentiment analysis. By enforcing strict demographic parity constraints, FairCrowd can improve fairness in crowdsourced aggregation and reduce the amplification of individual biases.
Analogy / Intuitive Explanation
Imagine a group of people trying to guess the price of a house. Each person has their own opinion, but some people might be more biased towards overestimating or underestimating the price. In crowdsourced aggregation, these biases can be amplified, leading to an unfair result. FairCrowd is like a filter that removes these biases and ensures that the final result is fair and representative of the ground-truth.
Paper Information
Categories:
cs.LG
Published Date:

arXiv ID:

2601.23221v1

Quick Actions