Joint Evaluation of Fairness and Relevance in Recommender Systems with Pareto Frontier

📅 2025-02-17

📈 Citations: 0

✨ Influential: 0

career value

255K/year

🤖 AI Summary

Joint evaluation of fairness and relevance in recommender systems has long suffered from unreliable separate assessments or manually weighted metrics with weak correlations to classical measures (e.g., NDCG). Method: We propose Distance to Pareto Frontier (DPFR), the first recommendation evaluation metric incorporating Pareto optimality: it constructs a two-dimensional Pareto frontier using any fairness metric (e.g., Statistical Parity) and any relevance metric (e.g., NDCG), then quantifies overall joint performance via Euclidean distance to this frontier. DPFR is modular, unbiased, and compatible with standard metrics. Results: Experiments across four models, three re-ranking strategies, and six datasets demonstrate that DPFR significantly outperforms existing joint metrics. It reveals that most mainstream metrics deviate substantially from the Pareto frontier, thereby enhancing evaluation robustness and credibility.

Technology Category

Application Category

📝 Abstract

Fairness and relevance are two important aspects of recommender systems (RSs). Typically, they are evaluated either (i) separately by individual measures of fairness and relevance, or (ii) jointly using a single measure that accounts for fairness with respect to relevance. However, approach (i) often does not provide a reliable joint estimate of the goodness of the models, as it has two different best models: one for fairness and another for relevance. Approach (ii) is also problematic because these measures tend to be ad-hoc and do not relate well to traditional relevance measures, like NDCG. Motivated by this, we present a new approach for jointly evaluating fairness and relevance in RSs: Distance to Pareto Frontier (DPFR). Given some user-item interaction data, we compute their Pareto frontier for a pair of existing relevance and fairness measures, and then use the distance from the frontier as a measure of the jointly achievable fairness and relevance. Our approach is modular and intuitive as it can be computed with existing measures. Experiments with 4 RS models, 3 re-ranking strategies, and 6 datasets show that existing metrics have inconsistent associations with our Pareto-optimal solution, making DPFR a more robust and theoretically well-founded joint measure for assessing fairness and relevance. Our code: https://github.com/theresiavr/DPFR-recsys-evaluation

Problem

Research questions and friction points this paper is trying to address.

Joint evaluation of fairness and relevance

Distance to Pareto Frontier approach

Modular and intuitive joint measure

Innovation

Methods, ideas, or system contributions that make the work stand out.

Pareto Frontier evaluation

Modular distance measure

Robust fairness-relevance assessment

🔎 Similar Papers

Correcting for Popularity Bias in Recommender Systems via Item Loss Equalization