The courses of teachers are under-recommended by state-of-the-art models, unless they belong to the country that offers more courses and attracts more ratings. Regulating how recommendations are distributed with respect to the country of provenience of the teachers enables equitable and effective recommendations (cross-continent provider fairness).
In a paper published in the Future Generation Computing Systems (Elsevier), with Elizabeth Gómez, Carlos Shui Zhang, Maia Salamó, and Guilherme Ramos, we characterize provider unfairness in the recommendation of online courses, based on the country of provenience of the teachers. We also mitigate disparities thanks to a novel post-processing approach that accounts for both predicted item relevance and the country of provenience of the course.
In our SIGIR 2021 paper, we analyzed provider unfairness in online-course recommendation from a binary perspective, considering a majority-versus-rest split of the groups, with the USA representing the majority group (i.e., the one providing more courses and attracting more ratings). In this study, we show that a mitigation for binary groups is not enough and we need to consider the representation of each demographic group to be able to provide equity.
Disparate impact assessment
To assess provider unfairness, we consider the COCO dataset, which includes the interactions of learners with online courses. To generate the recommendations, we considered seven models, namely MostPop, Random Guess, UserKNN, ItemKNN, BPR, BiasedMF, and SDV++. Specifically, we evaluated both the behavior of the original models and the results obtained with our mitigation for binary groups.
We characterize disparate impact that occurs in the presence of multiple demographic groups, by considering two metrics:
- Disparate visibility, which measures the difference between the share of recommendations for items of a demographic group and the representation of that group (where representation is either the percentage of items or ratings associated with that group);
- Disparate exposure, which measures the difference between the exposure obtained by a demographic group in the recommendation lists (i.e., in which positions the items of that group appear) and the representation of that group.
The paper contains the detailed results, but here we report a few takeaways from our assessment:
- North America represents the majority group, with over 50% of the offered courses. These courses attract even more interactions by the learners, thus increasing the group’s rating-based representation. All the other groups have a rating-based representation that is lower than the course-based one, minus Oceania. Hence, when courses are offered in English, a group attracts a share of ratings higher than the rate of courses it offers. The same does not hold for courses in Spanish and Portuguese, where learners mainly follow courses in these languages organized in their own country.
- Ranking effectiveness is associated with good visibility and exposure when considering a rating-based representation of the groups. The ratings given by learners help to produce good recommendations and to adapt to the preferences (in terms of ratings) that each demographic group had received.
- If an algorithm can provide a group with equitable visibility and exposure, when considering its representation in terms of offered courses, then its effectiveness is very low.
- Popularity-based recommendation exacerbates disparities, favoring the largest group and at the expense of the smallest ones.
Mitigating disparate impact
To mitigate disparate impact, we propose a re-ranking algorithm that introduces courses of the disadvantaged groups in the recommendation list, to reach visibility and exposure proportional to their representation.
Our mitigation algorithm is based on the idea to move up in the recommendation list the course that causes the minimum loss in prediction for all the learners, until the target visibility or exposure is reached. Our approach to introducing fairness via a re-ranking is the only one providing guarantees that equity of visibility and exposure is possible since we keep changing the recommendation list until equity from both perspectives is reached. The approaches at the state of the art, based on Maximal Marginal Relevance, make interventions on the predicted relevance for the items, thus not optimizing and not offering guarantees for the final visibility and exposure goals.
To achieve this goal, our re-ranking approach works in two passes:
- We start by targeting the desired visibility, to make sure the items of the disadvantaged groups (i.e., those that are currently under-recommended) are recommended enough times.
- Once the target visibility is reached (i.e., the items of the minority groups are recommended enough times), we move items up inside the recommendation list to reach the target exposure. This allows the items of the minority groups to appear in higher positions in the list.
The paper contains the details of our approach, including its pseudo-code.
Impact of mitigation
The full paper contains the detailed results of each algorithm. Here are the main outcomes that can be observed:
- Cross-continent provider fairness for demographic groups of teachers can be achieved without having a negative impact in terms of recommendation effectiveness. Thanks to our approach, we can distribute the recommendation in equitable ways between the different groups, without affecting the learners.
- Regulating the visibility given to a group does not provide the group with enough exposure. Disparities in terms of exposure are attenuated, but not fully mitigated. Specific interventions to regulate the given exposure are needed.
- Introducing provider fairness requires interventions at recommendation-list level. Mitigating by boosting predicted relevance for the disadvantaged groups (as the state-of-the-art approaches do) does not provide guarantees of equity of visibility and exposure are fully mitigated. Disparities are only partially mitigated.