Data imbalances, related to the country of production of an item, lead to the under-recommendation of items produced in the smaller (less represented) countries. Re-ranking the recommendation lists, by balancing item relevance with the promotion of items produced in smaller countries can introduce equity in terms of visibility and exposure, without affecting recommendation effectiveness.
In an ECIR 2021 paper, with Elizabeth Gómez and Maria Salamó, we characterize disparities generated by the state-of-the-art recommendation models, in a majority vs. rest splitting of the demographic group, where the USA represents the majority country in the datasets we considered (i.e., the country where the majority of the items are produced and that attracted most of the user ratings). We mitigate disparities thanks to a novel re-ranking approach.
Disparate impact assessment
To characterize disparate impact, we consider two metrics:
- Disparate visibility, which measures the difference between the share of recommendations for items of a demographic group and the representation of that group (where representation is either the percentage of items or ratings associated with that group);
- Disparate exposure, which measures the difference between the exposure obtained by a demographic group in the recommendation lists (i.e., in which positions the items of that group appear) and the representation of that group.
We assessed the behavior of these two metrics in the movie and book domains, by considering the MovieLens-1M and Book Crossing datasets, and by studying four models, namely MostPop, Random, UserKNN, ItemKNN, BPR, BiasedMF, and SVD++.
The paper contains the detailed results, but here we report a few takeaways from our assessment:
- Both datasets expose a big geographic imbalance in the representation of each group, in terms of offered items.
- The majority group (the USA) usually attracts more ratings, thus increasing the existing imbalance. However, the minority items are not considered as of lower quality for the users, since the average rating for both groups is the same in both datasets.
- Geographic imbalance almost always affects the minority group, since we feed algorithms with much more instances than their counterpart.
- Matrix Factorization based approaches can help the minority receive more visibility and exposure, with latent factors that capture the preferences also of the minority. However, if the imbalance is too severe, the minority is always affected by disparate impact.
Mitigating disparate impact
The foundation behind our mitigation algorithm is to move up in the recommendation list the item that causes the minimum loss in prediction for all the users. To achieve this goal, our re-ranking approach works in two passes:
- We start by targeting the desired visibility, to make sure the items of the disadvantaged group are recommended enough times.
- Once the target visibility is reached (i.e., the items of the minority group are recommended enough times), we move items up inside the recommendation list to reach the target exposure. This allows the items of the minority to appear in higher positions in the list.
The paper contains the details of our approach, including its pseudo-code.
Impact of mitigation
The full paper contains the detailed results of each algorithm. These results show a general pattern, highlighting that, when providing a re-ranking based on minimal predicted loss, the effectiveness remains stable, but disparate visibility and disparate exposure are mitigated.
1 thought on “Disparate Impact in Item Recommendation: a Case of Geographic Imbalance”
Comments are closed.