If you're doing research in a similar area, you might find this list of relevant conferences and deadlines helpful: https://hcorinna.github.io/fair-deadlines/

Conferences

2024

AAAI/ACM

What’s Distributive Justice Got to Do with It? Rethinking Algorithmic Fairness from a Perspective of Approximate Justice

Hertweck, Corinna, Heitz, Christoph, and Loi, Michele

In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (2024)

DOI arXiv
IEEE

Group Fairness Refocused: Assessing the Social Impact of ML Systems

Hertweck, Corinna, Loi, Michele, and Heitz, Christoph

In 2024 11th Swiss Conference on Data Science (SDS) (2024)

Honorable Mention - Best Paper

Abs

Fairness as a property of a prediction-based decision system is a question of its impact on the lives of affected people, which is only partially captured by standard fairness metrics. In this paper, we present a formal framework for the impact assessment of prediction-based decision systems based on the paradigm of group fairness. We generalize the equality requirements of standard fairness criteria to the concept of equality of expected impact, and we show that standard fairness criteria can be interpreted as special cases of this generalization. Furthermore, we provide a systematic and practical method for determining the necessary utility functions for modeling the impact. We conclude with a discussion of possible extensions of our approach.

2022

ACM

People Are Not Coins: Morally Distinct Types of Predictions Necessitate Different Fairness Constraints

Viganò, Eleonora, Hertweck, Corinna, Heitz, Christoph, and Loi, Michele

In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (2022)

Abs DOI SSRN

A recent paper (Hedden 2021) has argued that most of the group fairness constraints discussed in the machine learning literature are not necessary conditions for the fairness of predictions, and hence that there are no genuine fairness metrics. This is proven by discussing a special case of a fair prediction. In our paper, we show that Hedden’s argument does not hold for the most common kind of predictions used in data science, which are about people and based on data from similar people; we call these “human-group-based practices”. We argue that there is a morally salient distinction between human-group-based practices and those that are based on data of only one person, which we call “human-individual-based practices”. Thus, what may be a necessary condition for the fairness of human-group-based practices may not be a necessary condition for the fairness of human-individual-based practices, on which Hedden’s argument is based. Accordingly, the group fairness metrics discussed in the machine learning literature may still be relevant for most applications of prediction-based decision making.
AAAI

Gradual (In)Compatibility of Fairness Criteria

Hertweck, Corinna, and Räz, Tim

In Proceedings of the 36th AAAI Conference on Artificial Intelligence (2022)

Oral Presentation

Abs DOI arXiv Code Poster

Impossibility results show that important fairness measures (independence, separation, sufficiency) cannot be satisfied at the same time under reasonable assumptions. This paper explores whether we can satisfy and/or improve these fairness measures simultaneously to a certain degree. We introduce information-theoretic formulations of the fairness measures and define degrees of fairness based on these formulations. The information-theoretic formulations suggest unexplored theoretical relations between the three fairness measures. In the experimental part, we use the information-theoretic expressions as regularizers to obtain fairness-regularized predictors for three standard datasets. Our experiments show that a) fairness regularization directly increases fairness measures, in line with existing work, and b) some fairness regularizations indirectly increase other fairness measures, as suggested by our theoretical findings. This establishes that it is possible to increase the degree to which some fairness measures are satisfied at the same time – some fairness measures are gradually compatible.

2021

IEEE

A Systematic Approach to Group Fairness in Automated Decision Making

Hertweck, Corinna, and Heitz, Christoph

In 2021 8th Swiss Conference on Data Science (SDS) (2021)

Best Paper in Ethical, Legal, and Social Issues (ELSI) of Data Science

Abs DOI arXiv

While the field of algorithmic fairness has brought forth many ways to measure and improve the fairness of machine learning models, these findings are still not widely used in practice. We suspect that one reason for this is that the field of algorithmic fairness came up with a lot of definitions of fairness, which are difficult to navigate. The goal of this paper is to provide data scientists with an accessible introduction to group fairness metrics and to give some insight into the philosophical reasoning for caring about these metrics. We will do this by considering in which sense socio-demographic groups are compared for making a statement on fairness.
ACM

On the Moral Justification of Statistical Parity

Hertweck, Corinna, Heitz, Christoph, and Loi, Michele

In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (2021)

Best Student Paper

Abs DOI arXiv Poster

A crucial but often neglected aspect of algorithmic fairness is the question of how we justify enforcing a certain fairness metric from a moral perspective. When fairness metrics are proposed, they are typically argued for by highlighting their mathematical properties. Rarely are the moral assumptions beneath the metric explained. Our aim in this paper is to consider the moral aspects associated with the statistical fairness criterion of independence (statistical parity). To this end, we consider previous work, which discusses the two worldviews "What You See Is What You Get" (WYSIWYG) and "We’re All Equal" (WAE) and by doing so provides some guidance for clarifying the possible assumptions in the design of algorithms. We present an extension of this work, which centers on morality. The most natural moral extension is that independence needs to be fulfilled if and only if differences in predictive features (e.g. high school grades and standardized test scores are predictive of performance at university) between socio-demographic groups are caused by unjust social disparities or measurement errors. Through two counterexamples, we demonstrate that this extension is not universally true. This means that the question of whether independence should be used or not cannot be satisfactorily answered by only considering the justness of differences in the predictive features.

Journals

2025

Springer

Utility on the Brain: An Empirical Investigation of Fairness Perceptions of Algorithmic Decisions under a Utility-Based Ethical Evaluation Framework

Kandul, Serhiy, Hertweck, Corinna, and Leicht-Deobald, Ulrich

AI and Ethics (2025)

Abs DOI PDF

This paper presents insights from a controlled experiment that examines how exposure to a utility-based fairness evaluation framework affects individuals’ perceptions of fairness and preferences for various fairness metrics in algorithmic allocation decisions. The framework’s key features are (1) the explicit reference to the otherwise implicit utility assumptions for the affected groups and (2) the inclusion of deservedness arguments, a central tenet from moral philosophy. Conducted in the context of an employment agency that allocates job coaching slots, the study investigates two core questions: (1) How does the utility-based evaluation framework influence preferences for fairness metrics? (2) How does it affect the fairness perceptions of the resulting outcomes? The results show that people exposed to the utility-based evaluation framework choose different fairness metrics than people without such ethical guidance.

2024

CLSR

Discrimination for the Sake of Fairness: Fairness by Design and Its Legal Framework

Hoch, Holly, Hertweck, Corinna, Loi, Michele, and Tamò, Aurelia

Computer Law & Security Review (2024)

Abs DOI SSRN

As algorithms are increasingly enlisted to make critical determinations about human actors, the more frequently we see these algorithms appear in sensational headlines crying foul on discrimination. There is broad consensus among computer scientists working on this issue that such discrimination can be reduced by intentionally collecting and consciously using sensitive information about demographic features like sex, gender, race, religion etc. Companies implementing such algorithms might, however, be wary of allowing algorithms access to such data as they fear legal repercussions, as the promoted standard has been to omit protected attributes, otherwise dubbed “fairness through unawareness”. This paper asks whether such wariness is justified in light of EU data protection and anti-discrimination laws. In order to answer this question, we introduce a specific case and analyze how EU law might apply when an algorithm accesses sensitive information to make fairer predictions. We review whether such measures constitute discrimination, and for who, arriving at different conclusions based on how we define the harm of discrimination and the groups we compare. Finding that several legal claims could arise regarding the use of sensitive information, we ultimately conclude that the proffered fairness measures would be considered a positive (or affirmative) action under EU law. As such, the appropriate use of sensitive information in order to increase the fairness of an algorithm is a positive action, and not per se prohibited by EU law.

2022

Frontiers

Methods for Uncovering Discourses That Shape the Urban Imaginary in Helsinki’s Smart City

Zaman, Sara, and Hertweck, Corinna

Frontiers in Sustainable Cities (2022)

Abs DOI Code

In modern urban environments the technologies that are basic to everyday life have become further embedded in that life. Smart cities are one example of the acceleration of technological change in order to engage with urban sustainability challenges, with Artificial Intelligence (AI) tools as one mode of engagement. However, the discourses through which cities engage with smart city growth and management can have long-term consequences for diverse knowledge held within the imaginaries of situated smart urbanism. As the city of Helsinki increasingly focuses on sustainable smart city initiatives, concurrent research suggests that smart urbanism is at a crossroads, where developers must decide how smart cities choose to engage with its residents’ knowledge. This research sets out to ask, how are top-down smart city interventions communicated on Twitter (de)legitimizing diverse knowledge in situated smart urbanism? We draw from Foucaudian theory to identify which discourses are elevated, through statements posted on the social media platform Twitter. By answering this question, our goal in this paper is to examine how Foucault’s methods can be used to highlight unseen assumptions about smart urbanism in Helsinki. Our objective is to identify overarching narratives and potential contested conceptualizations of smart urbanism in Helsinki. With our methods, we contribute a novel angle to surfacing power relations that are becoming evident in the development of AI-governed smart cities.
JLA

Designing Affirmative Action Policies under Uncertainty

Hertweck, Corinna, Castillo, Carlos, and Mathioudakis, Michael

Journal of Learning Analytics (2022)

Abs DOI Code

We study university admissions under a centralized system that uses grades and standardized test scores to match applicants to university programs. In the context of this system, we explore affirmative action policies that seek to increase the acceptance rate of underrepresented groups while still accepting students with high scores. Since there is uncertainty about the score distribution of the students who will apply to each program, it is unclear what policy would have the desired effect on the acceptance rates of different groups. We address this challenge by using a predictive model trained on historical data to help optimize the parameters of such policies. We find that a learned predictive model does significantly better than relying on the ideal parameters for the last year. At the same time, we also find that a large pool of historical data yields similar results as our predictive approach for our data. Due to the more complex nature of the predictive approach, we conclude that a simpler approach should be preferred if enough data is available (e.g., long-standing, traditional university programs), but not for newer programs and other cases in which our predictive strategy can prove helpful.

2021

Springer

Bias, awareness, and ignorance in deep-learning-based face recognition

Wehrli, Samuel, Hertweck, Corinna, Amirian, Mohammadreza, Glüge, Stefan, and Stadelmann, Thilo

AI and Ethics (2021)

Abs DOI

Face Recognition (FR) is increasingly influencing our lives: we use it to unlock our phones; police uses it to identify suspects. Two main concerns are associated with this increase in facial recognition: (1) the fact that these systems are typically less accurate for marginalized groups, which can be described as “bias”, and (2) the increased surveillance through these systems. Our paper is concerned with the first issue. Specifically, we explore an intuitive technique for reducing this bias, namely “blinding” models to sensitive features, such as gender or race, and show why this cannot be equated with reducing bias. Even when not designed for this task, facial recognition models can deduce sensitive features, such as gender or race, from pictures of faces—simply because they are trained to determine the “similarity” of pictures. This means that people with similar skin tones, similar hair length, etc. will be seen as similar by facial recognition models. When confronted with biased decision-making by humans, one approach taken in job application screening is to “blind” the human decision-makers to sensitive attributes such as gender and race by not showing pictures of the applicants. Based on a similar idea, one might think that if facial recognition models were less aware of these sensitive features, the difference in accuracy between groups would decrease. We evaluate this assumption—which has already penetrated into the scientific literature as a valid de-biasing method—by measuring how “aware” models are of sensitive features and correlating this with differences in accuracy. In particular, we blind pre-trained models to make them less aware of sensitive attributes. We find that awareness and accuracy do not positively correlate, i.e., that bias ≠ awareness. In fact, blinding barely affects accuracy in our experiments. The seemingly simple solution of decreasing bias in facial recognition rates by reducing awareness of sensitive features does thus not work in practice: trying to ignore sensitive attributes is not a viable concept for less biased FR.

Preprints

Distributive Justice as the Foundational Premise of Fair ML: Unification, Extension, and Interpretation of Group Fairness Metrics

Baumann, Joachim, Hertweck, Corinna, Loi, Michele, and Heitz, Christoph

arXiv (2022)

Abs arXiv Poster

Group fairness metrics are an established way of assessing the fairness of prediction-based decision-making systems. However, these metrics are still insufficiently linked to philosophical theories, and their moral meaning is often unclear. We propose a general framework for analyzing the fairness of decision systems based on theories of distributive justice, encompassing different established “patterns of justice” that correspond to different normative positions. We show that the most popular group fairness metrics can be interpreted as special cases of our approach. Thus, we provide a unifying and interpretative framework for group fairness metrics that reveals the normative choices associated with each of them and that allows understanding their moral substance. At the same time, we provide an extension of the space of possible fairness metrics beyond the ones currently discussed in the fair ML literature. Our framework also allows overcoming several limitations of group fairness metrics that have been criticized in the literature, most notably (1) that they are parity-based, i.e., that they demand some form of equality between groups, which may sometimes be harmful to marginalized groups, (2) that they only compare decisions across groups, but not the resulting consequences for these groups, and (3) that the full breadth of the distributive justice literature is not sufficiently represented.
A Justice-Based Framework for the Analysis of Algorithmic Fairness-Utility Trade-Offs

Hertweck, Corinna, Baumann, Joachim, Loi, Michele, Viganò, Eleonora, and Heitz, Christoph

arXiv (2022)

Abs arXiv

In prediction-based decision-making systems, different perspectives can be at odds: The short-term business goals of the decision makers are often in conflict with the decision subjects’ wish to be treated fairly. Balancing these two perspectives is a question of values. We provide a framework to make these value-laden choices clearly visible. For this, we assume that we are given a trained model and want to find decision rules that balance the perspective of the decision maker and of the decision subjects. We provide an approach to formalize both perspectives, i.e., to assess the utility of the decision maker and the fairness towards the decision subjects. In both cases, the idea is to elicit values from decision makers and decision subjects that are then turned into something measurable. For the fairness evaluation, we build on the literature on welfare-based fairness and ask what a fair distribution of utility (or welfare) looks like. In this step, we build on well-known theories of distributive justice. This allows us to derive a fairness score that we then compare to the decision maker’s utility for many different decision rules. This way, we provide an approach for balancing the utility of the decision maker and the fairness towards the decision subjects for a prediction-based decision-making system.