Johnnatan Messias Thinks I Don't Exist

Twitter has been replete lately with pretty graphs from a scientific study of its users. The study, White, Man, and Highly Followed: Gender and Race Inequalities in Twitter, investigates an incredibly important problem: the way attention and interaction levels are influenced by race and gender.

There’s a problem with those pretty graphs, however: the study takes pretty much the worst approach possible, ignoring user consent and using a methodology that is essentially inaccurate racial profiling with a healthy dose of systemic erasure.

Inequalities in Twitter

Twitter is a critical platform. At least in theory, a user can build a reputation and following on the site divorced from their standing in the wider world as a whole. Users who have no other way to be heard can get their voice out there and focus attention on problems they see. People have become professional commentators, made millions, and forced issues into the traditional media that would never have made an appearance 5, 10 or 20 years ago. Twitter is vital to a lot of how the modern gestalt works.

The crucial phrase there is “in theory”. People bring their biases, bigotry and conditioning into the internet, and have since the internet’s beginning. This influences how people rise and fall on Twitter just as it does in the real world. In White, Man and Highly Followed, Johnnatan Messias, Pantelis Vikatos and Fabricio Benevenuto study US Twitter users to investigate how gender and race play roles in who interacts and who becomes prominent–things that, of course, feed into one another.

What they find is that US Twitter largely replicates the stratification and bigotry of the US as a whole. There’s a large amount of racial segregation. There’s a lot of bias against women, Asian-Americans and African-Americans. Your gender and race play a big role in not only the backgrounds of who follows you, but in how many people do. As the paper title suggests, if you want eyeballs, it’s best to be white and a man. Quelle surprise: quantitative work confirms what we already know through lived experiences.

Except then we get into the actual methodology and everything rapidly turns to shit.

To identify race and gender, the researchers used profile pictures belonging to United States based Twitter users: 50,270,310 of them. There’s no indication in the paper that users were ever asked if they wanted to be included in the study: researchers just decided that because it was public it was theirs. This is an ongoing pattern in online research–researchers deciding they just want to take content and so they will and thbbbt–and one which looks decidedly murkier when we explore their actual race/gender ID system.

There’s no indication the researchers asked Twitter, either. The API they reference doesn’t return profile photos, it returns links to profile photos. Or to put it another way, the researchers bombarded Twitter with 50.2 million random HTTP requests to grab your media files and then walked away without saying anything to anyone. They could do it, so they did do it, which is never an attitude that has ended badly in science.

Erasure in Machine Learning

The bigger issue, and the source of the title of this writeup, is how they IDed gender/race and how these are categorised. Quite simply, they used facial recognition through Face++, which purports to reliably ID gender and race from photos.

This is glorified racial and gender profiling, and inaccurate to boot. Face++ cannot reliably ID gender and race from photos because gender and race aren’t unambiguously visually distinguishable. Gender, particularly, is based on the self, not external viewpoints. Race is a thing that shifts depending on location and community, and comes with enough variation even within shitty white-imposed categorisations that a hell of a lot of people either don’t neatly fall within exclusive categories or spend their lives with white people squinching their faces up and asking everyone’s favourite racist question, “what are you?”

Even in a magical universe where race and gender were visually identifiable, Face++ would still be a hell of a terrible way of doing this because Face++ recognises:

Men
Women
African-Americans
Asian-Americans
White people

And that’s it–which is where the title of this post comes from. According to the researchers, I as a non-binary person simply do not exist. Neither do agender people or genderqueer people; neither do any Native, Latinx or Middle Eastern people. We are all erased from this study and from many others because Face++ literally can’t see us. Your “state of the art” machine learning system is incapable of recognising >25% of the US population.

The paper–which has been accepted for publication!–makes no real mention of this aside from saying that race IDing is ‘limited’ to 3 categories in the intro and moving swiftly on. There’s no awareness of the flux and greyscale of gender, and nothing in the concluding discussion about even trying to correct this. There’s no awareness of how excluding Native populations particularly, given the long-standing history of quantitative work that treats Natives as a dismissable aside, feeds into wider discriminatory narratives that puts them in the past-tense. Instead the researchers simply sign off with saying they’d like to see systems which “promote more diversity and less inequality in user bases”.

Such a system will not come from this paper. It can’t, because this paper contains and perpetuates said inequality and lack of diversity. The computational phrenology, the utter dependance on the gender binary, and the apparent complete ignorance of the authors and reviewers as to why any of this is a problem means that only poisoned fruit can come from this particular tree.

Scientists talk a lot about how we work for the betterment of humanity. That’s why a lot of us got into this. Well if that’s true, it starts at home: first, in tackling harassment and bigotry in academia itself, but second by applying those same lessons–that science should be inclusive, that science needs to consider the long-term consequences of seemingly small choices, that we are more accurate when we have more representative voices–to the research we conduct. We must insist on user consent, work with an awareness of how science has participated and can still participate in legitimising erasure and discrimination, listen to marginalised voices and create writing and review processes that are actively inclusive of them.

This paper does not help with any of that. Instead, it actively perpetuates the issues people have with science. The fact that it was published without any reviewer raising these concerns reinforces that it’s just a drop in the goddamn ocean–but then, what do I know? After all, apparently I don’t exist in the first place.

Inequalities in Twitter

User Content and Consent

Erasure in Machine Learning