Counting the Countless

By Os Keyes

Reconstructed transcript of a talk I gave at Seattle University earlier this year

Good evening everyone! My name is Os, and I’m a PhD student at the University of Washington. According to my website I study gender, data, technology and control; it also says that I’m an inaugural Ada Lovelace Fellow. And I’m here for a variety of reasons, but one of the big ones is that I really enjoy giving talks. Particularly community-oriented talks; remaining grounded in my communities is important to me and for my work to be effective. So I was really pleased when, as a result of my last talk here, the Seattle Non-Binary Collective reached out. And they said: “we hear you’re a data scientist. Could you do a talk on how trans &/ non-binary people can get involved in data science?”

And I replied: well, to be perfectly honest, I think data science is a profound threat to queer existences. And then for some reason they stopped replying! Who can say why? So when Jodi asked me what I’d like to talk about in this lecture series, I figured I’d do a talk on that. Why do I think data science is a profound threat for queer people?

The difficulty of definitions

Let’s start off with some definitions. What does it mean to say someone is queer? What does it mean to say someone is trans? In both cases, there really isn’t a fixed definition that holds everywhere. Trans identity is contextual, and fluid; it is also autonomous. There’s no test that you give someone to determine they’re “actually” trans, unless you’re a doctor or a neuroscience researcher or a bigot (but I, often, repeat myself).

But one constant, to some degree or another, is that living as a trans person is frequently miserable. Not because we’re trans, but because we exist under what hooks refers to as a “white supremacist capitalist patriarchy”. We live in an environment that is fundamentally racist; fundamentally built around capitalism; fundamentally based on rigid and oppressive gender roles. And as trans people, we suffer under all of these facets of society, both collectively and individually. Trans people of colour experience racism and transphobia, the latter induced by rigid patriarchal norms. Poor trans people - which is most of us, given how poverty correlates with social ostracisation - suffer under both transphobia and capitalism. Those of us who are disabled, and so don’t fit norms of a “productive” worker, experience that poverty tenfold.

Administrative violence

These norms and forms of harm do not exist “just because”: they exist as a self-reinforcing system in which we are coerced to fit the mould of what people “should be”. Those who can are pressured to; those who can’t, or refuse, are punished. Dean Spade, an amazing thinker on trans issues and the law, has coined the term “administrative violence” to refer to the way that administrative systems such as the law - run by the state, that white supremacist capitalist patriarchy - “create narrow categories of gender and force people into them in order to get their basic needs met”, a common example of this kind of violence and normalisation.

Let’s look at an example; suppose you want to update the name and gender associated with your mobile phone, right?

  1. You go in and they say that you need a legal ID which matches the new name and gender.
  2. So you go off to the government and say: hey, can I have a new ID? And they say: well, only if you’re officially trans.
  3. So you go off to a doctor and say: hey, can I have a letter confirming I’m trans? And the doctor says: well, you need symptoms X, Y and Z.

…and then when you do this, and jump through all that gatekeeping, everything breaks because suddenly the name your bank account is associated with no longer exists. You attempted to conform, and you still got screwed.

This case study nicely demonstrates what “administrative violence” looks like and what it does. It reinforces the gender binary (good luck getting an ID that doesn’t have a binary gender on it); it reinforces the medicalised model of trans lives; it communicates that gender is not contextual, that you can only be one thing, everywhere; it enables control and surveillance, because now, even aside from all of the rigid gatekeeping, a load of people have a note somewhere that you’re trans.

So: could we reform this? Spade would say: nope. This - the rigid maintenance of hierarchy and norms - is really what the state is for. Moreover, attempts to reform this often leave the most marginalised amongst us out; those of us who are multiply marginalised, least listened to. Efforts to get the gender binary expanded in ID forms are great, if you can afford a new ID, and if you can afford to come to the attention to the state, and in some ways they’re counterproductive even then. We are attempting to negotiate with a system that is fundamentally out to constrain us.

Defining Data Science

That might appear as a massive deviation, but I’d like you to keep it in mind as we veer back towards data science. And what is data science, anyway? There are a lot of definitions but the one I quite like is:

The quantitative analysis of large amounts of data for the purpose of decisionmaking

There’s a lot that’s packed into that, so let’s break it down.

First: quantitative analysis. A field based on quantitative analysis, on numbers, raises a lot of questions. For example: what can be counted? Who can be counted? If we’ve decided to take a quantitative approach to the universe, then by definition we have to exclude any factors or variables that can’t be neatly tidied into numbers - and we have to constrain and standardise those that can, to make sure tidying them is convenient.

Second: large amounts of data, that “big data” you’ve been hearing so much about. A data science approach encourages the collection of as much data as possible (all the better to measure you with). This data is vast across time: we should have as much of your history as possible. It’s vast across space: we should be able to measure as much of the world as possible, in this tidied, standardised way. It’s vast across subjects: we should be able to measure as many people as possible, in this tidied, standardised way. Our data should be collected ubiquitously, it should be collected consistently and perpetually, and any variation that complicates our data collection should be eliminated. The ideal data science system is one optimised to capture and consume as much of the world, and as much of your life, as possible.

And then finally we have for the purpose of decisionmaking, which is the bit proponents of data science really seem to drool over. We can use datalogical systems for efficiency gains, for consistency gains; we can remove that fallible, inconsistent “human factor” in how we make decisions, working more consistently and a million times faster. Which is, you know, fine, sort of, but by definition a removal of humanity makes a system - well. Inhumane!

So perhaps a more accurate definition of data science would be:

The inhumane reduction of humanity down to what can be counted

Data violence

This sounds resonant with Spade’s work. And so Anna Lauren Hoffmann, one of my favourite scholars and human beings and inspirations (and I’m not just saying that because she’s the person who gets to decide if I graduate) has coined the term “data violence”. Without being too reductive, or stealing too much of her thunder, think of it as the perpetuation of violence through datalogical systems, in the same way that administrative violence refers to the perpetuation of violence through administrative systems.

It’s different from administrative violence in a few ways; first, the ubiquity at which it operates (the state is not the only entity that can perpetuate data violence) and second, the scale at which it operates, and the fluidity of it. Data systems are mutable, ubiquitous, and constantly pointed to as the future: the direction in which we should be going. They can capture a lot more of your life than the DMV. This is not accidental; this is the point.

So let’s look at that same case study again; name changes!

  1. You go in and they say that you need a legal ID which matches the new name and gender.
  2. So you go off to the government and say: hey, can I have a new ID? And they say: well, only if you’re officially trans.
  3. So you go off to a doctor and say: hey, can I have a letter confirming I’m trans? And the doctor says: well, you need symptoms X, Y and Z.

…and then when you do this, and jump through all that gatekeeping, everything breaks because suddenly the name your bank account is associated with no longer exists. You attempted to conform, and you still got screwed.

…. and then when you resolve that, you out yourself, and mark yourself, forever. Everyone and their pet dog has a record that you’re trans; every data system you have to interact with is oriented to keep that known, for as long as possible, in case it becomes relevant to their model. And even if it somehow doesn’t, one of those background check websites, using outdated data, lists your deadname with that data, ensuring that you’re outed every time someone googles your number. Your administrative transition is a boon to you, but it’s also a boon to the vast number of systems dedicated to tracking the course of your life, for their own (not necessarily benevolent) purposes. And this tracking is going to be reductive; it will only track the things that matter, in the ways that are acceptable, punishing you when you deviate.

Marking & Punishing

An example of that punishment can be seen in a less state-oriented (and less trans-oriented example) I ran into in Forbes recently; health insurance companies tracking the food you buy. If you eat healthily, you get lower premiums; if you eat unhealthily, you get higher premiums.

Let’s set aside for a second the really (really) obvious issues with quantifying “food healthiness”. You’re tracking food purchases through smartphone apps, through supermarket and store purchases. Who gets tracked, and who doesn’t? Presumably if I go to somewhere like Trader Joe’s, the system is all neatly and nicely integrated and my health insurance company gets all my data. But I don’t go to Trader Joe’s: I live in the Central District, and there’s precisely fuck-all there except an Ezel’s fried chicken, where I pig out once a month, and a bodega, where I do my shopping. And if that bodega doesn’t have integration with my insurance company’s new fancy data science system for determining premiums, then according to their systems, I subsist off fried chicken approximately once a month, and nothing else. I’m guessing my premiums are going to be pretty high!

And as with administrative violence, we have to ask: who does this harm the most? The Central District is the historically black district of Seattle; the people caught in this trap are disproportionately likely to be already-marginalised, already-marked. The system’s integration with things like smartphone apps for discounts and coupons further invisibilises people without smartphones - which, again, is disproportionately the poor, immigrants, people of colour. The distribution of costs of this reduction to the quantifiable and the countable is not even. And the response that the people running this program are likely to have to critiques is: you’re right! We should make sure there’s a sensor in the bodega, too. Now: I don’t know about you, but my idea of a solution to being othered by ubiquitous tracking is not “track me better”.

So the long and the short of it is that, as currently constituted, data science is fundamentally premised on taking a reductive view of humanity - and using that view to control and standardise the paths our lives can take, responding to critique only by expanding the degree to which it surveils us

Reforming Data Science?

So: can we reform these systems? Tinker with the variables, the accountability mechanisms, make them humane? I’d argue: no. With administrative violence, Spade notes how “reform” often benefits only the least-marginalised, while legitimising the system and giving cover for it to continue its violence towards those more silenced. The same is true here, and a nice example can be found in attempts to reform facial recognition systems.

In 2018, a researcher at MIT published an intersectional analysis of gender recognition algorithms: facial recognition systems that identify gender, and use them for decisions from the small (demographic analytics) to the vast (who can access bathrooms). She found that these systems - which she notes are premised on an essentialist, biological view of gender - are biased against dark-skinned female-coded people. Her recommendation, developed in a subsequent paper, was that the people building facial recognition systems use more diverse datasets. In other words, it was a fundamentally reformist approach.

There are two really obvious problems with this. The first is that gender recognition systems are fundamentally controlling and dangerous to trans people, and simply cannot be reformed to not be violent against us. Normalising, reducing views of gender are what they’re for. Curiously, the second paper - the one about how great expanding data diversity was - did not mention the constrained view of gender these systems take. The other problem is that incorporating more people into facial recognition systems isn’t really a good thing even for them. As Zoé Samudzi notes, facial recognition systems are designed for control, primarily by law enforcement. That they do not recognise black people too well is not the problem with them - and efforts to improve them by making them more efficient at recognising black people really just increases the efficacy of a dragnet aimed at people already targeted by law enforcement for harassment, violence and other harms.

So, this reformist approach to facial recognition - making the system more “inclusive” - did not really reduce harm for the people actually, well, harmed. This control and normalisation is part of the point of data science. It is a requirement for data science’s logics to work. All reform-based approaches did was make violent systems more efficiently violent, under the guise of ethics and inclusion, and in doing so throw people such as trans people of colour entirely under the bus.

Radical Data Science

So in summary, data science as currently constituted:

  1. Provides new tools for state and corporate control and surveillance
  2. Discursively (and recursively) demands more participation in those tools when it fails
  3. Desires to, through that control, communicate universalised views of what humans can be, and lock us into those views.

Those don’t sound compatible with queerness, to me. Quite the opposite: they sound like a framework that fundamentally results in the elimination of queerness; the destruction of autonomy, contextuality and fluidity, all of which make us what we are and are often necessary to keep us safe.

Now: if you’re a trans person or otherwise queer person interested in data science, I’m not saying “don’t become a data scientist under any circumstances”: I’m not your mother, and I get that people need to eat to survive. I’m just explaining why I refuse to teach people, or train people, to be data scientists. Why I think that reformist approaches to data science are, while helpful, insufficient, and that co-option into data science, even to fix the system, is fundamentally inimical unless your primary question is asking who is left out of those fixes. You need to make the decision that is right for your ethics of care.

For me, my ethics of care says that we should be working for a radical data science; a data science that is not controlling, eliminationist, assimilatory. A data science premised on enabling autonomous control of data, on enabling plural ways of being; a data science that preserves context and does not punish those who do not participate in the system.

How we get there is a thing I’m still working out. But what you can do right now is build counterpower; alternate ways of being, living and knowing. You can refuse participation in these systems, whenever possible, to undercut their legitimacy. And you can remember that you are not the consumer, but the consumed - you can choose to never forget that the harm these systems do is part of the point.