A conversation with AI researcher and writer, Kate Crawford, author of Atlas of AI. We discuss her new book, emerging forms of collective dissent, collaborations with artists, as well as her ongoing work as a composer.
References
Hello, this is Informer. The show that reveals the latest ideas from artists, thinkers, and technologists, and invites you behind the screen to meet the people sketching, hacking, and imagining the next versions of our world. I'm Roddy Schrock, your host. In each episode, I spotlight creative minds grappling with a changing world through art, technology, or often both. And I hope you'll subscribe to this podcast at informerpodcast.com, where you can also find show notes, links, and more information on all of the artists and projects that we discuss.
So on today's inaugural episode, it is an absolute pleasure to reconnect with Kate Crawford. I've known Kate for quite a few years. I've invited her to be a part of numerous panels and projects throughout that time. And I'm always captivated by her eloquence in laying out emerging challenges and our relationships to technology.
Kate is a principal researcher at Microsoft Research, the co-founder and director of research at the AI Now Institute at NYU, a visiting professor at the MIT Center for Civic Media, a senior fellow at the Information Law Institute at NYU, and an associate professor in the Journalism and Media Research Center at the University of New South Wales. She recently published a book called Atlas of AI.
This book really opened my mind about the ways that this current adoption of machine learning by society at large is really just the latest moment in which humans have been blinded by science and allowed our tools to guide our ethics rather than the other way around. So I asked her about this and on whether or not we've passed the tipping point—whether it's too late or not for us to take collective action to be able to adopt AI in a more equitable way for the public. Good, and not just for a few.
I also asked her about the role of art in opening up the possibility for collective action and more about her recent collaboration with Trevor Paglen, a project that went viral. So it's the middle of the night here in New York when I'm talking to Kate, but I was already up and excited to catch up with her. It was the middle of the day for her in Sydney. And I began by asking her more about her book, what it is, what it was about, and why she wrote it.
So Atlas of AI has been a very long project in the making—five years of actually writing it and many more years of actually researching it. And for me, it really came from the perspective of having been a researcher of artificial intelligence for such a long time and noticing just how dominant this idea is that somehow AI is this spectral immaterial technology. That it is sort of code in the cloud, that it has no real connection to the earth or the people who make it—this idea of disembodied ones and zeros.
I think that has very serious social and political ramifications. Certainly, it keeps us at arm's length from how these technologies are truly made. So part of the motivation of writing Atlas of AI was to really understand the full supply chains of artificial intelligence and in that sense to move away from the abstract algorithmic space and to ground it quite literally in the specific locations and institutions where people are designing these systems.
So, you know, to do a project like that really meant moving away from perhaps the more traditional academic approaches of being in libraries and reading papers, but to actually put myself in the locations where artificial intelligence is constructed—all the way back from the mineralogical layers, to actually go to the mines where the components are being pulled out of the earth to construct everything from consumer devices to data centers to all of the parts of planetary computation. To go into the places of work, all of the sorts of hidden labor all along the supply chain that are sort of kept away from view and this shiny view that we're given of automation.
And then also to look at the ways in which datasets themselves are constructed. To go into the labs where they're being made to look at the archival practices of classification over hundreds of years, and to see how they are now re-emerging in sort of zombie forms inside machine learning.
I love the way you're describing that. And when I was reading your book, one of the things that really struck me was just the—I'm thinking topographically or almost texturally—just the way that you saw this world that so many of us are not exposed to. When you went to the houses of AI, which I just think is a brilliant description, and you sort of visited all of these locations from the mines to Silicon Valley.
I particularly remember your description of flying into the San Jose airport. And your perspective at times almost felt as though you were looking at this strange American topography from the perspective of like the woman who fell to earth. Almost like everything felt so alien with such strange rules and unexplained behaviors. And I just wonder, could you talk a little bit about how you felt as you're going through that process? Did you feel like you were visiting an alien planet?
I love the way you put that, Roddy. I mean, in that sense, I wonder if it has to do with the way in which immigrants will always see with a different perspective. You know, I moved to America over a decade ago, but I've always been, in that sense, a stranger in a strange land. And certainly, the history of the way in which immigrants have [written] about the United States has been very influential on me. And along the way that then I think and study the world... I almost go all the way back to Tocqueville and the fact that you needed someone coming from France to do his PhD to really look at the ways in which democracy, as it was being constructed in the US, was actually running the risk of creating new forms of tyranny.
And certainly, we've had a lot of experience of that in recent years. But for me, I think part of that denaturalization that you're referring to is a very conscious attempt to really pull away the curtain, to look at us sitting there pulling the lever.
I feel like some of the things you revealed are so strange and disturbing in some ways. At one point in your book, you talk about how you feel as though AI, and sort of a move to this data-driven world that we're in, really shifts the idea of image and relationality to other human images. A move from image to infrastructure, I think is the way you put it, in particular when you're describing the usage of mugshots as sort of fodder for AI.
And what do you think are some of the effects of this move? I guess I'm just asking, like, what does it mean to start viewing other people as infrastructure rather than people?
Mm, look, you know, it's a really important question. And certainly in the book, I look at it in terms of the way in which we train machine learning systems, particularly in the field of computer vision. And in computer vision, if you want to create a system that can, for example, tell the difference between an image of a cat and an image of a dog, you feed it thousands upon thousands, sometimes millions of images of cats and dogs, such that when it's presented with a new image, it can detect a certain pattern in that image to say, "Okay, this is more likely to be a cat than a dog." But certainly, when you start to see the way in which this works with images of people is where I think that some of the very kind of problematic underlying logics really of computer vision become very stark.
And, you know, this is one of the things that I researched: to look at the early training sets that have been constructed, certainly in order to create facial recognition. For example, one of the most common training datasets, which is maintained by NIST, the National Institute of Standards for the US, is mugshots. And it was mugshots of people who had been arrested multiple times over their lives.
It's a really disturbing training set. It's images of people at their most vulnerable—often people who are injured, people who are distressed and crying. And people, of course, who had absolutely no ability to refuse this image taken; it's completely outside of the framework of consent. And we have no idea if these people were ultimately imprisoned. They're just completely decontextualized images that are just used as an infrastructure. They're just used to train systems and to test your algorithms to see how they perform over time.
And it really, it shocked me. I have to say it's one of the most profound realizations for me as a researcher to see the ways in which large-scale training sets over many decades have just been used in this completely depersonalized way, where any meaning or care that might be given to the image of an individual person or to the context behind a scene is presumed to be erased at the moment it becomes part of this aggregate mass that we call training data.
So in that sense, all data is just seen to be material that can be ingested to improve technical performance. And that is this core premise in the ideology of data extraction right across AI. And that to me is a type of original sin. I think once you make that shift, once engineers no longer even have to look at the images—because in many ways, these are training sets which are used to train machines, so it's not necessarily assumed to be for human eyes—that minute we stop caring about the personal histories, the structural inequalities, and all of the injustices that come with these collections of images. That is when I think we build systems on forms of injustice. So certainly that shift from image and all of the care, or at least sense of multiplicity of meaning in an image, gets erased when it becomes this kind of infrastructural mass. And that's something that is a really important feature of how machine learning works today and certainly where it fails us.
Yeah, I hear you. And I think you at one point described that as being a shift from what previously were thought of as human subjects—as a way of talking about participants and consent in research—to what you described as "data subjects," where everything is, quote-unquote, an agglomeration of data points without subjectivity or context or even clearly defined rights. And I wonder, do you ever speculate on how that's impacting kind of personal relationships or societal relationships?
Absolutely. And it's interesting, certainly one of the ways in which I look at this in Atlas of AI is in terms of this issue of classification. We are being classified all the time, obviously, according to the sort of logics of ad tech and capital, as well as policing in the military. But so many of the ways in which we are currently classified by AI systems are according to these profoundly narrow, and I think deeply concerning and outdated frameworks.
So, for example, many systems still will try to automatically classify people into two genders, male and female, despite how completely preposterous that is in this day and age. Many use ridiculously narrow categorizations of race, having four or five racial categories. Sometimes they include "other" as the kind of move to some idea that they kind of capture all things.
And then some have these profoundly worrying forms of assessing people's morality really, or character, or effectiveness as an employee, or emotion. And here, we can think of certainly one of the chapters in the book that looks in detail just at the many failures of emotion recognition—why the very idea of emotion recognition is itself deeply pseudoscientific. But all along the way, we are seeing machine learning systems engage in the act of what I call in the book "making up people," which is a reference of course to an Ian Hacking essay from several decades ago now.
He looks at the way in which classification systems by states have often sort of made up human categories as ways to try and track everything from productivity to forms of surveillance. But we're doing it again, certainly in these machine learning systems that are primarily opaque to us, yet nonetheless, we're being classified and categorized in these ways thousands of times per day, if not more.
So the question then is, does that change how we see ourselves and see other people? And so Ian Hacking argued that it did, because the minute you create these classifications and people are classified, they in many cases begin to enact those kinds of classifications. They are told that they have a particular type of illness in the Diagnostic and Statistical Manual, so they will be treated in particular sorts of ways. Or in other cases, people will be told if they have particular sorts of features, they will receive particular kinds of benefits. And again, you change the social structures around behavior very, very quickly.
And there are questions about how much people are either able to sort of ingest, if you will, these kinds of forms of machine subjectivity, or if people are actively resisting these kinds of modes of subjectivity. And I think it differs really in kind depending on how much of it you can actually see and even engage with directly, which is really just such a small amount of what's going on because so much of it is of course internal to the logic of systems that we don't get to look into.
But certainly your broader question is: does it begin to change how we see ourselves and others? I would say absolutely, yes. The question of how that sort of fine-grained way that you would track and understand that, I think, is something we're all still figuring out.
It really made me wonder, Kate. It really made me just wonder: have we passed the tipping point? I just guess I wonder, given where you sit and the sort of broad perspective that you have on this whole economy of extraction, do you have a sense of kind of where we are in the spectrum of being able to make real positive change as a society versus the genie just being let out of the box and we'll just see what happens?
I do think there's an enormous urgency in finding ways to very seriously rebalance how much power we give to both the sort of broader political economy of tech companies, but also the systems themselves. And that needs to be done, in some cases, in extremely strenuous ways by very strict forms of regulation, not just the lighter forms of public pushback. I think it's time we really actually need strong regulation.
But that said, you know, I think part of the problem that you're pointing to is actually deeper than that. It's a sense of almost technological inevitability. And I think certainly one of the things that is deeply problematic is when we believe that we can't change anything—that certainly these tools and these systems and these forms of power are inevitable, and we simply have to accept them as they are.
In that sense, it's certainly one of the things that I write about at length, which is: how do we create politics of refusal? How do you oppose these narratives of technological inevitability that posit that if it can be done, it will be? So in that sense, rather than just assuming that these are the systems of power as they are and the sort of technical organs through which they will continue to centralize that power, we should be thinking about: well, how do we actually carve out spaces of refusal? How do we create zones where these tools and systems simply can't be used, to in some ways push against what Donna Haraway called the "informatics of domination"?
I certainly think we've got some really good signs just in the last couple of years. Some of them are very localized. We could look at the local bans of facial recognition or protests against algorithmic grading, or the various cities and in some cases countries that are moving against predictive policing. We could look to the fact that the EU just recently drafted the first-ever omnibus regulation for AI. But we've got a long way to go in the sense that all of these moves are either highly localized and they don't necessarily help people outside of those generally quite wealthy and technically literate cities.
But it also indicates to me that we have a real task on our hands in terms of really designing what good safety rails will look like around these systems, what strong regulation will look like, and what forms of public dissent are going to be most effective.
One of the things that gives me a lot of hope and optimism is when I see sort of creative responses and artist-driven public dissent that can not only help raise awareness—but I think that's sometimes too glib of a way to describe it—but to actually deeply encourage people who may just not be aware of what's happening to better understand their relationship to these machines that are just becoming more and more embedded literally in our lives.
And one of the things that you did was you worked with the artist Trevor Paglen in creating an art project that anyone could use and would help them understand what kind of big data and what big datasets were perceiving each of us individually to be. And sometimes it was hilarious and sometimes it was shocking. And other times it was terrifying, I might say. And that project really went—I guess it went viral. I heard people responding to that and kind of seeing that project really sort of wake them up to these hidden systems. And I wonder if you could talk a little bit about your relationship to art and kind of creative practice as a means of creating the kind of public dissent or at least the public understanding that you talk about in your book?
Well look, for me, some of the most rewarding projects have been collaborations and visual investigations and creative projects working alongside artists who really approach these questions from different perspectives than the so-called classical academic approaches. And I've had the great privilege of working alongside artists like Trevor, but also Vladan Joler and Heather Dewey-Hagborg, and to be inspired by so many of the artists that have certainly been part of the Eyebeam community but also part of the wider New York community of artists who look at these sorts of technological concerns. I'm thinking of Luke [inaudible], I'm thinking of Jeff, I'm thinking of Tega Brain and Sam Lavigne, Ingrid Burrington... it's a long and fantastic list of humans who are addressing these sorts of questions.
And certainly, the project with Trevor Paglen was many years in the making. And we call it "Excavating AI" because it really felt like years in the trenches of digging through training datasets that are used to build AI systems. Looking at, in some cases, training sets like ImageNet which have 14 million images, which in many cases I just don't think many people have looked at the images at all. Which explains why there were so many horrific things in there.
And in other cases, looking at the systems behind them—how did these training sets change over time? And it was really through working with Leif Ryge, who is an incredibly talented programmer in Trevor's studio, that ImageNet Roulette was constructed. I think that's the project you're referring to that went really unexpectedly viral.
The reason to make something like ImageNet Roulette is really to give people the ability to see into the logics of a training dataset. Because you can certainly—and we have—given talks about ImageNet, we've shown pictures, we've written papers. But the ability to train an AI model purely on what's called the "people category" of ImageNet—that's the category that has images of people and classifies them with labels that go anywhere from "boy scout" to "kleptomaniac"—and to allow people to upload their images and to see how they would be classified? You know, what sorts of labels would be applied to different images of themselves and their friends? That's when you really get to see the deep politics of these systems. That's where you get to see how young people are labeled differently to old people, and Black people labeled differently to white people. And you get to see this sort of racialized, gendered, and ableist logic underneath these kinds of systems.
And, you know, for us, I can remember the first time we gave a talk, Trevor and Leif, showing ImageNet Roulette. A few people thought, "Oh that was interesting," but there wasn't really any kind of moment where it sort of caught fire. It was actually six months later where somebody tweeted it out and it just took off. It was one of those really strange moments where suddenly it went from just a few people really engaging with it to a million people uploading images a day. We had a real problem on our hands in terms of maintaining that infrastructure. It was just crazy for a brief moment there.
But the beauty of it was to allow people to see the systems, but at the same time being sensitive to the fact that many of these labels are truly offensive and triggering and harmful in themselves. And so we also wanted to be very careful about how long we left ImageNet Roulette out in the wild for people to engage with. So we kept it as sort of a brief moment. You can still see it occasionally in gallery shows that Trevor puts on, but it's something that we think as a public moment, it made its point.
And certainly a few months after we did the ImageNet Roulette project, the makers of ImageNet released a paper saying that they were going to, quote-unquote, "mitigate bias" in ImageNet by deleting over 600,000 images and many of the categories that we critiqued. The question is: is that enough? And does that actually resolve the fact that this system—ImageNet has been out there for over a decade and had informed so many production-level AI systems that are around us every day—those logics have already sort of entered the water supply, if you will, that's all around us. So how do we think about the afterlives of training datasets in that context? And that's certainly something that we're continuing to work on to this day.
I have to say, one of the thoughts that I kept having while reading the book as well was just that, you know, I'm in a very privileged position. Like I can call you up and have a conversation. If I've got a question about data, I can read this book. But that's so different than the vast majority of people on this planet. And certainly in the United States, it seems to me that one of the issues is really around how there can be collective action in pushing for positive change given how urgent this issue is just in the way that we live and what the future holds for all of us on this planet. And are we simply overwhelmed in the responsibilities of what citizenship means right now?
I don't think so, in the sense that you're absolutely right that if we look at the ways in which late-stage capitalism is designed to keep us at a remove from seeing how the sausage is made, if you will—from understanding the many forms of extraction that go into the way that consumer devices are constructed and the profoundly disturbing political logics that can be driving large-scale AI systems in both the private and public sectors. I think, however, people are aware when these systems are touching their lives and producing serious forms of harm.
In that sense, I'm thinking here of the many kinds of activism that we've seen in local contexts against particular types of algorithmic and AI systems. I'm also thinking of what's happened since the pandemic. It's been such a horrifying time for so many reasons, but certainly, one of those reasons is to see the way the types of political structures that I write about in Atlas of AI have only gained traction. How we've seen just an increased centralization of power. We've seen even the richest people become even richer.
I can remember a few years ago when Vladan Joler and I did "Anatomy of an AI System," and we looked at how much money Jeff Bezos makes in a day compared to a cobalt miner who is actually producing the cobalt for his systems. And honestly, they would have to work 700,000 years to earn the same amount that Bezos made in a day. And that was then—that was back then. He's of course made so much more money now. And we're looking at just such extremes of wealth and such extremes of inequality that it is beholden on us, I think, to say that certainly one of the great responsibilities of citizenship is to do what we can to address this just preposterous asymmetry of wealth and power, certainly in the West, but we can look more broadly.
So I think there is a different consciousness emerging around things like climate change, around things like racial inequality, around certainly the ways in which we have responsibilities to each other and to our ecologies. It's been brought into relief certainly by the last 18 months.
That is such a key point and something that I think we can all think about more. And I'm sure it's going to be very resonant with our listeners. To change topics pretty drastically, one thing that I definitely didn't want to leave out of this conversation was to just learn more about whether or not you're still making music.
The most recent record that we released just before the pandemic began is in an outfit called Metric Systems. We released an album called People in the Dark, which has only become a more relevant title, which you can see on Bandcamp and your various other forms of hegemonic music networks.
But in terms of making music, you know, it is one of the things I just wish I had more time for. And certainly in the last year and all of the uncertainty that's come with it, my priority has been much more on being present for my people in my community and getting the book out there, less than making music. But certainly, it's lovely that you mentioned it because one of the things that for me really reconnects me and makes me feel more optimistic about where we are in the world is actually writing music.
So getting back into analog synthesizer mode and actually making/building modular synths is something that I also love doing, which has become a lot easier now than it used to be. Because there are shops where you can buy components and actually solder them together. It's very exciting. So in some ways, little things like that are always going to be part of my life and certainly something I'm looking forward to making more time for.
Oh, wow. I love this image of you actually soldering modular synths together. I'm just thrilled to know that that's something that you're working on.
You've got a background in noise bands, right?
Yes. That was sort of my early exposure to creative technology—somehow taking a weird, but totally amazing sort of sojourn to Japan and working in a noise band. And really having a very naive but enjoyable perspective of what technology was like in the early 2000s. And having that changed a lot over the years, but I still hold on to, just as you're mentioning, that belief that actually thinking creatively and artistically and utilizing the best promise of technology as kind of a creative tool is something that can really recharge your batteries. And I guess in my case, remind me that there's a version of these tools that can still be used for beautiful outcomes.
Yeah, I'm going to totally check out your music. And besides music, what are you working on now and what do you see as your next sort of project?
Well there's quite a lot of exciting things that are in the works at the moment, including a new project with Vladan Joler. So we've taken from "Anatomy of an AI System" back in 2018, and we're now creating a project that I have to say is even more massive and complex. Instead of just thinking about space, we've sort of added in the fourth dimension and we're looking at time as well, just to make our lives particularly hard. So I can't say too much about that, but I will tell you all about it when it is finished and public. But it's been really exciting to be working together again.
And I'm also doing a project, a multi-year project, looking at the sort of deeper foundations of machine learning—the conceptual foundations as well as the practical making, the epistemological layer as well as the material layer, if you will. So that is something that has been really just such an intellectually confronting, but also engaging project: to think about, are other ways of creating these systems possible?