Maxwell's Demon is a famous thought experiment in which a mischievous imp uses knowledge of the velocities of gas molecules in a box to decrease the entropy of the gas, which could then be used to do useful work such as pushing a piston. This is a classic example of converting information (what the gas molecules are doing) into work. But of course that kind of phenomenon is much more widespread -- it happens any time a company or organization hires someone in order to take advantage of their know-how. César Hidalgo has become an expert in this relationship between information and work, both at the level of physics and how it bubbles up into economies and societies. Looking at the world through the lens of information brings new insights into how we learn things, how economies are structured, and how novel uses of data will transform how we live.
Support Mindscape on Patreon.
César Hidalgo received his Ph.D. in physics from the University of Notre Dame. He currently holds an ANITI Chair at the University of Toulouse, an Honorary Professorship at the University of Manchester, and a Visiting Professorship at Harvard's School of Engineering and Applied Sciences. From 2010 to 2019, he led MIT’s Collective Learning group. He is the author of Why Information Grows and co-author of The Atlas of Economic Complexity. He is a co-founder of Datawheel, a data visualization company whose products include the Observatory of Economic Complexity.
Sean Carroll: 00:00:00 Hello, everyone, and welcome to the Mindscape Podcast. I'm your host, Sean Carroll, and by now, in this society we live in, everyone understands that information is something very important, whether it's the kind of information we get over the internet in terms of news and what's happening, what the weather's going to be like the next day, but also information in a more technical sense. If you have a company, it needs information about what sales are going on, where the customers are, what products are available, and what you should be selling. If you're a scientist, information is the data that you have about the universe. Information is another way of thinking about what we know about the world, so it's an extremely general concept, but there is so much information around us right now that it becomes a subject in its own right to understand what information is and how best to harness it.
Sean Carroll: 00:00:52 There's really no better person to talk to than today's guest César Hidalgo. César was trained as a physicist, but he quickly got into the idea of statistical mechanics of information, which led him, believe it or not, into economics, and he started studying not just economics in its own right, but how data flows through economic channels and how it becomes actual physical products.
Sean Carroll: 00:01:18 Now, after spending a long time as the lead of MIT's Collective Learning Group, Cesar is newly a chair at the University of Toulouse in France, but he's also the head of a startup company called Data Wheel. He's been very involved in data visualization and how that can help us understand what's happening in different places around the world. I can really recommend his book called Why Information Grows, that starts much like this conversation you're about to listen to does from the basics of what we mean by information at the level of physics or even philosophy into how information moves around, whether it's through a biological organism, through a society, through a culture, through an economy. It's a different lens. It's a different way of thinking about what's going on all around us. César's a very charismatic proponent of thinking about information in interesting ways. I think you're going to like this. Let's go.
Sean Carroll: 00:02:29 César Hidalgo, welcome to the Mindscape Podcast.
César Hidalgo: 00:02:30 Thank you.
Sean Carroll: 00:02:32 You started as a physicist, is that right?
César Hidalgo: 00:02:34 That's right.
Sean Carroll: 00:02:35 And you moved through... at any point were you officially an economist?
César Hidalgo: 00:02:39 No, I never got a degree in economics, but I've been working on topics related to economics I would say now for 14, 15 years.
Sean Carroll: 00:02:47 Okay, so you know the lingo, you're pretty [crosstalk 00:02:49]-
César Hidalgo: 00:02:49 Exactly, and I have a lot of colleagues and enemies in-
Sean Carroll: 00:02:52 Colleagues and enemies is a way to put it. But now you're also very interested in data visualization and things like that. To who are not an expert in any of these areas, what is the 30,000 foot view of all this? Like how do you think of your project of putting these things together?
César Hidalgo: 00:03:08 The way that I define my work is I say that I focus on collective learning. What I try to do is to understand how teams, cities, nations, and countries learn, how do they acquire new knowledge, and how they put that new knowledge to use. To do that, I do a lot of things. I study the creation, diffusion, and valuation of knowledge, and I've contributed a lot to that literature and economic geography in innovation, but I also have created lots of platforms to integrate and distribute large volumes of public and private data as a way to improve the way that we see our world.
Sean Carroll: 00:03:38 Does the physics background help you here?
César Hidalgo: 00:03:40 I think so because at the end of the day, what you rescue from an education like that of physics is that, to understand the world, you need to always have an interplay between theories and experience. That duality is useful in physics, but it's useful in economics and it's useful in, in most other fields, so I do think that my work still is always in the boundary between what the data is telling us and how we interpreted.
Sean Carroll: 00:04:06 But even more than that you have... so I read your wonderful book, Why Information Grows?
César Hidalgo: 00:04:11 Yes.
Sean Carroll: 00:04:13 The word "information" is obviously playing a large role here. Information has different definitions in different contexts. It's closely related to things like entropy that physicists care about. How do you think about information? What is your idea since you say that word?
César Hidalgo: 00:04:28 Of course, it depends with who am I talking to, but in a more technical sense, I like to think of information as the sort of third thing that is very basic and important to understand. In the universe we have things that, think of matter, things that we can touch or that have some sort of embodiment, but we also have the movement of those things we can think of in terms of energy and momentum and so forth, but there's a third quantity that we need to consider, which is not things nor how they're moving, but how they're arranged or ordered. To me, that's the basic idea of information. It's like the sequence of things, the way in which you stack a deck of cards, if you shuffle a deck of cards, you don't change the mass, you don't change the energy, but you're changing something. That order is information.
Sean Carroll: 00:05:16 I guess probably people, when you say the word "information," they think that that information is about something, that it contains meaning, not just data. When you shuffle a deck of cards, the meaning doesn't really change, or maybe it does. I'm not quite sure how you think about it.
César Hidalgo: 00:05:32 Maybe a better analogy than a deck of cards is to think of DNA. If I change the sequence of nucleic acids on DNA, I can transform one piece of DNA from encoding one protein to encode in a different protein. That protein and what it does and in the context in which it's being used, you can think of that as the meaning, but the little piece of DNA that is a certain sequence doesn't know really about that meaning. That meaning is beyond it. It's part of the environment and the way that that sequence of order interacts with the rest of the environment.
César Hidalgo: 00:06:06 I try to think of information when I think about it in fundamental terms as those sequences, but of course, when I'm talking with someone for example from the field of communication or media studies, understand that information there is much more related to meaning, and you can have concepts like misinformation, which in the DNA example would be a little bit harder to build.
Sean Carroll: 00:06:27 I guess that's what I'm trying to get at. For the DNA molecule, or just for a set of letters on a page, is any arrangement equally contain the same amount of information, or do they contain more information if the context they're in cares about what they say in some way?
César Hidalgo: 00:06:45 Well, it depends on which definition we're using again. If we're thinking from a pure Shannon perspective, basically a random sequence is going to be the one that contains more information because-
Sean Carroll: 00:06:56 More information, yeah.
César Hidalgo: 00:06:57 ... it's the hardest to predict.
Sean Carroll: 00:06:58 You know it means nothing.
César Hidalgo: 00:07:00 Exactly. But let's say now we're in the context of communication, and I'm trying to communicate something to you. What are words that are going to contain more informations are the ones that reduce your uncertainty more about what I'm trying to say. Maybe a more useful way to think about information in that context is not simply how many bits do I need to encode something, but how much do I reduce your uncertainty with each beat that I provide to you.
Sean Carroll: 00:07:25 Okay, good. This is sounding nice and physicsy, and I like this. We're going to get to the role of information in economies and firms and networks and things like that, but let's stick with the physics angle here. I mean, how do you think about the origin of information [inaudible 00:07:40] all the way back to the evolution of the universe or the evolution of life or something like that?
César Hidalgo: 00:07:45 That's a good question. Because in some way, information and order and complexity is conspicuous in our planet. We were marveled at it every time we go and see a landscape or walk around the city, but at the same time, if we were to take a space ship and travel across our solar system or beyond, we would see a universe that is quite barren. Complexity's not everywhere. Complexity actually concentrates in places like our planet, and it leads us to ask, "Why? Why is our planet so rich in complexity where other places are so barren," and-
Sean Carroll: 00:08:21 Like the moon is not-
César Hidalgo: 00:08:22 Exactly-
Sean Carroll: 00:08:23 ... that rich in complexity.
César Hidalgo: 00:08:23 ... the morning is not complex, or even like if you go to that Tihamah Desert here in our planet, you're going to find places that actually don't have too much structure beyond in the geological formations that you can find there.
Sean Carroll: 00:08:34 And some telescopes these days, by the way.
César Hidalgo: 00:08:36 Yeah. Yeah. Indeed. To understand the origin of this complexity and this order, I think the best solutions that I've found there is work like that of Ilya Prigogine. Ilya, as you know very well, is a very famous statistical physicists and chemist, that he started to physical systems that were out of equilibrium, near equilibrium, but out of it. He found that those systems that were out of equilibrium tended to self-organize into what he called the dissipative structures. Think of the little whirlpool that forms when you take the plunger out of a bathtub. That's a structure that is not haphazard. There are correlations there. There are certain order. There's structure. That structure is something that emerges when that system is going through a state in which it's flowing in the system in which is like going from one energy state to another.
César Hidalgo: 00:09:32 He says when the systems are out of equilibrium, when they're moving from one state to another, these kind of organize. That is an important group because if you think about it, a lot of the structure that we have serving nature, it's in life, and life is an out-of-equilibrium system that has to be sustained by energy flows. We have to eat many times a day. We're breathing, and we do get energy also out of the oxygen that we breathe. There's a lot of energy consumption that we need to stay out of equilibrium because the only way to maintain our level of organization is to stay out of that equilibrium. I think precaution is the one that gives us a first good clue of why they're structuring the universe.
Sean Carroll: 00:10:14 Yeah, so in other words, from the perspective of someone like me, if you just let the system go all by itself, it would go to a maximum entropy state. It'd be boring equilibrium. Who would... The bathtub would just become flat, the water on top of it. But you're saying that following Prigogine that, in the right circumstances, if you feed it some energy, in fact, some energy in a low-entropy form, then it obtains this orderly configuration, I guess.
César Hidalgo: 00:10:44 Exactly. That's conspicuous not only in biology, now our lives live around electronics that they become useless the moment that the battery runs out. They're able to process information. They're able to show information on their screens, encoded us as pixels or whatever you're using as a display type of technology because they're consuming energy. That energy consumption now is essential to kind of like keep order because, otherwise, entropy, that's what it does.
Sean Carroll: 00:11:14 What should we say that differentiates, really, the earth from the moon in this case? I mean, we're at the same distance from the sun, right?
César Hidalgo: 00:11:20 Yeah, but I think here, there's a lot of things that have happened that allow us to preserve information effectively. There's also like a chemical complexity that maybe might be missing in places like the moon. On the one hand, we do have an atmosphere, that makes a big difference, and we do have oceans, which also make a big difference. Those structures together with others have been able to create path dependencies for replicators like DNA and RNA to create order, sustain it, and reproduce it. In some way, the complexity that we've served today is not the result of like an instantaneous event or a condition that is present today on Earth and not present in the moon, but also of a long path dependent process in which this complexity has grown over time because we're able to generate more of it per unit of time than we lose, you know?
Sean Carroll: 00:12:15 Right. Yeah, I mean, I guess if I... I haven't actually thought about this long. I'm going to say things that could be disastrously wrong, but probably Venus, even though it has an interesting looking atmosphere, the surface of it from the few pictures that we have doesn't look that different from the moon. I mean, it's like there are some rocks lying around and nothing more organized than that, and probably there are conditions specifically to earth related to the existence of both solids and liquids and things like that that allow for these channels to open up and complexity to develop. Is that the right way to think about it?
César Hidalgo: 00:12:46 I think so, yeah. Of course, this is a tough question. If we would have a succinct answer, we would have had like the soundbite that solves the problem of the origins of life.
Sean Carroll: 00:12:57 I know, yeah.
César Hidalgo: 00:12:57 But I do think that even though we might not have that, we do have clues of some of the conditions that that lead to the creation of complex structures such as life, and one of those is the need to have energy flows, but the need also of not losing them that quickly and to be able to preserve the structure, and here on planet Earth, we do that by having certain solids and crystals that support the structure. For instance, DNA is a very stable molecule that is able to preserve a lot of complexity and information over long periods of time and with the ability to replicate allows us to have the conditions that we need not only to preserve information but then to make it grow.
Sean Carroll: 00:13:39 It's not even perfectly stable, right? I mean, if DNA were absolutely stable, never changed, wouldn't do the job. You need this kind of flexibility, I guess.
César Hidalgo: 00:13:47 Exactly. You need to be able to explore those spaces of configurations so that you can actually then grow in complexity because complexity requires diversity.
Sean Carroll: 00:13:57 There's some simple version of information, which is just that when entropy is low, there's secret... in some sense, there's a lot of information because you know a lot about the system, but you're making the point that it's this complexity that gives us the ability to really make use of that information? Is that the right way to say it?
César Hidalgo: 00:14:15 Yeah, I do think that complexity in many ways is kind of like a better term, maybe because a little bit more loosely defined for most people, but I do think that when you talk about the planet Earth as a planet that has a lot of information, maybe people think about the medium, the libraries. When you think about it as a place that is very complex, maybe people get a better idea that we're talking about like that complexity that is involved in ecosystems and our society that is absent from the moon, and it's also absent from libraries.
Sean Carroll: 00:14:44 Yeah. Yeah, so there's this interplay. I wish I understood it better, maybe because no one does, between complexity and information. In some sense, you're making the point that complexity makes use of information, and information makes complexity possible so they're... I'm not sure that the same thing, but there are at least symbiotic.
César Hidalgo: 00:15:00 Yeah, I do think that they're related on a part of kind of like a set of symbiotic relationships. The other aspect that I do think is important on that relationship is what I call the capacity to compute. If we go back to the DNA analogy, you can think of the DNA, and you can think of the cell. DNA by itself is quite useless. It cannot reproduce by itself. It requires all of these machinery. The same is true for a lot of information that we have. It's like a recipe without a kitchen and without a cook. Cannot transform itself into a dish.
César Hidalgo: 00:15:31 We do have also that ability to then grab a piece of encoded information as a set of instructions and transform it into something. That ability to transform information into new information or to reproduce it or to recombine it is that computational capacity that we're serving biology, we're serving society that I think is the true mystery, so I call that knowledge, and my separation between knowledge and information is information is what is encoded, and knowledge is this ability to make. You can make things by making a car. A car is information. It's an organized structure, just like DNA, or you can have knowledge when you are making a new cell type, cell differentiate, and that knowledge, that ability to make is ultimately what is hard to accumulate both at the biological level and at the social level.
Sean Carroll: 00:16:23 You said the word before, compute, the ability to compute. Now you're saying you believe to make. Are those the same thing?
César Hidalgo: 00:16:29 In a lax language, when we're 30-
Sean Carroll: 00:16:32 Yeah. We can relax. Don't worry.
César Hidalgo: 00:16:33 Exactly, at 30,000 feet away from or more, yes, I would say you have these order structures and you have the ability to make those order structures. That ability to make, we can call it the ability to compute, the ability to transform a string of bits into another string of bits where those bits are encoded on a magnetic tape or on a piece of DNA. From this perspective, it would be relevant. Of course, there are other situations in which you want to make those distinctions.
Sean Carroll: 00:17:01 Right. Presumably, there's also phase transitions, or at least transformations along the way where the system becomes better and better at the accumulating and using information.
César Hidalgo: 00:17:12 Yep. Yep.
Sean Carroll: 00:17:13 So life would be one multicellular life. I'm thinking of all these things. I just had a podcast a little while ago with Kate Jeffery who is a neuroscientist and she... I had given a talk saying how complexity can evolve, and she wanted to say, "Yes, but also, it goes away sometimes because there are disastrous events," and-
César Hidalgo: 00:17:29 Exactly.
Sean Carroll: 00:17:30 ... and it's not at all guaranteed that it comes in and just grows monotonically.
César Hidalgo: 00:17:34 Yeah, indeed. I think that's true for economies, societies, and ecosystems. We've seen the collapse of ecosystems. We might be risking a big ecosystem collapse now with climate change, and we do see it in social process, like process of social unrest. There are countries that sometimes everybody thinks that they're going fine, and things turn very quickly, like what is happening in Chile this week, what has happened in many places in the past.
Sean Carroll: 00:18:01 Yeah, no, worldwide, this is definitely going on. Let's make that transition. Let's presume that the last 15 minutes everyone understands the origin of life and how it takes in low-entropy energy. At what point does life become economics? At what point do we talk about trading information back and forth like that.
César Hidalgo: 00:18:21 What I try to communicate in why information grows, I don't know if I succeed, but what I try to communicate is, at the end of the day, you have these systems that have a finite ability to accumulate knowledge, to accumulate that capacity to make. The only way that those systems can transcend that limited capacity is by developing collective phenomenon, collective systems that include multiple units. You go from single cellular organisms to multicellular organisms because you could never achieve the level of complexity of a multicellular organism with a single-celled organism, but multicellular organisms, they peak at the human, let's say-
Sean Carroll: 00:18:59 So far, as far as we know, yeah.
César Hidalgo: 00:19:00 As far as we know. Humans also have limited capacity. The older you get, the more that you realize that you know very, very, very, very little about everything that could eventually be known. Humans transcend that capacity by forming teams, teams transcend those capacities by forming organizations, organizations transcend those capacities by belonging to industries and also by having cities, and you have these different like Russian doll structure of organizations that then grow all the way to our planet in which we're able to accumulate more complexity and more knowledge, more vulnerability to make by always renormalizing in ourselves into groups of the units that bumped into a ceiling.
Sean Carroll: 00:19:42 So division of labor in some sense-
César Hidalgo: 00:19:44 Exactly.
Sean Carroll: 00:19:44 ... is really what's getting us all this knowledge.
César Hidalgo: 00:19:45 But division of knowledge, which I would say is different than the division of labor because I can have a division of labor in which we're all doing the same thing, and we'll have more of us doing the same thing, but the division of knowledge is quite different because we're doing different things, and we're passing on those inputs to each other as a way to create things that would be impossible for each one of us to do.
César Hidalgo: 00:20:09 When you're making an aircraft, it's not that you have 100,000-people company in which everybody's making an aircraft by themselves with a hammer and with some metal. It's everybody's doing something different, and that allows them to create a few aircraft a year that are the product. Now, if you are in a lower complexity, let's say, activity, like the production of T-shirts, in that case, you might have more of a division of labor and less of a division of knowledge. You might have warehouses full of seamstress that they're all doing the same task, doing the same shirt, and in that case, you have more of a division of labor, you have economies of scale, but you don't have that division of knowledge that you would have served in a complex industry like pharmaceuticals or aircraft manufacturing.
Sean Carroll: 00:20:52 I presumed that for a complicated aircraft, there's literally nobody who has all the knowledge required to do it.
César Hidalgo: 00:20:59 Of course. Yeah, that's the whole point, and that's idea that basically you can think of the complexity of an industry as the number of what I call person bytes of knowledge.
Sean Carroll: 00:21:07 Person bytes.
César Hidalgo: 00:21:08 Yeah.
Sean Carroll: 00:21:09 B-Y-T-E-S.
César Hidalgo: 00:21:10 Exactly, of knowledge that you would need to be able to create a product.
Sean Carroll: 00:21:15 Yeah. Okay. But was that... did we skip ahead? When I think of economics in general, I mean, I think of trade and barter and maybe currency and value, and those can precede this sort of division of knowledge, or not. I'm not sure.
César Hidalgo: 00:21:33 Yeah, so in some way, my way to enter economics has been through a gap in the literature that was there in which products were very much considered like some sort of epiphenomenon that was not very differentiated in the economics literature. Economics is a field that comes more from traditional, like bankers and merchants, so it's kind of more about trade and interest rates-
Sean Carroll: 00:21:59 Money.
César Hidalgo: 00:21:59 ... prices and money and the cost of employment and those things, but whether you're producing ladders or you're producing apples or you're producing cars or you're producing t-shirts are things that are differentiated not too much in the models and the empirical work that was traditionally in the literature. I think I was lucky to enter the field at a moment in which more fine grain data was becoming available, and that allowed us to start characterizing products and industries in a more fine grain matter, not just by talking about how much labor or capital they need, but by actually looking at the patterns of production to be able to infer how much knowledge they need to be produced or other properties that differentiate them in ways that the traditional literature had not maybe paid attention to.
Sean Carroll: 00:22:54 Do you think that kind of differentiation was there from the start, from the first primitive economies, or is it something that allowed economies to take off later in the game?
César Hidalgo: 00:23:03 Another differentiation is actually very hard to achieve. One thing that is quite conspicuous when you look at international trade data or you look at data on the geography of industrial economic or innovative activities in a country is that as you move from the simple economic unit of the towns to the more complex, the cities, or you move from countries that are relatively down in the development ladder to the ones that are at the top, you have this subset structure in which diversity grows with the level of development and complexity so that the places that do few things, they not only do few things, they do things that are common to the places that do many things.
César Hidalgo: 00:23:46 You have this world in which creating that diversity is difficult, and those that don't have it are stuck not only with a primitive and limited offer, but one that is very redundant with everyone else. It's very uncompetitive too. These are also mechanisms that would actually help us explain part of inequality because if what I can do is something that everybody can do, and it's a few set of things, it's very hard for me to be competitive and have a decent income. If I can do a lot of things that nobody else can do and people want, I think I have it made.
Sean Carroll: 00:24:24 When I think about division of labor, and maybe division of knowledge goes the same way, I think about the Industrial Revolution, I think about Henry Ford building the Model T in an assembly line, and that, so... there's too many questions to ask. That's why I'm hesitating here. How did that get started? Who invented that? Was that sort of a side product of other technological information innovations like the printing press, or is there a story to tell about how information brought upon itself?
César Hidalgo: 00:24:55 Now we have more of like a big history question here, kind of like the... in that big history perspective, I do think that there's kind of like a long line of events, but there are also some transitions that I think are very important. The first one that someone like Yuval Harari emphasizes quite a bit on sapiens is like the cognitive revolution. It's like the development of language, and there's some particular properties of language that we have because human languages are not like animal communication systems. We have the ability to talk about hypothetical things. An ape can have the ability to maybe tell another ape that there's danger or that there's a lion or there's a snake, but they cannot tell them about their idea for their new Blockbuster film.
César Hidalgo: 00:25:39 That idea to talk about fiction was a big cognitive revolution that happened 80,000 years ago, 100,000 years ago and that is expressed in things like the development of advanced tools and an acceleration of our ability to accumulate knowledge and so forth. Then we have the Agricultural Revolution, which people know about that's much more reason about like 10,000 years ago, you know-
César Hidalgo: 00:26:00 That's much more reasonable, like 10000 years ago. With that we finally start accumulating, slowly people into cities. That incorporation is really important because at the end of the day just like biology accumulates knowledge in DNA and that knowledge has to be preserved and passed on to the next generation, we humans accumulate cultural knowledge. And if we all disperse into small groups that we need to accumulate knowledge and ideas through generations is limited.
César Hidalgo: 00:26:29 But once we start forming cities we start living together, we're going to accumulate more knowledge. We're going to have maybe also people that are going to be dedicated to those aspects of life. In the beginning maybe religious leaders and political leaders and so forth. And in the next revolution I would say in the sequence of events, at least the way that I understand them, is the writing revolution. And that it's very important in ancient Greece, even the writing is older than ancient Greeks. It was not that prevalent and it was not that advanced. It was much more based on accounting systems or even in administration and religious events.
César Hidalgo: 00:27:11 But in ancient Greece writing kind of explodes as a form of expression that is used to communicate and document complex ideas. And they go through something that could be considered similar to the renaissance that happened later. They actually get close to maybe what could have been an industrial revolution maybe if history would have worked differently. That writing revolution produces a lot of knowledge.
César Hidalgo: 00:27:37 And then after the writing revolution which I would date loosely around 700 BC, to me the next big revolution and what is the starting point of everything that is modern is the printing press. Because when Gutenberg adapts the printing press that existed in Asia and develops this removal type printing press, what happens are a lot of things. First, and this is all from Elizabeth Eisenstein's work, the Printing Press as an Agent of Change, but first for the first time scientists and scholars have access to multiple books.
César Hidalgo: 00:28:16 Before printing only kings had little libraries, books were transcribed by hand, and everybody has written a book quickly realizes that no matter how cool your life is, you don't have enough stories. You have to read and share other people's stories to fill up your books. Second, what happens also is that printing is the first economic activity that is urban and scalable. Think before printing. What was the way in which you could make money, that you could make it big? You had to have access to resources whether it is a lot of land and a lot of serfs that would farm that land. Or you might have access to maybe mineral resources or you have looted next door's town or something like that.
César Hidalgo: 00:28:58 But printing was something that you could produce in a city with a relatively small team of people. It was sort of IP intensive, very IP intensive. But if you have a book that sold and you could sell copies, you could become rich really quickly. And evidence of that is that the number of printers per capita in Europe stabilized after only 50 years.
Sean Carroll: 00:29:21 So it went from zero to 60 very quickly.
César Hidalgo: 00:29:23 Yes. Yeah. It was a huge economic boom. In 50 years you stabilized that number because you have an activity that basically was very profitable, was very urban. So it was the first time that someone in the city can really start making it big by manufacturing something at scale. It's really the first type of mass production is printing.
Sean Carroll: 00:29:43 It was content as we would call it today, right?
César Hidalgo: 00:29:47 Yes.
Sean Carroll: 00:29:47 That was the point. You had to monetize your content for the first time ever.
César Hidalgo: 00:29:51 Yeah. So that I think is huge and it gives rise to a new cognitive revolution that we call enlightenment. Eventually that involves a lot of the most important discoveries of science in the middle of the last millennia. And after that as society chugs along and the 16th/17th century comes along, printing accelerates. It takes 200 years for people to discover that you can print short formats.
Sean Carroll: 00:30:20 Okay.
César Hidalgo: 00:30:21 Like magazines and pamphlets and so forth. Eventually those lead to a change of institutions. It's hard to understand the transition from monarchies to democracy without printing. After those changes of institutions and with the acceleration of science and technology, we develop new forms of communication like film and radio. Then television, now the internet. And I like to think of the history of our planet or of our last thousands of years as a history of changes in communication. Technologies that have reconfigured the way that we create and process information and that we generate and produce knowledge. Those are the eras to me have contributed to this big history.
Sean Carroll: 00:31:09 Yeah, no. That's very helpful because I guess we glossed over it a little bit. But in your book you certainly emphasize the fact that not only there's this thing called information but it's embodied, it's crystallized.
César Hidalgo: 00:31:18 Yeah.
Sean Carroll: 00:31:19 There's some stuff that carries the information and all these changes in communication technology which sound sort of mundane when you put it that way, they're new information flow technologies, right? And if you make it cheaper for information to flow you enable a lot of new things.
César Hidalgo: 00:31:35 They're huge and they're always laughed upon when they first emerge. Nobody took Twitter seriously and probably now is the first thing that politicians take seriously.
Sean Carroll: 00:31:45 You still complain about it, but yeah it's important.
César Hidalgo: 00:31:49 Yeah, it's quick. For instance, think of the music industry. A lot of people are surprised that musicians make money anymore. But if you think about it from the perspective of communication technologies, what is curious is that there was a short period of time in which they could make money. Because there were musicians at the time of the ancient Greek, there are musicians still today. But for most of history musicians only could perform live and live performance were hard to monetize, they couldn't spread.
César Hidalgo: 00:32:21 But as technology evolved and you had radio and then more than radio you had then records. Those records allowed to constrain the diffusion of music in such a way that you could monetize it because music was trapped on those discs. It was trapped on those magnetic tapes and you could make a killing because the margin of cost of producing a new tape or a disc was very small. You could sell them for $10/$15 and all of the musicians that were big in the 60s, 70s, and 80s, they're loaded. Nowadays it's impossible to make money that way because music cannot be trapped in a physical media. And what was curious is that musicians could make money for a while.
Sean Carroll: 00:33:04 So that's an interesting way to put it. It had to be sharable but not too sharable, right?
César Hidalgo: 00:33:09 Exactly, yeah.
Sean Carroll: 00:33:10 There's that sweet spot. I wonder what lessons there are for other things. I wonder if books are the same way. Certainly as someone who writes books, you probably feel the same way. The first thing that happens when you write a book is pirated versions appear on the internet.
César Hidalgo: 00:33:22 Exactly.
Sean Carroll: 00:33:22 They're not that better than the physical books, so I don't think it's doing a huge amount of damage. That's interesting. Okay. So that flow of information through different media obviously affects what people can do. But then where do we get to things like Henry Ford? And you have this wonderful example in the book of this giant plant, I'd never heard of it before but there was this, what was it called?
César Hidalgo: 00:33:47 The River Rouge.
Sean Carroll: 00:33:48 River Rouge, yeah. I didn't know about this before. But to this day that was the largest integrated facility?
César Hidalgo: 00:33:55 I don't know because I've been to China a lot and they have amazing stuff there.
Sean Carroll: 00:33:57 But it didn't grow monotonically. There was a peak at that moment.
César Hidalgo: 00:34:02 Yeah, exactly. So the River Rouge was this fantastic plant that Ford created that literally took soy beans and metal on one end and produced cars out of the other because it was a complete vertical integration. I would say today we don't manufacture anything with that level of vertical integration. Value chains are very distributed and international. That's for instance why [Trumptards 00:34:27] are stupid when it involves Mexico because the US value chain and automotive sector, you know which is the number one country in the exports of beer is also Mexico. Because a lot of the [inaudible 00:34:40] beers manufacture, so a lot of the value chains that integrated and desegregated between different countries. And we don't have those monolithic models.
César Hidalgo: 00:34:48 But at the time of Henry Ford where transportation was not that good. That's why those slow cars were such a good business. You needed maybe to produce that level of vertical integration to be able to get the economies of scale that he needed to produce affordable vehicles. Nowadays things are different but it's impressive what they were able to do at some point.
Sean Carroll: 00:35:12 Again, maybe it's a version of a sweet spot in that they had the differentiation of knowledge enough to do the economy to scale but it was still expensive to do things all over the place. So bringing it together in the same geographical location was a useful thing. But I think in the book you had this example, I'm going to mangle it, but in Chile you can mine copper and other raw materials and you would like to have a battery and you can in principle make them. But it's actually cheapest to just send your raw materials to Korea, have them make the batteries, and then buy it back.
César Hidalgo: 00:35:42 Yes, so that's a good point. Because at the end of the day, for us to produce things, you need to combine materials, technology, and knowledge. The way that the world works is that the world is kind of lazy. Basically we try to minimize cost, that's a more economized way of putting it. But because of that you have to ask yourself the question, what is easier to move? Is it easier to move the knowledge? Is it easier to move the materials? Is it easier to move the technology? What are the factors that are easy to move? And the hardest factor to move is knowledge.
Sean Carroll: 00:36:15 That seems weird. Knowledge is not very heavy but it's hard to move it from brain to brain, I suppose.
César Hidalgo: 00:36:21 It's super heavy because people tend to confuse knowledge with encoded pieces of information. Would you trust a brain surgeon that the only thing that has done is read Wikipedia pages about brain surgery and has never been there? Knowledge is very experiential.
Sean Carroll: 00:36:38 Or even read a textbook.
César Hidalgo: 00:36:40 Exactly. Yeah. Someone that had just read a textbook on brain surgery, I wouldn't allow them to operate. And they don't. That's why you have residencies and you have all of this practice systems. But when it comes to complex industries, the knowledge is embedded on large teams and that makes it very hard to move. Knowledge has this temporary monopolies.
César Hidalgo: 00:37:01 For example, when Ford figures out how to build a car and he's able to put all of that together on the River Rouge and he's producing cars. And there's other people producing cars also not that far from him. They're not producing cars in San Diego, they're producing cars there in the Midwest.
Sean Carroll: 00:37:16 Still Detroit, yeah.
César Hidalgo: 00:37:17 Exactly, in Detroit and it expands from there. Because knowledge is hard to move so it's easier to bring the steel there, it's easier to bring to coal and all of the other materials. Nowadays I think it's the same. Silicone Valley has monopolies over markets that they discovered. China is now getting there because the country that is very technological advanced and sophisticated. The products are easier to move that the knowledge that you need to make them.
César Hidalgo: 00:37:44 If you figure out something that people want, you're going to have that monopoly because the product is going to diffuse very quickly. But the ability to make it is going to diffuse very slowly. And until it does, there's going to be only a few people that are going to be able to supply that and therefore they're going to have their day.
Sean Carroll: 00:38:01 And this is related to that distinction you draw between knowledge and know how or explicit knowledge I guess and know how. In some sense there are things you can put in a book but there are things that are just in the human brain or just in the individual people. And that's why it's heavy because moving people is really hard.
César Hidalgo: 00:38:17 Exactly. So in the literature on knowledge that is used traditionally on business schools and knowledge economics, information economics, people make a strong distinction between tacit and explicit knowledge. Explicit knowledge is all of the knowledge that I can communicate through an act of communication, I can qualify that knowledge. Like the recipe that I can put in a cookbook is something that I could communicate through a page and therefore is explicit knowledge.
César Hidalgo: 00:38:44 The experience of having cooked with the chefs from El Bulli or any famous restaurant is something that is much harder to communicate, so it would be considered tacit knowledge. The best examples of tacit knowledge are think of sports. Imaging Michael Jordan. He's extremely talented, he really knows how to handle a ball on a basketball court. But imagine you have a seminar and he speaks and tells you about basketball for three days. How much better of a player are you going to be after the third day?
Sean Carroll: 00:39:15 Epsilon, yeah.
César Hidalgo: 00:39:17 Not that much. Because that knowledge is tacit and it requires practice. And the world is full of tacit knowledge and is the one that is hardest to see because it's not as obvious as the knowledge that we can qualify.
Sean Carroll: 00:39:29 And closer to our experience. I don't know if you've had a professional basketball in your career in your past, but graduate school is the same way. Students come in, they know a lot of equations or whatever, but they don't know what it means to be a scientist. They haven't seen it in action. They haven't done it.
César Hidalgo: 00:39:43 I agree. I came to the US for grad school and I had a very good advisor. Laszlo Barabasi is a very famous physicist that works on networks and so forth. I remember soon I learned that it was a waste of time to go to him to talk to him about technical details. That you should figure out by yourself, but what I wanted to get out of him is understand when a problem is relevant. How should I communicate to people? How is different people going to interpret this? All of those type of things. How to think about a career and to connect the different things that you are doing, and those are things that are hard to learn if you don't have a model of someone that knows how to do them and you're learning this on apprentice.
César Hidalgo: 00:40:31 I think a lot of people sometimes don't understand that. I think the 20s you better get someone that is where you want to be and be a very humble apprentice of them.
Sean Carroll: 00:40:42 If we take this on board and we appreciate the importance of tacit knowledge and know how and how difficult it is to move around, what are the lessons that we get from that for either building an organization or organizing economy or trade barriers and things like that?
César Hidalgo: 00:40:58 There's a lot of lessons because there's some big literature on this. There's the literature that looks at the geographic diffusion of knowledge and as you can imagine because knowledge is sticky, there's a lot of barriers to the diffusion. It's hard for knowledge to travel long distances. Now the reason why it's hard for knowledge to travel long distances is because it's socially embedded. It's actually that social networks are geographically circumscribed, and that limits the diffusion of knowledge.
César Hidalgo: 00:41:26 Then the question is, what are things that limit the ability of people to create links? You have language barriers, you have cultural barriers. All of those have been shown extensively to limit knowledge diffusion, to limit trade, to limit other things. The question then is, now if you're a country or a city or a region that wants to develop their economy, what you're trying to do is accumulate knowledge. And the question is how do you do it by following the laws of knowledge diffusion?
Sean Carroll: 00:41:53 Okay.
César Hidalgo: 00:41:54 And here are the examples of people that did it wrong and examples of people that did things better. Examples of doing it wrong, I don't know if you've ever heard of the University of Yachay in Ecuador?
Sean Carroll: 00:42:08 No.
César Hidalgo: 00:42:10 Earlier this decade, Raphael Correa, who was the president of Ecuador decided to put a billion dollars on the creation of a new city of science and technology that he hoped would compete with knowledge production centers across the world. Very idealistic plan, a billion dollars in Ecuador is 1% of GDP so that's a lot of cake. And basically what they did is they grabbed a piece of agricultural land two hours north of Quito and they tried to start building this university and city and industrial park and so forth.
César Hidalgo: 00:42:48 But if you walk around a place like Kendall Square in Manhattan you know that you don't build a lot for a billion dollars. And if you have to bring every brick and every person to lay bricks to the place, you build even less. Quickly the plan and rubble, that university has gone through six or seven university presidents by now. They were able to attract a few scientists to move there. That didn't go well. There was an article in science talking about their experience and how some of them were leaving, some of them got fired. They got a few students there. At the beginning I think it was just a thousand that became very radicalized because they believed in the dream.
César Hidalgo: 00:43:31 But a thousand dollars, sorry. A thousand students at a billion dollars, that's a million dollars per student. So you could have put all of these kids in Stanford for life. There were ... Basically was this idea that if you had enough money you could create knowledge anywhere. And that's not true because their process of diffusion that constrained the creation of knowledge. In the case of Ecuador they had two chances. One was called Guayaquil, the other one was called Quito which are the two centers where they have accumulated knowledge.
César Hidalgo: 00:44:05 What are the channels that can promote knowledge diffusion? One of them which is really important and it's also well documented is migration. And migration is one channel that is also very biased toward the most talented people in the world. There's a book by Bill Kerr which is a Professor at HBS that is called the Gift of Global Talent and in that book you find a lot of interesting facts. One of them is people without college degree, about 1% of them migrate. People with a college degree it's about 5%. Inventors that have filed a patent about 10%. Nobel Prize winners about 31%. Nobel Prize winners in the US since the 70s is 60 something percent.
Sean Carroll: 00:44:52 Yeah, we get them all from everywhere.
César Hidalgo: 00:44:53 Exactly. So you do have this thing that there is a tail of talent that is extremely global and that is the one that helps create innovation, the ones that help create jobs. It's similar, you get similar numbers if you focus not only on formal education and academic credentials. If you look at people that have formed fortune 500 companies or unicorns, they're super biased towards foreigns. So then one of the lessons that you need to do is how do you attract that global talent? Because at the end of the day it's a game of global talent because also these talented famous individuals are the ones that are going to help you attract other ones.
César Hidalgo: 00:45:35 The US for a long time has had a huge advantage in that space. It receives about 50% of all of the PhDs that migrate in the world come only to one destination which is the US. And now that-
Sean Carroll: 00:45:47 A falling percentage I presume now.
César Hidalgo: 00:45:49 They changed.
Sean Carroll: 00:45:49 Yeah.
César Hidalgo: 00:45:50 Yeah.
Sean Carroll: 00:45:51 Well that's a very believable story. I had two previous podcasts which made similar points. One with Geoffrey West about how things scale and the ingenuity and creativity certainly scales with population density in some way. And then another one with Will Wilkinson where we talked about the political aspects of these and the openness to new ideas that was associated with cities versus more conservative rural divide.
Sean Carroll: 00:46:18 But then so that raises the question you brought up the word inequality earlier. It's great to attract the best talent and they want to be in these concentrations of brilliance and productivity. But then how do you spread the rewards of all that productivity and brilliance widely to everyone?
César Hidalgo: 00:46:37 So that's tough but I do think that we have found out a few things quite recently actually. Without going into the details, we might go into this later, but there are ways of measuring the knowledge intensity of economies. Like how much knowledge is there in New York vis a vie San Diego, Tokyo, Moscow, whatever. And when you use those measures to look at knowledge concentrations in cities or in countries, you do find relationships with inequality as well.
César Hidalgo: 00:47:06 For instance, economies that are more complex tend to be less unequal. Economies that are more attractive and less complex, sorry, economies that are more complex are less unequal.
Sean Carroll: 00:47:19 More equal.
César Hidalgo: 00:47:19 More equal, exactly. And economies that are less complex are more unequal.
Sean Carroll: 00:47:24 Okay. That's not obvious to me.
César Hidalgo: 00:47:27 No, it's not obvious. But it's really hard to have the level of inequality of Switzerland with industrial structure of Peru. So if your exports are based on three or four sectors mainly when those sectors are about mineral resources structured that are industries that are very regimented and hierarchical because they're about safety, about production. They're not about creativity. You don't want a miner to start digging anywhere and getting creative.
Sean Carroll: 00:47:54 You don't want creativity down there.
César Hidalgo: 00:47:55 Exactly. So you have production structures that are not geared toward innovation and they're geared towards a structure they're more hierarchical and unequal. So at international scale, as economies become more complex, they become more equal. And as economies become less complex, they become more unequal. For countries like Chile and Peru and Angola, all of these countries that depend a lot on mineral resource and structure whether these are fossil fuels or other forms of minerals, one thing that they need to do to reduce their inequality is they need to sophisticate the structure so they can generate those middle income jobs. Because those middle income jobs are not going to happen just through the redistributed policies.
César Hidalgo: 00:48:41 You do need policies and social safety nets, but you do need to change the practice structure. Now, if you look inside countries, the relationship flips. Which is kind of cool from a [statistical 00:48:51] perspective. Why that might be? Our hypothesis right now is that because within countries you do have more special [inaudible 00:48:59].
Sean Carroll: 00:48:58 Okay.
César Hidalgo: 00:49:01 A lot of the facts that Geoffrey West talks about are the facts that are not just about the size of cities but also about the migration into cities. Peter [Hextrom 00:49:08] has a paper on science advances that finds that a lot of those super lenient scaling relationships described by West and [inaudible 00:49:14] are actually quite explained by migrants coming to cities and those migrants being the most talented people. So the most talented people from the [inaudible 00:49:22] are the ones that migrate and help provide that extra punch of productivity that cities have.
César Hidalgo: 00:49:26 A city like New York is very unequal not because it's only based on New Yorkers. It's because a lot of poor people migrate to those places similar to San Francisco. They're big magnets and attractors of people and at the country scale, the more complex region or cities, the more unequal it is, the least complex the more equal it is. So you have a Simpson's paradox. Not from Homer Simpson, but the statistical Simpson paradox that the correlation that you serve at one scale inverts when you desegregate the data into the next scale.
Sean Carroll: 00:50:01 Right, okay. So then what's the, is there an immediate policy implication for this? What are we going to do with this knowledge?
César Hidalgo: 00:50:07 I do think that there are a few things. One thing that we've found also related to this is that in the more complex economic activities concentrate in space much more than the least complex economic activities. That paper is coming out on nature human behavior in a few weeks and what this tells us is that if at the end of the day wealth is going to be increasingly be generated in cities, we do need to have better ways of including more people into cities. Because I think one of the pain points of the United States right now is that the major cities of the US are failing to accommodate more people because of problems with infrastructure, problems in the ability to build and so forth.
César Hidalgo: 00:50:48 If you think about it, San Francisco-
Sean Carroll: 00:50:50 San Francisco is the worst, obviously.
César Hidalgo: 00:50:52 That city in China would be a 20 million people city given what it's able to make. But for that it would have 27 lines of subway that would be really fast and modern and autonomous. It would need to have another type of infrastructure. I don't know if you've been to [Shan gen 00:51:07]?
Sean Carroll: 00:51:07 I have not been there but I know about it, yeah.
César Hidalgo: 00:51:07 Right, Shan gen is-
Sean Carroll: 00:51:10 It just came into existence from nothing.
César Hidalgo: 00:51:12 Yeah, exactly. It's 23 million people ... The US hasn't been able to produce 21st century mega cities. It produced 20th century mega cities but nowadays I think you're going to need to include more people in the centers of production. And if you don't get together good infrastructure for transportation, you don't get together good ways of densifying neighborhoods in a way that make them livable but at the same time are able to include maybe twice the number of people that they had before, maybe three times, you're going to start having more and more social pressure because it's not sustainable to commute for three hours. People are getting excluded because cities have the need to include more people than the one that they're able to right now.
Sean Carroll: 00:51:55 Yeah, just to play devil's advocate here, I'm probably mostly on your side, but these are subtle questions. We're having this conversation in the middle of a char-
Sean Carroll: 00:52:00 [inaudible 00:52:00] questions. We're having this conversation in the middle of a charming neighborhood of Cambridge, Massachusetts with a lot of small, individual-sized houses, or at least a few unit houses. The character of the place would utterly change if the density were that of a large Chinese megalopolis, right? Is that entirely good?
César Hidalgo: 00:52:21 Well, in some ways. I live two blocks away from here. That's a huge privilege.
Sean Carroll: 00:52:26 Yeah, it is.
César Hidalgo: 00:52:27 That's a huge privilege.
Sean Carroll: 00:52:28 People don't want to give it up.
César Hidalgo: 00:52:30 Yeah. Many people in my company have to commute much longer and that commute is part of kind of like their unhappiness. If you think about it, a lot of the social unrest that is happening in many countries, it's being triggered when the price of gas goes up, when the price of transportation goes up, because commuters are unhappy. Commuters are the ones that are kind of like excluded and they have to kind of like make a journey every day to the places where people like me that have the privilege to be able to afford a home here live. I do think that if they could make Cambridge more affordable to more people, I think it probably would be healthier. Maybe we'll be able to hang out at night more and get better ideas for our businesses and for creative activities that we want to do. I do think that if you get along with your neighbors, it's good to have more of them.
Sean Carroll: 00:53:24 It is. I was impressed to learn, not impressed, but interested to learn that the average commute time goes up and up the more you're in a big city.
César Hidalgo: 00:53:34 Yes, yes.
Sean Carroll: 00:53:34 Despite the fact that things are more dense and nearby, it takes longer to get there, just because it's even more-
César Hidalgo: 00:53:39 Yes. Yeah, yeah. New York has huge commute times.
Sean Carroll: 00:53:40 Yeah, and LA [crosstalk 00:00:53:42].
César Hidalgo: 00:53:43 We have all of that data. We have Data USA, we have the main tool to distribute and visualize US public data that we launched in 2016, and there you can create maps of like commute times. You immediately see that when you create the average travel time map, boom, all of the cities light up. When you create the commuting alone map, then it's the opposite. Commuting alone is people driving cars in more rural areas. Average travel time goes up in places like DC and New York and so forth.
Sean Carroll: 00:54:12 Well, that's another side of what you do. If information flowing around in different forms and in different media is driving the economy in various ways, we're living in an era now where there's so much information that just dealing with it all is a big issue. You've been involved in data visualization and projects that try to sort of make sense of all this information we're getting. What drives you to do that?
César Hidalgo: 00:54:36 It was a little bit of a, let's say, lucky path. When I started doing this work on economic complexity and relatedness, there were like two big papers that we produced, one in 2007 that came out in science, and another one in [inaudible 00:54:53] in 2009. Those papers became very popular, but one of the things that those papers had was that they had some nontraditional techniques that we had invented that you could use to predict the products that our country was going to export in the future, that you could use to predict the future economic growth potential of countries based on their economic structures. I started to get a lot of demand of people to generate reports on that type of topic. As a scientist, like you always want to be working on the next thing.
Sean Carroll: 00:55:27 Yes.
César Hidalgo: 00:55:28 They say, "Oh yeah, we want to do like a relatedness and complexity analysis for these region of Brazil," or for this or that. That was kind of like boring, repetitive work because-
Sean Carroll: 00:55:38 Sure, you've done that already.
César Hidalgo: 00:55:38 Yeah, you've done it. When I started my lab at MIT, the first grad student that I hired was Alex [Simoes 00:03:47]. The job that I gave him is we're going to do like a self service tool for this demand. I had done something similar before for a paper in which we looked at correlations between diseases using the hospitalization records, and Alex had started building this tool, which became The Observatory of Economic Complexity. It's now the number one tool to distribute international trade data in the world. Then we found out that maybe like what people were more interested was on the platform and the tool rather than the analysis that the tool made.
Sean Carroll: 00:56:20 [crosstalk 00:56:18].
César Hidalgo: 00:56:20 That was re-deployable. Then we created a tool together with the government of Minas Gerais in Brazil that integrated data for more than 50 million workers, all of the basically formal sector economy of Brazil, data on trade, on industries and employment and also education for all Brazil. Then, together with a colleague from Deloitte, we started working on the creation of a similar platform for the US, but one of the things that we did there is that we realized that we needed to go beyond economic data and also to include data on demographics, on health, on insurance, on commute times, on you name it, and that's Data USA that was launched in 2016. That really became very popular and it became some sort of like the dream thing that people in statistics departments of many governments in the world wanted to have. That started to generate a demand to create more projects like that. There's Data Korea now. There is Data Chile. We're releasing Data Mexico on January. We've been creating these tools that what they do is something that is very simple, but it's not easy to do.
César Hidalgo: 00:57:27 It's easy to use, but it's hard to do, is to integrate 15, 20, 30 different datasets, but not just provide files, but integrate them into narratives. What we do is we transform data into stories. That's the main form of integration that we provide. That allows people to find the data on the web. The stories have text that is generated semi-algorithmically using the data, visualizations also. Imagine for the US, the US has like 70,000 census designated places. Each one of them has a complete profile with more than 70 visualizations and a lot of text and information. Even if you had a little town, you have all of your census data and your BA data and your BLS data in Data USA. Then we have more advanced tools like ways to integrate data and download it, but integrating data from multiple sources, ways of creating custom visualizations. That has been something that has done very well. Also, we've done similar solution for private sector companies in which we integrate the data from their marketing departments and logistics and so forth to create platforms that people can use in a strategic decision making.
Sean Carroll: 00:58:36 Are these also useful for academics doing studies of demography or whatever?
César Hidalgo: 00:58:42 Yeah. They use it a lot. Just to give you an idea, right now on our online properties, we get over a million people a month. We run service to try to figure out who they are and how we can serve them better. A platform like Data USA, 35 to 40% of the people that visit it, it's like academic of some form, whether it is like a high school student doing a homework, or whether it is a university professor using it in some report. If you go to Google Scholar and you search for the URL, it will be a relatively well-cited paper.
Sean Carroll: 00:59:15 Right.
César Hidalgo: 00:59:16 Unfortunately you don't get credit for those ones. Maybe one day you're going to be able to put your websites on Google Scholar as well, but that's another story. We do get people from academia. We do get a lot of people from local governments.
Sean Carroll: 00:59:29 Can people just go to the website and use it? Is it a fee-for-service or ...
César Hidalgo: 00:59:33 No, it's totally free right now. We're thinking maybe in the future to add some premium features for like people that are using this ... Some people use it, for example, to do market analysis. We could provide like premium features in that case, but so far it's completely free, open source. It's a very open project.
Sean Carroll: 00:59:51 Is that similar to ... You also mentioned, offhandedly, the urban perception, the idea of using actual photographs of different places in the city to sort of ... but not just show them in a slideshow but learn about them from the computer.
César Hidalgo: 01:00:06 Yeah. It's related because if you see what I'm interested in, I'm interested on kind of like these applications of science and technology to society. One of the things that I tried to do over the last decade is to find alternative ways to collect data and hopefully data about aspects of society that have been hard to quantify before. In 2010, I got the idea that we could use Google Street View images to quantify evaluative aspects of cities, which place looks safe, which place look lively, which place look depressing, beautiful and so forth. At that time, a lot of people were like, "Oh, this is crazy. These are all subjective things. You are never going to be able to do it. It's all meaningless," but I discovered that on the one hand, there was a literature [inaudible 01:00:52] of people that have attempted and had done that many times, but with very, very small sample sizes of both images and people.
César Hidalgo: 01:01:00 What I did is I did a crowdsourcing study. That quickly became the largest dataset ever of visual perception service. That was called Place Pulse. That allows us to classify 4,000 images, and we got over 100,000 people rating them. We discovered a few things, first, that people's preferences were very transitive.
Sean Carroll: 01:01:23 Oh, okay.
César Hidalgo: 01:01:23 Yeah, so it's not that all over the place. If I show you like a picture of a really nice, like beautiful, well-kept neighborhood, and I show you like a picture of like a very sketchy favela, industrial, like people tend to, let's say, answer that the first one is safer than the second one. That tends to be quite a universal preference.
Sean Carroll: 01:01:48 Right.
César Hidalgo: 01:01:48 The difference between images is so large that it overwhelms the differences between people. We could use those scores to like measure like the segregation-
Sean Carroll: 01:02:01 But you don't know if it's actually safe. All you know is that everyone perceives it to be safe or unsafe.
César Hidalgo: 01:02:06 Yeah. We don't expect it to be because like what we wanted to measure was the perception, because perception can also have an effect that is independent of whether that location is actually safe or not.
Sean Carroll: 01:02:16 Right. Yeah.
César Hidalgo: 01:02:16 They're different things. They don't have to be the same actually-
Sean Carroll: 01:02:18 I mean, we've all been in cities, in neighborhoods where we've said, "Oh, this looks safe," or, "Oh, this doesn't look safe." Articulating why we're saying that might be difficult.
César Hidalgo: 01:02:26 Exactly. What we discovered though is that crowdsourcing was very limited for the amount of data that we needed to collect. A city like New York, Manhattan alone has about 80,000 street segments. That's a lot of street segments. Let's say you want to get one image per street segment and you want to evaluate that image, and let's say that you want to compare that image with only 10 others to be able to get a decent score. Imagine it's like college football, but with like 80,000 teams and 10 games.
Sean Carroll: 01:03:03 You can't play every team, right, yeah.
César Hidalgo: 01:03:03 Exactly. It's super under-determined matrix. Still, you need a lot of traffic just to evaluate all of those images for one city and one dimension. What we decided to do is to say, let's grab the data that we have, and let's train computer vision algorithms to like do the clicking for us. We find that actually those computer algorithms work very well. They allowed us to scale to create [inaudible 01:03:26] perceptual maps with hundreds of thousands of images. Then we could use that to study how perception affected behavior.
Sean Carroll: 01:03:34 Okay. What did we find?
César Hidalgo: 01:03:37 We teamed up with a team of Italian colleagues that had mobile phone data, and they could see the activity of people in cities as a function of the time of the day. Then we could see if people tended to avoid unsafe looking places, controlling for distance to the subway, for distance to the central business district, for other things like the density of jobs, the density of population and so forth. We did find very interesting effects. First, we find that like people tend to avoid unsafe looking places. After controlling for all those things, you find less people in those places than you would expect. Those effects are modulated by demographics. It's stronger for women and for the elderly, but it's reversed for people below 30. Young people tend to like hang out in the unsafe-looking places.
Sean Carroll: 01:04:38 They hang out in those areas, yeah.
César Hidalgo: 01:04:38 Young male, especially, but like elderly women tend to avoid those places. It kind of like makes sense but it helped formalize something that I think we all had an intuition for, but now we actually have like hardcore data to be able to show it.
Sean Carroll: 01:04:52 I mean, it's another example of what we were talking about, a different way of thinking about conceptualizing, a different kind of information in some sense, or at least a different way of sharing it and thinking about it. That's the other ... The last thing I wanted to ask you about was how this relates to your interest in collective memory. I did have one podcast guest, Lynne Kelly, who studies memory palaces, ancient memory palaces, the idea of remembering things by associating them with physical, geographical locations. She thinks that Stonehenge, for example, was used as a way of remembering what you would call know-how, [crosstalk 01:05:28] knowledge that was ... because they didn't have writing, they couldn't pass it down that easily, but nowadays we have the internet, we have books, we have TV. It's a very different kind of thing. How is our collective memory of who we are and what kinds of things we pay attention to been affected by these technological changes?
César Hidalgo: 01:05:45 Yeah. I was studying how collective memory is affected by technology, language and time. We did a paper with Steve Pinker and other people in which we looked at the network of global languages. That's a network in which each node is a language, and language are connected if they're likely to be translated or spoken by the same people. Let's say English might be connected to German if a lot of books get translated from one language to another and a lot of people that speak German also speak English and so forth. We mapped that network using three datasets, a dataset with over two million book translations, Twitter, we detected the language of tweets, and then if you tweet in English and you tweet in Spanish, I can connect those two languages because you are expressing knowledge of both, and Wikipedia edits. It's not reading Wikipedia, but if you edited the page of Einstein in German and you edited the page of Einstein in English, you probably know how to write in German, so you probably know both languages.
César Hidalgo: 01:06:46 Interestingly enough, when we compare that network with a dataset that we created of globally famous people, we found that the centrality of a language in that network explained the number of famous people produced by the language better than the population of the language, better than the wealth of that language. That tells us, hey, like fame or global fame is something that is very much dependent on like this network, because if you think about it, the network of languages is like the most aggregate version of the global social network that you can have. You cannot have a social relationship if you don't speak a language. You cannot make a friend just by nodding. In that context, we learned that much of what the world knows is going to be modulated by this network. There's a lot of implications about that.
Sean Carroll: 01:07:38 Just to be clear, does the network take the form of saying things like, if you know English, you're more likely to know French than Chinese, whereas if you know Chinese, you're more likely to know Korean than to know English, things like that?
César Hidalgo: 01:07:50 Exactly. Yeah. The network gathers all those things. There are paths between every language and every language, but a language like English, for instance, is a global hub. Someone from Portugal and somewhere from Vietnam, probably they're going to speak something. The most likely thing is they're going to speak English.
Sean Carroll: 01:08:07 Yeah, that's a bad example.
César Hidalgo: 01:08:08 But there's also like regional hubs. For example, French connects a lot with African languages. Also, Spanish connects with like Quechua and Mapudungun and other languages from native South America. Then you have also Arabic as being as an original half-Chinese. Then you have language like Russia, it's a very important regional half on all like eastern Europe and in northeast Asia. Then you have very peripheral languages. It's a very hierarchical network with like English is at the center, then a ring of regional hubs, and then like smaller languages that are kind of like on the periphery, connected to some of those regional hubs.
Sean Carroll: 01:08:52 I would love to see a picture of this. Is there an image out there on the internet somewhere?
César Hidalgo: 01:08:54 Yeah, yeah, yeah, language.media.mit.edu.
Sean Carroll: 01:08:56 All right, very good. Thanks.
César Hidalgo: 01:08:59 Then we also studied how communication technologies affected our collective memory. We used this dataset that we've created of over 70,000 famous biographies to look at how changes in technology change the number of globally famous people that were born each year and the occupations associated to those people. For instance, when you think of Einstein, he would be a physicist, when you think of Michelangelo, he would be a painter. What you find is that when new communication technologies were introduced, the composition of our collective memory changed, not only the size. Before printing, you look at the matrix of our collective memory, and it's mostly political and religious leaders. After printing-
Sean Carroll: 01:09:48 Okay, bosses of hierarchies, one way or the other.
César Hidalgo: 01:09:50 Exactly. After printing, you get famous artists and famous scientists for the first time. You get a lot of painters. You get composers. You start getting astronomers.
Sean Carroll: 01:10:01 Writers?
César Hidalgo: 01:10:03 Writers, they're also there, but then there is the second printing era, which we talked before, when people started doing like shorter formats and so forth. There you have an explosion on the sciences. Then now you start differentiating natural philosophy into more sciences. You get a lot of famous writers and so forth and composers. Then you have the invention of film and radio. That generates a huge shift on the arts, because before that it was the painter and the composer. Then it's about the performer.
Sean Carroll: 01:10:32 Right.
César Hidalgo: 01:10:32 It's the musician. It's the singer. Composers disappeared. Musicians and singers are the ones that now become famous actors, become famous not play writers anymore. There's kind of like that shift. Then you have the introduction of television. With television, you create the fame of sportsmen, because sportsmen were not that famous yet before television.
Sean Carroll: 01:10:49 [crosstalk 01:10:49] performer, yeah.
César Hidalgo: 01:10:49 It's a live performance. It has to be at the right time when you watch the game.
Sean Carroll: 01:10:55 And visible and 3D and you can experience it there, yeah. Okay.
César Hidalgo: 01:11:00 Exactly. You do have kind of like that change in the composition of our collective memory with introduction of communication technologies. It's very clear. To me, now when I think about like history, I don't think about like the modern times and the Renaissance and the things that I learned in school. I think of it in terms of like, okay, what was the dominant communication technology at the time? That's the era that I set myself in.
Sean Carroll: 01:11:22 What's going to come next?
César Hidalgo: 01:11:24 The next thing that I-
Sean Carroll: 01:11:25 Instagram influencers?
César Hidalgo: 01:11:27 Next on that ... What is ... Do you know TikTok?
Sean Carroll: 01:11:30 I know of it. I've never used it yet. I'm far behind.
César Hidalgo: 01:11:34 Oh, dude. You don't need an account to watch.
Sean Carroll: 01:11:36 Oh, okay.
César Hidalgo: 01:11:37 That already tells you something. It's massive, I think, in different ways. First is like the quality of the performance that you've served there, it's very good. The amount of attention also that is on TikTok, it's amazing. There's like all of these videos getting like millions and millions and millions of likes. It's not like a marginal traffic.
Sean Carroll: 01:11:57 Right.
César Hidalgo: 01:11:57 It's huge. It's the first global social media, because it's a Chinese company. It's the only social media that is popular in both China and the West, which makes it quite interesting as a global phenomenon. It's not based on peer communication. It's actually broadcasting.
Sean Carroll: 01:12:20 Oh, okay.
César Hidalgo: 01:12:20 In TikTok, you have like two channels, the for you, which is like a TV feed in which you just scroll, and the following, which is the people that you're following and you have a feed of who you follow. That's it. You can search. The search is horrible. It doesn't work. It's like it doesn't provide good results. It's very passive like television. They reinvented television for the phone era, in a zapping world in which like content is 15, 30 seconds long. Those 15 to 30 seconds, there's a lot of good quality content at that range of size.
Sean Carroll: 01:12:53 I mean, Twitter and podcasts are as cutting edge as I'm going to get, I think, technologically, but they keep coming along very quickly. Yeah, I guess the point is that how we think about ourselves changes when these media change, how we remember ourselves.
César Hidalgo: 01:13:10 Indeed. What I want to say about TikTok though that I like, which is interesting, I do think that it's more social and more family. When I use Twitter, I use it by myself. When I use Facebook, it's by myself. When I watch TikTok, I'm in the couch with my wife and my daughter and the three of us are looking at the same screen at the same time and talking about what we're watching. I do think that that's important because we do have a lot of interactions right now in which people are interacting with screens independently. I grew up, probably you grew up, with people sitting in front of a television and having that more of as a collective experience in which you have to negotiate what you watch, you have to talk about what you watch. I do find that that's a good thing compared to like the social media that has been more peer-to-peer, but isolating.
Sean Carroll: 01:13:55 Well, I like to end on optimistic notes. Is that your optimistic note? What do you think about the future? What should our optimism be? Where should it be located?
César Hidalgo: 01:14:03 I'm an optimist. I don't know that's a good thing or a bad thing.
Sean Carroll: 01:14:07 Yeah, I know.
César Hidalgo: 01:14:07 I think it's maybe genetics, but I tend to be optimistic even though like today Chile is going through a very difficult moment with all of the things that have happened there during the last four or five days, but I do think that at the end of the day, the positive things in life tend to add up and build on each other better than the negative things in life. Going back to the beginning, we talked about where information and order grows, and that's because our ability to create order and complexity is larger than the rate at which it's getting destroyed. I think that's also true for many of the positive things. I am an optimist because I do think that at the end of the day, things add up better as we move along.
Sean Carroll: 01:14:58 All right, César Hidalgo, thanks for some good information there. Thanks for being on the podcast.
César Hidalgo: 01:15:01 Hey, my pleasure.
Sean,
Get some help with the sound quality! I have not-great internet, so I download first, but even that way it is so choppy can hardly get through it. It seems worse this week (Hidalgo) but it has been a problem for a while.
Thanks,
Peter
I am curious if César has read Wolfram’s book “A New Kind of Science”
excellent guest!
Can someone explain:
1) Whether there is a direct correlation between the information associated with entropy (early univere low entropy and high information) and the information associated with quantum mechanics (particles and qbits). Are the same?
2) Whether information can only exist when it is known, when it is knowledge.
Thanks.
Gerry
“Shen gen” should be “Shenzhen” in the transcripts.