Biological organisms are paradigmatic emergent systems. That atoms of which they are made mindlessly obey the local laws of physics; even cells and organs do their individual jobs without explicitly understanding the larger whole of which they are a part. And yet the system as a whole functions beautifully, with apparent purpose and function. How do the small parts come together to form the greater whole? I talk with biophysicist Rosemary Braun about what we're learning about collective behavior within organisms from the modern era of huge biological datasets, especially crucial aspects like timekeeping (with bonus implications for dealing with jet lag).
Support Mindscape on Patreon.
Rosemary Braun received her Ph.D. in physics from the University of Illinois at Urbana-Champaign, and an M.P.H. in biostatistics from Johns Hopkins. She is currently an associate professor of molecular biosciences, applied math, and physics at Northwestern University and external faculty at the Santa Fe Institute.
0:00:00.2 Sean Carroll: Hello everyone. Welcome to the Mindscape Podcast. I'm your host Sean Carroll. We all know about circadian rhythms, right? We get tired at night, we go to sleep, we wake up, we're refreshed. Hopefully maybe there's caffeine involved, but there's a cycle that we go through roughly 24 hours. Our bodies and our brains which are connected to our bodies are very much influenced by internal clocks that let us know what time of day it is. This is why when we travel to a different time zone we get jet lagged because there's a mismatch between the activities of the day and the position of the sun and the sky versus those internal clocks. So here's something that maybe I knew before but had forgotten but I learned during this podcast. Essentially every cell in our body has its own circadian clock to let the cell know what time of day it is.
0:00:52.8 SC: Of course there's also systems in our bodies 'cause the cells talk to each other. There's the nervous system and there's the metabolic system and so forth. But it kind of bubbles up from the individual cells. So when you get jet lagged or even just when you get tired at night 'cause you know that it's bedtime, it's not just some large scale impression that your body has, it's every cell in your body is thinking this way which is kind of a remarkable fact. You know, biology is in a great position these days. It's a very exciting time. There's enormous amounts of data coming in. Of course, we can analyze genes and molecular biology questions in much greater detail than we ever could before. And there's also tools to analyze that with great computational power, machine learning and AI, all that kinds of stuff. Plus of course there are questions to which we don't know the answers.
0:01:44.0 SC: In physics, if you listen to my long solo podcast, one of the challenges is that we have theories that fit the data really well in the regimes where we can experimentally access. In biology, don't worry that is not the case. It's very easy to ask questions that are very important but we don't know the answer but it's plausible that we can find the answer. So today we're talking to Rosemary Braun who is a biologist, Molecular Biologist/Engineer at Northwestern who works at the systems level of biology. Rather than looking at this particular organism and its whole life cycle or for that matter this particular organ in an organism, she looks at the various complexes, the various networks, the various systems that hook together inside biological organisms. The circadian rhythm is one of them but there's many different ways that this paradigm plays out.
0:02:35.0 SC: It's very different than in physics where you can find a simple spherical cow, abstract away all the complications and just look at the harmonic oscillator or whatever. In biology, very often we have these intricate complex networks that are talking to each other and so that's what we're gonna explore today. Before going onto that, let me just mention, I haven't mentioned in a while, we're still doing the Mindscape Big Picture Scholarship. So this is something where people who are going to college, going to universities can apply for this scholarship. It's $10,000 if you win. So it just goes right to your tuition. It is aimed at youngsters who are going to go to college, who are interested in the big picture. So who are interested in not necessarily doing something applied or primarily financially rewarding, but thinking about the big ideas.
0:03:26.2 SC: People who want to concentrate in fundamental physics or philosophy or mathematics or for that matter History or Literature or Economics or what have you. As long as it's that thinking deeply about the fundamentals kind of aspect, those are the applicants we want. So if you're interested in applying or if you're interested in donating to the Big Picture Scholarship Fund go to bold.org/scholarships/Mindscape. Bold.org is a wonderful organization that helps people crowdsource these kind of scholarships. Last year we gave out two scholarships. It's just very heartwarming that you can contribute to the education of some young student especially ones that might not come from academic backgrounds, etcetera. So it's a very nice thing you can do. There's also a link on our Mindscape homepage which is just preposterousuniverse.com/podcast if you wanna go directly from there. So thanks for everyone who has been donating. We've been getting donations all year. We gave out two scholarships last year. Let's hope we could do even better this year. So with that, let's go.
[music]
0:04:50.7 SC: Rosemary Braun, welcome to the Mindscape Podcast.
0:04:53.1 Rosemary Braun: Thank you so much for having me.
0:04:55.0 SC: Let me ask about the big picture here. That's my favorite thing to ask about. As an outsider, it seems to me that there's a shift in the last few decades or whatever in biology from talking largely about organisms or species to now thinking about systems and networks and genomes and so forth. Is that an accurate perception as an outsider?
0:05:19.0 RB: I think it is. Yeah. I think in large part, it's been driven by the revolution in micro... In molecular biology profiling technologies. So we're now able to assay gene expression in elaborate detail. We can look at genetic variations in extraordinary detail and we need a way to make sense of all of that information. And so trying to relate what we measure in this 20,000 dimensional feature space of gene expression, to what we can see under the microscope or in a patient who has just walked into a clinic, relating those two things to each other is a major challenge that we have now. And one way to approach that problem is to look at it from a systems point of view and to think about what the network of interactions between those microscopic elements are that produce the macroscopic features that we observe.
0:06:13.5 SC: So when I just out of my head came up with systems and networks, are those the right words? [chuckle]
0:06:20.4 RB: Yeah, I would say so. Yeah.
0:06:21.9 SC: Okay, good. And it's largely because, more data isn't, I don't wanna say it's not always better, more data is always better, but it's not useful by itself, right? We need to get the right way to squeeze the information out of the data.
0:06:34.6 RB: Yeah, it's not useful by itself and it's also not necessarily clear to me that it's necessarily... That what we are able to measure easily is what the system actually cares about.
0:06:47.9 SC: Sure.
0:06:49.4 RB: Right. And so, one can imagine measuring many, many things and at the end of the day it's really a low dimensional set of interactions that govern what a cell is going to do. And everything else is a consequence of that, that we're measuring. And so trying to identify which are the state variables that the system actually cares about, I think is a massive challenge now.
0:07:16.9 SC: Yeah. And so in my language...
0:07:17.9 RB: We're able to measure a lot of things, but what should we be paying attention to?
0:07:22.6 SC: Exactly. So in my language, it's all about coarse graining and emergence. I mean in a physics system, my favorite example is in the earth there's 10 to the 50th atoms. And to figure out how the earth orbits the sun, we don't need to know what those atoms are doing. We just need to know a small number of variables, the center of mass and the velocity. But the point is that, in physics it's kind of obvious what we need to know and that happens to coincide with what we can easily measure. In biology, everything is always much more complicated.
0:07:53.4 RB: Yeah. [laughter]
0:07:55.9 SC: So your lab, if you were to explain your lab to a stranger on the street, what exactly are you trying to figure out?
0:08:03.9 RB: Ah, everything. No. So my interest is really in how living systems self-organize and I approach this problem from a purely computational point of view. So we collaborate extensively with experimentalists who ask interesting biological questions and produce data and can then test our models, but we do not do that directly. But that gives me a certain amount of flexibility in the types of things that I study. So if you think about living systems, they exhibit these beautiful, self-organized, structures and processes at every single scale, right. At the molecular level, proteins assemble into macromolecular complexes that carry out certain functions. The cells are arranged into tissues with very reliable patterning, those tissues interact in a multicellular organism to enable it to live and move and do all of the things that it does. The organisms themselves interact in ecological networks and in societies. And so what gives rise to all of those dynamics is really... It's a mystery that I'm absolutely passionate about solving. And I've made very little headway, but I am struck by the fact that we're now at a point where data collection is easy, relatively speaking, and computational power is plentiful, and so we can start to scratch the surface of those questions.
0:09:47.0 SC: You're young, you have plenty of time to answer these questions. [laughter] We have great expectations here, but am I getting the impression that you are a more or less theoretical biologist in the same sense that I'm a theoretical physicist?
0:10:00.0 RB: Yeah. I was actually trained as a theoretical physicist and I drifted into biology because I was interested in complex systems and living systems were the most compelling example of complex systems that I could find. So, I slowly got drawn over to questions in biology, but my training is purely theoretical.
0:10:24.9 SC: How far, all the way up to the PhD?
0:10:28.0 RB: So yeah, my PhD is in theoretical physics.
0:10:30.6 SC: This is hilarious. So I have to promise the audience I did not know that because again and again, I have guests on Mindscape and they reveal halfway through that, "Oh yeah, I was a physics undergraduate or whatever," and they're studying political science or sociology or economics. So, okay, I get it. It makes sense to me now why you think that way? But it's funny because as a biology professor, you still have a lab, even though you're a theorist.
0:10:57.9 RB: Yeah, I mean, it's a dry lab. So all of our experiments are in silico and we work really closely with the experimentalists, who I collaborate with. You know, students who work with me often spend time in the wet lab of my collaborators. I definitely did that as well, when I was making my transition into theoretical biology. And it's been a wonderful journey and it's been a lot of fun to be working at that interface. It has its challenges, of course. We're trained to think in different ways, we use different language to describe phenomena, but at the end of the day we're kind of, both me and my experimentalist collaborators are interested in the same fundamental questions. And so we're kind of able to come together from those two perspectives.
0:11:52.5 SC: In physics, of course, it's a well established idea that we have theorists, we have experimentalists, we hire both or whatever. Do your biology colleagues look at you slightly as scan as someone who does not ever get their hands wet? [laughter]
0:12:11.4 RB: Yes, but I think that's changing. I think in part because of this revolution in molecular biology, there has been a realization that the ways in which we think about biological systems and try to make the links between the microscopic and the macroscopic, it requires different perspectives and so I think, I see an increase in the diversity of approaches that biology departments take, and mine is certainly one of them. So I think that sort of looking a scan so that skepticism of non wet, biology research is diminishing.
0:12:57.6 SC: That's good.
0:12:58.5 RB: As its power is revealed.
0:13:00.5 SC: Good to know. And, I was amused on your website to see the word omics used as a, I don't know, back, construction from there are things called genomics and proteomics and metabolomics and so forth. So we just, we'd lump them together and call them all omics. Maybe explain to the audience what that is. [chuckle]
0:13:24.5 RB: Yeah. So if you think back to what we're all taught as the central dogma of molecular biology, right? You have a genome, it's encoded in DNA, those genes are comprised of some sequence of bases that code for a protein in the end. But there's an intermediate step where the gene is transcribed into mRNA, and then the mRNA is then translated into protein and about 20 or more years ago now, there was a development of microarray technology. And what this technology was designed to do was to simultaneously assay genome wide, the abundances of all of the mRNAs in a given sample. So the idea is this, you have a gene, it codes for a protein, but how much of that protein is produced depends on how many copies of that gene are made in mRNA.
0:14:31.7 SC: Okay.
0:14:32.0 RB: And so if you can measure how many copies of mRNA you have for that... For that particular gene, you can use that as a proxy for how abundant that protein is, and maybe then as a proxy for how much of the function that that protein carries out is actually being carried out in your sample. Previously the way that mRNA abundance was assayed was one gene at a time and Microarrays enabled us to do this for 10,000 genes at a time. And so now you get this incredibly rich snapshot of gene expression for nearly every gene in the genome. And so that was referred to as genomics and transcriptomics, since we're looking at the transcript level. And you can also look at gene sequences.
0:15:31.5 RB: And so that's really, when we talk about, sometimes genomics is used as a catchall term, but if we're being really, really specific about it, the genomics refers to the sequence of the DNA and how much yours might vary from mine or two different organisms from one another, or cancer cell from a healthy tumor cell. The transcriptomics is the next layer up in the regulation. How much of the mRNA is produced? Proteomics looks directly at the protein abundances and so on and so forth. And we can do all of these things now in high throughput, right? So it's not just mRNA, we can do it at the genome level, we can do it at the protein level. We have now next generation sequencing technology. So nobody is using microarray anymore. They're all doing sequencing which means that we can really look in detail across the entire genome of what a given sample is doing.
0:16:22.8 SC: I guess this is the point where I ask my favorite question to ask biologists, which is, what is a gene [laughter] in DNA language, right? I mean, it seems to be some functional unit of the DNA, but maybe the boundaries are a little fuzzy and people disagree.
0:16:39.3 RB: The boundaries can be a little fuzzy. So the, a given region of the genome can code for different isoforms of the gene, right? So it could be truncated in different places.
0:16:51.4 SC: Okay.
0:16:51.7 RB: It could be spliced in different ways and which of those isoforms is actually expressed by a cell can change under different circumstances. So for example, we know of particular genes where depending on the temperature, a different isoform of that same gene is expressed. And because you have a different isoform, you have a slightly different protein, or perhaps you have the same protein, but it has different regulatory properties. So it might be easier to interact with by some other regulatory element. So, it is fuzzy in that sort of way, right? There's, we're taught to think of genes and proteins and enzymes as these highly specific entities, right?
0:17:45.1 SC: Yeah.
0:17:45.5 RB: Like, when you're in your first biology class in high school or in college, you're taught this lock-and-key model for how proteins interact with one another. And it turns out that that's not really the case, there's a lot of non specificity and promiscuity in the binding of proteins to one another. And what regulatory elements are targeted by which regulatory proteins. And so nailing down what a given gene does is...
[laughter]
0:18:26.8 RB: It can be challenging. And I think to be honest, it's something that I'm actually a little bit opposed to in the sense that I think that making this assumption of specificity and of this lock and key mechanism, I don't think it's borne out by the data. And I think that if you do an analysis, with that assumption, you limit the types of things that you're able to discover.
0:19:00.0 SC: It's interesting because we've long ago gotten rid of the idea of intelligent design. We know that these biological organisms evolve through natural selection, but also again and again, we come across a situation just like when you said, where we have an explanation or an account of something and then we realize it's not quite that simple or it doesn't work in quite that way. And I wonder how much of that is because we sort of have an intuition that grows up from designed systems where if there's a part, it has a purpose and there's a clear thing that it does. But because all the stuff that we see in biological organisms have just sort of come about because the whole system was successful the boundaries are not always so clear in biology.
0:19:45.4 RB: Yeah, absolutely. And I think that one could make energetic arguments for proteins having multiple functions. Right? Then you only need to make one of them to carry out multiple things. I think that one can also make probabilistic arguments for why that may be the case in the sense that having a multifunctional protein allows you to do a little bit of bet hedging. And I'm saying this kind of as a conjecture when it comes to proteins. We had this pair of papers with my former student, Gary Wilke, where we looked at the role of microRNAs, which are small non-coding RNA molecules.
0:20:33.1 SC: Okay.
0:20:33.9 RB: And there we know that their role in binding and regulating genes is pretty promiscuous. So if you'd like, I can elaborate on that a little bit more.
0:20:48.9 SC: Please.
0:20:50.7 RB: So microRNAs are short non-coding RNA molecules six to eight bases long, right? So very, very short. And they bind to mRNA molecules, which code for a specific genes and proteins, and they either inhibit the production of the protein by being bound to the mRNA molecule or they trigger the degradation of the mRNA. In either case, the end result of the microRNA binding to the mRNA is that it downregulates the production of the targeted protein.
0:21:33.2 RB: But I told you that that microRNA is very, very small, only six to eight bases. And so it could bind many, many, many genes, right? That's a very short sequence, which it might bind to. And so a given microRNA may have many mRNA targets and a given mRNA, because it's hundreds, possibly thousands of bases long, might be targeted by multiple microRNAs. So there's a many to many relationship between the microRNAs and their targets. And so this kind of flies in the face of that assumption that we're kind of taught early on, which is this sort of one-to-one, it does one thing...
0:22:21.1 SC: Yeah.
0:22:21.2 RB: Very, very mechanistically. And so we wanted to ask the question, well why would you produce a regulatory molecule, a microRNA that is completely non-specific in its binding. Right?
0:22:40.8 RB: Where it might have possibly hundreds of targets? And we thought, well, one reason you might want to do such a thing or evolve such a mechanism is as a way to exert systems level control over an entire process by redundantly targeting different elements of that system, right? So that if the microRNA doesn't bind to one particular mRNA, maybe it binds to the mRNA of a different gene, and it doesn't matter which one it's hit. The whole system as a whole is downregulated. And so what Gary and I did was to look for statistical evidence of this type of systems level control in TCGA, which is The Cancer Genome Atlas, and it's a repository of one of those giant genomics projects, thousands upon thousands of samples, and each of them has DNA data, mRNA data, microRNA data, proteomic data, et cetera.
0:23:50.4 RB: So it's a wonderful wealth of data. And so what we wanted to know was do we see evidence of the systems level control if we look in this data? The way that we did that was to say, okay, if a microRNA targets a particular gene, then we expect that when the microRNA abundance goes up, the abundance of the target RNA will go down. We know that if you look at particular microRNA gene pairs, that's not really born out. So you don't always see that at the single gene level. But what if we were to take all of the genes that we know to play a role in a particular biological function in a particular pathway, and we summarize in some statistical way, the expression level of all of the genes in that pathway, and use that as a course grained measure of how active that pathway is.
0:24:49.4 RB: Can we then identify particular microRNA pathway pairs, where as the microRNA goes up, the pathway activity summarized through nonlinear dimension reduction of the gene expression goes down. And in fact, we were able to identify exactly that.
0:25:07.6 SC: And is this an example of the circuits that you talk about? We use this language of circuitry, right? Of nodes and networks and things connecting them, so like rather than saying, you're putting your finger on a certain mRNA and saying here's what it does, you say, here's all the flow of different things. You talked about channels, I guess, and here's how it all adds up to some collective behavior.
0:25:38.2 RB: Yeah. So in that particular paper we just treated the pathway, not as a network but just as a collection of genes. But now in kind of work that's grown out of that as well as some other studies, now we're starting to take into account what we think is the wiring diagram. And I'm making air quotes, although your listeners cannot... They cannot see them. But the wiring diagram of these pathways. So that's a network of putative interactions between the genes or the gene products really, that carry out some particular biological function. And the reason that I put air quotes around it is that what we know of those wiring diagrams is incomplete, right? We don't know all of the interactions just because we haven't discovered them all yet. We obtain those wiring diagrams, those pathway networks, from databases which inevitably will have some errors.
0:26:40.5 RB: And again, there's this issue of probably any given element has more than one role. Maybe we haven't discovered all of them yet. So it may be that those networks are sparser than they, in the database than they are in reality. But it gives us a starting point for trying to understand what is connected to what and what the flow of information or the flow of a signal through this collection of genes might be.
0:27:14.0 SC: And is there some hope that, or maybe it's already true, I don't know, that we achieve a complete list of all the circuits, [laughter] of all the networks that are relevant, or are there completely different kinds of circuits we're talking about and one person's circuit is another person's noise? I'm not sure.
0:27:34.6 RB: That's a really great question. [laughter] Whenever somebody asks, is there hope? [laughter] My impulse is to say yes, of course, of course there is.
0:27:45.3 SC: Fair enough.
0:27:46.1 RB: But I think it's a, I think it's actually quite a challenging problem, right? So there's this big problem of how do you discover those wiring diagrams or those networks? And now because we have so much omics data, one could say, okay, well, what we'll do is we'll go to that omics data and we will look for patterns of correlation in that data. And from those correlation patterns we will infer some sort of network structure. And we've made a little tiny bit of progress on that, but it's, but not very much. And not very much because at the end of the day it's at this point, a pretty grossly under determined problem. Right? In that the number of interactions that one could have amongst the things that we are measuring is much larger than the number of observations that we've collected.
0:28:42.4 SC: Gotcha. So where, as far as hope is concerned, sure. But we're not at the level now where everyone agrees on what the circuits are in the same way they agree on what organs we have in our body?
0:28:56.0 RB: Yeah.
0:28:56.9 SC: Good. And it reminds me a little bit, I hadn't thought of this until just now, of, we had Judea Pearl on the podcast talking about causal networks, right? He's more thinking of social science or medicine, what kind of interventions can have a good outcome? When you draw your little diagrams, do you imagine asking questions about what causes what, what node in the diagram has a causal influence on the behavior of what other nodes?
0:29:31.1 RB: Yeah, I think that's, at the end of the day that's what we'd like to understand from these systems. From my perspective, I'm interested less in what causal effect does one node have on what other node and more about what causal effect does one node have on the behavior of the system as a whole. So if a particular node were subject to a mutation, what would that mean for the dynamics of the entire network, right? How robust is the network to damage in particular regions? If I recover some coarse grain statistical property of that network, is that enough to recover its functional behavior or do I need to recover every node and every edge in detail?
0:30:18.5 SC: Right. And maybe this is a slightly askew question, but is there a sharp bright line between biological networks of this form and other possible chemical reaction networks? Are there things that we see in the biological context that we just don't see elsewhere?
0:30:38.0 RB: I mean, these networks are intended to represent possible biochemical or biophysical interactions between molecules. And so in that sense I think they're, you can think of them as a chemical interaction network as well.
0:30:55.4 SC: Okay. But do we in nature, but not in biology and things that happen outside nature, do we see reaction networks of this richness and depth or is biology special if only because it's more complex than other things?
0:31:16.8 RB: I mean, biology is special in terms of its complexity, in terms of its complicatedness, in terms also in the fact that it is far from equilibrium.
0:31:27.3 SC: Of course, yes.
0:31:28.3 RB: Right. And it's adaptive, right? So those networks are not necessarily constant in time.
0:31:39.2 SC: Good. I mean, it's... Biology is simultaneously not an equilibrium, but almost in a steady state for some purposes, right? I mean, in stat mech we talk about non-equilibrium, steady state sometimes.
0:31:53.5 RB: Right? Yeah. And yeah I would say that the analogy works in biology and I think that's one of the great mysteries, right? We have incredibly reliable phenotypes. If you think about organismal development, it's astonishingly reliable, despite environmental variation, despite genetic variation. Like occasionally things can go wrong but by and large, despite all of these sources of variation in fluctuations, it's extremely reliable.
[laughter]
0:32:29.6 RB: To me that says that there is, that there's a wide basin of attraction. Right? And so I think that's part of why I'm very motivated to take these systems and network views in that, to have a wide basin of attraction, it means that the system is not fine tuned. Right? Fine tuned would suggest that there's a very, very narrow range of the state space that leads to a particular outcome and if it's not fine tuned then I think it has to all be systems level effects.
0:33:07.8 SC: Well, it goes hand in hand with what we said before. Mechanical things tend to be brittle, right? You break a single wire and then your computer doesn't work, but biology has to be a little resilient.
0:33:19.3 RB: Yeah.
0:33:20.7 SC: Good. Okay. So let's, that was, thank you for indulging me those more, semi philosophical questions. Let's get back down to the nitty gritty of what you do because like we said before, these networks are complicated, a lot of data in here, a lot of things going on. Computers are going to certainly play a role in your analysis, but also machine learning or what these days we're supposed to call AI, I'm sure plays an important role.
0:33:47.4 RB: Yeah. So we do quite a lot of that and I think we've seen a huge revolution in machine learning and statistical learning techniques are incredibly powerful. The computers are now incredibly powerful. And so you can do things now that were impossible five, 10 years ago. The data sets are very large, albeit still not large enough for deep learning to be as effective as I think we would like to be. But I think also that there is a little bit of a danger in just taking the wealth of data that we've collected and generic in some sense machine learning algorithms and pushing the data through and crossing your fingers and hoping that something of value is discovered in a naive way.
0:34:45.1 RB: We already know a lot about biological systems. So for example, let's think about this problem of inferring interaction networks, right or pathways. We already have some information about which genes code for proteins that act as transcription factors and would therefore regulate by binding to the DNA, the expression of other genes. So we know something about the biochemistry of these elements. Simply taking the correlation structure and the data pushing it into a machine learning model and saying, okay, can you give me a minimal network without taking into account that biochemical information, seems to me to be a little bit naive, right? We're not using what we already have. And so one of the things that my group is very interested in is, are there ways to integrate the information that we already have, even though it might be incomplete or somewhat erroneous into the machine learning methods that we are using to make new discoveries.
0:35:51.5 SC: So that's very interesting, but let me bring up the fact that it's also very interesting that when we build chess or Go playing machine learning systems one of the great discoveries was don't ruin it by giving it anything that a human being thinks is a good chess playing strategy. It's much better to just let the computer play games against itself a gajillion times and figure things out for itself than to give it any hints. So you're saying that maybe in biology it's different or that's a hypothesis that we might shoot down?
0:36:28.7 RB: I think... Well, first of all, I see no reason not to do both, right? There's absolutely no reason not to take both approaches. I do think that trying to understand what a cell or what a tissue is doing is perhaps a bit more complicated than learning how to play Go or how to play chess simply because the number of components in this particular game is much larger.
0:36:55.1 SC: Bigger, yeah.
0:36:56.7 RB: And so if you're, if the goal is to search a space of possible states to find the ones that correspond to a winning strategy in Go or chess or life [laughter] it might help to constrain your search space a little bit if the number of players is very large.
0:37:19.1 SC: Sure.
0:37:19.3 RB: So are there intelligent ways to constrain that space?
0:37:23.0 SC: You did bring up the number 20,000. What did that refer to? What are the 20,000 of?
0:37:28.1 RB: So that's the approximate number of genes that we obtain when we do gene expression profiling in a human subject. It's different for different organisms. So some have more, some have less. But in general, when you're doing transcriptomics for any given sample, and now we can do it at the single cell level. So for any given cell, you're looking at 20,000 measurements.
0:37:56.6 SC: Is that 20,000 genes?
0:38:00.2 RB: Yeah, that's 20,000 distinct microRNAs, mRNAs, excuse me. Not micro.
0:38:04.3 SC: But is that...
0:38:04.9 RB: MRNAs.
0:38:05.4 SC: Forgive me. My complete ignorance here. Is that saying that the human genome contains 20,000 genes? Or is this a statement about our ability to measure?
0:38:13.9 RB: It's both. Right.
0:38:14.7 SC: Okay. Good, then now we can put it all together. Right? So we have 20,000 genes or mRNAs, and we collect a lot of data. You have an overwhelming amount of data, and as we said at the very start, we're interested in this sort of emergent collective behavior, right? There's sort of, it's not just 20,000, it's 20,000 in some combinatorial array, right? Like different genes play with each other in different ways. So a huge number of possible things going on, but a relatively small number of things actually do go on. And that's what the machine learning for example, is trying to help us with.
0:38:52.6 RB: Yeah, exactly. So, I would say, you know from a mathematical perspective, the way I would put it is that we have observations in this 20,000 dimensional feature space, and we think that those observations lie on a very low dimensional manifold within that 20,000 dimensional space.
0:39:13.4 SC: Right.
0:39:13.7 RB: That not all locations in that 20,000 dimensional space correspond to a system that can live, right? And so the idea is, can we discover the geometry of that manifold? Can we then use it to say what it means to have variation along a particular direction of that manifold? What does it mean to deviate from that manifold? And can we use the location of an observation on that manifold to say something about it's behavior as a whole?
0:39:47.1 SC: Good. And I mean, you just asked a bunch of rhetorical questions. How much progress has been made? Is this like brand new frontier or do we have our handles on a bunch of answers to some of the questions you just asked?
0:40:00.1 RB: I mean, yes and no, right?
0:40:01.4 SC: Yeah.
0:40:01.8 RB: So non-linear dimension reduction in manifold learning has been a hot topic for a while now.
0:40:10.4 SC: Okay.
0:40:11.2 RB: And so we have, you know, there have been many algorithms proposed, some of which can be demonstrated to be equivalent to one another mathematically. In general, the idea behind them is to use the local geometry of your observations to try to decide what is the surface on which those observations lie. And then you go to the next local neighborhood and you kind of stitch them together like this patchwork. So that's the handwavy description of them. There are still some challenges, right? So one question that one can ask is, well, if you're using the local geometry to decide the, what the, where the manifold is in some region of this 20,000 dimensional space, how big of a local neighborhood should you take? Right? And that question of scale, is one that is still in some sense unanswered. So how do you tune that scale? Should you be looking at the same scale in every region of the data?
0:41:17.4 SC: Probably not, right?
0:41:18.7 RB: If not, how do you tune it Adaptively? So, yeah.
0:41:21.6 SC: Okay, good. Yeah. Okay. So we see it coming. We're gonna be able to figure out what are the relevant variables, the emergent coarse grained way of talking about all these things. What good will it do us? How does this help us understand biology or cure disease or whatever it is you wanna do?
0:41:41.3 RB: I mean, I think the ultimate goal is, if you have some understanding of the relationship between proteins and genes and gene products, you might be able to make a guess as to what you could target therapeutically. You might be able to develop better diagnostics. So I think that's the ultimate goal, is to have an understanding that would lead to the generation of new hypotheses that would ultimately lead to better health outcomes. You know, one area in which we've been somewhat successful, it's a pretty lofty goal, right? But one area in which we've made a little bit of headway on that front has been, in the work that I've done on circadian rhythms, and it, you know, there's abundant epidemiological evidence that, suggests that the time of day that your body thinks it is, is absolutely crucial for how you metabolize food.
0:42:44.3 RB: How you metabolize drugs, whether it's aligned with your environment, turns out to be crucial in your risk for various diseases. And we don't really understand fully the links between all of those things. But if we're able to measure physiological time, then maybe we can use that measurement to inform things like when you should take a blood pressure drug or when you should take, when chemotherapy should be delivered. And so using multi-gene expression profiles to obtain a measure of physiological time was one of the things that we worked on and there we were somewhat successful with it. So that might actually have some benefit within the near term, I'm hopeful.
0:43:36.8 SC: Let's actually dig into the circadian rhythm stuff, because I will confess as we are recording this podcast, yesterday I flew home from England and I'm totally jet lagged. So this idea that, you know, what time of day I eat things is gonna, I mean, let me phrase it as a question. Are you saying that where I am in my circadian rhythm, I.e. What time my body thinks it is, affects how I metabolize food?
0:44:09.2 RB: Yeah. It affects how you metabolize food and when you eat also affects your circadian phase, right? So there, so the circadian rhythm itself is a self-sustained 24 hour oscillation.
0:44:24.1 RB: And so even in the absence of light and dark or other time giving cues, you'll still roughly have a 24 hour day. It's a, in humans it's a little bit longer than 24 hours but it's still an approximate 24 hour rhythm. But as you are now experiencing when those cues...
[laughter]
0:44:43.9 RB: Occur your circadian phase will shift. That takes some time, as you are now very abundantly aware.
[laughter]
0:44:54.1 RB: And it can be influenced by light as well as by feeding behaviors. So when you eat can have a powerful effect on how quickly you recover from jet lag. Eat in the morning is what I've been told by my... [laughter]
0:45:06.2 SC: Oh, that was the next question, okay.
0:45:08.0 RB: Clinical and experimentalist collaborators. Yeah.
0:45:10.7 SC: Eat in the morning even if you don't usually eat in the morning?
0:45:15.1 RB: Apparently that is... So now this is not my direct line of research, but what I have been told by my colleagues who study this is that, yeah. So high calorie intake in the morning tends to be a better phase resetting signal than eating at other times a day. Light in the morning is well understood to be a powerful phase resetting signal.
0:45:37.4 SC: I'll warn you ahead of time, we're gonna have this lovely conversation about networks and signals and gene expression and so forth. And what people will remember is eat in the morning to get over jet lag. That's what people think about.
0:45:50.6 RB: Oh, yeah.
[laughter]
0:45:51.5 SC: News you can use, I like that. But let's connect it back to what you are doing. So we have this, 'cause this is fascinating to me, we have all this molecular activity going on in our bodies, and as a physicist I'm interested how the timescales connects to each other. I mean, ultimately it's all chemistry in our bodies and chemical reactions happen relatively quickly. Where in our bodies does it know that the day is 24 hours long and that there should be cycles and rhythms of that duration?
0:46:23.5 RB: So this is really, really fascinating. So nearly every cell in your body, and almost every living cell on earth, it's highly evolutionarily conserved, has a molecular oscillator within an approximate 24 hour period. In mammals and fruit flies, it's a transcription translation feedback loop. So essentially you have genes that are transcribed, they're then translated into proteins. Those proteins then regulate the transcription of other genes that are part of this feedback loop. And it's a negative feedback loop with time delays because all of these steps take time, and anytime you have negative feedback with time delays, you can sustain oscillations. And so there is this core clock circuit in nearly every cell. And this was a somewhat surprising discovery, the fact that it existed in every cell because...
0:47:20.2 SC: Yeah.
0:47:20.4 RB: One could conjecture, well, you only really need it in the brain and then the brain can control everything else. And so the discovery that there are peripheral clocks was quite exciting. And it raises some interesting questions like, why would you produce a clock in every single cell?
[laughter]
0:47:44.5 RB: And I have some hypotheses about that, one of which is that it's one way to allow processes to be coordinated and orchestrated across a spatially extended system like you or me without direct communication between...
0:48:03.9 SC: Right.
0:48:04.0 RB: The elements of that system. Right? If you've got a watch and I've got a watch, it's enough to say that we're gonna meet to record a podcast. We don't need to be in constant communication up until that point.
0:48:14.0 SC: And when you say oscillator, just, 'cause again, physics training in me, I'm thinking of a pendulum going back and forth, but you are thinking of a little collection of chemical concentrations going up and down. Is that true?
0:48:25.1 RB: Yeah, that's correct.
0:48:26.1 SC: So it's a chemical oscillator, not a mechanical one?
0:48:28.5 RB: Correct. Yeah.
0:48:29.6 SC: And every cell within our bodies, I know you just said that and you even said that it's amazing, but it's amazing. [laughter]
0:48:35.9 RB: Yeah, very nearly. There was a beautiful, beautiful study done by John Hoganashe's lab a number of years ago, and it was called the Mouse Circadian Atlas. And basically what they did was they took every single tissue in the mouse and they did time series gene expression measurements for each of those tissues. And they discovered genes that were under circadian control in every one of those...
0:49:10.7 SC: Wow.
0:49:10.9 RB: Those tissues and those genes included the core clock genes.
0:49:16.0 SC: Let me home in on this particular question, where does the number 24 hours come from, right? I mean, it must be that this particular oscillator is pretty darn tuned. Someone has to design the clock in some way, and chemistry, biology is figured out how to get this rather long timescale to pop out of some very quick interactions.
0:49:38.7 RB: Right. Yeah. So this is obviously something that has evolved on earth, right? With its 24 hour day. So there's reason to believe that it's favorable to have a clock that would enable you to anticipate time of of day, right?
0:49:54.9 SC: Sure.
0:49:56.1 RB: But you raised something really interesting, and that's that it has this pretty robust 24 hour period. What's curious about it is that it has this robust approximate 24 hour period across a very wide range of temperatures. And that's pretty surprising, because we would expect that chemical reactions are going to run faster at higher temperatures. And so the clock should speed up, but it doesn't, it's temperature compensated. This is obviously very useful for a cold-blooded organism, but also for us, if you consider the fact that the temperature on your skin is very different from the temperature in the core of your body.
0:50:39.7 SC: Yeah.
0:50:40.2 RB: It's important to have them be at approximately the same period. How temperature compensation is achieved is still not well understood.
0:50:49.7 RB: Okay. [laughter]
0:50:52.3 RB: And there's another wrinkle too. It's temperature compensated but it's not temperature insensitive. So fluctuations in environmental temperature can shift the phase of the clock. So it's not just completely isolated somehow from thermal information, but the period is compensated.
0:51:15.9 SC: It's very similar in mechanical wristwatches, right? It's very hard to make a watch that keeps the same amount of time no matter what temperature it is or other conditions.
0:51:25.3 RB: Absolutely. And in fact that was, when Harrison was designing his chronometers for the British Royal Navy that was, achieving temperature compensation was a major challenge, right?
0:51:37.5 SC: Yeah.
0:51:37.7 RB: Because if the clock onboard your ship starts running faster or slower, you're gonna go off course.
[laughter]
0:51:44.5 SC: And then how does this...
0:51:45.5 RB: Could be disastrous.
0:51:46.2 SC: Does this little chemical oscillator in every one of our cells communicate with the rest of the world for, one... Well, actually, let me back up. Cells are pretty tiny [laughter] how is there room in a cell for this? And then how does it then connect to the rest of what's going on in the cell and outside?
0:52:03.1 RB: I mean, cells are pretty tiny but molecules are more tiny.
0:52:06.6 SC: Also tiny.
0:52:07.0 RB: Right? So the cell is a pretty densely packed environment. It's big enough to allow for the expression of the core clock components.
0:52:19.2 SC: Sorry. Does the clock literally have a size or is it sort of diffuse?
0:52:23.6 RB: So these are chemical reactions, right?
0:52:26.2 SC: Yeah.
0:52:26.2 RB: It's a chemical oscillator, so we expect that those things are going to be diffusing and binding and unbinding, et cetera.
0:52:33.2 SC: Good. Okay. And then how does it communicate elsewhere?
0:52:38.2 RB: Okay, that's a really good question. I don't have an answer. I don't think we have an answer. So there are certain signaling molecules that we know exist. So melatonin, for example, is produced by the brain couple hours before you go to sleep. It serves as a signal to other cells, and so that's one form of that communication. There's obviously the nervous system, but the coupling between the various oscillators in the various cells, we don't necessarily know what achieves that. There were some really interesting work from Karen Esther's group in Florida, where they looked at phase resetting in mouse muscle tissue when they changed when the mice were allowed to exercise. So the mice will run on a wheel given the opportunity. And so basically they deprived the mice of the wheel at particular times of day to see what that would do to the circadian phase of the muscle tissue.
0:53:49.8 SC: To give them jet lag, basically. [laughter]
0:53:52.5 RB: But nothing else was jet-lagged.
0:53:54.1 SC: Okay, just...
0:53:54.5 RB: So the light cues were the same, feeding was the same. The only thing that they perturbed was when the mice could exercise. And what they discovered was that exercise was a very powerful zeitgeber, a very powerful phase resetting queue for the muscle clock. And so this raises the question of to what degree is it actually the physical coupling between cells that might mediate the cell coupling of the oscillators? I do not know. I think those experiments still need to be done.
0:54:32.6 SC: That's so much fun. It's so easy in Biology, just to bump up against fundamental things that are easy to ask and we don't know the answer to. [laughter] So the oscillators are keeping track. It sounds like a fairly simple thing. An oscillator is literally the simplest possible physics system. But before we were talking about these very complicated chemical networks, is there a connection there or is the oscillator part of the network, or is an oscillator secretly much more complicated than I think it is?
0:55:05.8 RB: So the oscillator is part of the network in the sense that there are many genes that are under circadian control. So those core clock components, are transcription factors that then regulate the expression of hundreds, maybe even thousands of genes downstream. So that mouse circadian Atlas paper that I mentioned from Hoganashe's group earlier, what they found was that nearly half of the mouse genome was under circadian control in some tissue. So it wasn't nearly half in any given specific tissue, but if you look across the mouse as a whole, nearly half. And so that suggests that the clock is playing a significant role in a variety of biological processes. What we've seen in our research in collaboration with Ravi Alito's group here at Northwestern is that the clock is crucial to metabolism and to lifespan and to egg laying and fruit flies. So you can do these experiments that Ravi has done where you take a fruit fly and you subject it to lower temperatures or to dietary restriction. So you deprive it of some calories without starving it, and a normal fly when subjected to these slightly unfavorable conditions. Actually it does something a little bit unexpected. It lives longer.
[chuckle]
0:56:43.0 RB: So dietary restriction usually results in lifespan elongation.
0:56:48.1 SC: Yes.
0:56:48.7 RB: Lower temperature also results in lifespan elongation in the fruit fly. And you can come up with all sorts of theories and just so stories about why that might be like total energy consumed over the lifespan or something like that. It turns out that in order to observe that lifespan elongation, the fruit fly needs an intact clock. If you knock out one of the core clock components by mutating a gene helpfully known as clock, you get an arrhythmic fruit fly. And that fruit fly has the same approximate lifespan regardless of the calories that it has available to it. And so it seems that there's not only hundreds possibly thousands of genes that are under circadian control but which genes are under circadian control has a big influence on these macroscopic outcomes that we're able to observe.
0:57:54.5 SC: You know, one of the very first podcasts I did was with Coleen Murphy of Princeton who studies aging. And it's interesting like, yeah, you knock out some genes things like that. Our life expectancy changes or not ours but, you know, they're using nematodes. Does this kind of thing open up prospects for dramatically changing human lifespans if we were to understand these networks better? Go ahead. It's late in the podcast you're allowed to speculate a little bit.
[laughter]
0:58:25.3 RB: I mean, I would love to say yes. I think that that's a hard question to answer yes or no to. I will say this, something that has long been observed is that there are sleep and circadian disruptions that are associated with neurodegeneration and circadian patterns to symptoms in neurodegenerative diseases like Alzheimer's. And so one of the things that we are looking at now is what are the patterns of circadian gene expression that are associated with healthy aging versus people who have preclinical and clinical Alzheimer's disease. And so hopefully that research will give us some insights as to the links between circadian rhythms and at least healthy aging. I don't know about lifespan that much.
0:59:21.0 SC: Okay. Fair enough. I would like to be healthy for a long time too, that's also a goal. So let me dramatically oversimplify here so that you can tell me if I've forgotten the gist of it. The circadian rhythm we all know that we get tired at a certain time of day, we get hungry at a certain time of day, jet lag affects us, et cetera. But the impression I'm getting from what you're saying is that in some sense like all or many some large fraction of our internal biological processes are just different at different times of the day. They're being regulated differently. Like goes back to what you said how we metabolize food is different during different times of the day and I'm sure that when we're asleep versus when we're awake. So we're kind of like we're... Every human being, every organism is a complicated chemical reaction but we're different chemical reactions depending on what time it is.
1:00:13.0 RB: Yeah. And that's precisely why when we were talking earlier about what makes biological networks special in some way, I mentioned that these networks aren't necessarily static over time. Which components of those networks might be expressed can differ over the course of the day. And so the behavior of those networks might differ over the course of the day.
1:00:35.6 SC: So it's kind of like... I'm just coming up with trying simple analogies here. Traffic flow patterns in a city are very different at 8:00 AM and 5:00 PM.
1:00:44.9 RB: Absolutely. Yeah.
1:00:46.9 SC: And it's all... And this gives one hope that there really is something called complex systems science because these systems do have some commonalities in there or is that going too far? Are there laws of complexity? Is the fact that a system is complex, whether it's a city or a human being give us any shared insight there or do complex systems have the features that every complex system is different, so, you know, have fun but don't expect to share insight between different modes.
1:01:16.1 RB: I think that insights that can be shared across modes is the approach that we use to investigate them. So one thing that is in common amongst all of these complex systems is that a reductionist point of view where the elements that you are reducing to are defined at the very outset. Like, here are my cars, here are my genes. Going to look at specific elements is maybe not fruitful, you need to consider the behavior of the system as a whole and coarse-grain it. So, I think that those types of insights are the same whether you're talking about traffic flow through a city or information flow through a cell.
1:02:01.5 SC: And in biology in particular, there's a temptation to be a little bit too reductionist. I mean, not as a professional biologist but in the popular discourse about biology, oh we found the gene that explains x.
1:02:16.3 RB: I think it's a temptation just because of the way that we're kind of taught to think about these things. But I think there's also a reason that we tend to go in that direction and that's, that we know how to do experiments on particular genes. We know how to target particular molecules. When I go back to my experimentalist collaborators and I say, good news, I've found a pathway that is associated with a phenotype you're interested in and this pathway contains 600 genes they tell me to get out of their lab, right, like there's nothing they can do with that. You need to be able to narrow it down to something that is targetable. And I think that there's a tension there between acknowledging that what is driving the outcomes of a particular cell or a particular organism is a systems level property versus what specific element of that system can I target to produce an effect that I want to see.
1:03:20.5 SC: Good. So...
1:03:21.6 RB: And... Yeah.
1:03:23.2 SC: So maybe in other words part of what you've emphasized is this sort of coarse-graining looking for a low dimensional manifold, looking for a small number of collective coordinates that help explain what's going on, and then the next thing is to be interventionist about it, to say, okay, you know, how do you poke at it to get some outcome that you desire? And is that... I mean, it... I'm not quite sure whether that's supposed to be just a cautionary tale, don't expect too much or, oh, yes. That's what we're figuring out. We're gonna figure out exactly how to poke at things and make us live forever and cure cancer. I don't wanna...
[laughter]
1:04:02.9 SC: Put words in your mouth.
1:04:03.4 RB: I mean, yes. No, I think that there is hope there. One of the things that we're really interested in now is... So it... Suppose that you have some sort of wiring diagram or some sort of network and you have measured the abundances of the mRNAs for the nodes in that network, or you've measured the correlation between the gene expression for all of the edges in your network.
1:04:31.1 RB: And so now you can use that network picture to start to make statistical summaries of the network. So I can coarse-grain that data on that network and I can do that sort of coarse-graining with different data sets. So you can imagine I do this coarse-graining with tumor and I do it with normal samples. And so now I can identify statistical differences at the whole network level between my cancer samples and my normal samples. And then one can ask the following question. So those coarse-grained statistical differences are obviously due to local differences in particular nodes, in particular edges. I could recover the statistical properties of the normal network by changing all of the altered nodes and edges that were altered in my cancer sample, but also there may be other ways to do so.
1:05:32.9 RB: So One can ask what is the minimal number of edges that I would need to alter in my cancer sample in order to recover the statistical properties of the normal sample, but without recovering that network in detail. And now I have maybe a smaller number of nodes and edges that I can then go and target experimentally. So I think there are ways to bridge that gap. I think also that one can ask a slightly different question, which is, is recovering the statistical, those coarse-grained statistical properties enough? And those statistical properties of the graph, I can look at the entire eigenspectrum of the graph Laplacian.
1:06:23.9 RB: Chances are that entire eigenspectrum does not actually capture all the biological information of consequence. Probably it's just the leading eigenvalues. So how much of those leading eigenvalues would I need to recover in order to recover the function of the network?
1:06:44.1 SC: I know it's getting late, but I will ask whether or not you think it's possible to explain to the audience what in the world you're talking about, about eigenvalues of a graph and looking at the leading, the eigenspectrum of a graph and looking at the leading eigenvalues. I think I know, but I don't want presume.
1:06:58.9 RB: Yes.
1:07:00.8 SC: And maybe the answer is they should read a linear algebra book. I'm not sure.
1:07:04.4 RB: [laughter] No, so it's a way to describe the coarse-grained geometry of the graph. So the first eigenvalue and its associated eigenvector describes the smoothest pattern of variation across the network. So if you have a network with two large communities that are only slightly, very tenuously connected to one another, then that smooth vector on the graph would have one community having very high values, the other one having low values. And then you can imagine increasingly less smooth patterns on the graph. And chances are, at least I would say, that those coarse geometric descriptions of the graph probably represent the flow of information in some coarse-grained way that is biologically meaningful. So maybe we only need to look at the first few, the smoothest patterns on the graph, in order to understand what the graph itself is doing.
1:08:13.5 SC: Good. We did it. I'm very...
1:08:15.5 RB: So hard to do this without the visual aids. [laughter]
1:08:17.7 SC: Very impressive. Yeah, I know it is hard, but I mean, it's basically... In some sort of overly literal sense, the graph is every single node and every single edge between them, but what you're saying is that there are these coarse features of what the graph looks like that you can mathematically characterize, and maybe that's all you need to know without all the details being relevant.
1:08:40.0 RB: Yeah.
1:08:41.0 SC: So my last question, and this might be a completely unfair one, you can be honest with me here. When we were talking about how to focus the conversation before we started, I brought up the phrase rules of life, because it appeared on your web page or something like that.
1:08:58.5 RB: Yeah.
1:08:58.5 SC: And your response was, that's not my phrase, that's the National Science Foundation's phrase. [laughter] And so, is it, which is fine, we all need to be nice to the National Science Foundation.
[laughter]
1:09:08.0 RB: Indeed.
1:09:08.5 SC: And... But is it just a kind of sexy thing that we talk about rules of life so that we can get funding or is the concept of rules of life a reasonable one to help us contemplate future research directions?
1:09:26.5 RB: No, I think it is a useful concept. So the way that NSF thinks about it and the way that I think about it as well is we need some way to map between these microscopic things we are able to measure, all of that omics data, and the macroscopic things that we observe, like whether or not a cell is cancerous. Or whether an organism is a mouse or a frog. And so the idea behind rules of life is that there are some constraints or some organizing principles that enable you to go from that microscopic description to that macroscopic description, and the problem is discovering what those are. So what exactly dictates the relationship between the microscopic and the macroscopic and what constrains it? So for me, when I am thinking about, well, I want to discover this low-dimensional manifold that, on which my observations lie. One of the questions that one can then ask is, well, why are things constrained to that manifold? And you can think of that as being one of those rules of life.
1:10:43.0 SC: It'll never quite be laws of physics, but it will give us some insight that we can tell ourselves about.
1:10:47.5 RB: Yes.
1:10:49.2 SC: That's great. What more can we ask for than that? Rosemary Braun, thanks very much for being on the Mindscape Podcast.
1:10:53.5 RB: Thank you so much for having me. It's been fun.
[music]
Pingback: Sean Carroll's Mindscape Podcast: Rosemary Braun on Uncovering Patterns in Biological Complexity - 3 Quarks Daily
If the pathways driving circadian cycles of diverse animals are homologous they must have evolved when the day length was about 20 hours long. They must be sufficiently plastic that all the branches of life can independently adapt to a longer day.
A recently published paper includes a graph of day length through geologic time:
Mitchell, R.N., Kirscher, U. Mid-Proterozoic day length stalled by tidal resonance. Nat. Geosci. 16, 567–569 (2023). https://doi.org/10.1038/s41561-023-01202-6
I listened to just over thirty minutes of this conversation. Professor Braun’s discussion about the definition of “gene,” beginning at about 16:22 started off acceptably but then went awry by confusing this issue with the nature of specificity in interactions among biomolecules. It is not news that proteins are not absolutely specific or that they are frequently associated with multiple functions. These points have been clear for years if not decades. However, that does not mean that any protein can bind any target at reasonable receptor and ligand concentrations. In any case, none of this is especially germane to clarifying the multiple meanings of “gene.”
Dr. Braun also failed to adequately explain most of the reasons that defining “gene” is not straightforward. Over the course of the history of genetic research, genes have been identified by associations with inheritance of defined phenotypes, units of mutation, units of recombination, entities that play defined roles in ontogeny (i.e., development from embryo to adult), and DNA sequences (RNA sequences for some viruses) of defined structure with specified sections that determine the structures of gene products (proteins or RNA molecules that perform cellular functions) or that regulate transcription. Beyond these different attributes of genes, more recent studies have revealed numerous phenomena that complicate the relationships between transcriptional units and functional gene products (e.g., fusion of transcripts from different transcriptional units or “encrypted” genes for which portions are non-contiguous in the genome).
Here are links to two informative sources on the topic of why “gene” has multiple senses and has become more challenging to characterize simply:
—-
https://pubmed.ncbi.nlm.nih.gov/28360126/
https://plato.stanford.edu/entries/gene/#PlurGeneConcContBiol
—-
I would emphasize that “gene” is like “life,” “species,” “epitope,” “inflammation,” and numerous other fundamental biological and biomedical terms that have multiple useful and context-sensitive definitions (what I believe could reasonably be called the semantic multiverse). Therefore, in any discussion it is wise to clarify which definition is being used or if it is being used differently than previously.
There is a term that has been in use among philosophers of biology for years to describe categories like those encompassed by “gene” or “life.” It is: “polythetic.” I have elsewhere explained (see link below; 2009) that such categories are to be expected in biology given the evolutionary origins of organisms and the components of organisms, a point that the host seems to acknowledge at 19:00.
—
https://evmedreview.com/boundaries-of-categories-categories-of-boundaries-and-evolution/
—
At about 20:50 Professor Braun states that microRNAs are 6-8 nucleotides (nt). This assertion is simply wrong. They are generally described as 22 nt (or 21-23 nt). A recent review (see citation directly below) confirms the point about microRNA size. In some cases, a miRNA molecule binds to the 3’-UTR of a target mRNA using 6-8 nucleotides.
Shang, R., Lee, S., Senavirathne, G. et al. microRNAs in action: biogenesis, function and regulation. Nat Rev Genet (2023). https://doi.org/10.1038/s41576-023-00611-y.