General relativity, Einstein’s theory of gravity and spacetime, has been pretty successful over the years. It’s passed numerous tests in the Solar System, scored a Nobel-worthy victory with the binary pulsar, and gets the right answer even when extrapolated back to the first one second after the Big Bang. But no scientific theory is sacred. Even though GR is both aesthetically compelling and an unquestioned empirical success, it’s our job as scientists to keep probing it in different ways. Especially when it comes to astrophysics, where we need dark matter and dark energy to explain what we see, it makes sense to put Einstein to the most stringent tests we can devise.
So here is a new such test, courtesy of Rachel Bean of Cornell. She combines a suite of cosmological data, especially measurements of weak gravitational lensing from the Hubble Space Telescope, to see whether GR correctly describes the behavior of large-scale structure in the universe. And the surprising thing is — it doesn’t. At the 98% confidence level, Rachel finds that general relativity is inconsistent with the data. I’m not sure why we haven’t been reading about this in the science media or even on other blogs — it’s certainly a newsworthy result. Admittedly, the smart money is still that there is some tricky thing that hasn’t yet been noticed and Einstein will eventually come through the victor, but this is serious work by a respected cosmologist. Either the result is wrong, and we should be working hard to find out why, or it’s right, and we’re on the cusp of a revolution.
Here is the abstract:
A weak lensing detection of a deviation from General Relativity on cosmic scales
Authors: Rachel BeanAbstract: We consider evidence for deviations from General Relativity (GR) in the growth of large scale structure, using two parameters, γ and η, to quantify the modification. We consider the Integrated Sachs-Wolfe effect (ISW) in the WMAP Cosmic Microwave Background data, the cross-correlation between the ISW and galaxy distributions from 2MASS and SDSS surveys, and the weak lensing shear field from the Hubble Space Telescope’s COSMOS survey along with measurements of the cosmic expansion history. We find current data, driven by the COSMOS weak lensing measurements, disfavors GR on cosmic scales, preferring η < 1 at 1 < z < 2 at the 98% significance level.
Let’s see if we can’t unpack the basic idea. The real problem in testing GR in cosmology is that any particular kind of spacetime curvature can be a solution to Einstein’s theory — all you need are the right sources of matter and energy. So in order to do a real test, you need to have some confidence that you understand what is creating the gravitational field — in the Solar System it’s the Sun and planets, in the binary pulsar it’s two neutron stars, and in the early universe it’s radiation. For large-scale structure things are a bit less clear — there’s ordinary matter, and dark matter, and of course dark energy.
Nevertheless, even though there are some things we don’t know about dark matter and dark energy, there are some things we think we do know. One of those things is that they don’t create any “anisotropic stress” — basically, a force that pulls different sides of things in different directions. Given that extremely reasonable assumption, GR makes a powerful prediction: there is a certain amount of curvature associated with space, and a certain amount of curvature associated with time, and those two things should be equal. (The space-space and time-time potentials φ and ψ of Newtonian gauge, for you experts.) The curvature of space tells you how meter sticks are distorted relative to each other as they move from place to place, while the curvature of time tells you how clocks at different locations seem to run at different rates. The prediction that they are equal is testable: you can try to measure both forms of curvature and divide one by the other. The parameter η in the abstract is the ratio of the space curvature to the time curvature; if GR is right, the answer should be one.
There is a straightforward way, in principle, to measure these two types of curvature. A slowly-moving object (like a planet moving around the Sun) is influenced by the curvature of time, but not by the curvature of space. (That sounds backwards, but keep in mind that “slowly-moving” is equivalent to “moves more through time than through space,” so the curvature of time is more important.) But light, which moves as fast as you can, is pushed around equally by the two types of curvature. So all you have to do is, for example, compare the gravitational field felt by slowly-moving objects to that felt by a passing light ray. GR predicts that they should, in a well-defined sense, be the same.
We’ve done this in the Solar System, of course, and everything is fine. But it’s always possible that some deviation from Einstein shows up at much larger distance and weaker gravitational fields than we have access to in our local neighborhood. That’s basically what Rachel’s paper does, considering different measures of the statistical properties of large-scale structure and comparing them to the predictions of a phenomenological model of the gravitational field. A crucial role is played by gravitational lensing, since that’s where the deflection of light comes in.
And here is the answer: the likelihood, given the data, for different values of 1/η, the ratio of the time curvature to the space curvature. The GR prediction is at 1, but the data show a pronounced peak between 3 and 4, and strongly disfavor the GR prediction. If both the data and the analysis are okay, there would be less than a 2% chance of obtaining this result. Not as good as 0.01%, but still pretty good.
So what are we supposed to make of this? Don’t get me wrong: I’m not ready to bet against Einstein, at least not yet. Mostly my pro-Einstein prejudice comes from long experience trying to come up with alternative theories of gravity that are simultaneously logically sensible and observationally consistent; it’s just very hard to do. But more generally, good scientists naturally have a strong suspicion of any claimed observational result that purports to overthrow an extremely well-established theory. That’s just common sense, not hidebound establishmentarianism; most such anomalies eventually go away.
But that doesn’t mean that you ignore anomalies; you just treat them with caution. In this case, there could be an unrecognized systematic error in the data set, or a subtle error in the analysis. Given 1:1 odds, that’s certainly where the smart money would bet right now. It’s also possible that the fault lies with dark matter or dark energy, not with gravity — but it’s hard to see how that could work, to be honest. Happily, it’s an empirical question — more data and more analysis will either reinforce the result, or make it go away. After all, some anomalies turn out to be frighteningly real. This one is worth taking seriously, to say the least.
The paper says that most of the power of the result comes from COSMOS 1 < z < 2 data, while the statistics come from COSMOMC runs which include the COSMOS data (through a library written by J. Lesgourgues).
Given that, is the basic point of the paper that allowing eta to vary produces COSMOMC runs which fit the COSMOS data better than regular eta=1 runs?
Okay, so 98% isn’t great, but assuming this indication holds up in further analysis, it suggests to me there’s information in the lensing data that’s hard to come by with LambdaCDM, while the other data sets aren’t so sensitive to it.
Incidentally, since there’s some lensing folks around, a question I think I asked at this blog previously: is it true you’re assuming rho is positive definite for the data analysis?
I think people have pointed out LambdaCMD doesn’t work very well with the large scale structure for quite a while, though less quantitatively, see eg http://arxiv.org/abs/0811.4684
I’m not a cosmologist, so for me it’s hard to tell how serious to take these “puzzles” (esp. the one with the voids). I’ve talked to several people in the field and they commonly think it’s a lacking understanding of astrophysical effects, or some think it’s a numerical weakness, and that given more effort, the data would fit the model.
Well, this sure is stunning if it pans out. No one here seems to have asked, what other tests would show such an asymmetry between space and time curvature? My impression is, this would show up in simple things like radar delay tests in the solar system. And, whither the equivalence principle? But AFAIK they all work OK.
Bee –
Skimming off the rails here, but definitely don’t take
the “void” issue seriously.
When we’ve tried to carefully measure the mass distribution
of dwarf galaxies in empty regions using homogeneous surveys,
we’ve found no discrepancy with CDM, certainly not a statistically
significant one.
I would like to see some more of the fitted parameters published. What are the joint confidence intervals of all six fitted parameters (note: not the marginal intervals)? A marginalized statistic at the 98% confidence level may actually be in a volume of the joint parameter space that does not have as much statistical significance.
Also did they fit the (six total) parameters of the three Gaussian nuisance variables or did they marginalize over all possible values of the parameters, and in that case what was the prior?
Sorry one more point of note on the statistical techniques:
A properly configured MCMC is driven by a Central Limit Theorem. So regardless of whether the hypothesized distributions contains the correct sampling distribution of the data, the MCMC process will converge in distribution to a single answer, as long as the first two moments of the physical data are finite. This single answer may very well be wrong if you have chosen the wrong family of hypothetical distributions.
Pingback: Black Belt Bayesian » A New Challenge to 98% Confidence
Pingback: A New Challenge to Einstein? | bootlegers101 Magazine
@18 (PTM) That sounds a damned clever idea, and I bet it’s the explanation (assuming their calculations haven’t already taken these ratios into effect).
Bee – the assumption that there’s no such thing as negative mass (and thus negative rho) is fairly common, but usually doesn’t apply to weak lensing studies. This is because one measures shear (in the weak lensing limit at least), which is one set of second derivatives of the surface potential, and rho is related to the convergence, which is another set of second derivatives of the surface potential.
So to get to rho, you effectively combine various derivatives of the shear to get derivatives of rho, then integrate, leaving an unknown constant of integration. This constant is basically the mean rho at the edge of your data field, so you’re only measuring mass fluctuations relative to some mean level.
As weak lensing is also very noisy, you often see large negative signals in the density maps (much larger than any reasonable value for the mean rho at the edge of your data), which are normally interpreted as noise fluctuations caused by the intrinsic ellipticity of the background galaxies rather than an actual region of negative mass. Estimating the noise by various tricks (such as rotating all of your galaxy ellipticities by 45 degrees and redoing the measurement) usually agrees that the large negative regions are most likely noise.
Cosmic shear is usually measured using 2 point correlation functions, which are combined to give various statistics. Most of the statistics wind up being effectively compensated filters, designed to give 0 signal in a region of flat density distribution regardless of what the actual value of the density is (again, in the weak limit).
Okay, I have to interrupt again to give a simple example of the dangerous waters of statistical significance, and to show that MCMC is not a silver bullet.
Consider sampling data from a simple mono-exponential distribution, then fitting the data to a bi-exponential distribution using MCMC. This will yield a posterior on the three parameters of the bi-exponential distribution. But the posterior won’t be maximized around a point where the parameters of one of the exponentials is zero (as would be hoped for in the logic of the analysis of the cited paper). Rather the mean and variance of the bi-exponential distribution will equal the mean of the mono-exponetial distribution, and worse, you can drive the significance arbitrarily close to 1 by adding more sample data.
This is a general property that can be proven of all cases where a sufficient statistic of the hypothesized distributions has a well defined Fisher Information Matrix in the sampling distribution. It is a significant danger when adding more parameters to a model in the hopes of statistically testing their triviality.
Pingback: Occam’s Machete » Blog Archive » General Relativity is Wrong?
Blanton,
Thanks! Is there a reference you could point me to?
Doug,
Yes, that’s exactly what I was referring to. So what happens to the “noise” for the further analysis of the data? Is it weeded out? “Estimating the noise by various tricks … usually agrees that the large negative regions are most likely noise.” sounds reasonable at first, but suspiciously like a case of confirmation bias at second.
Best,
B.
How might a mere plebean access the body of the paper, rather than just a header or précis?
Click “PDF” under “Download” up in the right corner at http://arxiv.org/abs/0909.3853
I’m looking at this from a writer’s perspective. I know this is speculative, I get that data sets affect results, and I understand that GR weighs in as the theory with best odds of being right.
However, if the curvature of space and time might be unequal, what effect would this variance have on the universe? What part of this effect might intersect with humans?
Pingback: Current status of the concordance model « Antimatter
Pingback: Susan Pinochet (sdp) 's status on Wednesday, 14-Oct-09 19:29:22 UTC - Identi.ca
Sean,
In the framework of your effort for unpack the basic idea
it is possible to suggest that the result of Raquel Bean correspond to something like Horava Gravity in the sense that time and space scale differently?
Can we read from her result the different dynamical exponent for time and space?
Pingback: 14 October 09 (pm) « blueollie
Pingback: Einstein Wrong? : Mormon Metaphysics
Another challenge to general relativity
It is worth to start remarking that binary pulsar tests are also challenging general relativity. First, the same tests are also passed out by alternative theories like nonlinear field theory and the recent relational theory (2008: Grav. and Cosm. 14(1), 41—52).
Second, recent works —also presented in conference PPC-08— point out the possibility that the discrepancy of 0.85% between the general relativity prediction and observation can be explained by a nonlinear field theory, which predicts extra radiation of 0.735% thanks to novel radiation mechanisms not available in general relativity.
I want not to discuss about if GR is aesthetically compelling or not. Some people thinks GR is the most beautiful of the theories of physics. Others strongly disagree and consider that the nongeometrical formulations are more beautiful —particle physicists as Feynman and Weinberg have stated their preference by nongeometrical formulation for gravity—. I agree that nongeometrical formulations are more beautiful. But the important question is are they more useful as well?
This important question has been answered in a report presented some few days ago, which rigorously analyzes the geometrical formulation of general relativity and compares —as never before was done— with another five theories of gravity in mainstream journals. The results are somehow surprising: (i) the geometrical formulation behaves poor than nongeometrical ones —the myth of equivalence of both formulations is showed—, and (ii) the deficiencies of the geometrical formulation are the cause of some observational discrepancies —there is a section in the Report specifically devoted to cosmological discrepancies—.
The implications for quantum gravity are deep. Beyond the profound cultural divide between the relativity and the particle physicists’ community in dealing with spacetime, this reports shows that field-theoretic approaches to gravity over a flat background are more correct than attempts like loop theory —deeply rooted into the geometrical language of general relativity—.
It is not strange that experts as M. Pavsic (author of the book “The Landscape of Theoretical Physics: A Global View; From Point Particles to the Brane World and Beyond, in Search of a Unifying Principle”) have praised this work, as reported in some news
http://www.canonicalscience.org/en/publicationzone/canonicalsciencereports/20092.html
http://www.geskka.com/articles/categories/Space-science/
http://digg.com/d316f5a
Sean writes:
“But more generally, good scientists naturally have a strong suspicion of any claimed observational result that purports to overthrow an extremely well-established theory. That’s just common sense, not hidebound establishmentarianism; most such anomalies eventually go away.”
What am I missing? Why am I alone in considering the flat rotation curves of galaxies as representing a serious anomaly. If dark matter is detected by some means other than gravitationally , then the anomaly of flat rotation curves should rightfully ” go away”.
I believe that a multi-team, multi-million dollar effort has been underway at least since 1988 looking for non-gravitational evidence for the dark matter. Even though the need for dark energy concept has only been around for ten years, it raises more serious theoretical difficulties than does the dark matter concept.
So when do good scientists consider the fact the flat rotation curves and the fact of cosmic acceleration a serious anomaly?
I have a hard time thinking that scientist are really “good scientists” if they do not hold the view, at this late stage, that the need for dark matter and dark energy represent in some sense a serious anomaly.
“Second, recent works —also presented in conference PPC-08— point out the possibility that the discrepancy of 0.85% between the general relativity prediction and observation can be explained by a nonlinear field theory, which predicts extra radiation of 0.735% thanks to novel radiation mechanisms not available in general relativity.”
This assertion has no support in the literature.
Plus I greatly enjoy your presence here given your opinions about Carroll.
To Eric Gisse,
Your ill-informed assertions about PPC-08 and your ad hominems were already replied in sci.physics.research until moderators there blocked you from posting more in the thread:
http://groups.google.com/group/sci.physics.research/msg/0bca9684bc5e0a3e
http://groups.google.com/group/sci.physics.research/msg/bee924193a0a5b68
http://groups.google.com/group/sci.physics.research/msg/b6c4269a42ddea97
http://groups.google.com/group/sci.physics.research/msg/b2384376b4c5b02c
(…)