A New Challenge to Einstein?

General relativity, Einstein’s theory of gravity and spacetime, has been pretty successful over the years. It’s passed numerous tests in the Solar System, scored a Nobel-worthy victory with the binary pulsar, and gets the right answer even when extrapolated back to the first one second after the Big Bang. But no scientific theory is sacred. Even though GR is both aesthetically compelling and an unquestioned empirical success, it’s our job as scientists to keep probing it in different ways. Especially when it comes to astrophysics, where we need dark matter and dark energy to explain what we see, it makes sense to put Einstein to the most stringent tests we can devise.

So here is a new such test, courtesy of Rachel Bean of Cornell. She combines a suite of cosmological data, especially measurements of weak gravitational lensing from the Hubble Space Telescope, to see whether GR correctly describes the behavior of large-scale structure in the universe. And the surprising thing is — it doesn’t. At the 98% confidence level, Rachel finds that general relativity is inconsistent with the data. I’m not sure why we haven’t been reading about this in the science media or even on other blogs — it’s certainly a newsworthy result. Admittedly, the smart money is still that there is some tricky thing that hasn’t yet been noticed and Einstein will eventually come through the victor, but this is serious work by a respected cosmologist. Either the result is wrong, and we should be working hard to find out why, or it’s right, and we’re on the cusp of a revolution.

Here is the abstract:

A weak lensing detection of a deviation from General Relativity on cosmic scales
Authors: Rachel Bean

Abstract: We consider evidence for deviations from General Relativity (GR) in the growth of large scale structure, using two parameters, γ and η, to quantify the modification. We consider the Integrated Sachs-Wolfe effect (ISW) in the WMAP Cosmic Microwave Background data, the cross-correlation between the ISW and galaxy distributions from 2MASS and SDSS surveys, and the weak lensing shear field from the Hubble Space Telescope’s COSMOS survey along with measurements of the cosmic expansion history. We find current data, driven by the COSMOS weak lensing measurements, disfavors GR on cosmic scales, preferring η < 1 at 1 < z < 2 at the 98% significance level.

Let’s see if we can’t unpack the basic idea. The real problem in testing GR in cosmology is that any particular kind of spacetime curvature can be a solution to Einstein’s theory — all you need are the right sources of matter and energy. So in order to do a real test, you need to have some confidence that you understand what is creating the gravitational field — in the Solar System it’s the Sun and planets, in the binary pulsar it’s two neutron stars, and in the early universe it’s radiation. For large-scale structure things are a bit less clear — there’s ordinary matter, and dark matter, and of course dark energy.

Nevertheless, even though there are some things we don’t know about dark matter and dark energy, there are some things we think we do know. One of those things is that they don’t create any “anisotropic stress” — basically, a force that pulls different sides of things in different directions. Given that extremely reasonable assumption, GR makes a powerful prediction: there is a certain amount of curvature associated with space, and a certain amount of curvature associated with time, and those two things should be equal. (The space-space and time-time potentials φ and ψ of Newtonian gauge, for you experts.) The curvature of space tells you how meter sticks are distorted relative to each other as they move from place to place, while the curvature of time tells you how clocks at different locations seem to run at different rates. The prediction that they are equal is testable: you can try to measure both forms of curvature and divide one by the other. The parameter η in the abstract is the ratio of the space curvature to the time curvature; if GR is right, the answer should be one.

There is a straightforward way, in principle, to measure these two types of curvature. A slowly-moving object (like a planet moving around the Sun) is influenced by the curvature of time, but not by the curvature of space. (That sounds backwards, but keep in mind that “slowly-moving” is equivalent to “moves more through time than through space,” so the curvature of time is more important.) But light, which moves as fast as you can, is pushed around equally by the two types of curvature. So all you have to do is, for example, compare the gravitational field felt by slowly-moving objects to that felt by a passing light ray. GR predicts that they should, in a well-defined sense, be the same.

We’ve done this in the Solar System, of course, and everything is fine. But it’s always possible that some deviation from Einstein shows up at much larger distance and weaker gravitational fields than we have access to in our local neighborhood. That’s basically what Rachel’s paper does, considering different measures of the statistical properties of large-scale structure and comparing them to the predictions of a phenomenological model of the gravitational field. A crucial role is played by gravitational lensing, since that’s where the deflection of light comes in.

And here is the answer: the likelihood, given the data, for different values of 1/η, the ratio of the time curvature to the space curvature. The GR prediction is at 1, but the data show a pronounced peak between 3 and 4, and strongly disfavor the GR prediction. If both the data and the analysis are okay, there would be less than a 2% chance of obtaining this result. Not as good as 0.01%, but still pretty good.

bean-eta

So what are we supposed to make of this? Don’t get me wrong: I’m not ready to bet against Einstein, at least not yet. Mostly my pro-Einstein prejudice comes from long experience trying to come up with alternative theories of gravity that are simultaneously logically sensible and observationally consistent; it’s just very hard to do. But more generally, good scientists naturally have a strong suspicion of any claimed observational result that purports to overthrow an extremely well-established theory. That’s just common sense, not hidebound establishmentarianism; most such anomalies eventually go away.

But that doesn’t mean that you ignore anomalies; you just treat them with caution. In this case, there could be an unrecognized systematic error in the data set, or a subtle error in the analysis. Given 1:1 odds, that’s certainly where the smart money would bet right now. It’s also possible that the fault lies with dark matter or dark energy, not with gravity — but it’s hard to see how that could work, to be honest. Happily, it’s an empirical question — more data and more analysis will either reinforce the result, or make it go away. After all, some anomalies turn out to be frighteningly real. This one is worth taking seriously, to say the least.

64 Comments

64 thoughts on “A New Challenge to Einstein?”

  1. Too bad the peek isn’t a little to the left. It would have been cool if it had been a factor of Pi.

  2. Pingback: IanHuston.net — Latest From FriendFeed this week

  3. I’m afraid that all appearances suggest that this result is wrong in at least one important respect. The paper has a number of obvious problems in its use of statistics. The most salient (in terms of whether the paper’s claims are valid) concerns the chi squared values. The chi squared for pure GR is greater than 3000. The number of degrees of freedom is not given, but either this fit is extremely bad, or it has so many free parameters that one cannot learn anything useful from varying only one parameter at a time. If the fit is really bad, that’s quite interesting, and it does pose a problem for GR. Adding possible variation of eta changes the goodness of fit by an amount that is tiny in comparison, but when the fit is already this bad, it doesn’t tell you anything useful. You can often improve the chi squared per degree of freedom by adding a new fit parameter, but that’s not meaningful if neither fit is anywhere close to the data.

  4. Hi – I am working on the weak lensing data-set that drives these results (the COSMOS weak lensing data). I just wanted to mention that Rachel Beans results are based on a paper that we published in 2007 (Massey et al. 2007). Since 2007, there have been many changes in our data-set. Firstly and perhaps most importantly, our source redshift distribution has changed with the addition of deep Near Infra Red and U band data. The high redshift distribution has changed quite a bit and this will obvioulsy impact our tompography results. Secondly, we have been working since 2007 to reduce systematic effects in the data (shear calibration, PSF correction and we have a new and improved way of dealing with Charge Transfer Efficiency). Therefore, I would not be surprised if these results were to change with our new improved weak lensing data. We have not yet calculated if the new data will go in the same direction as Rachel Beans result or not, but we are working towards publishing our new results as soon as possible – so keep an eye out on astro-ph for an update on this result!

  5. To follow up on #6 from Alexie – It appears that most of the the power in this result comes from the high redshift bin in COSMOS, and will be most susceptible to the systematic changes mentioned in that post.

    Also, there is another, older paper from Daniel et al, 2009 [http://adsabs.harvard.edu/abs/2009PhRvD..80b3532D] that uses the same datasets, except substituting CFHTLS for COSMOS, and apparently the same code (CosmoMC), but finds everything perfectly consistent with GR.

  6. Rachel Mandelbaum

    An addendum to Alexie’s comment, for the non-lensers:
    The significance of a change in the redshift distribution at the high end is that the GR predictions for each redshift slice will increase if the new results suggest that the galaxies are actually at higher redshift than was originally assumed. So, if the change in the “high redshift distribution” that Alexie mentions is in the direction of putting the sources at higher redshift, then this change could reduce/eliminate the current tension between the data and GR.

    Another point, in case anyone wonders why the COSMOS team did not find this tension with GR:
    My take is that this relates to the inclusion of the other data in Rachel Bean’s analysis. In the COSMOS Massey et al. (2007) paper, the best-fit power spectrum amplitude sigma_8 was found to be fairly high. If the other data that Rachel Bean includes tend to pull sigma_8 to lower values, then the lensing signal in this highest redshift slice will appear to be too high, and modifications of the theory of gravity are one way to reconcile the inconsistency with theoretical predictions. This is just my take on it; perhaps someone who is more familiar with the data and/or analysis can comment.

  7. When I first saw Rachel’s paper on the arXiv, my initial reaction was to scan through it for any discussion of how she treats nonlinearities in the growth of structure. Unless I’m missing something, she only does one correction, mentioned briefly at the beginning of Sec. 2.

    For ISW measurements, she is probably safe doing so. Smith et al. estimated that for l < 100, nonlinearities should contribute less than a 10% effect. Likewise for cosmic shear, depending on the scale. For galaxy auto-correlation functions, I'm a bit more skeptical, especially since the power spectrum model is explicitly based on simulations. The beauty of her approach is that this is meant to be a clean measure of spatial versus temporal components of the metric. These terms can only be determined cleanly in the linear regime, and it's not obvious that this is completely applicable here.

    Still, a very interesting (and provocative) paper.

  8. It’s always fascinating when people think of new ways of looking at the data. It sounds as though this method is likely to give an indication if there is something there.

    It’ll be interesting to see the analysis done on the new and improved datasets.

    As an ignorant layman I’m not happy about the apparent shoulder at 1. It may be weak, but it makes it look like there’re two signals in there.

  9. Thanks to Sean for posting about the paper. I thought I’d make some quick replies to these few posts. Rachel Mandelbaum is spot on about the tension between the weak lensing data and the other datasets having being seen before as a difference in the preferred values of sigma_8; the difference between the two potentials just allows that tension to relax and the lensing data to be well fit by the bestfit set of parameters that the other datasets prefer. DougA is also correct that Daniel et al looked at a similar effect, however they modeled it as evolving as (1+z)^(-3) i.e. looking for an effect that became more important with decreasing redshift. My analysis finds no evidence for a deviation from GR at z<1, consistent with their results. Dave Goldberg mentioned the modeling of non-linearities, that can be a big source of systematic uncertainty on smaller scales. I used the Smith et al fit for the lensing data, however because the data here is on reasonably large scales the non-linear correction doesn't affect the result. As Sean alludes to, I've taken the systematic errors in the datasets at face value, unmodeled systematics would, no doubt, have an impact on the result.

  10. That’s not a very statistically significant result. People use the words “98% confidence level” in order to sound authoritative, but the way they are calculated, they do NOT mean “98% posterior probability”. It looks like in this example, what they meant is “p-value of 0.02”, which is not very strong evidence at all (assuming p-values are the relevant statistic, which we all know they’re not, really). Testing GR is important, but I doubt very much that this is a detection of a GR violation.

  11. Interesting results!

    Nevertheless could this deviation from GR on largest scales (if it is real) be related to the findings of Kashlinsky et al. 2008 who found some evidence for large scale bulk flows via the kinetic SZ effect?

  12. I should also reply to the chi^2 comment of Malo Juevo. The paper quotes -2ln(likelihood)= chi^2 +2ln(det(Cov)) so it’s not the chi^2 per se (because includes the normalization of the determinant of the covariance matrix). However, while you can’t do -2ln(likelihood)/degree of freedom as a measure of fit, the change in -2ln(likelihood) does give a measure of the improvement in fit.

  13. Perhaps I’m betraying my personal biases here, but if this result *is* real, do the theorists have an idea which (if any) of the “well-established” modifications to GR would agree with these data? Rachel’s paper cites a lot of the relevant literature but it doesn’t look like there’s a direct comparison to any specific theories.

    Mike, I’m no expert on this (my senior project is on the SZ effect though, so ask me in a few months ;)) but I believe Kashlinsky’s paper claims to find evidence for a specific anisotropy, rather than some effect of gravity in general (if we believe the Kashlinsky result!).

  14. Adam, I don’t think there are any obvious candidates. As far as I can tell, this effect wasn’t specifically predicted by any of the models I’m familiar with. I’m sure we’ll hear otherwise if the effect persists.

  15. Interesting, if this effect is real could it mean that dark energy is the temporal and dark matter spatial part of a single field which fills the Universe?

    Dark energy and dark matter dominate on large scales, they constitute approximately 72% and 23% of all mass-energy in the Universe. So dark energy is 3.13 times more abundant then dark matter, if one were related to temporal curvature and the other to spatial one the result would match well with this distribution. Is such an idea reasonable and if not why?

    Though in such case shouldn’t temporal curvature be exactly 3 times larger then spatial one due to spatial energy being distributed among three dimensions while temporal only one?

  16. Pingback: Link roundup to watch out for (13th October, 2009) | Geek Feminism Blog

  17. Exciting result! But as Alexie mentioned, there’s been changes to the COSMOS data set from 2007, including better correction of systematics. Will be definitely interesting to see the same analysis with updated data.

  18. If these results were to hold up under the scrutiny of analysis and addition data, what affect if any would this have upon the quest for a quantum gravity theory? That is, would such a result be more indicative of there being more dimensions (degrees of freedom), then 3 + 1 or perhaps less; or simply not being relevant at all? I ask as it seems to me that if nature is shown to have a temporal bias, then it should have some implication in this regard.

  19. Pingback: Tales from the Tubes — 13/​10/​09 | Young Australian Skeptics

  20. Pingback: Interesting Reading #345 – The Blogs at HowStuffWorks

  21. My proposal

    My proposal is that eta=1/3 could be understood if the
    perturbations are not that of matter (visible or dark) but of dark energy density. This would imply that four-volume is conserved in the perturbation implying eta=1/3 for scalar perturbations.

Comments are closed.

Scroll to Top