Monday, August 25, 2014

A Supervoid cannot explain the Cold Spot

In my last post, I mentioned the claim that the Cold Spot in the cosmic microwave background is caused by a very large void — a "supervoid" — lying between us and the last scattering surface, distorting our vision of the CMB, and I promised to say a bit more about it soon. Well, my colleagues (Mikko, Shaun and Syksy) and I have just written a paper about this idea which came out on the arXiv last week, and in this post I'll try to describe the main ideas in it.

First, a little bit of background. When we look at sky maps of the CMB such as those produced by WMAP or Planck, obviously they're littered with very many hot and cold spots on angular scales of about one degree, and a few larger apparent "structures" that are discernible to the naked eye or human imagination. However, as I've blogged about before, the human imagination is an extremely poor guide to deciding whether a particular feature we see on the sky is real, or important: for instance, Stephen Hawking's initials are quite easy to see in the WMAP CMB maps, but this doesn't mean that Stephen Hawking secretly created the universe.

So to discover whether any particular unusual features are actually significant or not we need a well-defined statistical procedure for evaluating them. The statistical procedure used to find the Cold Spot involved filtering the CMB map with a special wavelet (a spherical Mexican hat wavelet, or SMHW), of a particular width (in this case $6^\circ$), and identifying the pixel direction with the coldest filtered temperature with the direction of the Cold Spot. Because of the nature of the wavelet used, this ensures that the Cold Spot is actually a reasonably sizable spot on the sky, as you can see in the image below:

The Cold Spot in the CMB sky. Image credit: WMAP/NASA.

Well, so we've found a cold spot. To elevate it to the status of "Cold Spot" in capitals and worry about how to explain it, we first need to quantify how unusual it is. Obviously it is unusual compared to other spots on our observed CMB, but this is true by construction and not very informative. Instead the usual procedure quite rightly compares the properties of the cold spots found in random Gaussian maps using exactly the same SMHW technique to the properties of the Cold Spot in our CMB. It is this procedure which results in the conclusion that our Cold Spot is statistically significant at roughly the "3-sigma level", i.e. only about 1 in every 1000 random maps has a coldest spot that is as "cold" as* our Cold Spot.** (The reason why I'm putting scare quotes around everything should become clear soon!)

So there appears to be a need to explain the existence of the Cold Spot using additional new physics of some kind. One such idea that that of the supervoid: a giant region hundreds of millions of light years across which is substantially emptier than the rest of the universe and lies between us and the Cold Spot. The emptiness of this region has a gravitational effect on the CMB photons that pass through it on their way to us, making them look colder (this is called the integrated Sachs-Wolfe or ISW effect) — hence the Cold Spot.

Now this is a nice idea in principle. In practice, unfortunately, it suffers from a problem: the ISW effect is very weak, so to produce an effect capable of "explaining" the Cold Spot the supervoid would need to be truly super — incredibly large and incredibly empty. And no such void has actually been seen in the distribution of galaxies (a previous claim to have seen it turned out to not be backed up by further analysis).

It was therefore quite exciting when in May a group of astronomers, led by Istvan Szapudi of the Institute for Astronomy in Hawaii, announced that they had found evidence for the existence of a large void in the right part of the sky. Even more excitingly, in a separate theoretical paper, Finelli et al. claimed to have modeled the effect of this void on the CMB and proven that it exactly fit the observations, and that therefore the question had been effectively settled: the Cold Spot was caused by a supervoid.

Except ... things aren't quite that simple. For a start, the void they claimed to have found doesn't actually have a large ISW effect — in terms of central temperature, less than one-seventh what would be needed to explain the Cold Spot. So Finelli et al. relied on a rather curious argument: that the second-order effect (in perturbation theory terms) of this void on CMB photons was somehow much larger than the first-order (i.e. ISW) effect. A puzzling inversion of our understanding of perturbation theory, then!

In fact there were a number of other reasons to be a bit suspicious of the claim, among which were that N-body simulations don't show this kind of unusual effect, and that several other larger and deeper voids have already been found that aren't aligned with Cold Spot-like CMB features. In our paper we provide a fuller list of these reasons to be skeptical before diving into the details of the calculation, where one might get lost in the fog of equations.

At the end of the day we were able to make several substantive points about the Cold Spot-as-a-supervoid hypothesis:
  1. Contrary to the claim by Finelli et al., the void that has been found is neither large enough nor deep enough to leave a large effect on the CMB, either through the ISW effect or its second-order counterpart — in simple terms, it is not a super enough supervoid.
  2. In order to explain the Cold Spot one needs to postulate a supervoid that is so large and so deep that the probability of its existence is essentially zero; if such a supervoid did exist it would be more difficult to explain that the Cold Spot currently is!
  3. The possible ISW effect of any kind of void that could reasonably exist in our universe is already sufficiently accounted for in the analysis using random maps that I described above.
  4. There's actually very little need to postulate a supervoid to explain the central temperature of the Cold Spot — the fact that we chose the coldest spot in our CMB maps already does that!
Point number 1 requires a fair bit of effort and a lot of equations to prove (and coincidentally it was also shown in an independent paper by Jim Zibin that appeared just a day before ours), but in the grand scheme of things it is probably not a supremely interesting one. It's nice to know that our perturbation theory intuition is correct after all, of course, but mistakes happen to the best of us, so the fact that one paper on the arXiv contains a mistake somewhere is not tremendously important.

On the other hand, point 2 is actually a fairly broad and important one. It is a result that cosmologists with a good intuition would perhaps have guessed already, but that we are able to quantify in a useful way: to be able to produce even half the temperature effect actually seen in the Cold Spot would require a hypothetical supervoid almost twice as large and twice as empty as the one seen by Szapudi's team, and the odds of such a void existing in our universe would be something like a one-in-a-million or one-in-a-billion (whereas the Cold Spot itself is at most a one-in-a-thousand anomaly in random CMB maps). A supervoid therefore cannot help to explain the Cold Spot.***

Point 3 is again something that many people probably already knew, but equally many seem to have forgotten or ignored, and something that has not (to my knowledge) been stated explicitly in any paper. My particular favourite though is point 4, which I could — with just a tiny bit of poetic licence — reword as the statement that
"the Cold Spot is not unusually cold; if anything, what's odd about it is only that it is surrounded by a hot ring"
I won't try to explain the second part of that statement here, but the details are in our paper (in particular Figure 7, in case you are interested). Instead what I will do is to justify the first part by reproducing Figure 6 of our paper here:

The averaged temperature anisotropy profile at angle $\theta$ from the centre of the Cold Spot (in red),  and the corresponding 1 and $2\sigma$ contours from the coldest spots in 10,000 random CMB maps (blue). Figure from arXiv:1408.4720.

What the blue shaded regions show is the confidence limits on the expected temperature anisotropy $\Delta T$ at angles $\theta$ from the direction of the coldest spots found in random CMB maps using exactly the SMHW selection procedure. The red line, which is the measured temperature for our actual Cold Spot, never goes outside the $2\sigma$ equivalent confidence region. In particular, at the centre of the Cold Spot the red line is pretty much exactly where we would expect it to be. The Cold Spot is not actually unusually cold.

Just before ending, I thought I'd also mention that Syksy has written about this subject on his own blog (in Finnish only): as I understand it, one of the points he makes is that this form of peer review on the arXiv is actually more efficient than the traditional one that takes place in journals.

Update: You might also want to have a look at Shaun's take on the same topic, which covers the things I left out here ...

* People often compare other properties of the Cold Spot to those in random maps, for instance its kurtosis or other higher-order moments, but for our purposes here the total filtered temperature will suffice.

** Although as Zhang and Huterer pointed out a few years ago, this analysis doesn't account for the particular choice of the SMHW filter or the particular choice of $6^\circ$ width — in other words, that it doesn't account for what particle physicists call the "look-elsewhere effect". Which means it is actually much less impressive.

*** If we'd actually seen a supervoid which had the required properties, we'd have a proximate cause for the Cold Spot, but also a new and even bigger anomaly that required an explanation. But as we haven't, the point is moot.

Monday, July 14, 2014

Short news items

Over the past two months I have been on a two-week seminar tour of the UK, taken a short holiday, attended a conference in Estonia and spent a week visiting collaborators in Spain. Posting on the blog has unfortunately suffered as a result: my apologies. Here are some items of interest that have appeared in the meantime:
  • The BICEP and Planck teams are to share their data — here's the BBC report of this news. The information I have from Planck sources is that Planck will put out a paper with new data very soon (about a week ago I heard it would be "maybe in two weeks", so let's say two or three weeks from today). This new data will then be shared with the BICEP team, and the two teams will work together to analyse its implications for the BICEP result. From the timescales involved my guess is that what Planck will be making available is a measurement of the polarised dust foreground in the BICEP sky region, and the joint publication will involve cross-correlating this map with the B-mode map measured by BICEP. A significant cross-correlation would indicate that most (or all) of the signal BICEP detected was due to dust.
  • What Planck will not be releasing in the next couple of weeks is their own measurement of the polarization of the CMB, in particular their own estimate of the value of $r$. The timetable for this release is still October: this is a deadline imposed by the fact that ESA requires Planck to release the data by December, but another major ESA mission (I forget which) is due to be launched in November and ESA don't like scheduling "competing" press conferences in the same month because there's only so much science news Joe Public can absorb at a time. From what I've heard, getting the full polarization data ready for October is a bit of a rush as it is, so it's fairly certain that's not what they're releasing soon.
  • By the way, I think I've recently understood a little better how a collaboration as enormous as Planck manage to remain so disciplined and avoid leaking rumours: it's because most of the people in the collaboration don't know the full details of the results either! That is to say, the collaboration is split into small sub-groups with specified responsibilities, and these sub-groups don't share results with each other. So if you ask a randomly chosen Planck member what the preliminary polarization results are looking like, chances are they don't know any better than you. (Though this may not stop them from saying "Well, I've seen some very interesting plots ..." and smiling enigmatically!)
  • The conference I attended in Estonia was the IAU symposium in honour of the 100th birth anniversary of the great Ya. B. Zel'dovich, on the general topic of large-scale structure and the cosmic web. I'll try to write a little about my general impressions of the conference next time. In the meantime all the talks are available for download from the website here.
  • A science news story you may have seen recently is "Biggest void in universe may explain cosmic cold spot": this is a claim that a recently detected region with a relative deficit of galaxies (the "supervoid") explains the existence of the unusual Cold Spot that has been seen in the CMB, without the need to invoke any unusual new physics. The claim of the explanation is based on this paper. Unfortunately this claim is wrong, and the paper itself has several problems. My collaborators and I are in the process of writing a paper of our own discussing why, and when we are done I will try to explain the issues on here as well. In the meantime, you heard it here first: a supervoid does not explain the Cold Spot!
Update: It has been pointed out to me that last week Julien Lesgourgues gave a talk about Planck and particle physics at the Strong and Electroweak Matter (SEWM14) symposium, in which he also discussed the timeline of forthcoming Planck and BICEP papers. You can see this on page 12 of his talk (pdf) and it is roughly the same as what I wrote above (except that there's a typo in the year — it should be 2014 not 2015!).

Friday, May 16, 2014

BICEP and listening to real experts

First up, I'd like to provide a health warning for all people landing here after following links from Sean Carroll or Peter Woit (thanks for the traffic!): I am not a CMB data analysis expert. What I provide on this blog is my own interpretation and understanding of the news and papers I have read, largely because writing such things out helps me understand them better myself. If it also helps people reading this blog, that's great, and you're welcome. But there are no guarantees that any of what I have written about BICEP is correct! If you truly want the best expert opinions on CMB analysis issues, you should listen to the best CMB experts — in this case, probably people who were in the WMAP collaboration, but are not in either Planck or BICEP. Also, if you want to ask somebody to write a scholarly review article on BICEP (yes, I get strange emails!), please don't ask me.

Having said that, I'm not sure whether any WMAP scientists write blogs, so I can at least try to provide some sources for the non-expert reader to refer to. One thing that you definitely should look at is Raphael Flauger's talk (slides and video) at Princeton yesterday. I think it is this work which was the source of the "is BICEP wrong" rumours first publicly posted at Resonaances, and indeed I see that Resonaances today has a follow-up referring to these very slides.

There are several interesting things to take away from this talk. The first is to do with the question of whether BICEP misinterpreted the preliminary Planck data that they admit having taken from a digitized version of a slide shown at a meeting. Here Flauger essentially simulates the process by digitizing the slide in question (and a few others) himself and analyzing them both with and without the correct CIB subtraction. His conclusion is that with the correct treatment, the dust models appear to predict higher dust contamination than BICEP accounted for; the inference being, I guess, that they didn't subtract the CIB correctly.

How important is this dust contribution? Here there is a fair amount of uncertainty: even if the digitization procedure were foolproof, one of the dust models underestimates the contamination and another one overestimates it. Putting the two together, "foregrounds may be OK if the lower end of the estimates is correct, but are potentially dangerous" (page 40). Flauger tries another method of estimation based on the HI column density, using yet more unofficial Planck "data" taken from digitized slides. This seems to give much the same bottom line.

A key point here is that everybody who isn't privy to the actual Planck data is really just groping in the dark, digitizing other people's slides. Flauger acknowledges by trying to estimate the effect of the process of converting real data into a gif image, converting that into a pdf as part of a talk, somebody nicking the pdf and converting it back to gif and then back to useable data. As you can imagine, the amount of noise introduced in this version of Chinese Whispers is considerable! So I think the following comment from Lyman Page towards the end of the video (as helpfully transcribed by Eiichiro Komatsu for the Facebook audience!) is perhaps the most relevant:
"This is, this is a really, peculiar situation. In that, the best evidence for this not being a foreground, and the best evidence for foregrounds being a possible contaminant, both come from digitizing maps from power point presentations that were not intended to be used this way by teams just sharing the data. So this is not - we all know, this is not sound methodology. You can't bank on this, you shouldn't. And I may be whining, but if I were an editor I wouldn't allow anything based on this in a journal. Just this particular thing, you know. You just can't, you can't do science by digitizing other people's images."
Until Planck answers (or fails to definitively answer) the question of foregrounds in the BICEP window, or some other experiment confirms the signal, we should bear that in mind.

There are some other issues that remain confusing at the moment: the cross-correlation of dust models with BICEP signal doesn't seem to support the idea that all the signal is spurious (though there are possibly some other complicating factors here), and the frequency evidence — such as it is — from the cross power with BICEP1 also doesn't seem to favour a dust contaminant. But all in all, the BICEP result is currently under a lot of pressure. Having seen this latest evidence, I now think the Resonaances verdict ("until [BICEP convincingly demonstrate that foregrounds are under control], I think their result does not stand") is — at least — a justifiable position.

Footnote: I should also perhaps explain that throughout my physics education I have been taught, and had come to believe, that the types of models of inflation BICEP provided evidence for (those with inflaton field values larger than the Planck scale) were fundamentally unnatural and incomplete, and that those, small-field, models that BICEP apparently ruled out were much more likely to be true. So perhaps my conscious attempts to compensate for this acknowledged theoretical prejudice could have biased me too far in the opposite direction in some previous posts!

Wednesday, May 14, 2014

New BICEP rumours: nothing to see here

This week there has been a minor kerfuffle about some rumours, originally posted on Adam Falkowski's Resonaances blog, regarding the claimed gravitational wave detection by BICEP. The rumours asserted that Planck had proven BICEP had made a mistake, BICEP had admitted the mistake, and that this might mean that all the excitement about the detection of gravitational waves was misplaced and all that BICEP had seen was some foreground dust emission contaminating their maps. (Since then there has been a strong public denial of this by the BICEP team.)

Now, with the greatest respect to Resonaances, which is an excellent particle physics blog, this is really a non-issue, and certainly not worth offending lots of people for (see for instance Martin Bucher's comment here). I really do not see what substantial information these rumours have provided us with that was not already known in March, and therefore why we should alter assessments of the data  made at that time.

Let me explain a bit more. One of the important limitations of the BICEP2 experiment is that it essentially measured the sky at only one frequency (150 GHz) — the data from BICEP1, which was at 100 GHz, was not good enough to see a signal, and the data from the Keck Array at 100 GHz has not yet been analysed. When you only have one frequency it is much harder to rule out the possibility that the "signal" seen is not due to primordial gravitational waves at all but due to intervening dust or other contamination from our own Galaxy.

The way that BICEP addressed this difficulty was to use a set of different models for the dust distribution in that part of the sky, and to show that all of them predict that the possible level of dust contamination is an order of magnitude too small to account for the signal that they see. Now, some of these models may not be correct. In fact none of them are likely to be exactly right, because they may be based on old and likely less accurate measurements of the dust distribution or rely on a bit of extrapolation, wishful thinking, whatever. But the point is that they all roughly agree about the order of magnitude of dust contamination. This does not mean that we know there is or isn't any foreground contamination; this is merely a plausibility argument from BICEP (that is supported by and supports some other plausibility arguments in the paper).

Now the "new" rumour is based on the fact that it turns out that one of the dust models was based on BICEP's interpretation of preliminary Planck data, and that this data was not officially sanctioned but digitally extracted from a pdf of a slide shown at a talk somewhere. This is not exactly news, since the slide in question is in fact referenced in the BICEP paper. What's new is that now somebody unnamed is suggesting that the slide was in fact misinterpreted, and therefore this one dust model is more wrong than we thought, though we already accepted it was probably somewhat wrong. This is not the same as proving that the BICEP signal has been definitively shown to be caused by dust contamination! In fact I don't see how it changes the current picture we have at all. Ultimately the only way we can be sure about whether the observed signal is truly primordial or due to dust is to have measurements that combine several different frequencies. For that we have to wait a bit for other experiments — and that's the same as we were saying in March.

It's worth noting that when BICEP quote their result in terms of the tensor-to-scalar ratio r, the headline number $r=0.2$ assumes that there is literally zero foreground contamination. This was always an unrealistic assumption, but that hasn't stopped some 300 theorists from writing papers on the arXiv that take the number as face value and use it to rule out or support their favourite theories. The foreground uncertainty means that while we can be reasonably confident that the gravitational wave signal does exist (see here), model comparisons that strongly depend on the precise value of r are probably going to need some revision in the future.

So what new information have we gained since March? Well, Planck released some more data, this time a map of the polarized dust emission close to the Galactic plane.

The polarization fraction at 353 GHz observed by Planck. From arXiv:1405.0871.

Since these maps do not include the part of the sky that BICEP looked at (which is mostly in the grey region at the bottom), they don't tell us a huge amount about whether that part of the sky is or is not contaminated by polarized dust emission! Some people have speculated that this is something to do with the rivalry between Planck and BICEP, which is a bit over-the-top. Instead the reason is more scientific: the mask excludes areas where the error in determining the polarisation fraction is high, or the overall dust signal itself is too small. So the fact that the BICEP patch is in the masked region indicates that the dust emission does not dominate the total emission there, at least at 353 GHz (dust emission increases with frequency). This means there is not a whole lot of dust showing up in the BICEP region — if anything, this is good news! But even this interpretation should be treated with caution: dust doesn't contribute too much to the total intensity in that region, but it may well still contribute a large fraction of whatever B-mode polarization is seen. Based on my understanding and things I have learned from conversations with colleagues, I don't think Planck is going to be sensitive enough to make definitive statements about the dust in that specific region of the sky.

Another interesting paper that has come out since March has been this one, which claims evidence for some contamination in the CMB arising from the "radio loops" of our Galaxy. It also has the great benefit of being an actual scientific paper rather than a rumour on somebody's blog. (Full disclaimer: one of the authors of this paper was my PhD advisor, and another is a friend who was a fellow student when I was at Oxford.) 

The radio loops are believed to be due to ejected material from past supernovae explosions; the idea is that if this dust contains ferrimagnetic molecules or iron, it would contribute polarized emission that might be mistaken for true CMB when it is in fact more local. What this paper argues is that does appear to be some evidence that one of the CMB maps produced by the WMAP satellite (which operated before Planck) does show some correlation between map temperature and the position of one of these radio loops ("Loop I"). In particular, synchrotron emission from Loop I appears to be correlated with the temperature in the WMAP Internal Linear Combination (or ILC) map. I'm not going to comment on the strength of the statistical evidence for this claim; doubtless someone more expert than I will thoroughly check the paper before it is published. For the time being let us treat it as proven.

The relevance of this to BICEP is somewhat intricate, and proceeds like this: given our physical understanding of how the radio loops formed, it seems likely that they produce both synchrotron and dust emission which follow the same pattern on the sky. Therefore perhaps the correlation of the synchrotron emission from Loop I with the ILC map is because both are correlated with dust emission from the loop. If the correlation is because of dust emission, this might be polarized because of the postulated ferrimagnetic molecules etc., leading to a correlation between the WMAP polarization and Loop I. And if Loop I is contaminating the WMAP ILC map, it is perhaps plausible that a different radio loop, called the "New Loop", is also contaminating other CMB maps, in particular those of BICEP. Whereas Loop I doesn't get very close to the BICEP region, the New Loop goes right through the centre of it (see the figure below), so it is possible that there is some polarized contamination appearing in the BICEP data because of the New Loop. At any rate, the foreground dust models that BICEP used didn't account for any radio loops, so likely underestimate the true contamination.

Position of some Galactic radio loops and the BICEP window. "Loop I" is large one in the upper centre, that only skims the BICEP window; the "New Loop" is the one in the lower centre that passes through the centre of it. Figure from Philipp Mertsch.

So far so good, but this is quite a long chain of reasoning and it doesn't prove that it is actually dust contamination that accounts for any part of the BICEP observation. Instead it makes a plausible argument that it might be important; further investigation is required.

At the end of the day then, we are left in pretty much the same position we were in back in March. The BICEP result is exciting, but because it is only at one frequency, it cannot rule out foreground contamination. Other observations at other frequencies are required to confirm whether the signal is indeed cosmological. One scenario is that Planck, operating on the whole sky at many frequencies but with a lower sensitivity than BICEP, confirms a gravitational wave signal, in which case pop the champagne corks and prepare for Stockholm. The other scenario is that Planck can't confirm a detection, but also can't definitively say that BICEP's detection was due to foregrounds (this is still reasonably likely!), in which case we wait for other very sensitive ground-based telescopes pointed at that same region of sky but operating at different frequencies to confirm whether or not dust foregrounds are actually important in that region, and if so, how much they change the inferred value of r.

Until then I would say ignore the rumours.

Monday, March 24, 2014

BICEP2: reasons to be sceptical, part 2

This is the second part of three posts in which I wanted to lay out the various possible causes of concern regarding the BICEP2 result, and provide my own opinion on how seriously we should take these worries. I arranged these reasons to be sceptical into three categories, based on the questions
  • how certain can we be that BICEP2 observed a real B-mode signal?
  • how certain can we be that this B-mode signal is cosmological in origin, i.e. that it is due to gravitational waves rather than something less exciting?
  • how certain can we be that these gravitational waves were caused by inflation?
The first post dealt with the first of the three questions, this one addresses the second, and a post yet to be written will deal with the third.

How certain can we be that the observed B-mode signal is cosmological? 


Let's take it as given that none of the concerns in the previous post turn out to be important, i.e. that the observed B-mode signal is not an artefact of some hidden systematics in the analysis, leakage or whatever. From my position of knowing a little about data in general, but nothing much about CMB polarization analysis, I guessed that the chances of any such systematic being important were about 1 in 100.

The next question is then whether the signal could be caused by something other than the primordial gravitational waves that we are all so interested in. The most important possible contaminant here is other nearby sources of polarized radiation, particularly dust in our own Galaxy. We don't actually know how much polarized dust or synchrotron emission there might be in the sky maps here, so a lot of what BICEP have done is educated guesswork.

To start with, the region of the sky that BICEP looks at was chosen on the basis of a study by Finkbeiner et al. from 1999, which extrapolated from measurements of dust emission at certain other frequencies to estimate that, at the frequency of relevance to CMB missions such as BICEP, that particular part of the sky would be exceptionally "clean", i.e. with exceptionally low foreground dust emission. Whether this is actually true or not is not yet known for certain, but there exist a number of models of the dust distribution, and most of these models predict that the level of contamination to the B-mode detection from polarized dust emission would be an order of magnitude smaller than the observed signal. Similar model-dependent extrapolation to the observation frequency based on WMAP results suggests that synchrotron contamination is also an order of magnitude too small.

Predictions for foreground contamination for different dust models (the coloured lines at the bottom) versus the actual B-mode signal observed by BICEP2 (black points).


Now one real test of these assumptions will come from Planck, because Planck will soon have the best map of dust in our Galaxy and therefore the best limits on the possible contamination. This is one of the reasons to look forward to Planck's own polarization results, due in about October or November. In the absence of this information, the other thing that we would like to see from BICEP in order to be sure their signal is cosmological is evidence that the signal exists at multiple frequencies (and has the expected frequency dependence).

BICEP do not detect the signal at multiple frequencies. The current experiment, BICEP2, operates at 150 GHz only, and that is where the signal is seen. A previous experiment, BICEP1, did run at 100 GHz as well, but BICEP1 did not have the same sensitivity and could only place an upper limit on the B-mode signal. Data from the Keck Array will eventually also include observations at 100 GHz, but this is not yet available. Until we have confirmation of the signal at different frequencies, most cosmologists will treat the result very carefully.

In the absence of this, we must look at the cross-correlation between B2 and B1. Remember that although B1$\times$B1 did not have the sensitivity to make a detection of non-zero power, B2$\times$B1 can still tell us something useful. If B1 maps were purely noise, or B2 maps were due to dust, we would not expect them to be correlated. If both were due to synchrotron radiation, we would expect them to be strongly correlated. In fact the B2$\times$B1 cross power is non-zero at the $3\sigma$ level or about 99% confidence, which is something Peter Coles' sceptical summary ignores. This is indeed evidence that the signal seen at 150 GHz is cosmological.

Still, some level of cross-correlation could be produced even if both B2 and B1 were only seeing foregrounds. Combining the B2$\times$B1 data with B2$\times$B2 and B1$\times$B1 means that polarized dust or synchrotron emission of unexpected strength are rejected as explanations – though at a not-particularly-exciting significance of about $2.2-2.3\sigma$.

Verdict 


It's fair to say, on the basis of models of the distribution of polarized dust and synchrotron emission, that the BICEP2 signal probably isn't due to either of these contaminants. However, we don't yet have confirmation of the detection at multiple frequencies, which is required to judge for sure. At the moment, the frequency-based evidence against foreground contamination is not very strong, but we'd still need some quite unexpected stuff to be going on with the foregrounds to explain the amplitude of the observed signal.

Overall, I'd guess the odds are about 1:100 against foregrounds being the whole story. (This should still be compared with the quoted headline result of 1:300,000,000,000 against $r=0$ assuming no foregrounds at all!)

The chances are much higher – I'd be tempted to say perhaps even as much as better than even money – that foregrounds contribute a part of the observed signal, and that therefore the actual value of the tensor-to-scalar ratio will come down from $r=0.2$, perhaps to as low as $r=0.1$, when Planck checks this result using their better dust mapping.

Friday, March 21, 2014

BICEP2: reasons to be sceptical, part 1

As the dust begins to settle following the amazing announcement of the discovery of gravitational waves by the BICEP2 experiment, physicists around the world are taking stock and scrutinizing the results.

Remember that the claimed detection is enormously significant, in more ways than one. The BICEP team have apparently detected an exceedingly faint B-mode polarization pattern in the CMB, at an order of magnitude better sensitivity than any previous experiment probing the same scales. They have then claimed to have been able to ascribe this B-mode signal unambiguously to cosmological gravitational waves, rather than any astrophysical effects due to intervening dust or other sources of radiation. And finally they have interpreted these results as direct evidence for the theory of inflation, which is really the source of all the excitement, because if true it would pin down the energy scale of inflation at an incredibly high level, with extensive and dramatic consequences for our understanding of high energy particle physics.

However, as all physicists have been saying, with results of this magnitude it is important to be very careful indeed. Speculating who should get the Nobel Prize (or Prizes) for this is still premature. The paper containing the results will of course be subjected to anonymous peer review when it is submitted to a journal, but it has also already faced a rather extraordinary open peer review by social media, with a live group on Facebook, and all sorts of other discussion on blogs, Twitter and the like. (And to the great credit of the scientists on the BICEP team, they have patiently responded to questions and comments on these forums, and the whole process has been carried out very civilly!)

What I wanted to do today is to possibly contribute to that by gathering together all the main points of concern and reasons to be sceptical of the BICEP result. This is partly for my own purposes, since writing things down helps to clarify my thoughts. I will divide these concerns into three main categories, addressing the following questions:

  • how certain can we be that BICEP2 observed a real B-mode signal?
  • how certain can we be that this B-mode signal is cosmological in origin, i.e. that it is due to gravitational waves rather than something less exciting?
  • how certain can we be that these gravitational waves were caused by inflation?

I'll discuss the first category of concerns in part 1 of this post and the next two together in parts 2 and 3. I do not claim that any of the concerns I raise here are original, however any mistakes are definitely mine alone. I'd like to encourage discussion of any of these points via the comments below.

How certain can we be that BICEP2 observed a real B-mode signal?


This is obviously the most basic issue. The general reason for concern here — and this applies to any B-mode detection experiment — is that the experimental pipeline has to be able to decompose the polarization signal seen into two components, the E-mode and the B-mode, and the level of the signal in the B-mode is orders of magnitude smaller than in E. Now, as Peter Coles explains here, the E and B polarization components are in principle orthogonal to each other when the spherical harmonic decomposition can be performed over the whole sky, but this is in practice impossible. BICEP observes only a small portion of the sky, and therefore there is the possibility of "leakage" from E to B when the separating out the components. It would not take much leakage to spoil the B-mode observation.

Obviously the BICEP team implemented many tests of the obtained maps to check for such systematics. One of the ways to do this is to cross-correlate the E and B maps: if there is no leakage the cross-correlation should be consistent with zero. Another important test is the jackknife technique, also nicely explained here: you split your data into two equal halves, and subtract the signal found in one half from that in the other; the answer should also be consistent with zero.

Now one source of concern arises because of a combination of these two tests. The blue points in the following figure show the results of a jackknife test on the BB power:


These points are consistent with zero ... but they are possibly too consistent with zero! The $1\sigma$ error bars of each one of them passes through zero, whereas it would be more natural to expect some more scatter. In fact from the number on the plot you can see that there is only a 1% chance that all 9 blue points should be so close to zero.

This raises the possibility, pointed out by Hans Kristian Eriksen, that the errorbars on the blue points are overestimated. It may then be the case that the errorbars on other points in other jackknife tests are also too large. If that were the case then reducing those errors might mean that some of the other jackknife tests now fail — the points are no longer consistent with zero. As it happens, of the 168 jackknife test results listed in the table in the paper, quite a large number (about 7) of them already "fail" by the stricter standards (2% probability) some other experiments such as QUIET might apply. Obviously some number of tests are always expected to fail, but more than 7 out of 168 starts to look like quite a large number. This then becomes a little worrying.

On the other hand, this extrapolation may be a little exaggerated, because we are surmising that the errorbars might be too large purely on the basis of the one figure above. Clearly if you do a large number of jackknife tests, it becomes less surprising that one of them gives a surprising result, if you see what I mean. Looking through the table for the other BB jackknife results, the particular example from the figure is the only one that stands out as being odd, so it is hard to conclude from this that the errorbars are too large. Overall I'm not convinced that there is necessarily a problem here, but it is something that deserves a little more quantitative attention.

The second source of concern that has been highlighted is that the data at large multipole values appear to be doing something odd. Look at the 5th, 6th and 7th black points from the figure above, which are quite a long way from the theoretical expectation. Peter Coles helpfully drew a little blue circle around them:


The worry here is that even if the data appear to be passing jackknife tests for internal consistency and null tests for EB cross power, the fact that these points are so high suggests that there is still some undetected systematic that has crept in somewhere. This hypothesized systematic could account for the measured values of the crucial first four points, which constitute the detection of the gravitational waves.

Similarly, people are worried about the EE power spectrum, which appears to be too high in the $50< \ell<100$ region — again this could be a sign of leakage from temperature into polarization, which could perhaps be contaminating the B-mode maps despite not explicitly showing up in the jackknife consistency checks.

Now, the BICEP response to this is that you shouldn't judge things simply "by eye". The EE excess does not appear to be statistically significant. It's also not incredibly unlikely that the final two of the circled BB data points could simultaneously be as high as they are just due to random chance — they say "their joint significance is $<3\sigma$", which means that the chance is about 1%. (Of course the chance that all three of the circled points could simultaneously be high is smaller than that, and so presumably less than 1% ... )

Another justification some people have been providing (mostly people from outside the BICEP collaboration to be fair, though some from within it as well) is that the preliminary data from the Keck array, which is a similar instrument to BICEP but with higher sensitivity, appear to show no anomaly in that region. I think this is a somewhat dangerous argument, because the Keck data also don't seem to be quite so high in the region of the crucial first four bandpowers! In any case, the "official" word from BICEP is that any such speculation on the basis of Keck is to be discouraged, because the Keck data is still very preliminary and has not been properly checked.

Verdict

I'm a little bit worried about the various issues raised here, though overall I would say the odds are in favour of the B-mode detection being secure (this is a different issue to whether this detected signal is due to gravitational waves! More on that in the next post). I would not, however, put those odds at anywhere near 1 in 300,000,000,000 against there being an error, which is the headline significance claimed for the detection of a non-zero tensor-to-scalar ratio ($7\sigma$). If I were forced to quantify my belief, I would say something more like 1 or 2 in 100. That's not particularly secure, but luckily there are follow-up experiments, such as Keck and Planck itself, which should be able to reassure us on that score soon.

A final point: seeing the preliminary Keck data shown in a figure in the paper suggests to me that perhaps the final analysis of Keck data will now not be done "blind". I hope that's not the case, it would be very disturbing indeed if it were. 

Monday, March 17, 2014

First Direct Evidence for Cosmic Inflation

That was the title of the BICEP2 presentation today. Gives you some idea about the magnitude of the result, if it holds up: it really is astonishingly exciting.

Unfortunately it was so exciting that we in Helsinki couldn't even access the Harvard server and so couldn't watch any of the webcast at all. It seems the same was true for most other cosmologists around the world. So my comments here are based purely on a preliminary reading of the paper itself, and a distillation of the conversations occurring via Facebook and the like.

Firstly, the headline results: the BICEP team claim to have detected a B-mode signal in the CMB at exceedingly high statistical significance. Their headline claim is

$r=0.2^{+0.07}_{-0.05}$, with $r=0$ disfavoured at $7.0\sigma$

That is frankly astonishing. Here's the likelihood plot:

BICEP2 constraint on the tensor-to-scalar ratio r. 

(All figures are taken from the paper avalaible here.)

The actual measurement of the BB power spectrum looks like this:


The black points are the new measurements, the other coloured points are the previously available best upper limits. The solid red curve is the theoretical expectation from lensing (the relatively boring contribution to BB), the dashed red curve that dies off is the theoretical expectation from a model with inflationary gravitational waves and $r=0.2$, and the other dashed red curve (were they short of colours?!) is the total.

They've also done a pretty good job of eliminating other foreground sources (dust, synchrotron emission etc.) as possible explanations for the signal seen, which means it is much more likely that the signal is actually due to primordial gravitational waves from inflation. In doing this, it helps that the signal they see is actually as large as it is, since there's less chance of confusing it with these foregrounds (which are much smaller).  [Update: I'm not an expert here, apparently some others were less convinced about the removal of foregrounds. Not sure why though – I'd have thought other systematic errors were far more likely to be a problem than foregrounds.]

So far so good. In fact — and I really can't stress this enough — this is an extraordinary, wonderful, unexpected result and huge congratulations to the BICEP team for achieving it. It will mean a lot of happy theorists as well, because we finally have something new to try to explain!

However, it is very important that as a community we remain skeptical, particularly so when - as here - the result is one that we would so desperately love to be true. Given that, I'm going to list a serious of things that are potentially worrying/things to think about/things I don't understand. (Some of these are not things I noticed myself, but were points raised by Dave Spergel, Scott Dodelson and other experts at the ongoing live discussion on Facebook.) Doubtless these are questions the BICEP team will have thought about themselves; perhaps they already have all the answers and will tell us about them in due course — as I said, no one I know was able to watch the webcast live.

  • In the BB-spectrum plot above, the data seem to be showing a significant excess above expectations for multipoles about $\ell\sim200-350$. What's going on with that?
  • This is particularly noticeable in another figure (Fig. 9) in the paper:
  • From the above figure, preliminary results of the cross-correlation with Keck don't show the excess at high-$\ell$ (a reason to believe it might go away), but the same cros-correlation also shows less power at lower $\ell$ (which is a bit confusing).
  • At lower values of $\ell$ the EE power spectrum also shows an excess (Fig. 7):
  • All the above points put together suggest that perhaps there is some leakage in the polarisation maps coming from the temperature anisotropy — a large part of the analysis work is concerned with accounting for and correcting for any such leakage, of course, the question is to what extent independent experts will be satisfied that these methods worked.
  • Although the headline figure is $r=0.2$, they rather confusingly later say that when the best possible dust model is used for foreground subtraction, this becomes $r=0.16^{+0.06}_{-0.05}$. But if this the the best possible dust model, why is this not the quoted headline number? Is this related somehow to the power excess at $\ell\sim200-350$?
  • If $r$ is as large as they have measured why was it not seen by Planck? Actually this is a fairly complicated question: the point being that if the tensor amplitude is so large, it should make a non-negligible contribution to the temperature power spectrum as well, which would have affected Planck's results. Planck had a constraint $r<0.11$, but this specifically assumed that the primordial power spectrum had a power-law form with no running (sorry about the technical jargon, unfortunately not enough time to explain here today). So BICEP suggest one way around this tension is to simply introduce a running, but it seems (but this bit was not entirely clear to me from the paper) that you need a fairly large value of the running for this explanation to fly. And if you've got a large running then you have to worry about why not a running of the running, a running of the running of the running and so on ad infinitum - in fact how do we know that the power-law expansion form of the $P(k)$ is the correct way to go at all?
  • Besides, are there viable inflationary models that predict both large $r$ as well as large running (or non-power-law form of the primordial power)? Given the vast array of inflationary models, the answer to this question is almost certainly yes, but people may consider some other explanations more worthwhile ...
Phew. There are probably lots of other things to think about, but that's about all I can manage today. It's been a very exciting day!

Saturday, March 15, 2014

B-modes, rumours, and inflation

Update: The announcement will definitely be about a major discovery by BICEP2, meaning it can only really be about a B-mode signal. You can follow the webcast at  http://www.cfa.harvard.edu/news/news_conferences.html, starting at 10:45 am EDT (14:45 GMT) for scientists, or 12:00 pm EDT (16:00 GMT) for the general public and news organisations.

The big news in cosmology circles at the minute is the rumour that the "major discovery" due to be announced at a press conference on Monday the 17th is in fact a claimed detection of the B-mode signal in the CMB by the the BICEP2 experiment.

Now, I'm not particularly well placed to comment on this rumour, since all the information I have comes at second- or third-hand, via people who have heard something from someone, people who think they heard something from someone, or people who are simply unashamedly speculating. (Perhaps this is a function of being on the wrong side of the Atlantic: although the BICEP2 experiment is based at the South Pole, the only non-North-American university participating in the collaboration is Cardiff University in Wales. Even worse, I'm not on Twitter.) In any case, by reading thisthisthis and this, you will be starting with essentially the same information as me.

But having got that health warning out of the way, let's pretend that the rumours are entirely accurate and that on Monday we will have an announcement of a detection of a significant B-mode signal. What would this mean for cosmology?

Firstly, the B-mode signal refers to a particular polarisation of the CMB (for a short and somewhat technical introduction, see here; for a slightly longer one, see here). This polarisation can arise in various ways, one of which is the polarisation induced in the CMB by gravitational lensing, as the CMB photons travel through the inhomogeneous Universe on their way from the last scattering surface to us. There have been a few experiments, such as POLARBEAR, which have already claimed a detection of this lensing contribution to the B-mode signal (though in this particular case after skim-reading the paper I was a little underwhelmed by the claim).

Now, detecting a lensing B-mode would be cool, but significantly less exciting than detecting a primordial B-mode. This is because whereas the lensing signal comes from late-time physics that is quite well understood, a primordial signal would be evidence of primordial tensor fluctuations or primordial gravitational waves. And this is cool because inflation provides a possible way to produce primordial gravitational waves – therefore their detection could be a major piece of evidence in favour of inflation.

The contributions to the B-mode signal coming from gravitational waves and lensing are differentiated on the basis of the multipoles (essentially the length scale) at which they are important. Figure from Hu and Dodelson 2002.

People often say that detection of this tensor signal would be a "smoking gun" for inflation; something that would be very welcome, because although inflation has proved to be an attractive and fertile paradigm for cosmology, there is still a bit of a lack of direct, incontrovertible evidence in favour of it. Coupled with certain unresolved theoretical issues it faces, this lack of a smoking gun meant that arguments for or against inflation were threatening to degenerate into what you might call "multiverse territory", definitely an unhealthy place to be.

It may be worth introducing a note of caution about this "smoking gun" though. Although inflation is a possible source of primordial gravitational waves, it is not the only one. Artefacts of possible phase transitions in the early universe, known as cosmic defects, can also produce a spectrum of gravitational waves – and what's more, this spectrum can be exactly scale-invariant, just as that from inflation. I don't know a huge amount about this field, so I am not sure whether the amplitude of the perturbations which could be produced by these cosmic defects could be sufficiently large, nor – if it is – whether there are any other features which could help distinguish this scenario from inflation if the rumours turn out to be true. Perhaps better informed people could comment below.

Suppose we put that issue to one side though, and assume that not only has a significant tensor signal been detected, we have also been able to prove that it could not be due to anything other than inflation. The rumour is that the detection corresponds to value for the tensor-to-scalar ratio r of about 0.2. What are the implications of this for the different inflation models?

Planck limits on various inflationary models.
Not all models of inflation do result in tensor modes large enough to observed in the CMB, so an observation of a large r would rule out a large class of these models. Generally speaking, the understanding is that models in which the inflaton field $\phi$ takes large values (i.e., values larger than the Planck mass $M_P$) are the ones which could produce observably large r, whereas the so-called "small-field models" where $\phi\ll M_P$ usually predict tiny values of r which could never be observed. (A note for non-experts: irrespective of the field value, the energy scale in both small-field and large-field models is always much less than the Planck scale.) Therefore, at a stroke, all small-field inflation models would be ruled out. Many people regard these as the better-motivated models of inflation, with in some respects fewer theoretical issues than the large-field models, so this would be quite significant.

There are two small caveats to this statement: firstly, it isn't strictly necessary for $\phi$ itself to be larger than $M_P$ to generate a large r, only that the change in $\phi$ be large. So models in which the inflaton field winds around a cylinder, in effect travelling a large distance without actually getting anywhere, can still give large r (hat-tip to Shaun for that phrasing). Also, it is not even strictly true that the change in $\phi$ must be large: if some other rather specific conditions (including the temporary breakdown of the slow-roll approximation) are met, this one can be avoided and even small field models can produce enough gravitational waves. This was something pointed out by a paper I wrote with Shaun Hotchkiss and Anupam Mazumdar in 2011, though other people had similar ideas at about the same time. Such rather forced small-field models would have other specific features though, so could be distinguished by other measurements.

One of the more interesting consequences of a detection of large r (aside from the earth-shattering importance of a confirmation of inflation itself) would be that the Higgs inflation model – which has been steadily gaining in popularity given the results from the LHC and Planck, and has begun to be regarded by many as the most plausible mechanism by which inflation could have occurred – would be disfavoured. In the plot above, the Higgs inflation prediction is shown by the orange points at the bottom centre of the figure. So a BICEP2 detection of $r\sim0.2$ as suggested by the rumours would be pretty serious for this model.

On the other hand, a BICEP2 detection of $r\sim0.2$ would also strongly contradict appear to be at odds with the results from the Planck and WMAP satellites. Which probably goes to show that there is not much point believing every rumour ...

We will find out on Monday!

Monday, February 3, 2014

Does the multiverse explain the cosmological constant?

At the end of the last post on falsifiability, I mentioned the possibility that the multiverse hypothesis might provide an explanation for the famous cosmological constant problem. Today I'm going to try to elaborate a little on that argument and why I find it unconvincing.

Limitations of space and time mean that I cannot possibly start this post as I would like to, with an explanation of what the cosmological problem is, and why it is so hard to resolve it. Readers who would like to learn a bit more about this could try reading this, this, this or this (arranged in roughly descending order of accessibility to the non-expert). For my purposes I will have to simply summarise the problem by saying that our models of the history of the Universe contain a parameter $\rho_\Lambda$ – which is related to the vacuum energy density and sometimes called the dark energy density – whose expected value, according to our current understanding of quantum field theory, should be at least $10^{-64}$ (in units of the Planck scale energy) and quite possibly as large as 1, but whose actual value, deduced from our reconstruction of the history of the Universe, is approximately $1.5\times10^{-123}$. (As ever with this blog, the mathematics may not display correctly in RSS readers, so you might have to click through.)

This enormous discrepancy between theory and observation, of somewhere between 60 and 120 orders of magnitude, has for a long time been one of the outstanding problems – not to say embarrassments – of high energy theory. Many very smart people have tried many ingenious ways of solving it, but it turns out to be a very hard problem indeed. Sections 2 and 3 of this review by Raphael Bousso provide some sense of the various attempts that have been made at explanation and how they have failed (though this review is unfortunately also at a fairly technical level).

This is where the multiverse and the anthropic argument comes in. In this very famous paper back in 1987, Steven Weinberg used the hypothesis of a multiverse consisting of causally separated universes which have different values of $\rho_\Lambda$ to explain why we might be living in a universe with a very small $\rho_\Lambda$, and to predict that if this were true, $\rho_\Lambda$ in our universe would nevertheless be large enough to measure, with a value a few times larger than the energy density of matter, $\rho_m$. This was particularly important because the value of $\rho_\Lambda$ had not at that time been conclusively measured, and many theorists were working under the assumption that the cosmological constant problem would be solved by some theoretical advance which would demonstrate why it had to be exactly zero, rather than some exceedingly small but non-zero number.

Weinberg's prediction is generally regarded as having been successful. In 1998, observations of distant supernovae indicated that $\rho_\Lambda$ was in fact non-zero, and in the subsequent decade-and-a-half increasingly precise cosmological measurements, especially of the CMB, have confirmed its value to be a little more than three times that of $\rho_m$.

This has been viewed as strong evidence in favour of the multiverse hypothesis in general and in particular for string theory, which provides a potential mechanism for the realisation of this multiverse. Indeed in the absence of any other observational evidence for the multiverse (perhaps even in principle), and the ongoing lack of experimental lack of experimental evidence for other predictions of string theory, Weinberg's anthropic prediction of the value of the cosmological constant is often regarded as the most important reason for believing that these theories are part of the correct description of the world. For instance, to provide just three arbitrarily chosen examples, Sean Carroll argues this here, Max Tegmark here, and Raphael Bousso in the review linked to above.

I have a problem with this argument, and it is not a purely philosophical one. (The philosophical objection is loosely the one made here.) Instead I disagree that Weinberg's argument still correctly predicts the value of $\rho_\Lambda$. This is partly because Weinberg's argument, though brilliant, relied upon a few assumptions about the theory in which the multiverse was to be realised, and theory has subsequently developed not to support these assumptions but to negate them. And it is partly because, even given these assumptions, the argument gives the wrong value when applied to cosmological observations from 2014 rather than 1987. Both theory and observation have moved away from the anthropic multiverse.

Wednesday, January 22, 2014

Is falsifiability a scientific idea due for retirement?

Sean Carroll argues that it is.

He characterises the belief that "theories should be falsifiable" as a "fortune-cookie-sized motto"; it's a position adopted only by "armchair theorizers" and "amateur philosophers", and people who have no idea how science really works. He thinks we need to move beyond the idea that scientific theories need to be falsifiable; this appears to be because he wants to argue that string theory and the idea of the multiverse are not falsifiable ideas, but are still scientific.

This position is not just wrong, it's ludicrous. 

What's more, I think deep down Sean – who is normally a clear, precise thinker – realises that it is ludicrous. Midway through his essay, therefore, he flaps around trying to square the circle and get out of the corner he has painted himself into: a scientific theory must, apparently, still be "judged on its ability to account for the data", and it's still true that "nature is the ultimate guide". But somehow it isn't necessary for a theory to be falsifiable to be scientific.

Now, I'm not a philosopher by training. Therefore what follows could certainly be dismissed as "amateur philosophising". I'm almost certain that what I say has been said before, and said better, by other people in other places. Nevertheless, as a practising scientist with an argumentative tendency, I'm going to have to rise to the challenge of defending the idea of falsifiability as the essence of science. Let's start by dismantling the alternatives.