Monday, March 24, 2014

BICEP2: reasons to be sceptical, part 2

This is the second part of three posts in which I wanted to lay out the various possible causes of concern regarding the BICEP2 result, and provide my own opinion on how seriously we should take these worries. I arranged these reasons to be sceptical into three categories, based on the questions
  • how certain can we be that BICEP2 observed a real B-mode signal?
  • how certain can we be that this B-mode signal is cosmological in origin, i.e. that it is due to gravitational waves rather than something less exciting?
  • how certain can we be that these gravitational waves were caused by inflation?
The first post dealt with the first of the three questions, this one addresses the second, and a post yet to be written will deal with the third.

How certain can we be that the observed B-mode signal is cosmological? 

Let's take it as given that none of the concerns in the previous post turn out to be important, i.e. that the observed B-mode signal is not an artefact of some hidden systematics in the analysis, leakage or whatever. From my position of knowing a little about data in general, but nothing much about CMB polarization analysis, I guessed that the chances of any such systematic being important were about 1 in 100.

The next question is then whether the signal could be caused by something other than the primordial gravitational waves that we are all so interested in. The most important possible contaminant here is other nearby sources of polarized radiation, particularly dust in our own Galaxy. We don't actually know how much polarized dust or synchrotron emission there might be in the sky maps here, so a lot of what BICEP have done is educated guesswork.

To start with, the region of the sky that BICEP looks at was chosen on the basis of a study by Finkbeiner et al. from 1999, which extrapolated from measurements of dust emission at certain other frequencies to estimate that, at the frequency of relevance to CMB missions such as BICEP, that particular part of the sky would be exceptionally "clean", i.e. with exceptionally low foreground dust emission. Whether this is actually true or not is not yet known for certain, but there exist a number of models of the dust distribution, and most of these models predict that the level of contamination to the B-mode detection from polarized dust emission would be an order of magnitude smaller than the observed signal. Similar model-dependent extrapolation to the observation frequency based on WMAP results suggests that synchrotron contamination is also an order of magnitude too small.

Predictions for foreground contamination for different dust models (the coloured lines at the bottom) versus the actual B-mode signal observed by BICEP2 (black points).

Now one real test of these assumptions will come from Planck, because Planck will soon have the best map of dust in our Galaxy and therefore the best limits on the possible contamination. This is one of the reasons to look forward to Planck's own polarization results, due in about October or November. In the absence of this information, the other thing that we would like to see from BICEP in order to be sure their signal is cosmological is evidence that the signal exists at multiple frequencies (and has the expected frequency dependence).

BICEP do not detect the signal at multiple frequencies. The current experiment, BICEP2, operates at 150 GHz only, and that is where the signal is seen. A previous experiment, BICEP1, did run at 100 GHz as well, but BICEP1 did not have the same sensitivity and could only place an upper limit on the B-mode signal. Data from the Keck Array will eventually also include observations at 100 GHz, but this is not yet available. Until we have confirmation of the signal at different frequencies, most cosmologists will treat the result very carefully.

In the absence of this, we must look at the cross-correlation between B2 and B1. Remember that although B1$\times$B1 did not have the sensitivity to make a detection of non-zero power, B2$\times$B1 can still tell us something useful. If B1 maps were purely noise, or B2 maps were due to dust, we would not expect them to be correlated. If both were due to synchrotron radiation, we would expect them to be strongly correlated. In fact the B2$\times$B1 cross power is non-zero at the $3\sigma$ level or about 99% confidence, which is something Peter Coles' sceptical summary ignores. This is indeed evidence that the signal seen at 150 GHz is cosmological.

Still, some level of cross-correlation could be produced even if both B2 and B1 were only seeing foregrounds. Combining the B2$\times$B1 data with B2$\times$B2 and B1$\times$B1 means that polarized dust or synchrotron emission of unexpected strength are rejected as explanations – though at a not-particularly-exciting significance of about $2.2-2.3\sigma$.


It's fair to say, on the basis of models of the distribution of polarized dust and synchrotron emission, that the BICEP2 signal probably isn't due to either of these contaminants. However, we don't yet have confirmation of the detection at multiple frequencies, which is required to judge for sure. At the moment, the frequency-based evidence against foreground contamination is not very strong, but we'd still need some quite unexpected stuff to be going on with the foregrounds to explain the amplitude of the observed signal.

Overall, I'd guess the odds are about 1:100 against foregrounds being the whole story. (This should still be compared with the quoted headline result of 1:300,000,000,000 against $r=0$ assuming no foregrounds at all!)

The chances are much higher – I'd be tempted to say perhaps even as much as better than even money – that foregrounds contribute a part of the observed signal, and that therefore the actual value of the tensor-to-scalar ratio will come down from $r=0.2$, perhaps to as low as $r=0.1$, when Planck checks this result using their better dust mapping.

Friday, March 21, 2014

BICEP2: reasons to be sceptical, part 1

As the dust begins to settle following the amazing announcement of the discovery of gravitational waves by the BICEP2 experiment, physicists around the world are taking stock and scrutinizing the results.

Remember that the claimed detection is enormously significant, in more ways than one. The BICEP team have apparently detected an exceedingly faint B-mode polarization pattern in the CMB, at an order of magnitude better sensitivity than any previous experiment probing the same scales. They have then claimed to have been able to ascribe this B-mode signal unambiguously to cosmological gravitational waves, rather than any astrophysical effects due to intervening dust or other sources of radiation. And finally they have interpreted these results as direct evidence for the theory of inflation, which is really the source of all the excitement, because if true it would pin down the energy scale of inflation at an incredibly high level, with extensive and dramatic consequences for our understanding of high energy particle physics.

However, as all physicists have been saying, with results of this magnitude it is important to be very careful indeed. Speculating who should get the Nobel Prize (or Prizes) for this is still premature. The paper containing the results will of course be subjected to anonymous peer review when it is submitted to a journal, but it has also already faced a rather extraordinary open peer review by social media, with a live group on Facebook, and all sorts of other discussion on blogs, Twitter and the like. (And to the great credit of the scientists on the BICEP team, they have patiently responded to questions and comments on these forums, and the whole process has been carried out very civilly!)

What I wanted to do today is to possibly contribute to that by gathering together all the main points of concern and reasons to be sceptical of the BICEP result. This is partly for my own purposes, since writing things down helps to clarify my thoughts. I will divide these concerns into three main categories, addressing the following questions:

  • how certain can we be that BICEP2 observed a real B-mode signal?
  • how certain can we be that this B-mode signal is cosmological in origin, i.e. that it is due to gravitational waves rather than something less exciting?
  • how certain can we be that these gravitational waves were caused by inflation?

I'll discuss the first category of concerns in part 1 of this post and the next two together in parts 2 and 3. I do not claim that any of the concerns I raise here are original, however any mistakes are definitely mine alone. I'd like to encourage discussion of any of these points via the comments below.

How certain can we be that BICEP2 observed a real B-mode signal?

This is obviously the most basic issue. The general reason for concern here — and this applies to any B-mode detection experiment — is that the experimental pipeline has to be able to decompose the polarization signal seen into two components, the E-mode and the B-mode, and the level of the signal in the B-mode is orders of magnitude smaller than in E. Now, as Peter Coles explains here, the E and B polarization components are in principle orthogonal to each other when the spherical harmonic decomposition can be performed over the whole sky, but this is in practice impossible. BICEP observes only a small portion of the sky, and therefore there is the possibility of "leakage" from E to B when the separating out the components. It would not take much leakage to spoil the B-mode observation.

Obviously the BICEP team implemented many tests of the obtained maps to check for such systematics. One of the ways to do this is to cross-correlate the E and B maps: if there is no leakage the cross-correlation should be consistent with zero. Another important test is the jackknife technique, also nicely explained here: you split your data into two equal halves, and subtract the signal found in one half from that in the other; the answer should also be consistent with zero.

Now one source of concern arises because of a combination of these two tests. The blue points in the following figure show the results of a jackknife test on the BB power:

These points are consistent with zero ... but they are possibly too consistent with zero! The $1\sigma$ error bars of each one of them passes through zero, whereas it would be more natural to expect some more scatter. In fact from the number on the plot you can see that there is only a 1% chance that all 9 blue points should be so close to zero.

This raises the possibility, pointed out by Hans Kristian Eriksen, that the errorbars on the blue points are overestimated. It may then be the case that the errorbars on other points in other jackknife tests are also too large. If that were the case then reducing those errors might mean that some of the other jackknife tests now fail — the points are no longer consistent with zero. As it happens, of the 168 jackknife test results listed in the table in the paper, quite a large number (about 7) of them already "fail" by the stricter standards (2% probability) some other experiments such as QUIET might apply. Obviously some number of tests are always expected to fail, but more than 7 out of 168 starts to look like quite a large number. This then becomes a little worrying.

On the other hand, this extrapolation may be a little exaggerated, because we are surmising that the errorbars might be too large purely on the basis of the one figure above. Clearly if you do a large number of jackknife tests, it becomes less surprising that one of them gives a surprising result, if you see what I mean. Looking through the table for the other BB jackknife results, the particular example from the figure is the only one that stands out as being odd, so it is hard to conclude from this that the errorbars are too large. Overall I'm not convinced that there is necessarily a problem here, but it is something that deserves a little more quantitative attention.

The second source of concern that has been highlighted is that the data at large multipole values appear to be doing something odd. Look at the 5th, 6th and 7th black points from the figure above, which are quite a long way from the theoretical expectation. Peter Coles helpfully drew a little blue circle around them:

The worry here is that even if the data appear to be passing jackknife tests for internal consistency and null tests for EB cross power, the fact that these points are so high suggests that there is still some undetected systematic that has crept in somewhere. This hypothesized systematic could account for the measured values of the crucial first four points, which constitute the detection of the gravitational waves.

Similarly, people are worried about the EE power spectrum, which appears to be too high in the $50< \ell<100$ region — again this could be a sign of leakage from temperature into polarization, which could perhaps be contaminating the B-mode maps despite not explicitly showing up in the jackknife consistency checks.

Now, the BICEP response to this is that you shouldn't judge things simply "by eye". The EE excess does not appear to be statistically significant. It's also not incredibly unlikely that the final two of the circled BB data points could simultaneously be as high as they are just due to random chance — they say "their joint significance is $<3\sigma$", which means that the chance is about 1%. (Of course the chance that all three of the circled points could simultaneously be high is smaller than that, and so presumably less than 1% ... )

Another justification some people have been providing (mostly people from outside the BICEP collaboration to be fair, though some from within it as well) is that the preliminary data from the Keck array, which is a similar instrument to BICEP but with higher sensitivity, appear to show no anomaly in that region. I think this is a somewhat dangerous argument, because the Keck data also don't seem to be quite so high in the region of the crucial first four bandpowers! In any case, the "official" word from BICEP is that any such speculation on the basis of Keck is to be discouraged, because the Keck data is still very preliminary and has not been properly checked.


I'm a little bit worried about the various issues raised here, though overall I would say the odds are in favour of the B-mode detection being secure (this is a different issue to whether this detected signal is due to gravitational waves! More on that in the next post). I would not, however, put those odds at anywhere near 1 in 300,000,000,000 against there being an error, which is the headline significance claimed for the detection of a non-zero tensor-to-scalar ratio ($7\sigma$). If I were forced to quantify my belief, I would say something more like 1 or 2 in 100. That's not particularly secure, but luckily there are follow-up experiments, such as Keck and Planck itself, which should be able to reassure us on that score soon.

A final point: seeing the preliminary Keck data shown in a figure in the paper suggests to me that perhaps the final analysis of Keck data will now not be done "blind". I hope that's not the case, it would be very disturbing indeed if it were. 

Monday, March 17, 2014

First Direct Evidence for Cosmic Inflation

That was the title of the BICEP2 presentation today. Gives you some idea about the magnitude of the result, if it holds up: it really is astonishingly exciting.

Unfortunately it was so exciting that we in Helsinki couldn't even access the Harvard server and so couldn't watch any of the webcast at all. It seems the same was true for most other cosmologists around the world. So my comments here are based purely on a preliminary reading of the paper itself, and a distillation of the conversations occurring via Facebook and the like.

Firstly, the headline results: the BICEP team claim to have detected a B-mode signal in the CMB at exceedingly high statistical significance. Their headline claim is

$r=0.2^{+0.07}_{-0.05}$, with $r=0$ disfavoured at $7.0\sigma$

That is frankly astonishing. Here's the likelihood plot:

BICEP2 constraint on the tensor-to-scalar ratio r. 

(All figures are taken from the paper avalaible here.)

The actual measurement of the BB power spectrum looks like this:

The black points are the new measurements, the other coloured points are the previously available best upper limits. The solid red curve is the theoretical expectation from lensing (the relatively boring contribution to BB), the dashed red curve that dies off is the theoretical expectation from a model with inflationary gravitational waves and $r=0.2$, and the other dashed red curve (were they short of colours?!) is the total.

They've also done a pretty good job of eliminating other foreground sources (dust, synchrotron emission etc.) as possible explanations for the signal seen, which means it is much more likely that the signal is actually due to primordial gravitational waves from inflation. In doing this, it helps that the signal they see is actually as large as it is, since there's less chance of confusing it with these foregrounds (which are much smaller).  [Update: I'm not an expert here, apparently some others were less convinced about the removal of foregrounds. Not sure why though – I'd have thought other systematic errors were far more likely to be a problem than foregrounds.]

So far so good. In fact — and I really can't stress this enough — this is an extraordinary, wonderful, unexpected result and huge congratulations to the BICEP team for achieving it. It will mean a lot of happy theorists as well, because we finally have something new to try to explain!

However, it is very important that as a community we remain skeptical, particularly so when - as here - the result is one that we would so desperately love to be true. Given that, I'm going to list a serious of things that are potentially worrying/things to think about/things I don't understand. (Some of these are not things I noticed myself, but were points raised by Dave Spergel, Scott Dodelson and other experts at the ongoing live discussion on Facebook.) Doubtless these are questions the BICEP team will have thought about themselves; perhaps they already have all the answers and will tell us about them in due course — as I said, no one I know was able to watch the webcast live.

  • In the BB-spectrum plot above, the data seem to be showing a significant excess above expectations for multipoles about $\ell\sim200-350$. What's going on with that?
  • This is particularly noticeable in another figure (Fig. 9) in the paper:
  • From the above figure, preliminary results of the cross-correlation with Keck don't show the excess at high-$\ell$ (a reason to believe it might go away), but the same cros-correlation also shows less power at lower $\ell$ (which is a bit confusing).
  • At lower values of $\ell$ the EE power spectrum also shows an excess (Fig. 7):
  • All the above points put together suggest that perhaps there is some leakage in the polarisation maps coming from the temperature anisotropy — a large part of the analysis work is concerned with accounting for and correcting for any such leakage, of course, the question is to what extent independent experts will be satisfied that these methods worked.
  • Although the headline figure is $r=0.2$, they rather confusingly later say that when the best possible dust model is used for foreground subtraction, this becomes $r=0.16^{+0.06}_{-0.05}$. But if this the the best possible dust model, why is this not the quoted headline number? Is this related somehow to the power excess at $\ell\sim200-350$?
  • If $r$ is as large as they have measured why was it not seen by Planck? Actually this is a fairly complicated question: the point being that if the tensor amplitude is so large, it should make a non-negligible contribution to the temperature power spectrum as well, which would have affected Planck's results. Planck had a constraint $r<0.11$, but this specifically assumed that the primordial power spectrum had a power-law form with no running (sorry about the technical jargon, unfortunately not enough time to explain here today). So BICEP suggest one way around this tension is to simply introduce a running, but it seems (but this bit was not entirely clear to me from the paper) that you need a fairly large value of the running for this explanation to fly. And if you've got a large running then you have to worry about why not a running of the running, a running of the running of the running and so on ad infinitum - in fact how do we know that the power-law expansion form of the $P(k)$ is the correct way to go at all?
  • Besides, are there viable inflationary models that predict both large $r$ as well as large running (or non-power-law form of the primordial power)? Given the vast array of inflationary models, the answer to this question is almost certainly yes, but people may consider some other explanations more worthwhile ...
Phew. There are probably lots of other things to think about, but that's about all I can manage today. It's been a very exciting day!

Saturday, March 15, 2014

B-modes, rumours, and inflation

Update: The announcement will definitely be about a major discovery by BICEP2, meaning it can only really be about a B-mode signal. You can follow the webcast at, starting at 10:45 am EDT (14:45 GMT) for scientists, or 12:00 pm EDT (16:00 GMT) for the general public and news organisations.

The big news in cosmology circles at the minute is the rumour that the "major discovery" due to be announced at a press conference on Monday the 17th is in fact a claimed detection of the B-mode signal in the CMB by the the BICEP2 experiment.

Now, I'm not particularly well placed to comment on this rumour, since all the information I have comes at second- or third-hand, via people who have heard something from someone, people who think they heard something from someone, or people who are simply unashamedly speculating. (Perhaps this is a function of being on the wrong side of the Atlantic: although the BICEP2 experiment is based at the South Pole, the only non-North-American university participating in the collaboration is Cardiff University in Wales. Even worse, I'm not on Twitter.) In any case, by reading thisthisthis and this, you will be starting with essentially the same information as me.

But having got that health warning out of the way, let's pretend that the rumours are entirely accurate and that on Monday we will have an announcement of a detection of a significant B-mode signal. What would this mean for cosmology?

Firstly, the B-mode signal refers to a particular polarisation of the CMB (for a short and somewhat technical introduction, see here; for a slightly longer one, see here). This polarisation can arise in various ways, one of which is the polarisation induced in the CMB by gravitational lensing, as the CMB photons travel through the inhomogeneous Universe on their way from the last scattering surface to us. There have been a few experiments, such as POLARBEAR, which have already claimed a detection of this lensing contribution to the B-mode signal (though in this particular case after skim-reading the paper I was a little underwhelmed by the claim).

Now, detecting a lensing B-mode would be cool, but significantly less exciting than detecting a primordial B-mode. This is because whereas the lensing signal comes from late-time physics that is quite well understood, a primordial signal would be evidence of primordial tensor fluctuations or primordial gravitational waves. And this is cool because inflation provides a possible way to produce primordial gravitational waves – therefore their detection could be a major piece of evidence in favour of inflation.

The contributions to the B-mode signal coming from gravitational waves and lensing are differentiated on the basis of the multipoles (essentially the length scale) at which they are important. Figure from Hu and Dodelson 2002.

People often say that detection of this tensor signal would be a "smoking gun" for inflation; something that would be very welcome, because although inflation has proved to be an attractive and fertile paradigm for cosmology, there is still a bit of a lack of direct, incontrovertible evidence in favour of it. Coupled with certain unresolved theoretical issues it faces, this lack of a smoking gun meant that arguments for or against inflation were threatening to degenerate into what you might call "multiverse territory", definitely an unhealthy place to be.

It may be worth introducing a note of caution about this "smoking gun" though. Although inflation is a possible source of primordial gravitational waves, it is not the only one. Artefacts of possible phase transitions in the early universe, known as cosmic defects, can also produce a spectrum of gravitational waves – and what's more, this spectrum can be exactly scale-invariant, just as that from inflation. I don't know a huge amount about this field, so I am not sure whether the amplitude of the perturbations which could be produced by these cosmic defects could be sufficiently large, nor – if it is – whether there are any other features which could help distinguish this scenario from inflation if the rumours turn out to be true. Perhaps better informed people could comment below.

Suppose we put that issue to one side though, and assume that not only has a significant tensor signal been detected, we have also been able to prove that it could not be due to anything other than inflation. The rumour is that the detection corresponds to value for the tensor-to-scalar ratio r of about 0.2. What are the implications of this for the different inflation models?

Planck limits on various inflationary models.
Not all models of inflation do result in tensor modes large enough to observed in the CMB, so an observation of a large r would rule out a large class of these models. Generally speaking, the understanding is that models in which the inflaton field $\phi$ takes large values (i.e., values larger than the Planck mass $M_P$) are the ones which could produce observably large r, whereas the so-called "small-field models" where $\phi\ll M_P$ usually predict tiny values of r which could never be observed. (A note for non-experts: irrespective of the field value, the energy scale in both small-field and large-field models is always much less than the Planck scale.) Therefore, at a stroke, all small-field inflation models would be ruled out. Many people regard these as the better-motivated models of inflation, with in some respects fewer theoretical issues than the large-field models, so this would be quite significant.

There are two small caveats to this statement: firstly, it isn't strictly necessary for $\phi$ itself to be larger than $M_P$ to generate a large r, only that the change in $\phi$ be large. So models in which the inflaton field winds around a cylinder, in effect travelling a large distance without actually getting anywhere, can still give large r (hat-tip to Shaun for that phrasing). Also, it is not even strictly true that the change in $\phi$ must be large: if some other rather specific conditions (including the temporary breakdown of the slow-roll approximation) are met, this one can be avoided and even small field models can produce enough gravitational waves. This was something pointed out by a paper I wrote with Shaun Hotchkiss and Anupam Mazumdar in 2011, though other people had similar ideas at about the same time. Such rather forced small-field models would have other specific features though, so could be distinguished by other measurements.

One of the more interesting consequences of a detection of large r (aside from the earth-shattering importance of a confirmation of inflation itself) would be that the Higgs inflation model – which has been steadily gaining in popularity given the results from the LHC and Planck, and has begun to be regarded by many as the most plausible mechanism by which inflation could have occurred – would be disfavoured. In the plot above, the Higgs inflation prediction is shown by the orange points at the bottom centre of the figure. So a BICEP2 detection of $r\sim0.2$ as suggested by the rumours would be pretty serious for this model.

On the other hand, a BICEP2 detection of $r\sim0.2$ would also strongly contradict appear to be at odds with the results from the Planck and WMAP satellites. Which probably goes to show that there is not much point believing every rumour ...

We will find out on Monday!