blackboard

waynehu

Professor, Department of Astronomy and Astrophysics
University of Chicago

Group Contact CV SnapShots
CMB Introduction '96   Intermediate '01   Polarization Intro '01   Cosmic Symphony '04   Polarization Primer '97   Review '02   Power Animations   Lensing   Power Prehistory   Legacy Material '96   PhD Thesis '95 Baryon Acoustic Oscillations Cosmic Shear Clusters
Transfer Function
Intro to Cosmology [243] Cosmology I [legacy 321] Cosmology II [321] Current Topics [282] Galaxies and Universe [242] Radiative Processes [305] Research Preparation [307] GR Perturbation Theory [408] CMB [448] Cosmic Acceleration [449]

Bandpower Estimation

Figure 5 indicates that the next step in the compression process is extracting bandpowers from the map. What is a bandpower and how can it be extracted from the map? To answer these questions, we must construct a new likelihood function, one in which the estimated $\Theta_i$ are the data. No theory predicts an individual $\Theta_i$, but all predict the distribution from which the individual temperatures are drawn. For example, if the theory predicts Gaussian fluctuations, then $\Theta_i$ is distributed as a Gaussian with mean zero and covariance equal to the sum of the noise covariance matrix $C_N$ and the covariance due to the finite sample of the cosmic signal $C_S$. Inverting Equation (1) and using Equation (2) for the ensemble average leads to

\begin{displaymath}
C_{S,ij} \equiv \langle \Theta_i \Theta_j \rangle
= \sum_\ell \Delta_{T,\ell}^2 W_{\ell, ij}
\,,
\end{displaymath} (30)

where $\Delta_{T,\ell}^2$ depends on the theoretical parameters through $C_\ell$ (see Equation (3)). Here $W_\ell$, the window function, is proportional to the Legendre polynomial $P_\ell(\hat n_i\cdot \hat n_j)$ and a beam and pixel smearing factor $b_\ell^2$. For example, a Gaussian beam of width $\sigma $, dictates that the observed map is actually a smoothed picture of true signal, insensitive to structure on scales smaller than $\sigma $. If the pixel scale is much smaller than the beam scale, $b_\ell^2\propto e^{-\ell(\ell+1)\sigma^2}$. Techniques for handling asymmetric beams have also recently been developed [Wu et al, 2001,Wandelt & Gorski, 2001,Souradeep & Ratra, 2001]. Using bandpowers corresponds to assuming that $\Delta_{T,\ell}^2$ is constant over a finite range, or band, of $\ell $, equal to $B_a$ for $\ell_a-\delta \ell_a/2 < \ell < \ell_a + \delta \ell_a/2$. Plate 1 gives a sense of the width and number of bands $N_b$ probed by existing experiments.

For Gaussian theories, then, the likelihood function is

\begin{displaymath}
{\cal L}_B (\Theta_i )
= {1\over (2\pi)^{N_p/2} \sqrt{\det {...
...( -{1\over 2}
\Theta_i C^{-1}_{\Theta,ij} \Theta_j \right)\,,
\end{displaymath} (31)

where ${\bf C}_\Theta={\bf C}_S+{\bf C}_N$ and $N_p$ is the number of pixels in the map. As before, ${\cal L}_B$ is Gaussian in the anisotropies $\Theta_i$, but in this case $\Theta_i$ are not the parameters to be determined; the theoretical parameters are the $B_a$, upon which the covariance matrix depends. Therefore, the likelihood function is not Gaussian in the parameters, and there is no simple, analytic way to find the point in parameter space (which is multi-dimensional depending on the number of bands being fit) at which ${\cal L}_B$ is a maximum. An alternative is to evaluate ${\cal L}_B$ numerically at many points in a grid in parameter space. The maximum of the ${\cal L}_B$ on this grid then determines the best fit values of the parameters. Confidence levels on say $B_1$ can be determined by finding the region within which $\int_a^b d B_1
[ \Pi_{i=2}^{N_b} \int d^B_i ]\,{\cal L}_B = 0.95$, say, for $95\%$ limits.

This possibility is no longer viable due to the sheer volume of data. Consider the Boomerang experiment with $N_p=57,000$. A single evaluation of ${\cal L}_B$ involves computation of the inverse and determinant of the $N_p\times N_p$ matrix ${\bf C}_\Theta$, both of which scale as $N_p^3$. While this single evaluation might be possible with a powerful computer, a single evaluation does not suffice. The parameter space consists of $N_b=19$ bandpowers equally spaced from $l_a=100$ up to $l_a=1000$. A blindly placed grid on this space would require at least ten evaluations in each dimension, so the time required to adequately evaluate the bandpowers would scale as $10^{19} N_p^3$. No computer can do this. The situation is rapidly getting worse (better) since Planck will have of order $10^7$ pixels and be sensitive to of order a $10^3$ bands.

It is clear that a ``smart'' sampling of the likelihood in parameter space is necessary. The numerical problem, searching for the local maximum of a function, is well-posed, and a number of search algorithms might be used. ${\cal L}_B$ tends to be sufficiently structureless that these techniques suffice. [Bond et al, 1998] proposed the Newton-Raphson method which has become widely used. One expands the derivative of the log of the likelihood function - which vanishes at the true maximum of ${\cal L}_B$ - around a trial point in parameter space, $B_a^{(0)}$. Keeping terms second order in $B_a-B_a^{(0)}$ leads to

\begin{displaymath}
\hat B_a = \hat B_a^{(0)}
+ \hat F_{B,ab}^{-1} {\partial\ln {\cal L}_B\over \partial B_b}\,,
\end{displaymath} (32)

where the curvature matrix $\hat F_{B,ab}$ is the second derivative of $-\ln{\cal L}_B$ with respect to $B_a$ and $B_b$. Note the subtle distinction between the curvature matrix and the Fisher matrix in Equation (29), ${\bf F} = \langle \hat{\bf F} \rangle$. In general, the curvature matrix depends on the data, on the $\Theta_i$. In practice, though, analysts typically use the inverse of the Fisher matrix in Equation (32). In that case, the estimator becomes

\begin{displaymath}
\hat B_a = \hat B_a^{(0)}
+ {1\over 2} F^{-1}_{B,ab} \left(
...
...j}^{-1} {\partial C_{\Theta,ji}\over \partial B_b}
\right) \,,
\end{displaymath} (33)

quadratic in the data $\Theta_i$. The Fisher matrix is equal to

\begin{displaymath}
F_{B,ab} = {1\over 2} C_{\Theta,ij}^{-1} {\partial C_{\Theta...
...Theta,kl}^{-1}
{\partial C_{\Theta,li} \over \partial B_b}\,.
\end{displaymath} (34)

In the spirit of the Newton-Raphson method, Equation (33) is used iteratively but often converges after just a handful of iterations. The usual approximation is then to take the covariance between the bands as the inverse of the Fisher matrix evaluated at the convergent point ${\bf C}_{B} = {\bf F}_B^{-1}$. Indeed, [Tegmark, 1997b] derived the identical estimator by considering all unbiased quadratic estimators, and identifying this one as the one with the smallest variance.

Although the estimator in Equation (33) represents a $\sim 10^{N_b}$ improvement over brute force coverage of the parameter space - converging in just several iterations - it still requires operations which scale as $N_p^3$. One means of speeding up the calculations is to transform the data from the pixel basis to the so-called signal-to-noise basis, based on an initial guess as to the signal, and throwing out those modes which have low signal-to-noise [Bond, 1995,Bunn & Sugiyama, 1995]. The drawback is that this procedure still requires at least one $N_p^3$ operation and potentially many as the guess at the signal improves by iteration. Methods to truly avoid this prohibitive $N_p^3$ scaling [Oh et al, 1999,Wandelt & Hansen, 2001] have been devised for experiments with particular scan strategies, but the general problem remains open. A potentially promising approach involves extracting the real space correlation functions as an intermediate step between the map and the bandpowers [Szapudi et al, 2001]. Another involves consistently analyzing coarsely pixelized maps with finely pixelized sub-maps [Dore et al, 2001].


next up previous
Next: Cosmological Parameter Estimation Up: DATA ANALYSIS Previous: Mapmaking
Wayne Hu 2001-10-15