Figure 5 indicates that the next step
in the compression process is extracting bandpowers
from the map. What is a bandpower and how can it be extracted from the map?
To answer these questions, we must construct a new likelihood function, one
in which the estimated are the data. No theory predicts an individual
, but all predict the distribution from which the individual
temperatures are drawn. For example, if the theory predicts Gaussian fluctuations,
then
is distributed as a Gaussian with mean zero and covariance
equal to the sum of the noise covariance matrix
and the covariance due to the finite sample
of the cosmic signal
. Inverting Equation (1)
and using Equation (2) for the ensemble average
leads to
where depends on the theoretical parameters through
(see Equation (3)). Here
, the window function, is proportional
to the Legendre polynomial
and a beam
and pixel smearing factor
. For example, a Gaussian beam of width
, dictates that the observed map is actually a smoothed picture
of true signal, insensitive to structure on scales smaller than
. If the pixel scale is much smaller than the beam scale,
. Techniques for handling asymmetric
beams have also recently been developed [Wu et al, 2001,Wandelt & Gorski, 2001,Souradeep & Ratra, 2001]. Using bandpowers
corresponds to assuming that
is constant over a finite
range, or band, of
, equal to
for
. Plate
1 gives a sense of the width and number of bands
probed by existing experiments.
For Gaussian theories, then, the likelihood function
is
![]() |
(31) |
where and
is the number of pixels in the map. As before,
is Gaussian in the anisotropies
, but in this case
are not the parameters to be determined; the theoretical
parameters are the
, upon which the covariance matrix depends. Therefore, the likelihood
function is not Gaussian in the parameters, and there is no simple, analytic way
to find the point in parameter space (which is multi-dimensional depending on
the number of bands being fit) at which
is a maximum. An alternative is to evaluate
numerically at many points in a grid in parameter space.
The maximum of the
on this grid then determines the best fit values of the parameters.
Confidence levels on say
can be determined by finding the region within which
, say, for
limits.
This possibility is no longer viable due to the sheer volume of data. Consider
the Boomerang experiment with . A single evaluation of
involves computation of the inverse and determinant of the
matrix
, both of which scale as
. While this single evaluation might be possible with a powerful
computer, a single evaluation does not suffice. The parameter space consists
of
bandpowers equally spaced from
up to
. A blindly placed grid on this space would require at least
ten evaluations in each dimension, so the time required to adequately evaluate
the bandpowers would scale as
. No computer can do this. The situation is rapidly getting
worse (better) since Planck will have of order
pixels and be sensitive to of order a
bands.
It is clear that a ``smart'' sampling of the likelihood
in parameter space is necessary. The numerical problem, searching for
the local maximum
of a function, is well-posed, and a number of search algorithms might be used.
tends to be sufficiently structureless that these techniques
suffice. [Bond et al, 1998] proposed the Newton-Raphson
method which has become widely used. One expands the derivative of the
log of the likelihood function - which vanishes at the true maximum of
- around a trial point in parameter space,
. Keeping terms second order in
leads to
where the curvature matrix is the second derivative of
with respect to
and
. Note the subtle distinction between the curvature matrix and the
Fisher matrix in Equation (29),
. In general, the curvature matrix
depends on the data, on the
. In practice, though, analysts typically use the inverse of
the Fisher matrix in Equation (32). In that
case, the estimator becomes
quadratic in the data . The Fisher matrix is equal to
In the spirit of the Newton-Raphson method, Equation (33)
is used iteratively but often converges after just a handful of iterations. The
usual approximation is then to take the covariance between the bands as the inverse
of the Fisher matrix evaluated at the convergent point . Indeed, [Tegmark, 1997b] derived the identical estimator
by considering all unbiased quadratic estimators, and identifying this one as
the one with the smallest variance.
Although the estimator in Equation (33)
represents a improvement over brute force coverage of the parameter
space - converging in just several iterations - it still requires operations
which scale as
. One means of speeding up the calculations is to transform the
data from the pixel basis to the so-called signal-to-noise
basis, based on an initial guess as to the signal, and throwing out those
modes which have low signal-to-noise [Bond, 1995,Bunn & Sugiyama, 1995]. The drawback is that
this procedure still requires at least one
operation and potentially many as the guess at the signal improves
by iteration. Methods to truly avoid this prohibitive
scaling [Oh et al, 1999,Wandelt & Hansen, 2001] have been devised
for experiments with particular scan strategies,
but the general problem remains open. A potentially promising approach involves
extracting the real space correlation functions as an intermediate step between
the map and the bandpowers [Szapudi et al, 2001]. Another involves
consistently analyzing coarsely pixelized maps with finely pixelized sub-maps
[Dore et al, 2001].