Astron. Astrophys. 362, 697-710 (2000)

Astron. Astrophys. 362, 697-710 (2000)

3. Solving radiative transfer and molecular excitation

3.1. The coupled problem

The equation of radiative transport reads, in the notation of Rybicki & Lightman (1979),

[EQUATION]

or, equivalently,

[EQUATION]

Here, [FORMULA] is the intensity at frequency along a particular line of sight parameterized by ds, is the absorption coefficient in units cm^-1, and the emission coefficient with units erg s^-1 cm^-
3 Hz^-1 sr^-1. The second form of the equation is a useful change of variables, with the source function [FORMULA] and the optical depth . We consider both molecules and dust particles as sources of emission and absorption (; ), but ignore scattering. Although not impossible to include in our code, scattering effects are usually negligible at wavelengths longer than mid-infrared.

When [FORMULA] and are known at each position in the source, the distribution of the emission on the sky simply follows from ray tracing. However, in many cases, and will depend on the local mean intensity of the radiation field

[EQUATION]

Here, [FORMULA] is the average intensity received from all solid angles , and is the solution of Eq. (2) along each direction under consideration. The latter integration extends formally to infinity, but in practice only to the edge of the source with any incident isotropic radiation field like the cosmic microwave background (CMB) as boundary condition.

For thermal continuum emission from dust, [FORMULA] and are simply given by

[EQUATION]

where [FORMULA] is the Planck function at the dust temperature , and

[EQUATION]

where [FORMULA] is the dust opacity in cm^-2 per unit (dust) mass and is the mass density of dust. Our code can use any description of (Ossenkopf & Henning, 1994; Pollack et al., 1994; Draine & Lee, 1984; Mathis et al., 1977, e.g.).

In the case of emission and absorption in a spectral line, [FORMULA] and are determined by absorption and emission between radiatively coupled levels u and l with populations (in cm^-3) and . The energy difference between levels corresponds to the rest frequency of the transition, , where h is Planck's constant. The emission and absorption coefficients between levels u and l are strongly peaked around [FORMULA] with a frequency dependence described by a line-profile function ,

[EQUATION]

The Einstein [FORMULA] , , and coefficients determine the transition probabilities for spontaneous emission, absorption, and stimulated emission, respectively, and depend on molecule. In most interstellar clouds the line profile is dominated by Doppler broadening due to the turbulent velocity field

[EQUATION]

where the turbulence is assumed to be Gaussian with a full width b. In the presence of a systematic velocity field, the line profile is angle-dependent and the projection of the local velocity vector onto the photon propagation direction enters [FORMULA] .

Together, collisions and radiation determine the level populations through the equation of statistical equilibrium,

[EQUATION]

The collision rates [FORMULA] depend on the density and the collisional rate coefficients of molecular hydrogen and other collision partners, and on temperature through the detailed balance of the up- and downward coefficients. Eq. (10) can be easily solved through matrix inversion for each position in the source provided the radiation field [FORMULA] is known. However, contains contributions by the CMB, dust and spectral lines, and since the spectral line term depends on the level populations through Eqs. (2), (7) and (8), the problem must be solved iteratively. Starting with an initial guess for the level populations, is calculated, statistical equilibrium is solved and new level populations are obtained; through the Monte Carlo integration, the new populations yield a new value for [FORMULA] , after which the populations are updated; etc., until the radiation field and the populations have converged on a consistent solution.

When the physical conditions do not vary much over the model, an approximate value of [FORMULA] can be found from the local conditions alone. This idea is the basis of the Large Velocity Gradient method, the Sobolev method, microturbulence, or the escape probability formalism (Sobolev, 1960; Goldreich and Kwan, 1974; Leung & Liszt, 1976; de Jong et al., 1980, e.g.). Also, in specific geometries, the integration over all solid angles and along the full length of the line of sight of Eqs. (3) and (4) can be greatly reduced, making the problem tractable. This sort of technique has most application in stellar and planetary atmospheres; the Eddington approximation is an example.

However, in many astrophysical situations including the example of Sect. 2, such simplifications cannot be made, and Eqs. (3) and (4) need to be fully solved to get [FORMULA] . Compared to the relative ease with which statistical equilibrium can be solved (Eq. 10), obtaining becomes the central issue. Direct integration of Eqs. (3) and (4) with, e.g., Romberg's method, is infeasible for realistic models, but based on a finite set of directions a good approximation of [FORMULA] can be obtained. The next two sections describe two different methods to choose this set and construct in this way.

3.2. Constructing and the -operator

For computational purposes, source models are divided into discrete grid cells, each with constant properties (density, temperature, molecular abundance, turbulent line width, etc.). It is also assumed that the molecular excitation can be represented by a single value in each cell, which requires instantaneous spatial and velocity mixing of the gas. Appropriate source models have small enough cells that the assumption of constant excitation is valid. The systematic velocity field is the only quantity that is a vector field rather than a scalar field, and in our code it is allowed to vary in a continuous way within each cell. We divide the integration along a ray into subunits within a cell to track the variation of the velocity projected on the ray.

Such a gridded source model lends itself easily to the construction of a finite set of integration paths to build up [FORMULA] . The average radiation field can be thought of as the sum of the emission received in cell i from each of the other cells j after propagation through the intervening cells and weighted with the solid angle subtended by each of these cells j as seen from cell i. The combination of radiative transfer and statistical equilibrium can be written as

[EQUATION]

This equation states that the radiation field is given by an operator [FORMULA] acting on the source function , which depends on the level populations and hence (Eqs. 7, 8, 10). Considering the narrow frequency interval around the transition u-l, we have replaced by . This corresponds to the assumption of instanteous redistribution of excitation mentioned above. In a gridded source model, one can think of [FORMULA] as a matrix describing how the radiation field in cell i depends on the excitation in all other cells. The elements in the matrix then represent the radiative coupling between cell pairs.

Eq. (11) can be solved iteratively, where an updated value of [FORMULA] is obtained by having operate on the previous populations, ,

[EQUATION]

Since [FORMULA] is already known, this only involves matrix multiplication, compared to the much more expensive matrix inversion required to solve Eq. (11). Because of this elegant notation, iterative schemes for non-LTE excitation and radiative transfer are commonly referred to as -iteration, even if no [FORMULA] -operator is ever actually constructed. These methods share the use of the same set of rays throughout the calculation, in contrast to Monte Carlo methods, which use random rays as discussed in Sects. 3.3 and 3.5.

Constructing the [FORMULA] -operator in multidimensional source models is taxing on computer memory because of all the possible connections to keep track of simultaneously. Techniques exist to reduce the number of elements (Dullemond and Turolla, 2000), but these are complex and may require some fine-tuning for individual source geometries. Alternatively, computer memory can be exchanged for computing time by solving the problem serially, calculating the radiation field in each of the cells due to the other cells one at a time.

3.3. The Monte Carlo method

One way of solving Eq. (12) is to directly sum the contribution from all other cells to the radiation field in each of the individual cells. This corresponds to replacing the integral in Eq. (4) by a summation. With a judiciously chosen fixed set of directions or rays, as most [FORMULA] -iteration codes do, a good approximation of can be found in this way (Phillips, 1999, e.g.). However, this procedure requires care, since the necessary angular sampling depends, in principle, on the characteristics of the excitation solution of the problem at hand.

Since our aim is to construct a method that can be applied to many different source models without too much fine-tuning, we adopt the Monte Carlo approach to obtain [FORMULA] . Analogous to the Monte Carlo method to solve the definite integral of a function [see chapter 7 of Press et al. (1992) for a discussion of Monte Carlo integration, and for further references], Eq. (4) can be approximated by the summation over a random set of directions. This has the advantage that all directions are sampled to sufficient detail: if too few directions are included, subsequent realizations will give different estimates of [FORMULA] (see Sect. 3.5 for further discussion of this issue).

Originally (Bernes, 1979), the Monte Carlo approach was phrased in terms of randomly generated `photon packages', which are followed as they travel through the source and which together approximate the radiation field. Fig. 2 illustrates that a formulation in terms of randomly chosen directions from each cell yields an equivalent estimate of [FORMULA] . The only difference is the direction of integration in Eq. (2). Where the former approach follows the photons as they propagate through the cells, the latter backtracks the paths of the photons incident on each cell. As the next section will discuss, this latter approach lends itself better to improvements in its convergence characteristics. Treatment of non-isotropic scattering is more complicated in this approach, and since scattering is not important at the wavelengths of interest here, [FORMULA] m, scattering is not included in the code. Implementations of the Monte Carlo method more appropriate for scattering are available in the literature (Wood et al., 1996a; Wood et al., 1996b; Wolf et al., 1999).

Fig. 2. a In the `traditional' formulation of the Monte Carlo method for solving radiative transfer, the radiation field is represented by a certain number of photon packages. Each of these packages originates in a random position of the cloud, corresponding to spontaneous emission, and travels in a random direction through the cloud until it either escapes or is absorbed. To include the CMB field, a separate set of packages is included, shown as dashed arrows, that originate at the cloud's edge. The packages traversing a cell during an iteration give [FORMULA] in that cell. b In our implementation, an equivalent estimate of is found by choosing a certain number of rays which enter the cell from infinity (or the cloud's edge, using the CMB field as a boundary condition) from a random direction and contribute to the radiation field at a random point in the cell's volume. As Sect. 3.4 argues, this formulation allows separation between the incident radiation field and the locally produced radiation field, which accelerates convergence in the presence of significant optical depth.

3.4. Convergence and acceleration

Besides estimating [FORMULA] , an important aspect of non-LTE radiative transfer is convergence towards the correct solution in a reasonable amount of time. Since the solution is not a priori known, convergence is often gauged from the difference between subsequent iterative solutions. This relies on the assumption that when [FORMULA] and the populations are far from their equilibrium solution, corrections in each iteration are large. Large optical depth can be a major obstacle to this behaviour: emission passing through an opaque cell will rapidly lose all memory of its incident intensity and quickly tend toward the local source function. The distance over which the information about changes in excitation can travel is one mean free path per iteration, so that the required number of iterations grows [FORMULA] characteristic of random walk. This effect makes it hard to determine if the process has converged.

Accelerated or Approximated Lambda Iteration (Rybicki & Hummer, 1991, ALI), circumvents this problem by defining an approximate operator [FORMULA] such that

[EQUATION]

An appropriate choice for [FORMULA] is one which is easily invertible and which steers rapidly toward convergence. This occurs if is dominated by the second term on the right hand side of the equation, where works on the current source function as opposed to the solution from the previous iteration.

After several attempts (Scharmer, 1981), Olson, Auer, & Buchler (1986) found that a good choice for [FORMULA] is the diagonal, or sometimes tri-diagonal, part of the full operator . This choice for describes the radiation field generated locally by the material in each cell, and its direct neighbours in the case of the tri-diagonal matrix. Eq. (13) then gives as the sum of the field incident on each cell due to the previous solution for the excitation { [FORMULA] }, and a self-consistent solution of the local excitation and radiation field {}. In opaque cells, the radiation field is close to the local source function, and Eq. (13) converges significantly faster than Eq. (12); for optically thin cells, both formalisms converge equally fast.

Formulating the Monte Carlo method in terms of randomly generated photon packages traveling through the source does not easily permit separation of the locally generated field and the incident field for each cell. However, such a separation is possible when [FORMULA] is constructed by summation over a set of rays, which each start at a random position within the cell and point in a random direction. For ray i, call the incident radiation on the cell and the distance to the boundary of the cell . The current level populations translate this distance into an opacity [FORMULA] , and give the source function . The average radiation field from N rays then follows from Eqs. (3) and (7),

[EQUATION]

Here, [FORMULA] and contain both line and continuum terms, and includes the CMB. The radiation field is now the sum of the external () and internal () terms. Since the external term is evaluated using populations from the previous Monte Carlo iteration (through and ), this scheme is akin to accelerated [FORMULA] -iteration. Within Monte Carlo iterations, sub-iterations are used to find the locally self-consistent solution of and for given .

The main computational cost of this strategy lies in following a sufficient number of rays out of each cell through the source. Iteration on Eq. (14) is relatively cheap and speeds up convergence considerably in the presence of opaque cells. Fig. 3 illustrates this, by showing the evolution of the fractional error of the solution of the simple problem posed in Sect. 2 for optically thick HCO⁺ and thin H¹³CO⁺ excitation (for a fixed set of directions - see below).

Fig. 3. Evolution of the fractional error in the level populations of HCO⁺ as function of iteration step with (`accelerated'; solid line) and without (`not accelerated'; dashed line) separation of local and incident radiation field. For the optically thin H¹³CO⁺ molecule, both methods converge equally fast. The solid symbols indicate the iteration where the first stage of the calculation has converged (see Sect. 3.5); after that random noise starts to dominate the fractional error, which is controlled by the increase in rays per cell. The source model is that described in Sect. 2. The `not accelerated' HCO⁺ formally converged at iteration 18, because the difference with iteration 17 became smaller than 1/30, even though the difference from the real solution exceeds that value. This illustrates that acceleration is not only computationally convenient, but may also essential for a correct solution.

Population inversions require careful treatment in radiative transfer codes, since the associated opacity is negative and the intensity grows exponentially. In general, an equilibrium will be reached where the increased radiation field quenches the maser. Since iterative schemes solve the radiative transfer before deriving a new solution for the excitation, the radiation field can grow too fast if population inversions are present. Our code handles negative opacities by limiting the intensity to a fixed maximum which is much larger than any realistic field. Proper treatment requires that the grid is well chosen, so that masing regions are subdivided into small cells where the radiation field remains finite. Our code can deal with the small population inversions that occur in many problems including the model presented in Sect. 2. However, to model astrophysical masers, specialized codes are required (Spaans & van Langevelde, 1992, e.g.).

3.5. The role of variance in Monte Carlo calculations

Because the Monte Carlo method estimates [FORMULA] from a randomly chosen set of directions, the result has a variance, , which depends on the number N of included directions as . As explained above (Sect. 3.3), this variance is a strength rather than a weakness of the Monte Carlo method. Since it is not a priori known how many directions are required for a fiducial estimate of [FORMULA] , this method automatically arrives at an appropriate sampling by increasing N until the variance drops below a predefined value.

The variance of a solution is usually estimated from the largest relative difference between subsequent iterations. In our implementation (see appendix), the number N of rays making up [FORMULA] in a particular cell is doubled each time the variance in that cell exceeds a certain value; the variance is evaluated using the largest relative difference between three subsequent solutions with the same N. This cell-specific scheme ensures that the radiation field is sufficiently sampled everywhere, and at the same time prevents oversampling of cells which are close to LTE and/or weakly coupled to other regions.

The variance as estimated from the difference between subsequent solutions only reflects the noise if random fluctuations dominate the difference. There will be systematic differences between subsequent solutions if these are still far from convergence. Therefore, many Monte Carlo methods consist of two stages. In the first stage, a fixed number of photons will yield a roughly converged solution; in the second stage, the number of photons is increased until the noise becomes sufficiently small.

In our implementation, this first stage consists of iterations with a fixed number of directions making up [FORMULA] in each cell, , which depends on the model. The directions are randomly distributed, but in each iteration, the same set of random directions is used by resetting the random number generator each iteration. Without random fluctuations in , the difference between subsequent solutions purely reflects the evolution toward convergence. The first stage is considered converged when this `noise' is a factor of ten smaller than the user-specified level.

For a sufficiently large [FORMULA] (typically a few hundred), the excitation in each cell now is close to the final solution, except for imperfections in the sampling of . In the second stage, each iteration uses a different set of random directions to estimate : the random number generator is no longer reset. Based on the resulting variance, the number of rays in each cell is doubled each iteration, until the noise on the level populations in each cell is below a given value. If [FORMULA] was initially insufficient, the variance will contain a significant contribution from systematic differences between iterations. Even though this will slow down the code by artificially increasing the number of rays in these cells as the code over-compensates the variance, ultimately the code will still converge to the correct solution.

The separation between local and incident radiation fields in our method (Sect. 3.4) keeps the system responsive to changes even in the presence of significant optical depth. This accelerates the convergence, but also increases the noise level. The literature mentions several methods to reduce the noise of Monte Carlo methods, e.g., with a reference field (Bernes, 1979; Choi et al., 1995) or quasi-random instead of pseudo-random numbers (Juvela, 1997). These schemes are useful when assumptions about the solution are available, but may slow down convergence if the initial guess is far off. Since the `first stage' described above and the presence of noise prevents Monte Carlo methods from `false convergence', we have not included any noise reduction techniques in our code, to keep it as widely applicable as possible.

3.6. Implementation and performance characteristics

Appendix A describes the structure of the program in detail, and provides a reference to a web site where we have made the source code of its spherically symmetric version publicly available. To test its performance, Appendix B compares results obtained with our code to those of other codes.

The main part of the program deals with calculating the excitation through the source model. In a separate part, comparison to observations proceeds by integrating Eq. (2) on a grid of lines of sight for a source at a specified inclination angle and distance. The resulting maps of the sky brightness may be convolved with a beam pattern for comparison with single-dish data, or Fourier transformed for comparison with interferometer data.

Appendix B describes tests of the validity of the results of the program. We have also tested up to what optical depth the program can be used, and found that this depends on source model. These tests were done on a Sun Ultrasparc 2 computer with 256 Mb internal memory and a 296 MHz processor. For a simple, homogeneous HCO⁺ model with [FORMULA] cm^-3 and K, the code produces accurate results within an hour for values of N(HCO⁺) up to cm^-2, corresponding to in the lowest four rotational transitions. Higher lines are less optically thick under these physical conditions. For such opaque models, `local' approximations fail badly, because the excitation drops sharply at the edge of the model (Bernes 1979; Appendix A).

For a power-law, Shu-type model, performance is somewhat slower. The dense and warm region fills only a small volume, while its radiation has a significant influence on the excitation further out, and modeling this effect requires a large number of rays. We have used the specific model from the Leiden workshop on radiative transfer (Appendix B) for various values of the HCO⁺ abundance. Within a few hours, accurate results are obtained for values of HCO⁺/H₂ up to [FORMULA] , corresponding to in the lowest four lines.

These test cases should bracket the range of one-dimensional models of interest. For two-dimensional models, the limitations of present-day hardware are much more prohibitive. As a test case, we have used the flattened infalling envelope model from Sect. 4.1 for various HCO⁺ abundances. Within 24 hours, the above machine provides a converged solution for HCO⁺/H₂ up to [FORMULA] , corresponding to a maximum optical depth of . Realistic models often have higher opacities, and call for the use of parallel computers. However, as faster computers are rapidly becoming available, we expect that these limitations will become less relevant in the near future. For both one- and two-dimensional models, the second, ray-tracing part of the code to create maps of the sky brightness takes only a fraction of the computer resources of the first part.

3.7. Alternative accelerators

Another method to tackle slow convergence in the presence of large opacities is core saturation (Rybicki, 1972; Doty & Neufeld, 1997), where photons in the optically thick line center are replaced by the source function and no longer followed, while photons in the still optically thin line wings which carry most information are more accurately followed. This scheme has been implemented in a Monte Carlo program by Hartstein & Liseau (1998), but involves a choice where to separate the line core from the line wings. Since the effectiveness of the method depend on this choice, we have not included core saturation in our program.

A completely different approach to accelerating convergence is to extrapolate the behaviour of the populations from the last few iterations. This so-called Ng acceleration (Ng, 1974) is not implemented in our code, because extrapolating from an inherently noisy estimate may be dangerous.

Online publication: October 24, 2000
helpdesk@link.springer.de