next up previous contents
Next: Two-Dimensional Implementation Up: FDTD Method Previous: Far-Field Transform

Computational Considerations

Finite difference simulations with Yee's algorithm are computationally intensive. The maximum step size for stable results is $\lambda/10$ which, at optical wavelengths of approximately $1\: \mu\text{m}$ is $0.1\: \mu\text{m}$. The modeling of structures with dimensions of typical cells requires the storage of large arrays. In a three-dimensional simulation, arrays for 6 field components, $\varepsilon$, and $\sigma$ must be stored in memory. In addition all 6 field components and the incident source must be computed at all grid points for each time step.

The approximate amount of memory required for a simulation is

 \begin{displaymath}\text{App.Mem.}=N_xN_yN_z\left[\left(6\times8\right)+8\right]
\end{displaymath} (4.53)

where Nx, Ny, and Nz are the array dimensions in each direction. Equation 4.53 assumes that the 6 field components and permittivity are stored as double precision floating point values requiring 8 bytes of memory each. A three-dimensional simulation of a volume $15\: \mu\text{m}$ on a side (N=200) would require 450 Mb of memory according to Equation 4.53. The actual amount of memory will be greater than this since it does not take into consideration storage of variables for boundary conditions and far field transform.

A two-dimensional simulation on the other hand, requires storage of arrays for 3 field components and permittivity with only N2elements, as opposed to N3 elements in three dimensions. Large areas $(\sim50\,\mu\text{m})$/side can be simulated with modest storage requirements (<100 Mb).

Since the same equations are applied to the field components at all grid locations, Yee's algorithm can be implemented efficiently on a parallel computer. Parallel simulations of Yee's algorithm have been implemented in the past and have demonstrated decreases in computation time compared with serial processors [66,67]. Weedon et al. [68] implemented a FDTD simulation on a Connection Machine CM-5 parallel supercomputer and found that the CM-5, using a 256-node partition, was 17.8 times faster than the same algorithm running on a single processor of a Cray Y-MP.

The three dimensional FDTD program developed for this work used the resources at the High Performance Computing Facility (HPCF) at UT-Austin. The primary computer used for the FDTD program was a Cray J90-16 supercomputer. The J90 is a 16 processor parallel/vector supercomputer with 4 Gb of shared central memory. The advantage of the shared memory architecture is that data layout is transparent to the programmer. However, a distributed memory ``massively parallel'' architecture, such as the newly acquired Cray T3E at the HPCF, should provide improved performance of the three dimensional FDTD program over the J90, although the code has not yet been ported to the T3E.


next up previous contents
Next: Two-Dimensional Implementation Up: FDTD Method Previous: Far-Field Transform
Andy Dunn
1998-05-12