-
Notifications
You must be signed in to change notification settings - Fork 8
imager
This page provides some details about the imager program. The purpose of this software is to perform imaging in a parallel/distributed environment or on a single computer system. The software leverages MPI, however can be run on a simple laptop or a large supercomputer. It provides all the functionality of the cimager program but permits a more flexible distribution of data and has a smaller aggregated memory footprint.
- Running the program
- Parallel/distributed execution
- Spectral line processing (replacing simager)
- Configuration parameters
- Parameters of images
- Example
It can be run with the following command, where “config.in” is a file containing the configuration parameters described in the next section.
$ <MPI wrapper> imager -c config.in
The config file is the same format as that required by the cimager this has been done to accentuate the similarity between the two imagers. As cimager is much more accepted by the development team a decision has been made that the config files be interchangeable - thereby allowing simple testing.
The program is distributed and used a master/worker pattern to distribute and manage work. The program requires at least two processes to execute, and failure to either execute imager as an MPI process or specifying only one MPI process will result in an error.
On the Cray XC30 platform executing with the MPI wrapper takes the form:
$ aprun -n 305 -N 20 imager -c config.in
The -n and -N parameters to the aprun application launcher specify 305 MPI processes will be used (304 workers and one master) and each node will host 20 MPI processes. This job then requires 16 compute nodes.
As with cimager the imager has no implicit parallelism, rather the configuration parameters allow the user to specify what subset of the total task each worker will carry out. This is controlled by the nchanpercore and usetmpfs keywords.
It should be noted that imager can be run in exactly the manner as cimager if required, but its primary benefit is that you do not have to.
Typically we are processing a single measurement set with 304 spectral channels, using 304 workers (as described above) the following configuration parameter can be used to restrict each worker to a single spectral channel:
Cimager.nchanpercore = 1
You will need 305 cores to run. However if you decide that a core will process 4 channels. Then you would use:
Cimager.nchanpercore = 4
Now you would only need to invoke:
$ aprun -n 77 -N 20 imager -c config.in
This would only require 4 nodes.
The Channels keyword behaviour has been changed. It now behaves in a simialr way to cimager:
Cimager.Channels = [<num>,<start>]
You can specify the number of channels and the start channel. This will be taken from the complete list of channels as determined by the input measurement sets.
There is now a Frequencies keyword that allows you to specify the dimensions of the spectral cube in frequency space:
Cimager.Frequencies = [<num>,<start>,<width>] - all in Hz
This is in under test at the moment - it has also been introduced at the same time as a change to the way the barycentric corrections are performed.
Cimager.freqframe = topo | bary | lsrk
You can now choose a default frequency frame for the spectral cube as either, topocentric (the default), barycentric or the kinematic local standard of rest. Other frames can be added on request.
The current (beta) version of this imager cannot determine a reference frequency synthesis so you must add it yourself:
Cimager.visweights = MFS
Cimager.visweights.MFS.reffreq = 1.10697e+09
This optional feature, that is only really useful for continuum imaging where the number of stored channels is low is achieved by allowing the cores to store the raw visibilities from the allocated channels. These are stored as complete measurement sets:
Cimager.usetmpfs = True
Cimager.tmpfs = /dev/shm
In the above example each channel (and selected beam) is split out of the original measurement set and stored as a file on the ramfs file system mounted on /dev/shm - this places the entire file in memory. As a result access to the visibilities is very efficient. It also allows imaging with only one pass through the measurement set.
This imager has built in splitting of the input measurement set. At the moment on a single beam is split out but it can be any one and is determined by the first entry int the following vector:
Cimager.beams =[0]
The minor cycle of clean is currently still performed on a single master core. But the individual channel allocations are merged locally to the worker core.
This imager can also be used for spectral line processing - essentially mimicking the performance of Simager. The each channel is processed independently and written to an output cube. There are some differences between imager and simager. The two most notable being:
Cimager.solverpercore = true
When writing CASA images this imager can write to mode than one output image cube to improve disk throughput. This has been implemented to remove a serious bottle neck in the spectral line processing. For FITS imagetypes a single cube can be written - but multiple writers are used - once again to improve write performance:
Cimager.nwriters = X
Cimager.singleoutputfile = true
This imager can process multiple epochs and generate output cubes in a barycentric reference frame:
Cimager.dataset = [epoch1.ms,epoch2.ms]
Cimager.barycentre = true
This imager can be instructed to process each measurement set independently and to merge the subimage into a larger image for the minor cycles. Note this is different to facetting - which processes the sub images entirely independently. This scheme grids and images the fields indvidually but cleans them jointly as in Cornwell 1989:
Cimager.updatedirection = true
Parset parameters understood by imager are given in the following table (all parameters must have Cimager prefix, i.e. Cimager.dataset). For a number of parameters certain keywords are substituted, i.e. %w is replaced by the worker number (rank-1, if there is only one pool of workers) and %n by the number of nodes in the parallel case. In the serial case, these special strings are substituted by 0 and 1, respectively. This substitution allows to reuse the same parameter file on all nodes of the cluster if the difference between jobs assigned to individual nodes can be coded by using these keywords (e.g. using specially crafted file names). Note, if there is more than 1 group of workers (e.g. parallel calculation of Taylor terms), %w index spans the workers in one group rather than the global pool of workers. This is done to allow the same file name to be used for corresponding worker in different groups (i.e. all Taylor terms are build from the same file). If a parameter supports substitution, it is clearly stated in the description.
A number of other parameters allowing to narrow down the data selection are understood. They are given in a separate table (see Data Selection) and should also have the Cimager prefix. Note that option Cimager.CorrelationType will be ignored and default to “cross”.
To record the individual channel beams when run in spectral-line mode, the imager will produce an ASCII text file listing the beam parameters for each channel. This is known as the “beam log”. If the image cube name is “image.i.blah”, then the beam log will be called “beamlog.image.i.blah.txt”. The file has columns:
index | major axis[arcsec] | minor axis [arcsec] | position angle [deg]
Should the imaging of a channel fail for some reason, the beam for that channel will be recorded as having zero for all three parameters. This beam log is compatible with other askapsoft tasks, specfically the spectral extraction in Selavy (see Extraction of Spectra, Images and Cubelets).
Here is an example of the start of a beam log:
#Channel BMAJ[arcsec] BMIN[arcsec] BPA[deg]
0 64.4269 59.2985 -70.8055
1 64.4313 59.299 -70.8831
2 64.4333 59.3018 -70.9345
3 64.4338 59.2996 -70.9256
4 64.4349 59.2982 -70.9108
Parameter | Type | Default | Description |
---|---|---|---|
imagetype | string | “casa” | Type of the image handler (determines the format of the images, both which are written to or read from the disk). The default is to create casa images but “fits” can also be chosen. |
dataset | string or vector <string> |
None | Measurement set file name to read from. Usual substitution rules apply if the parameter is a single string. If the parameter is given as a vector then the sets can be different frequencies of the same observation or different epochs. |
nworkergroups | int | 1 | Number of worker groups. This option can only be used in the parallel mode. If it is greater than 1, the model parameters are distributed (as evenly as possible) between the given number of groups of workers (e.g. if one calculates a Taylor term expansion of the order of 1 for one image, setting this parameter to 3 will allow parallel computation of the Taylor terms for this image). This is on top of the normal parallelism within the group (the %w index spans from 0 to the number of workers per group - 1). Essentially, this option allows to throw several workers on the same problem if the model allows partitioning.Taylor terms, faceting and multiple images in the model are the typical use cases. |
nchanpercore | int | 1 | Number of channels allocated to each worker core. |
Channels | vector <int> |
[] | Channels to be selected from the measurement set. Syntax is [,]. Defaults to all the channels. |
Frequencies | vector <double> |
[] | Dimensions of the output cube in freuency space Syntax is [,,<width] all in Hz. Defaults to the same as the input MS. |
beams | vector <int> |
[0] | Beam number to be selected from the measurement set. |
nwriters | int | 1 | The number of output cubes to generate in spectral cube mode. |
freqframe | string | topo | Generate output cubes in the given frame options are topocentric (topo), barycentric (bary) and kinematic local standard of rest (lsrk). |
singleoutputfile | bool | false | Single output cube. Useful in the case of multiple writers. |
solverpercore | bool | false | Turn on distributed solver (simager) mode. |
datacolumn | string | “DATA” | The name of the data column in the measurement set which will be the source of visibilities.This can be useful to process real telescope data which were passed through casapy at some stage (e.g. to image calibrated data which are stored in the CORRECTED_DATA column). In the measurement set convention, the DATA column which is used by default contains raw uncalibrated data as received directly from the telescope. Calibration tasks in casapy make a copy when calibration is applied creating a new data column. |
sphfuncforpsf | bool | false | If true, the default spheroidal function gridder is used to compute PSF regardless of the gridder selected for model degridding and residual gridding. This has a potential to produce better behaving PSF by taking out two major factors of position dependence. Note, this doesn’t make the PSF correct or otherwise,it is just a different approximation. |
calibrate | bool | false | If true, calibration of visibilities will be performed before imaging. See Access to calibrator solutions for details on calibration parameters used during this application process. |
calibrate.scalenoise | bool | false | If true, the noise estimate will be scaled in accordance with the applied calibrator factor to achieve proper weighting. |
calibrate.allowflag | bool | false | If true, corresponding visibilities are flagged if the inversion of Mueller matrix fails. Otherwise, an exception is thrown should the matrix inversion fail. |
calibrate.ignorebeam | bool | false | If true, the calibration solution corresponding to beam 0 will be applied to all beams. |
gainsfile | string | “” | This is an obsolete parameter, which is still supported for backwards compatibility defining the file with antenna gains (a parset format, keywords look like gain.g11.0, where g11 or g22 in the middle correspond to different polarisations and the trailing number is the zero-based antenna number. The default value (empty string) means no gain correction is performed. The gain file format is the same as produced by Ccalibrator. |
restore | bool | false | If true, the image will be restored (by convolving with the given 2D gaussian). This is an additional step to normal imaging, which, by default, ends with just a model image. The restored image is written into a separate image file (with the .restore suffix). The convolution is done with the restore solver (see also Solvers) which reuses the same parameters used to setup the image solver (and therefore ensuring the same preconditioning is done). The only additional parameter of the restore solver is the shape of the gaussian representing clean beam (or flag to determine the shape). It is given by the restore.beam parameter, which must be present if restore is set to True. |
residuals | bool | true | If true write out the residual image. |
restore.beam | vector <string> |
None | Either a single word fit or a quantity string describing the shape of the clean beam (to convolve the model image with). If quantity is given it must have exactly 3 elements, e.g. [30arcsec, 10arcsec, 40deg]. Otherwise an exception is thrown. This parameter is only used if restore is set to True. If restore.beam=fit, the code will fit a 2D gaussian to the PSF image (first encountered if multiple images are solved for) and use the results of this fit. |
restore.beam.cutoff | double | 0.05 | Cutoff for the support search prior to beam fitting, as a fraction of the PSF peak. This parameter is only used if restore.beam=fit. The code does fitting on a limited support (to speed things up and to avoid sidelobes influencing the fit). The extent of this support is controlled by this parameter representing the level of the PSF which should be included into support. This value should be above the first sidelobe level for meaningful results. |
restore.equalise | bool | false | If true, the final residual is multiplied by the square root of the truncated normalised weight (i.e. additional weight described by Sault et al. (1996), which gives a flat noise). Note, that the source flux densities are likely to have position-dependent errors if this option is used because not all flux is recovered during the clean process. However, the images look aesthetically pleasing with this option. |
restore.updateresiduals | bool | true | The residual image written out by the restore solver can be updated using the latest model. This is now the default behviour. Note the majorcycle outputs do not pass through the restore solver so are not updated so therefore correspond to the residuals at the beginning of the last minor cycle. |
Images.xxx | various | A number of parameters given in this form define the images one wants to produce (shapes, positions, etc). The details are given in a separate section (see below). | |
nUVWMachines | int32 | number of beams | Size of uvw-machines cache. uvw-machines are used to convert uvw from a given phase centre to a common tangent point. To reduce the cost to set the machine up (calculation of the transformation matrix), a number of these machines is cached. The key to the cache is a pair of two directions: the current phase centre and the tangent centre. If the required pair is within the tolerances of that used to setup one of the machines in the cache, this machine is reused. If none of the cache items matches the least accessed one is replaced by the new machine which is set up with the new pair of directions. The code would work faster if this parameter is set to the number of phase centres encountered during imaging. In non-faceting case, the optimal setting would be the number of synthetic beams times the number of fields. For faceting (btw, the performance gain is quite significant in this case), it should be further multiplied by the number of facets. Direction tolerances are given as a separate parameter. |
uvwMachineDirTolerance | quantity string | “1e-6rad” | Direction tolerance for the management of the uvw-machine cache (see nUVWMachines for details). The value should be an angular quantity. The default value corresponds roughly to 0.2 arcsec and seems sufficient for all practical applications within the scope of ASKAPsoft. |
gridder | string | None | Name of the gridder, further parameters are given by gridder.something. See Gridders for details. |
rankstoringcf | int | 1 | In the parallel mode, only this rank will attempt to export convolution functions if this operation is requested (see tablename option in the Gridders) This option is ignored in the serial mode. |
visweights | string | “MFS” if any nterms>1, “” otherwise |
If this parameter is set to “MFS” gridders are setup to grid/degrid with the weight required for multi-frequency synthesis. At the moment, this parameter is decoupled from the image setup, which has to be done separately in a consistent way to use MSMFS (nterms should be set to something greater than 1). |
visweights.MFS.reffreq | double | ave freq (see frequency above) | Reference frequency in Hz for MFS processing (see above). |
solver | string | None | Name of the solver, further parameters are given by solver.something. See Solvers for details. |
threshold.xxx | various | Thresholds for the minor and major cycle (cycle termination criterion), see Solvers for details. | |
preconditioner.xxx | various | Preconditioners applied to the normal equations before the solver is called, see Solvers for details. | |
ncycles | int32 | 0 | Number of major cycles (and iterations over the dataset). |
sensitivityimage | bool | true | If true, an image with theoretical sensitivity will be created in addition to weights image. |
sensitivityimage.cutoff | float | 0.01 | Desired cutoff in the sensitivity image. |
freqframe | string | topo | Frequency frame to work in (the frame is converted when the dataset is read). Either lsrk or topo is supported. |
channeltolerance | double | 0 | Whether to use the floating-point tolerance in comparing frequencies from different datasets, allowing for small differences in the frequency settings. Default is to require the frequencies to match exactly. |
This section describes parameters used to define images, i.e. what area of the sky one wants to image and how. All parameters given in the following table have Cimager.Images* prefix, e.g. Cimager.Images.reuse = false
Parameter | Type | Default | Description |
---|---|---|---|
reuse | bool | false | If true, the model images will be read from the disk (from the image files they are normally written to according to the parset) before the first major cycle. If false (the default), a new empty model image will be initialised for every image solved for. Setting this parameter to true allows to continue cleaning the same image if more major cycles are required after inspection of the image. Note, there is little cross check that the image given as an input is actually a result of the previous run of cimager with the same Image parameters. So the user is responsible to ensure that the projection, shape, etc matches. |
shape | vector <int> |
1.7 * pb FWHM (~1st null) + 2 * max(pb offset) where pb FWHM = 1.2 * lambda / 12 |
Optional parameter to define the default shape for all images. If an individual shape parameter is specified separately for one of the images, this default value of the shape is overridden. Individual shape parameters (see below) must be given for all images if this parameter is not defined. Must be a two-element vector. |
cellsize | vector <string> |
1/max(u,v) / 6 rad | Optional parameter to define the default pixel (or cell) size for all images. If an individual cellsize parameter is specified separately for one of the images, this default value is overridden. Individual cellsize parameters (see below) must be given for all images, if this parameter is omitted. If defined, a 2-element quantity string vector is expected, e.g. [6.0arcsec, 6.0arcsec]. |
writeAtMajorCycle | bool | false | If true, the current images are written to disk after each major cycle (.cycle suffix is added to the name to reflect which major cycle the image corresponds to). By default, the images are only written after ncycles major cycles are completed. |
Names | vector <string> |
None | List of image names which this imager will produce. If more than one image is given, a superposition is assumed (i.e. visibilities are fitted with a combined effect of two images; two measurement equations are simply added). Parameters of each image defined in this list must be given in the same parset using ImageName.something keywords (with usual prefix). Note, all image names must start with word image (this is how parameters representing images are distinguished from other type of free parameters in ASKAPsoft), otherwise an exception is thrown. Example of valid names are: image.10uJy , image , imagecena . |
ImageName. nchan |
int32 | 1 | Number of spectral planes in the image cube to produce. Set it to 1 if just a 2D image is required. |
ImageName. frequency |
vector <double> |
[min freq, max freq] if nchan>1 , or[ave freq, ave freq] if nchan=1 where ave freq = (min+max)/2
|
Frequencies in Hz of the first and the last spectral channels to produce in the cube. The range is binned into nchan channels and the data are gridded (with MFS) into a nearest image channel (therefore, the number of image channels given by the nchan keyword may be less than the number of spectral channels in the data. If nchan is 1 all data are MFS’ed into a single image (however the image will have a degenerate spectral axis with the frequency defined by the average of the first and the last element of this vector; it is practical to make both elements identical, when nchan is 1). The vector should contain 2 elements at all times, otherwise an exception is thrown. Note: these are the min and max frequencies being processed, which may be a subset of the full frequency range. |
ImageName. direction |
vector <string> |
phase centre of the visibilities | Direction to the centre of the required image (or tangent point for facets). This vector should contain a 3-element direction quantity containing right ascension, declination and epoch, e.g. [12h30m00.00, -45.00.00.00, J2000] . Note that a casa style of declination delimiters (dots rather than colons) is essential. Only J2000 directions are currently supported. |
ImageName. tangent |
vector <string> |
“” | Direction to the user-defined tangent point, if different from the centre of the image. This vector should contain a 3-element direction quantity containing right ascension, declination and epoch, e.g. [12h30m00.00, -45.00.00.00, J2000] or be empty (in this case the tangent point will be in the image centre). Note that a casa style of declination delimiters (dots rather than colons) is essential. Only J2000 directions are currently supported. This option doesn’t work with faceting. |
ImageName. ewprojection |
bool | false | If true, the image will be set up with the NCP or SCP projection appropriate for East-West arrays (w-term is equivalent to this coordinate transfer for East-West arrays). |
ImageName. shape |
vector <int> |
None | Optional parameter if the default shape (without image name prefix) is defined. This value will override the default shape for this particular image. Must be a 2-element vector. |
ImageName. cellsize |
vector <string> |
None | Optional parameter if the default cell size (without image name prefix) is defined. This value will override the default cell size for this particular image. A two-element vector of quantity strings is expected, e.g. [6.0arcsec, 6.0arcsec]. |
ImageName. nfacets |
int32 | 1 | Number of facets for the given image. If greater than one, the image centre is treated as a tangent point and nfacets facets are created for this given image (parameters/output model images will have names like ImageName.facet.x.y , where x and y are 0-based facet indices varying from 0 to nfacet-1 ). The facets are merged together into a single image in the restore solver (i.e. it would happen only if restore is true). |
ImageName. polarisation |
vector <string> |
[“I”] | Polarisation planes to be produced for the image (should have at least one). Polarisation conversion is done on-the-fly, so the output polarisation frame may differ from that of the dataset. An exception is thrown if there is insufficient information to obtain the requested polarisation (e.g. there are no cross-pols and full stokes cube is requested). Note, ASKAPsoft uses the correct definition of stokes parameters, i.e. I=XX+YY, which is different from casa and miriad (which imply I=(XX+YY)/2).The code parsing the value of this parameter is quite flexible and allows many ways to define stokes axis, e.g. [“XX YY”] or [“XX”,”YY”] or “XX,YY” are all acceptable. |
ImageName. nterms |
int32 | 1 | Number of Taylor terms for the given image. If greater than one, a given number of Taylor terms is generated for the given image which are named ImageName.taylor.x , where x is the 0-based Taylor order (note, it can be combined with faceting causing the names to be more complex). This name substitution happens behind the scene (as for faceting) and a number of images (representing Taylor terms) is created instead of a single one. This option should be used in conjunction with visweights (see above) to utilize multi-scale multi-frequency algorithm. With visweights=”MFS” the code recognizes different Taylor terms (using taylor.x name suffix) and applies the appropriate order-dependent weight. |
ImageName. facetstep |
int32 | min( shape(0), shape(1)) | Offset in tangent plane pixels between facet centres (assumed the same for both dimensions). The default value is the image size, which means no overlap between facets (no overlap on the shortest axis for rectangular images). Overlap may be required to achieve a reasonable dynamic range with faceting (aliasing from the sources located beyond the facet edge). The alternative way to address the same problem is the padding option of the gridder (see Gridders for details). |
#
# Input measurement set
#
Cimager.dataset = 10uJy_stdtest.ms
#
# Define the image(s) to write
#
Cimager.Images.Names = [image.i.10uJy_clean_stdtest]
Cimager.Images.shape = [2048,2048]
Cimager.Images.cellsize = [6.0arcsec, 6.0arcsec]
Cimager.Images.image.i.10uJy_clean_stdtest.frequency = [1.420e9,1.420e9]
Cimager.Images.image.i.10uJy_clean_stdtest.nchan = 1
Cimager.Images.image.i.10uJy_clean_stdtest.direction = [12h30m00.00, -45.00.00.00, J2000]
#
# Use a multiscale Clean solver
#
Cimager.solver = Clean
Cimager.solver.Clean.algorithm = MultiScale
Cimager.solver.Clean.scales = [0, 3, 10, 30]
Cimager.solver.Clean.niter = 10000
Cimager.solver.Clean.gain = 0.1
Cimager.solver.Clean.tolerance = 0.1
Cimager.solver.Clean.verbose = True
Cimager.threshold.minorcycle = [0.27mJy, 10%]
Cimager.threshold.majorcycle = 0.3mJy
Cimager.ncycles = 10
#
# Restore the image at the end
#
Cimager.restore = True
Cimager.restore.beam = [30arcsec, 30arcsec, 0deg]
#
# Use preconditioning for deconvolution
#
Cimager.preconditioner.Names = [Wiener, GaussianTaper]
Cimager.preconditioner.Wiener.noisepower = 100.0
Cimager.preconditioner.GaussianTaper = [20arcsec, 20arcsec, 0deg]