SAGECAL [#f1]_
==============

This chapter is a step by step description of the SAGECAL method for self-calibration of LOFAR data. The mathematical framework of the algorithm can be found in Yatawatta et al. 2009 and Kazemi et al. 2011. The contents are applicable to the latest version (0.3.8) of SAGECAL and older versions are obsolete. The latest source code can be obtained from `http://sagecal.sf.net <http://sagecal.sf.net>`_.

------------
Introduction
------------

The acronym SAGECAL stands for Space Alternating Generalized Expectation Maximization Calibration. The Expectation Maximization is used as a solution to maximum likelihood estimation to reduce the computational cost and speed of convergence. The commonly used Least Squared calibration method involves the inversion of a matrix corresponding to the full set of unknown parameters; where the convergence to a local minimum impels a slow speed of convergence and significant computational cost. The SAGE algorithm, on the other hand, allows to compute a direct estimation of subsets unknown parameters, providing faster convergence and/or reduced computational costs. If K is the number of sources and N are the stations, the computational cost scales as :math:`{\mathcal O} ((KN)^2)` for the Least Squared, while is :math:`{\mathcal O} (K N^2)` for the Expectation Maximization. 

BBS uses the Least Squared algorithm to solve the Measurement Equation. The maximum number of directions for which solve using directional gains with BBS is currently limited to 5 or 6 directions; this is a consequence of the limited computing power available in CEP2 and CEP3. The typical LOFAR Field customarily requires to solve for a number of directions generally higher than 5 or 6, this is due to the wide field of view (FOV), variable beam pattern and ionospheric errors. SAGECAL has been tested to solve to a maximum of 300 directions (in the GPU EoR cluster not in CEP2 or CEP3) and 512 stations.  The version installed in the CEP clusters is not optimized so that other users can also use the same node while SAGECAL is running; e.g. a total of about 50 directions have been successfully tested in CEP1 cluster.

-------------
Using SAGECAL
-------------

^^^^^^^^^^^^^^^^
Data preparation
^^^^^^^^^^^^^^^^

Before running SAGECAL, you need to take a few precautions. First of all, the data format is the Measurement Set (MS), so no extra conversions are necessary. Averaging in time is not essential, but it is definitely recommended in order to work with smaller datasets (you can average down to e.g. 10 seconds). If data have been already demixed,  this step is not needed. The MS can have more than one channel. Also it is possible to calibrate more than one MS together in SAGECAL. This might be useful to handle situations where the data is very noisy. An interesting note about SAGECAL: If the conventional demixing could not be performed on your data (e.g. if the A-team was too close to the target source) you will be able to successfully remove  the A-team sources using this new algorithm by working on averaged data(!). Another procedure you will need apply to your data before SAGECAL is to  calibrate them in the standard way with BBS, solving for the four G Jones elements as well as for the element beam. After BBS, you will need to flag the outliers using DPPP. You are now ready to start the algorithm. 

All the programs related to the SAGECAL can be found in **/opt/cep/sagecal/bin/**  or see **/home/sarod/bin** for the latest build.

^^^^^
Model
^^^^^

Make an image of your MS (using casapy or awimager). You can use Duchamp to create a mask for the image. To create a sky model, you can adopt buildsky  for point sources and shapelet_gui in case of extended emission (see `Sky model construction using Shapelets <./shapelets.html>`_). Note that you are free to use any other source finder (like `PyBDSF <./sourcedetection.html>`_) if you feel more confident with it. A new functionality has been implemented to transform the the BBS model into the model format used by SAGECAL. The important is that in the sky model any source name starting with 'S' indicates shapelet, 'D' a disk, 'R' a ring, 'G' a Gaussian, while others keys are point sources. In the following, we report a few useful sky models examples::

    ## name h m s d m s I Q U V spectral_index RM extent_X(rad) extent_Y(rad) 
    ## pos_angle(rad) freq0 
    P1C1 0 12 42.996 85 43 21.514 0.030498 0 0 0 -5.713060 0 0 0 0 115039062.0
    P5C1 1 18 5.864 85 58 39.755 0.041839 0 0 0 -6.672879 0 0 0 0 115039062.0
    
    # A Gaussian mjor,minor 0.1375,0.0917 deg diameter, pa 43.4772 deg 
    G0  5 34 31.75 22 00 52.86 100 0 0 0 0.00 0 0.0012  0.0008 -2.329615801 130.0e6 
    
    # A Disk radius=0.041 deg
    D01 23 23 25.67 58 48 58 80 0 0 0 0 0 0.000715 0.000715 0 130e6
    
    # A Ring radius=0.031 deg
    R01 23 23 25.416 58 48 57 70 0 0 0 0 0 0.00052 0.00052 0 130e6
    
    # A shapelet ('S3C61MD.fits.modes' file must be in the current directory)
    S3C61MD 2 22 49.796414 86 18 55.913266 0.135 0 0 0 -6.6 0 1 1 0.0 115000000.0
    
Note that it is also possible to have sources with 3rd order spectra (with -F 1 option). Here is such an example::

    ##  name h m s d m s I Q U V spectral_index0 spectral_index1 spectral_index2 
    ## RM extent_X(rad) extent_Y(rad) pos_angle(rad) freq0
    PJ1C1 18 53 33.616 86 10 19.559 0.008594 0 0 0 -5.649676 -2.0 -60.0 0 0 0 0 152391463.2
    
But all the sources should have either 1st order or 3rd order spectra, mixing is not allowed.

The number of directions you want to solve for are described in the cluster file. In here, one or more sources corresponding to the patch of the skymodel you need to correct for are combined together as the following example::

    ## cluster_id chunk_size source1 source2 ...
    0 1 P0C1 P0C2
    1 3 P11C2 P11C1 P13C1
    2 1 P2C1 P2C2 P2C3

Note: comments starting with a '#' are allowed for both sky model and cluster files.

^^^^^^^
SAGECAL
^^^^^^^

SAGECAL will solve for all directions described in the cluster file and will subtract the sources listed in the sky model. Note that if you are not interested in the residual data, putting negative values for **cluster_id** will not subtract the corresponding sources from data. Run SAGECAL as follows::

    sagecal -d my.MS -s my_skymodel -c my_clustering -t 120 -p my_solutions

This will read the data from the DATA column of the MS and write the calibrated data to the CORRECTED_DATA column. If these columns are not present, you have to create them first.

If you need to calibrate more than one MS together, first create a text file with all the MS names, line by line. Then run ::

    sagecal -f MS_names.txt -s my_skymodel -c my_clustering -t 120 -p my_solutions
    
Running sagecal -h will provide the additional options, some of them are::

    -F sky model format: 0: LSM, 1: LSM with 3 order spectra : default 0
    -I input column (DATA/CORRECTED_DATA) : default DATA
    -O ouput column (DATA/CORRECTED_DATA) : default CORRECTED_DATA
    -e max EM iterations : default 3
    -g max iterations  (within single EM) : default 2
    -l max LBFGS iterations : default 10
    -m LBFGS memory size : default 7
    -n no of worker threads : default 6
    -t tile size : default 120
    -x exclude baselines length (lambda) lower than this in calibration : default 0
    
    Advanced options:
    -k cluster_id : correct residuals with solution of this cluster : default -99999
    -j 0,1,2... 0 : OSaccel, 1 no OSaccel, 2: OSRLM, 3: RLM:  4: RTR, 5: RRTR: default 0

Use a solution interval (e.g. -t 120) that is big enough to get a decent solution and not too big to make the parameters vary too much (about 20 minutes per solution is a reasonable value).

In case of both bright and faint sources in the model, you might need to use different solution intervals for different clusters. In order to do that you need to define the values of the second column of the cluster file in such a way that the cluster with the longest solution interval is 1. While the cluster with shorter solution interval will be equal to n, where n is the number of times the longer solution interval is divided. This means that if -t 120 is used to select 120 timeslots, cluster 0 will find a solution using the full 120 timeslots, while cluster 1 will solve for every 120/3=40 timeslots. The option -k will allow to correct the residuals using the solutions calculated for a specific direction which is defined by the **cluster_id**.  The -k option is analogue to the correct step in BBS. See the simulations section later in this chapter for more detailed use of this.

You are now ready to image  the residual data. Successively, run "restore" (see `Sky model construction using Shapelets <./shapelets.html>`__) on the residual image before updating the sky model and starting another loop of SAGECAL. SAGECAL cycles can be done till you are satisfied by the end product.

^^^^^^^^^^
Robustness
^^^^^^^^^^

Many have experienced flux loss of weaker background sources after running any form of directional calibration. There is a new algorithm in SAGECAL that minimizes this. To enable this, use -j 2 option while running SAGECAL. The theory can be found in Kazemi and Yatawatta, 2013. Also for best speed and robustness, it is recommended to use -j 5. Also for number of stations greater than 64, -j 5 option is faster and less memory consuming.

^^^^^^^^^^^
Simulations
^^^^^^^^^^^

After running SAGECAL, you get solutions for all the directions used in the calibration. Using these solutions, it is possible to correct the data for each direction and make images of that part of the sky. This will in theory enable to image the full field of view with correct solutions applied to each direction. This is a brief description on how to do it. We will use an example to illustrate this. Assume you have run SAGECAL as below::

    sagecal -d my.MS -s my_skymodel -c my_clustering -t 120 -p my_solutions
    
Now you have the my_solutions file that contains solutions for all the directions given in my_clustering file. The -a 1 option or -a 2 option enables simulation mode in SAGECAL. If -a 1 is given, the sky model given by my_skymodel and my_clustering is simulated (with gains taken from solutions if -p my_solutions is given) and written to the output data. If -a 2 is given, the model given by my_skymodel and my_clustering is simulated (with the gains if -p my_solutions is given), and then added to the input data (-I option) and written to the output data (-O option). 

There is one additional thing you can do while doing a simulation: correction for any particular direction. By using -k cluster_id , you can select any cluster number from the my_clustering file to use as the solutions that are used to correct the data. If the cluster_id specified is not present in the my_clustering file, no corrections will be applied.

Simulating the full sky model back to the data might be too much in some cases. It is also possible to simulate only a subset of clusters back to the data. This is done by specifying a list of clusters to ignore during the simulation, using -z ignore_list option. Here, ignore_list is a text file with the cluster numbers (one per line) to ignore. Note also that even while clusters are ignored, their solutions can still be used to correct the data (so these options are independent from each other).

Thus, by cleverly using the -a, -k and -z options, you can get an image that is fully corrected (using SAGECAL solutions) for all directional errors and moreover, simulations take only a fraction of the time taken for calibration.

-----------------------
Distributed calibration
-----------------------

It is possible to use SAGECAL to calibrate a large number of subbands, distributed across many computers. This is done by using OpenMPI and SAGECAL together; see sagecal-mpi -h for further information. Here is a simple example:

^^^
MPI
^^^

Using MPI, you can run a program across a network of computers. The basic way to run any program with MPI is ::

    mpirun -np N -hostfile machines.txt (your program) (program options)
    
where **-np** gives the number of processes (can be different from the number of computers) and **-hostfile** gives the computer names to use (**machines.txt** includes the computer names to use). Always recommended to use the full path for the program to run. If you are only using one computer, you can skip the hostfile option.

^^^^^^^^^^^^^
Initial setup
^^^^^^^^^^^^^

Let's say there are 10 subbands, **SB1.MS**, **SB2.MS**, to **SB10.MS** in one computer. First you need to create a text file (say **mslist.txt**) ::

    SB1.MS
    SB2.MS
    ...
    SM10.MS

that includes all the MS names, one per each line. Since we are running on one computer, no need to create a hostfile.

^^^^^^^^^^^^^^^
SAGECAL options
^^^^^^^^^^^^^^^

Apart from the usual command line options used in running stand alone SAGECal, there are a few special options for running it in a disributed way.

+ The number of ADMM iterations is given by **-A** option.
+ The type of polynomial in freqyency to use as a regularizer is given by **-Q** option.
+ The number of freqyency terms in the polynomial is given by **-P** option.
+ The regularization (penalty) factor is given by the **-r** option.

^^^^^^^
Running
^^^^^^^

Since there are 10 subbands, we need to invoke 10+1 SAGECal processes, this is done by selecting **-np 11**. The simplest way to run is ::

    mpirun -np 11 /home/sarod/bin/sagecal-mpi -A 10 -P 2 -r 10  -s my_skymodel -c my_clustering -t 120 -p my_solutions
    
this will calibrate each subband the usual way, but the solutions will be much better because information across the frequency range covered by the subbands is used. Note that almost all options used for standalone SAGECal can still be used here.

----------
References
----------

+ S. Yatawatta et al., "Radio interferometric calibration using the SAGE algorithm}", **IEEE DSP**, Marco Island, FL, Jan. 2009.
+ S. Kazemi and S. Yatawatta et al., "Radio interferometric calibration using the SAGE algorithm", **MNRAS, 414-2**, 2011.
+ S. Kazemi and S. Yatawatta, "Robust radio interferometric calibration using the T-distribution", **MNRAS**, 2013.

.. rubric:: Footnotes

.. [#f1] The authors of this chapter are `Emanuela Orru <mailto:orru@astron.nl>`_ and `Sarod Yatawatta <mailto:yatawatta@astron.nl>`_.