AOFlagger [#f1]_ ================ The frequencies covered by LOFAR are considerably affected by RFI, both in the low and the high band. Without proper flagging of the RFI, the RFI affects the image quality or can make calibration fail. This chapter describes how to analyze and flag LOFAR observations using the AOFlagger. In most cases, one would normally use the AOFlagger step in NDPPP with the default settings to flag the data. In this chapter we look at what the flagger does and how to alter its behaviour. -------- Tutorial -------- The tutorial data can be downloaded from the LOFAR LTA. Please refer to the chapter on `practical examples <./tutorial.html>`_ for details. The AOFlagger program comes with a tool called **rfigui** that is part of the **lofar** module on the CEP3 cluster. If you haven't done so, load the module:: module load lofar We will use the rfigui to display the measurement set. The rfigui will not write to the measurement set, so you do not need to copy the measurement set yet. Use the following command to open the measurement set in rfigui:: rfigui L74759_SAP000_SB001_uv.MS ^^^^^^^^^^^^^^^^^^ Browsing baselines ^^^^^^^^^^^^^^^^^^ A window should pop-up which can be used to select how to open the measurement set. Note that the visualized column can be selected here, which is useful to analyze data in a different column (e.g. CORRECTED_DATA). Since this is raw data, it will only have a DATA ("observed") column, so leave the settings as they are and press "Open". Another window will pop-up which allows you to select two antennas, a band and a field. Since LOFAR measurement sets never have multiple bands or fields, they will always have only one option. With the two antenna selection boxes, you can select which cross-correlation to visualize. Select antenna CS002HBA0 in both selection frames and press "Load". It will take some while before the visibilities are loaded. You should now see a dynamic spectrum (time on the x-axis, frequency on the y-axis) of auto-correlation CS002HBA0 x CS002HBA0. Note several things: + When you move the mouse over the dynamic spectrum, the status bar will provide some information of the visibilities at the location of the mouse. + At the bottom of the plot is a purple line. The rfigui uses purple to indicate visibilities that are flagged. In this case, it is flagged because in raw LOFAR data, channel 0 contains invalid data due to how the poly-phase filter works. The purple overlay can be turned off with the "Original flags" toggle button in the toolbox. + Note the sharp patterns that recur in a horizontal way, such as at 119.05 and 119.18 MHz. These are common, and are caused by normal transmitters. Use the mouse to find which features I mean. + Around 13.30 h, vertical features are visible. These can be caused by broadband RFI such as lightning, or by a malfunctioning instrument. + Finally, there are some big wavy features visible. These are intermodulations in the receivers, and are only present in auto-correlations (and sometimes in the cross-baseline of a dual split station, e.g. CS002HBA0 x CS002HBA1). Auto-correlations are much more sensitive to RFI. Because the auto-correlations are normally not used in imaging, in fact this RFI situation is not representative. Let's go to a cross-correlation: press the "Forward" button. This will load the correlations for the baseline of this antenna by the next antenna, in this case CS002HBA0 x CS002HBA1. Notice that the RFI situation is much better, and the dominating features are the consistent transmitters at 119.05 MHz and 119.18 MHz. Press "Forward" a few more times and notice the changes. Let's go to a remote station: go to the "Go" (newer version: "Browse") menu and select "Go to...". Load baseline CS004HBA0 x RS106HBA. Note the smooth vertical features in the background. These are fringes caused by observed sources. Therefore, a flagging strategy should not flag these. ^^^^^^^^^^^^^^^^^^^ Plotting & flagging ^^^^^^^^^^^^^^^^^^^ Power spectra and time-power plots can be helpful to analyze the magnitude of RFI. In menu "Plot", select "Power spectrum". A pop-up window appears with the power over frequency over the selected region in the main window. The RFI is clearly visible as spikes in the spectrum. Note also that the bandpass shape is curving down at 119.23 MHz. One normally does not want to include these edge channels because of these features, and they are therefore cut-off in NDPPP (see the NDPPP tutorial). Now select "Plot time scatter" in the "Plot" menu, and notice how the RFI looks like in this plot. In this plot, different colours show the different polarizations. Also try the "mean spectrum" and "power vs time" plots, and notice how they differ. Normally, the time-frequency plot shows the Stokes I power of the visibilities. However, internally AOFlagger has loaded complex correlations for all four polarizations. Select "Data" :math:`\rightarrow` "Keep real data". Fringes are more pronounced in the real data. When you press one of the "Keep..." options, the rfigui will no longer keep the other parts in memory. This can be confusing: for example, try selecting "Data" :math:`\rightarrow` "Keep imaginary data" and understand the error message. To see the imaginary data, first reload the baseline with "Go" :math:`\rightarrow` "Go to..." (new version: refresh button). Try analyzing the different parts of this correlation by using the other "Data" :math:`\rightarrow` "Keep..." buttons and reloading the baseline when necessary. Before continuing, close the plot window and reload the baseline. ^^^^^^^^^^^^^^^^^^^^^^^^^^^ Testing a flagging strategy ^^^^^^^^^^^^^^^^^^^^^^^^^^^ Select "Actions" :math:`\rightarrow` "Execute strategy". A progress window will pop-up, which you can close once the flagging is done. Notice that yellow flags have appeared in the time-frequency plot. The default flagging strategy has indeed flagged all visible RFI. Additionally, it has flagged a bit of the edge channels (e.g. in the top-right), because of the curvature in the pass-band at the edges. This is normally not a problem, since these channels will be cut off with NDPPP. Try toggling the yellow flags off and on with the "Alternative flags" button on the toolbar. Before continuing, make sure that the yellow flags are shown. Plot the power spectrum and time scatter plots as before. The power spectrum will now have two lines: the original one, and one that shows the spectrum after the flagged visibilities have been taken off. The time scatter plot will only show unflagged samples. Indeed, the RFI samples have been removed, and a clear difference between XX/YY and XY/YX is visible. If you hide the yellow flags and replot the scatter plot, the flagged samples will be shown again. Clear the flags by reloading the baseline. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Editing the flagging strategy ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Press "Action" :math:`\rightarrow` "Edit strategy". A window will pop-up that shows a tree that represents the flagging script. Every row is an action, and certain actions can have sub-actions [#f2]_. Notice that there are two "SumThreshold" actions: one is in the "iterate 2 times" loop, while the other is below the loop. When you press one of the SumThreshold actions, a settings box will appear. Set the "Base sensitivity" to 0.7 and press "Apply" for both SumThreshold actions. Now, rerun the strategy with "Actions" :math:`\rightarrow` "Execute strategy" and notice the difference. Try playing with the settings below, and rerun the flagger each time. Do not forget to press "Apply" after changing the settings of an action. If you make modifications you don't like, you can press the "Default" button in the edit startegy window to reset the strategy. + The strategy normally slowly converges by iterating a couple of times. Only in the last SumThreshold action, the maximum sensitivity is used. The "Iterate 2 times" action performs this slow convergence. In the "Iterate 2 times" action, try changing the number of iterations to 1 and see if this is enough for this baseline. + Normally, each polarization is searched for RFI. Change the "For each polarization" action so that only Stokes Q is searched for RFI. + In the "Statistical flagging" action, play with the "Minimum time ratio" and "Minimum frequency ratio". For example, is a "Minimum time ratio" of 60\% a good setting? + Try setting the threshold of the "Time selection" action to 2.5. This subband is a rather good subbands. While for LOFAR many subbands are like this, there are also subbands which are much worse. Typically you want to try the strategy on some good and bad subband, as well as on some small on some long baselines. You can chose to flag different subbands with a different strategy, but in general it is easier to have one generic strategy. It is also generally better to flag slightly too much than too little. Finally, it is not necessary to consider the flagging speed. Once you have made a modified strategy, you can save it with the "Save" button in the "Edit strategy window". When saving a strategy, give the filename an extension of ".rfis". These strategies can be used by the AOFlagger step inside NDPPP, to flag and average the data at the same time, as well as that they can be used by the "**aoflagger**" command line program. It is normally faster to do this with NDPPP. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Running the **aoflagger** command line program ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ We are now going to flag the measurement set without running NDPPP. To do this, we need write access to the measurement set. Therefore, copy the measurement set mentioned above to your scratch directory. Measurement sets that are created by the LOFAR correlator have a special "raw" format. To make the flag table writeable by aoflagger, it is necessary to store the flags with a different format. This can be done with the following command:: > makeFLAGwritable L74759_SAP000_SB001_uv.MS Successful read/write open of default-locked table L74759_SAP000_SB001_uv.MS: 23 columns, 5337090 rows Created new FLAG column; copying old values ... FLAG column now stored with SSM to make it writable The measurement set can now be flagged with the aoflagger command. To get a list of commands, run the program without parameters:: > aoflagger AOFlagger 2.7.1 (2015-07-01) command line application This program will execute ... Two settings need to be changed: i) for large raw sets, it is best to use the indirect reading mode; ii) since we have made a modified strategy, we want to specify our new strategy. If you are using aoflagger 2.7, you can immediately specify the .rfis file that you have created in the gui. At the point of writing, you will need to use a file which has been prepared by me, because there is an issue with different versions of the gui and the command line aoflagger. Combined all together, run aoflagger like this:: > aoflagger -indirect-read -strategy ~offringa/tutorials/aoflagger-tutorial.rfis L74759_SAP000_SB001_uv.MS AOFlagger 2.7.1 (2015-07-01) command line application ... 0% : +-+-+-+-+-+-Initializing... ... Please note that the current working directory will be used as a temporary storage location, and a lot of temporary data will be written there. Once the flagger is done, you can open the measurement set in the rfigui, and the found flags will be shown with purple. ------------- Documentation ------------- The manual for AOFlagger can be found on the site `https://aoflagger.readthedocs.io/ `_. The properties and performance of the AOFlagger have been described in the following papers: + `Post-correlation radio frequency interference classification methods `_. + `A morphological algorithm for improving radio-frequency interference detection `_. .. rubric:: Footnotes .. [#f1] This author of this chapter is `Andre Offringa `_. .. [#f2] A description of the steps in the pipeline is given in `this paper `_.