QSIM 2014 - Martin Wendt

(Simulated Datacubes for MUSE prior to FirstLight)


Here, I will describe in detail how to set up and run QSIM 2014.
You can jump to individual topics but this document leads you through the general
process chronologically.
Please contact me for comments/questions/bug-reports under: qsim@martinwendt.de


CONTENTS
Setting up MPDAF
Setting up QSIM 2014
Setting up a test scene
Simulating an observation
What QSIM does
The config file
Scene object types
Examples
LSF
FWHM for the GC NGC6397 in commissioning.

MPDAF
First, install and set up the MPDAF package.

If you dont know how to set it up, drop me an email, please.
For a successfull install of MPDAF, you will have to follow the Prerequisites. Either with your packaging system or
pip install, i.e. 'pip install numpy' but that may require further packages (i.e. atlas, gfortran):
python-scipy
python-matplotlib
python-pyfits
python-pywcs
python-nose
python-image

In the new mpdaf directory execute:
python setup.py build
python setup.py install --prefix ~/local
(I prefer --prefix ~/local which will install MPDAF (only) for the current user, it helps keeping it
up to date and no root access to the system is required)

QSIM2014
Now, get the latest QSIM version. I set up a private versioning system.
You can get the latest QSIM via 'mercurial' (not GIT as MPDAF):
hg clone https://enthusi@bitbucket.org/enthusi/muse-qsim
It contains:
- the QSIM python main file
- a default config file
- a sky emission spectrum
- a sky absorption spectrum

Test scene
To test the current QSIM and also to learn by doing I strongly suggest to render a test scene.
One of the major artifical test scenes is the QSIM DRY-RUN00.

Information on the scene and how to retrieve the scene file and its 18 object files is given
in our MUSE-wiki (you should have access to it):
https://musewiki.aip.de/qsim-run00

At the very bottom you can see how to get the scene + object files from the lyon ftp:
(check the MUSE wiki above for the password)
sftp ftpdast@urania1.univ-lyon1.fr
the directory is DryRuns/Reference/Scene
There you get the data via:
get scene-ref-udf.fits
get -r dir_refobj
I suggest fv to check the scene file.
http://heasarc.gsfc.nasa.gov/ftools/fv/

To inspect cubes, I mostly use ds9:
http://ds9.si.edu/site/Home.html

The SCENE file ('scene-ref-udf.fits') contains all the information on the objects you want
to 'observe' with MUSE. There several object types.
'PS' for point sources, i.e. 1-dimensional spectra, like stars
'SG' for Galaxy type sources, more complex format
'UD' for Ultra Deep field like complex objects
'RC' for a raw cube that is already set up properly

RUNNING THE SIMULATION
1)
To run the simulation via remote access you will need a -X connection (because of matplotlib).
Keep "qsim.py" and the scene file in the same directory. Depending on the scene file, your object
data may be in a subdirectory (which is the case for the DryRun example scene)

2)
check mandatory config.txt (see below for more details, all default settings are suitable for this scene)

3)
If you have special requirements for the sky emission/absorption you will have to provide your own spectra.
The default spectra are doing a fine job so far. So optionally, check sky-emi.fits, check sky-abs.fits.

4)
Run qsim with the following parameters:
time python qsim.py -c config.txt [-seeing 0.6 -psf_dump "'moffat.fits'" ...]
- the leading 'time' is optional but its good to get an idea of CPU usage
- the '-c config.txt' is mandatory
- you can override ANY config parameter on the commandline, i.e. for automated
   execution. Stick to the exact same format as in the config file. Note, that
   some string parameters, such as filenames require to be set in single ' '.
   To add such a keyword value to the commandline you have to brace it with
   double quotes " " as in the example above.

The DryRun scene with the default config takes roughly 90 CPU Minutes on a
single core AMD Opteron 8435. It scales rather well will with multiple cores (shared memory).
Using multiple cores this reduces to < 30 mins. The bottle neck are complex individual objects which
are handled by individual cores (see graph below). The CPU intensive LSF and PSF scale basically linearly
with the number of cores
That becomes obvious when you compare the execution times of a second run with the same specs
(all cube objects are already stored as oc_ cubes and read in on the fly):
Roughly 40 CPU minutes (RAW, LSF, PSF, NOISE) and less than 4 minutes total execution time with multiple cores.
HOW QSIM WORKS:
At first, QSIM checks the scene file (which might cover objects far outside the specified FOV) and
computes each objects sizes to judge which objects to render.

Then all non-stellar objects are rendered and STORED individually on hard disk. Make sure you have
sufficient disk space available. The rendering of the individual objects is parallized.

Finally an empty data cube is created according to given FOV and each object is read in from disk and
inserted serially. Note, that a second run of the scene will skip the generation of the individual objects
which can be quite CPU intensive for large or complex SG and UD objects.
The resulting complete RAW cube is stored on disk.

Now, each spaxel is convolved spectrally and the LSF convolved cube stored on disk.
This process is carried out in parallel.

The Moffat-PSF convolution is also carried out in parallel and the cube is stored.

Finally the noise and instrument throughput and photon rate according to exposure time and number of exposures
is applied to the LSF_PSF cube and stored. This cube has roughly twice the size of the former cubes as it carries
information on the variances.

The CONFIG file
wav_min = 4800.0 #lower wavelength limit (AA)
wav_max = 9300.0 #upper wavelength limit (AA)

Wavelength range of the final cube, however the cube
will be truncated to the range that is covered by all objects.

d_lambda = 1.25 #constant bin size (AA)

The pixel size throughout the whole spectrum

namebase = 'testcube3' #name base of generated cubes

The names of the resulting data cubes are as following:
  namebase +'_raw.fits'
  namebase +'_lsf.fits'
  namebase +'_lsf_psf.fits'
  namebase +'_lsf_psf_noise.fits'
deb_file = 'debug.txt' #debugging log file

Name of the debug-log-file

plt_file = 'plot.txt' #output of coordinates

All objects positions are written into this file,
i.e. useful for crowed scenes like globular clusters

scene = 'scene-ref-udf.fits' #input scene file

The scene file to be rendered, also if you generate
your own testscene within QSIM

seeing = 0.8 #parameter for PSF (arsec)
airmass = 1.2 #parameter for PSF
psf_beta = 2.5 #parameter for PSF
psf_l0 = 22 #parameter for PSF (m)

Physical parameters relevant for the MOFFAT PSF

psf_size = 8.0 #truncate moffat at psf_size (arcsec)

Total RANGE of the PSF in arcsec (NOT FWHM), i.e. a psf_size of 4"
results in an PSF image of 4/0.2 = 20x20 pixels.
Spaxels outside of that box will not be affected at all.

psf_src = '' #use stored PSF cube (skip if ='')
psf_dump = 'moffat.fits' #store PSF cube (skip if ='' or psf_src !='')

If psf_src is given, the corresponding data cube will be used as PSF.
It has to match the spectral dimension of the data cube.
As default the PSF cube is dumped as moffat.fits and can i.e. be edited
and re-used.

noise_r = 2.5 #readout noise level (e-)
noise_d = 1.0 #dark current (e-/hour)

Detector specific parameters

no_exp = 10 #number of exposures
exp_secs = 3600 #exposures time per exposure (sec)
fov_y0 = -30.0 #field of view, bottom (arsec)
fov_x0 = -30.0 #field of view, left (arsec)
fov_y1 = 30.0 #field of view, top (arsec)
fov_x1 = 30.0 #field of view, right (arsec)

Parameters describing the actual observations. The resulting cube will be
automatically cut around the used scene. So fov_x0 - fov_y1 are the max-size
boundaries.

cpus = 0 #number of cpus used

The number of CPU cores you wish to dedicate to QSIM, if you specify
0, all the detected cores will be used.

do_raw = yes #render and write raw object cube
do_lsf = yes #render and write LSF
do_psf = yes #render and write PSF
do_noise = yes #render and write final noise cube

Specify which cubes you want to render. Note, that QSIM expects the
cubes according to the name format described above. I.e., if you just
want to render a new noise realisation set all execept do_noise to no
and make sure namebase +'_lsf_psf.fits' exists in your current dir.

skyemi = 'sky-emi.fits' #emission spectrum for sky noise (as image)
skyabs = 'sky-abs.fits' #absorption spectrum for sky noise (as image)

Names of the sky emission and absorption spectra. In IMAGE format. Inspect
the provided examples for specifications.

seed = 1234 #seed for noise (int, 0=None)

The seed for the random generator used for the noise. Set to 0 to initiate it
internally via (non-reproducable) timer.

SCENE OBJECTS
Talk-Slides. The most simple objects are the point source objects, referred to as 'PS'.
They consist of a 1-dimensional spectrum. Please check the Dryrun Scene 'PS' objects
for a proper FITS header example. The spectra will be rebinned to d_lambda (config) and
if properly specified, will be transformed to AA if given in nm.
To allow for subpixel positioning, each 'PS' object will be represented as a 3x3 spaxel cube
according to a small size normalized gaussian with corresponding of-center peak.

The 'SG' objects usually consist of several images and corresponding spectra,
The standard is one image for the distribution of young stars and a young star spectrum
and young star LyAlpha, another image with the distribution of old stars and their spectrum
and another image for gaseous spectrum and gaseous LyAlpha.

The 'UD' objects are more complex. The provide one continuum cube and an arbitray number of
(gaussian) emission lines. Two further images are provided. One for the veolocity dispersions of the
emission lines and another for the radial velocity shifts of all emission lines. Both information
has to be applied spaxel by spaxel for every given emission line.

The 'RC' objects are complete data cubes (i.e. from earlier simulations) that fulfill the specifications of the
scene (spectral and spatial sampling). Any formerly dumped oc_ cube can be used as 'rc' object.

Fingerprint of QSIM2014
The following graph displays some statistics from a segment of the HUDF simulation based on the time stamps of the written cubes.
FOV was 60x60 arsec. 3096 objects were rendered on ~20 cores.
QSIM track, Martin Wendt
On the x-axis the total time in minutes is given. In red the number of rendered objects is plotted, linear
scale on the y-axis. Actually its the number of written files, which also includes the different resulting cubes.
In blue the total size in GB x 10 is plotted. The final total size being about 17 GB.
You can nicely see how many objects are rendered in parallel right at the start. Only a few objects (the large or
complex ones) taking a considerable amount of time. After 140 minutes all objects are rendered and the RAW cube
was written (the ~2 GB jump). The LSF convolution took roughly 30 minutes (another ~2 GB cube written) and then
about 12 minutes for the moffat PSF after it was generated (the small step for the moffat-cube)
and only a few minutes for the noise cube (the final ~4 GB step).
If you now want a new cube under different seeing and noise conditions, you would only have to rerun the last two steps again.
Examples
globular cluster, Martin Wendt
Globular cluster scene, 1 frame, 75" x 75", seeing 0.8", ~ 37,000 objects (thanks to Sebastian Kamann for the scene and obj files).
dry run 00, Martin Wendt
DryRun00, white light image, 20" x 18" (default test scene/config).
UDF, Martin Wendt UDF, Martin Wendt
Hubble UltraDeepField scene, 1 frame, 60" x 60" in the center, 3100 objects.
Left:With noise, PSF, seeing 0.8"; right:same as left but RAW, i.e. no PSF or noise applied yet.
LSF:
LSF, Martin Wendt LSF, Martin Wendt
Some information on a more simplistic LSF model is given here.

NGC 6397:
Tested model for the globular cluster NGC6397:
LSF, Martin Wendt