API¶

`timewise.parent_sample_base.ParentSampleBase`(...)	Base class for parent sample.
`timewise.wise_data_base.WISEDataBase`(...)	Base class for WISE Data
`timewise.wise_data_by_visit.WiseDataByVisit`(...)	WISEData class to bin lightcurve by visits.
`timewise.wise_bigdata_desy_cluster.WISEDataDESYCluster`(...)	A class to download WISE data with multiple threads and do the binning on the DESY cluster.
`timewise.point_source_utils`

Parent Sample¶

The Base Class¶

class timewise.parent_sample_base.ParentSampleBase(base_name)[source]¶

Base class for parent sample. Any subclass must implement

ParentSample.df: A pandas.DataFrame consisting of minimum three columns: two columns holding the sky positions of each object in the form of right ascension and declination and one row with a unique identifier.
ParentSample.default_keymap: a dictionary, mapping the column in ParentSample.df to ‘ra’, ‘dec’ and ‘id’

Parameters:: base_name – determining the location of any data in the timewise data directory.

plot_cutout(ind, arcsec=20, interactive=False, **kwargs)[source]¶

Plot the coutout images in all filters around the position of object with index i

Parameters:

ind (int or list-like) – the index in the sample
arcsec (float) – the radius of the cutout
interactive (bool) – interactive mode
kwargs – any additional kwargs will be passed to matplotlib.pyplot.subplots()

Returns:

figure and axes if interactive=True

WISEData¶

The Base Class¶

class timewise.wise_data_base.WISEDataBase(base_name, parent_sample_class, min_sep_arcsec, n_chunks)[source]¶

Base class for WISE Data

Parameters:

parent_sample_class (ParentSample class) – class for parent sample
base_name (str) – unique name to determine storage directories
min_sep (astropy.units.Quantity) – query region around source for positional query
whitelist_region (astropy.units.Quantity) – region around source where all datapoints are accepted in positional query
n_chunks (int) – number of chunks in declination
parent_wise_source_id_key (str) – key for the WISE source ID in the parent sample
parent_sample_wise_skysep_key (str) – key for the angular separation to the WISE source in the parent sample
parent_sample_default_entries (dict) – default entries for the parent sample
cache_dir (str) – directory for cached data
cluster_dir (str) – directory for cluster data
cluster_log_dir – directory for cluster logs
output_dir (str) – directory for output data
lightcurve_dir (str) – directory for lightcurve data
plots_dir (str) – directory for plots
submit_file (str) – file for cluster submission
tap_jobs (list[pyvo.dal.tap.TAPJob]) – TAP jobs
queue (multiprocessing.Queue) – queue for cluster jobs
clear_unbinned_photometry_when_binning (bool) – whether to clear unbinned photometry when binning
chunk_map (np.ndarray) – map of chunks
service_url (str) – URL of the TAP service
service (timewise.utils.StableTAPService) – custom TAP service, making sure that the TAP jobs are stable
active_tap_phases (set) – phases of TAP jobs that are still active
running_tap_phases (list) – phases of TAP jobs that are still running
done_tap_phases (set) – phases of TAP jobs that are done
query_types (list) – query types
table_names (pd.DataFrame) – map nice and program table names of WISE data tables
bands (list) – WISE bands
flux_key_ext (str) – key extension for flux keys
flux_density_key_ext (str) – key extension for flux density keys
mag_key_ext (str) – key extension for magnitude keys
luminosity_key_ext (str) – key extension for luminosity keys
error_key_ext (str) – key extension for error keys
band_plot_colors (dict) – plot colors for bands
photometry_table_keymap (dict) – keymap for photometry tables, listing the column names for flux, mag etc for the different WISE data tables
magnitude_zeropoints (dict) – magnitude zeropoints from here
constraints (list) – constraints for TAP queries selecting good datapoints as explained in the explanatory supplements
parent_wise_source_id_key – key for the WISE source ID in the parent sample
parent_sample_wise_skysep_key – key for the angular separation to the WISE source in the parent sample

add_flux_densities_to_saved_lightcurves(service)[source]¶

Adds flux densities to all downloaded lightcurves

Parameters:: service (str) – The service with which the lightcurves were downloaded

add_flux_density(lightcurve, mag_key, emag_key, mag_ul_key, f_key, ef_key, f_ul_key, do_color_correction=False)[source]¶

Adds flux densities to a lightcurves

Parameters:

lightcurve (pandas.DataFrame) –
mag_key (str) – the key in lightcurve that holds the magnitude
emag_key (str) – the key in lightcurve that holds the error of the magnitude
mag_ul_key (str) – the key in lightcurve that holds the upper limit for the magnitude
f_key (str) – the key that will hold the flux density
ef_key (str) – the key that will hold the flux density error
f_ul_key (str) – the key that will hold the flux density upper limit
do_color_correction (bool) –

Returns:

the lightcurve with flux density

Return type:

pandas.DataFrame

add_luminosity_to_saved_lightcurves(service, redshift_key=None, distance_key=None)[source]¶

Add luminosities to all lightcurves, calculated from flux densities and distance or redshift

Parameters:

service (str) – the service with which the lightcurves were downloaded
redshift_key (str) – the key in the parent sample data frame that holds the redshift info
distance_key (str) – the key in the parent sample data frame that holds the distance info

abstract bin_lightcurve(lightcurve)[source]¶

Bins a lightcurve

Parameters:: lightcurve (pandas.DataFrame) – The unbinned lightcurve
Returns:: the binned lightcurve
Return type:: pd.DataFrame

calculate_metadata(service, chunk_number=None, jobID=None, overwrite=True)[source]¶

Calculates the metadata for all downloaded lightcurves.: Results will be saved under

</path/to/timewise/data/dir>/output/<base_name>/lightcurves/metadata_<service>.json

Parameters:

service (str) – the service with which the lightcurves were downloaded
chunk_number (int) – the chunk number to use, default uses all chunks
jobID (int) – the job ID to use, default uses all lightcurves
overwrite (bool) – overwrite existing metadata file

abstract calculate_metadata_single(lcs)[source]¶

Calculates some properties of the lightcurves

Parameters:: lcs (pandas.DataFrame) – the lightcurve

static calculate_position_mask(lightcurve, ra, dec, whitelist_region, return_all=False)[source]¶

Estimated the 90th percentile of the angular separations from the given position. Assuming a 2D-Gaussian, calculate the standard deviation for the 90th percentile. Keeps all datapoints within five times the standard deviation.

Parameters:

lightcurve (pd.DataFrame) – unstacked lightcurve
ra (Sequence[float]) – RA in degrees of the source
dec (Sequence[float]) – Dec in degrees of the source
return_all (bool, optional) – if True, return all info collected in the selection process
whitelist_region (float) – region in which to keep all datapoints [arcsec]

Returns:

positional mask (and result of the clustering algorithm and the mask for the closest allwise data if return_all is True)

Return type:

list (return_all is False) or tuple (list, sklearn.cluster.HDBSCAN, list) (return_all is True)

find_color_correction(w1_minus_w2)[source]¶

Find the color correction based on the W1-W2 color. See this

Parameters:: w1_minus_w2 (float) –
Returns:: the color correction factor
Return type:: float

static get_db_name(table_name, nice=False)[source]¶

Get the right table name

Parameters:

table_name – str, table name
nice – bool, whether to get the nice table name

Returns:

str

get_photometric_data(tables=None, perc=1, wait=0, service=None, nthreads=100, chunks=None, overwrite=True, remove_chunks=False, query_type='positional', skip_download=False, mask_by_position=False)[source]¶

Load photometric data from the IRSA server for the matched sample. The result will be saved under

</path/to/timewise/data/dir>/output/<base_name>/lightcurves/binned_lightcurves_<service>.json

Parameters:

remove_chunks (bools) – remove single chunk files after binning
overwrite (bool) – overwrite already existing lightcurves and metadata
tables (str or list-like) – WISE tables to use for photometry query, defaults to AllWISE and NOEWISER photometry
perc (float) – percentage of sources to load photometry for, default 1
nthreads (int) – max number of threads to launch
service (str) – either of ‘gator’ or ‘tap’, selects base on elements per chunk by default
wait (float) – time in hours to wait after submitting TAP jobs
chunks (list-like) – containing indices of chunks to download
query_type (str) – ‘positional’: query photometry based on distance from object, ‘by_allwise_id’: select all photometry points within a radius of 50 arcsec with the corresponding AllWISE ID
skip_download (bool) – if True skip downloading and only do binning
mask_by_position (bool) – if True mask single exposures that are too far away from the bulk

get_position_mask(service, chunk_number)[source]¶

Get the position mask for a chunk

Parameters:

service (str) – The service that was used to download the data, either of gator or tap
chunk_number (int) – chunk number

Returns:

position masks

Return type:

dict

get_unbinned_lightcurves(chunk_number, clear=False)[source]¶

Get the unbinned lightcurves for a given chunk number.

Parameters:

chunk_number (int) – int
clear (bool, optional) – remove files after loading, defaults to False

load_data_product(service, chunk_number=None, jobID=None, return_filename=False, verify_contains_lightcurves=False)[source]¶

Load data product from disk

Parameters:

service (str) – service used to download data (‘tap’ or ‘gator’)
chunk_number (int, optional) – chunk number to load, if None load combined file for this service
jobID (int, optional) – jobID to load, if None load the combined file for this chunk
return_filename (bool, optional) – return filename of data product, defaults to False
verify_contains_lightcurves (bool, optional) – verify that the data product contains lightcurves, defaults to False

luminosity_from_flux_density(flux_density, band, distance=None, redshift=None, unit='erg s-1', flux_density_unit='mJy')[source]¶

Converts a flux density into a luminosity

Parameters:

flux_density (float or numpy.ndarray) –
band (str) –
distance (astropy.Quantity) – distance to source, if not given will use luminosity distance from redshift
redshift (float) – redshift to use when calculating luminosity distance
unit (str or astropy.unit) – unit in which to give the luminosity, default is erg s-1 sm-2
flux_density_unit (str or astropy.unit) – unit in which the flux density is given, default is mJy

Returns:

the resulting luminosities

Return type:

float or ndarray

match_all_chunks(table_name='AllWISE Source Catalog', save_when_done=True, additional_columns=None)[source]¶

Match the parent sample to a WISE catalogue and add the result to the parent sample.

Parameters:

table_name (str) – The name of the table you want to match against
save_when_done (bool) – save the parent sample dataframe with the matching info when done
additional_columns (list) – optional, additional columns to add to the matching table

Returns:

plot_lc(parent_sample_idx, service='tap', plot_unbinned=False, plot_binned=True, interactive=False, fn=None, ax=None, save=True, lum_key='flux_density', **kwargs)[source]¶

Make a pretty plot of a lightcurve

Parameters:

parent_sample_idx (int) – The index in the parent sample of the lightcurve
service (str) – the service with which the lightcurves were downloaded
plot_unbinned (bool) – plot unbinned data
plot_binned (bool) – plot binned lightcurve
interactive (bool) – interactive mode
fn (str) – filename, defaults to </path/to/timewise/data/dir>/output/plots/<base_name>/<parent_sample_index>_<lum_key>.pdf
ax – pre-existing matplotlib.Axis
save (bool) – save the plot
lum_key – the unit of luminosity to use in the plot, either of ‘mag’, ‘flux_density’ or ‘luminosity’
kwargs – any additional kwargs will be passed on to matplotlib.pyplot.subplots()

Returns:

the matplotlib.Figure and matplotlib.Axes if interactive=True

vegamag_to_flux_density(vegamag, band, unit='mJy', color_correction=None)[source]¶

This converts the detector level brightness m in Mag_vega to a flux density F

F = (F_nu / f_c) * 10 ^ (-m / 2.5)

where F_nu is the zeropoint flux for the corresponding band and f_c a color correction factor. See this

Parameters:

vegamag (float or numpy.ndarray) –
band (str) –
unit (str) – unit to convert the flux density to
color_correction (float or numpy.ndarray or dict) – the colorcorection factor, if dict the keys have to be ‘f_c(“band”)’

Returns:

the flux densities

Return type:

ndarray

Bin lightcurves by visit¶

class timewise.wise_data_by_visit.WiseDataByVisit(base_name, parent_sample_class, min_sep_arcsec, n_chunks, clean_outliers_when_binning=True, multiply_flux_error=True)[source]¶

WISEData class to bin lightcurve by visits. The visits typically consist of some tens of observations. The individual visits are separated by about six months. The mean flux for one visit is calculated by the weighted mean of the data. The error on that mean is calculated by the root-mean-squared and corrected by the t-value. Outliers per visit are identified if they are more than 20 times the rms away from the mean. In addition to the attributes of timewise.WISEDataBase this class has the following attributes:

Parameters:

clean_outliers_when_binning (bool) – whether to remove outliers by brightness when binning
mean_key (str) – the key for the mean
median_key (str) – the key for the median
rms_key (str) – the key for the rms
upper_limit_key (str) – the key for the upper limit
Npoints_key (str) – the key for the number of points
zeropoint_key_ext (str) – the key for the zeropoint

bin_lightcurve(lightcurve)[source]¶

Combine the data by visits of the satellite of one region in the sky. The visits typically consist of some tens of observations. The individual visits are separated by about six months. The mean flux for one visit is calculated by the weighted mean of the data. The error on that mean is calculated by the root-mean-squared and corrected by the t-value. Outliers per visit are identified if they are more than 100 times the rms away from the mean. These outliers are removed from the calculation of the mean and the error if self.clean_outliers_when_binning is True.

Parameters:: lightcurve (pandas.DataFrame) – the unbinned lightcurve
Returns:: the binned lightcurve
Return type:: pandas.DataFrame

calculate_epochs(f, e, visit_mask, counts, remove_outliers, outlier_mask=None)[source]¶

Calculates the binned epochs of a lightcurve.

Parameters:

f (np.array) – the fluxes
e (np.array) – the flux errors
visit_mask (np.array) – the visit mask
counts (np.array) – the counts
remove_outliers (bool) – whether to remove outliers
outlier_mask (np.array) – the outlier mask

Returns:

the epoch

Return type:

float

calculate_metadata_single(lc)[source]¶

Calculates some metadata, describing the variability of the lightcurves.

max_dif: maximum difference in magnitude between any two datapoints
min_rms: the minimum errorbar of all datapoints
N_datapoints: The number of datapoints
max_deltat: the maximum time difference between any two datapoints
mean_weighted_ppb: the weighted average brightness where the weights are the points per bin

Parameters:: lc (dict) – the lightcurves
Returns:: the metadata
Return type:: dict

static get_visit_map(lightcurve)[source]¶

Create a map datapoint to visit

Parameters:: lightcurve (pd.DataFrame) – the unbinned lightcurve
Returns:: visit map
Return type:: np.ndarray

plot_diagnostic_binning(service, ind, lum_key='mag', interactive=False, fn=None, save=True, which='panstarrs', arcsec=20)[source]¶

Show a skymap of the single detections and which bin they belong to next to the binned lightcurve

Parameters:

service (str) – service used to download data, either of ‘tap’ or ‘gator’
ind (str, int) – index of the object in the parent sample
lum_key (str) – the key of the brightness unit, either of flux (instrument flux in counts) or mag
interactive (bool) – if function is used interactively, return mpl.Figure and mpl.axes if True
fn (str) – filename for saving
save (bool) – saves figure if True
which (str) – survey to get the cutout from, either of ‘sdss’ or ‘panstarrs’
arcsec (float) – size of cutout

Returns:

Figure and axes if interactive=True

Return type:

mpl.Figure, mpl.Axes

Use the DESY cluster in Zeuthen to do the binning¶

class timewise.wise_bigdata_desy_cluster.WISEDataDESYCluster(base_name, parent_sample_class, min_sep_arcsec, n_chunks, clean_outliers_when_binning=True, multiply_flux_error=True)[source]¶

A class to download WISE data with multiple threads and do the binning on the DESY cluster. In addition to the attributes of WiseDataByVisit this class has the following attributes:

Parameters:

executable_filename (str) – the filename of the executable that will be submitted to the cluster
submit_file_filename (str) – the filename of the submit file that will be submitted to the cluster
job_id (str) – the job id of the submitted job
cluster_jobID_map (dict) – a dictionary mapping the chunk number to the cluster job id
clusterJob_chunk_map (dict) – a dictionary mapping the cluster job id to the chunk number
cluster_info_file (str) – the filename of the file that stores the cluster info, loaded by the cluster jobs
start_time (float) – the time when the download started

clear_cluster_log_dir()[source]¶: Clears the directory where cluster logs are stored

collect_condor_status()[source]¶: Gets the condor status and saves it to private attribute

condor_status(job_id)[source]¶: Get the status of jobs running on condor. :return: number of jobs that are done, running, waiting, total, held

static get_condor_status()[source]¶: Queries condor to get cluster status. :return: str, output of query command

get_coverage(chunk, lum_key, load_from_bigdata_dir=False)[source]¶

Get the coverage of the MEASURED median for a given chunk and lum_key

Parameters:

chunk (int, list[int]]) – chunk number
lum_key (str) – luminosity key
load_from_bigdata_dir (bool, optional) – if True, load the coverage from the bigdata directory

static get_quantiles_label(df, cl=0.68)[source]¶: Get the quantiles label for a given coverage level

get_red_chi2(chunk, lum_key, use_bigdata_dir=False)[source]¶

Get the reduced chi2 for a given chunk or multiple chunks

Parameters:

chunk (int or list) – the chunk number or list of chunk numbers
lum_key (str) – the unit of luminosity to use in the plot, either of ‘mag’, ‘flux’ or ‘flux_density’
use_bigdata_dir (bool, optional) – load from the big data storage directory, default is False

Returns:

the reduced chi2 for each band, the DataFrame will have columns chi2, med_lum and N_datapoints

Return type:

dict[str, pd.DataFrame]

get_sample_photometric_data(max_nTAPjobs=8, perc=1, tables=None, chunks=None, cluster_jobs_per_chunk=100, wait=5, remove_chunks=False, query_type='positional', overwrite=True, storage_directory=None, node_memory='8G', skip_download=False, skip_input=False, mask_by_position=False)[source]¶

An alternative to get_photometric_data() that uses the DESY cluster and is optimised for large datasets.

Parameters:

max_nTAPjobs (int) – The maximum number of TAP jobs active at the same time.
perc (float) – The percentage of chunks to download
tables (str or list-like) – The tables to query
chunks (list-like) – chunks to download, default is all of the chunks
cluster_jobs_per_chunk (int) – number of cluster jobs per chunk
wait (float) – time in hours to wait after submitting TAP jobs
remove_chunks (bool) – remove single chunk files after binning
query_type (str) – ‘positional’: query photometry based on distance from object, ‘by_allwise_id’: select all photometry points within a radius of 50 arcsec with the corresponding AllWISE ID
overwrite (bool) – overwrite already existing lightcurves and metadata
storage_directory (str) – move binned files and raw data here after work is done
node_memory (str) – memory per node on the cluster, default is 8G
skip_download (bool) – if True, assume data is already downloaded, only do binning in that case
skip_input (bool) – if True do not ask if data is correct before download
mask_by_position (bool) – if True mask single exposures that are too far away from the bulk

get_submit_file_filename(ids)[source]¶

Get the filename of the submit file for given job ids

Parameters:: ids (list) – list of job ids
Returns:: filename
Return type:: str

load_data_product(service, chunk_number=None, jobID=None, return_filename=False, use_bigdata_dir=False, verify_contains_lightcurves=False)[source]¶

Load data product from disk

Parameters:

service (str) – service used to download data (‘tap’ or ‘gator’)
chunk_number (int, optional) – chunk number to load, if None load combined file for this service
jobID (int, optional) – jobID to load, if None load the combined file for this chunk
return_filename (bool, optional) – return filename of data product, defaults to False
verify_contains_lightcurves (bool, optional) – verify that the data product contains lightcurves, defaults to False

make_chi2_plot(index_mask=None, chunks=None, load_from_bigdata_dir=False, lum_key='_flux_density', interactive=False, save=False, nbins=100, cumulative=True, upper_bound=4)[source]¶

Make a plot of the reduced chi2 distribution for a given chunk or multiple chunks

Parameters:

index_mask (dict) – a mask to apply to the parent sample, eg {‘AGNs’: agn_mask}
chunks (int or list) – the chunk number or list of chunk numbers
load_from_bigdata_dir (bool, optional) – load from the big data storage directory, default is False
lum_key (str) – the unit of luminosity to use in the plot, either of ‘mag’, ‘flux’ or ‘flux_density’
interactive (bool) – return the figure and axes if True, default is False
save (bool) – save the plot, default is False
nbins (int) – the number of bins to use in the histogram, default is 100
cumulative (bool) – plot the cumulative distribution, default is True
upper_bound (float) – the upper bound of the x-axis, default is 4

Returns:

the matplotlib.Figure and matplotlib.Axes if interactive=True

Return type:

tuple[mpl.Figure, mpl.Axes]

make_coverage_plots(index_mask=None, chunks=None, load_from_bigdata_dir=False, lum_key='_flux_density', interactive=False, save=False, nbins=100)[source]¶

Make the coverage plots for the measured median of the specified luminosity unit

Parameters:

index_mask (dict, optional) – index mask to apply to the data, e.g. {“AGNs”: agn_mask}
chunks (list[int], int, optional) – chunks to use, if None use all chunks
load_from_bigdata_dir (bool, optional) – if True, load the coverage from the bigdata directory
lum_key (str, optional) – luminosity key, either of “_flux_density” or “_mag”, default is “_flux_density”
interactive (bool, optional) – if True, return the figures and axes, otherwise close them
save (bool, optional) – if True, save the figures
nbins (int, optional) – number of bins for the histograms

Returns:

if interactive, return the figures and axes, otherwise close them

Return type:

list[tuple[matplotlib.figure.Figure, matplotlib.axes.Axes]]

make_executable_file()[source]¶: Produces the executable that will be submitted to the NPX cluster.

make_submit_file(job_ids: (<class 'int'>, typing.List[int]), node_memory: str = '8G', mask_by_position: bool = False)[source]¶

Produces the submit file that will be submitted to the NPX cluster.

Parameters:

job_ids (int or list of ints) – The job ID or list of job IDs to submit
node_memory (str) – The amount of memory to request for each node
mask_by_position (bool) – if True mask single exposures that are too far away from the bulk

plot_lc(parent_sample_idx, service='tap', plot_unbinned=False, plot_binned=True, interactive=False, fn=None, ax=None, save=True, lum_key='flux_density', load_from_bigdata_dir=False, **kwargs)[source]¶

Make a pretty plot of a lightcurve

Parameters:

parent_sample_idx (int or str) – The index in the parent sample of the lightcurve
service (str) – the service with which the lightcurves were downloaded
plot_unbinned (bool) – plot unbinned data
plot_binned (bool) – plot binned lightcurve
interactive (bool) – interactive mode
fn (str) – filename, defaults to </path/to/timewise/data/dir>/output/plots/<base_name>/<parent_sample_index>_<lum_key>.pdf
ax – pre-existing matplotlib.Axis
save (bool) – save the plot
lum_key (str) – the unit of luminosity to use in the plot, either of ‘mag’, ‘flux_density’ or ‘luminosity’
load_from_bigdata_dir (bool) – load from the the big data storage directory
kwargs – any additional kwargs will be passed on to matplotlib.pyplot.subplots()

Returns:

the matplotlib.Figure and matplotlib.Axes if interactive=True

run_cluster(node_memory, service)[source]¶

Run the DESY cluster

Parameters:

node_memory (str) – memory per node
service (str) – service to use for querying the data

submit_to_cluster(node_memory, single_chunk=None, mask_by_position=False)[source]¶

Submit jobs to cluster

Parameters:

node_memory (str) – memory per node
single_chunk (int) – number of single chunk to run on the cluster
mask_by_position (bool) – if True mask single exposures that are too far away from the bulk

Returns:

ID of the cluster job

Return type:

int

wait_for_job(job_id=None)[source]¶: Wait until the cluster job is done

Point Source Utils¶

timewise.point_source_utils.get_point_source_wise_data(base_name, ra, dec, min_sep_arcsec=10, match=False, **kwargs)[source]¶

Get a WISEData instance for a point source

Parameters:

base_name (str) – base name for storage in the data directory
ra (float) – right ascencion
dec (float) – declination
min_sep_arcsec (float) – search radius in arcsec
match (bool) – match to AllWISE Source Catalogue
kwargs (dict) – keyword arguments passed to WISEData.get_photometric_data()

Returns:

WISEData