Eclipsing Binaries and Asteroseismology

Logo

Precise fundamental stellar parameters in the golden age of time-domain astronomy

  • Home
  • Registration
  • La Palma
  • Programme
  • Photos
  • Iterative pre-whitening examples

    Ground-based data

    This notebook provides an example for running an iterative pre-whitening (IPW) analysis for an RR Lyrae variable observed in multiple filters from the ground.

    We will perform the IPW on the MeerLICHT q-band data.

    BONUS: Do the same for the other bands and compare the amplitudes.

    For this notebook, you’ll need to download: ML_V02.csv

    %matplotlib notebook
    
    import numpy as np
    import pandas as pd
    import matplotlib.pyplot as plt
    
    from pythia.timeseries.periodograms import LS_periodogram
    from pythia.timeseries.iterative_prewhitening import run_ipw
    
    
    
    plt.rcParams.update({
        "text.usetex": True,
            "font.family": "sans-serif",
        "font.sans-serif": ["Helvetica"]})
    plt.rcParams.update({
        "pgf.rcfonts": False,
        "pgf.texsystem": "pdflatex",
        "pgf.preamble": "\n".join([
             r"\usepackage{amsmath}",
             r"\usepackage[utf8x]{inputenc}",
             r"\usepackage[T1]{fontenc}",
             r"\usepackage{cmbright}",
        ]),
    })
    
    plt.rcParams['xtick.labelsize']=18
    plt.rcParams['ytick.labelsize']=18
    
    df = pd.read_csv('ML_V02.csv', usecols=['MJD', 'Filter', 'Mag_Opt', 'Magerr_Opt'])
    

    First, we will work with the $q$-band data.

    In order to make sure that we’re not over or under exploring the data, we need to quantify:

    lc_q = df.loc[df['Filter']=='q']
    times_q = lc_q.MJD.values
    mags_q = lc_q.Mag_Opt.values
    time_base = max(times_q)-min(times_q)
    delta_freq = 1./time_base
    
    delta_t = np.median(np.diff(times_q))
    f_nyq = 1./delta_t
    
    print('$q$-band:')
    print('\t Total time-base: {:.2f} d'.format(time_base))
    print('\t Frequency resolution: {:.5f} c/d'.format(delta_freq))
    print('\t Nyquist Frequency: {:.5f} c/d'.format(f_nyq))
    
    $q$-band:
    	 Total time-base: 1737.11 d
    	 Frequency resolution: 0.00058 c/d
    	 Nyquist Frequency: 325.73290 c/d
    

    Okay, so we can see that the data allows for a high frequency resolution $f_{\rm Rayleigh}$, and a high $f_{\rm Nyquist}$.

    Typically, we would want to oversample in frequency by a factor of 5-10 (we’ll use 7). However, since we know ahead of time that the target is an RR Lyrae (spoiler alert), we don’t need to explore super high frequencies. So, we’ll simply assume $f_{\rm Nyquist}=50~{\rm d^{-1}}$.

    Pythia allows for the user to specify the minimum and maximum frequency searched, as well as the oversampling factor in frequency that is used. Another thing to keep in mind is that you MUST remove the mean magnitude from your observations and center the data around zero. Pythia does not do this for you.

    nu, amp = LS_periodogram(times_q, mags_q-np.median(mags_q), f0=1.5*delta_freq, fn=50., oversample_factor=7.)
    

    You don’t have to calculate these values each time, pythia is clever enough to figure out the frequency resolution, nyquist frequency, etc., by itself.

    Keep in mind that periodogram calculations scale as $N \log\left( N\right)$, where $N$ is the number of data points. So although this particular example ran quickly, if you use a 4-year long {\it Kepler} short-cadence light curve, you’ll be sitting here for a while and your laptop will likely not be happy.

    But, in any case, let’s plot our results.

    fig, ax = plt.subplots(1,1,figsize=(6.69,6.69),num=1)
    ax.plot(nu, amp*1000.)
    ax.set_xlabel(r'${\rm Frequency~[d^{-1}]}$',fontsize=18)
    ax.set_ylabel(r'${\rm Amplitude~[mmag]}$',fontsize=18)
    ax.set_xlim(0.,50.)
    fig.tight_layout()
    
    <IPython.core.display.Javascript object>
    

    Well that looks familiar! Let’s identify the frequency of maximum amplitude.

    f_max = nu[np.argmax(amp)]
    period_max = 1./f_max
    print('The frequency of maximum amplitude occurs at {:.4f} c/d'.format(f_max))
    
    The frequency of maximum amplitude occurs at 1.7178 c/d
    

    We can now phase fold our light curve on this frequency using the relation: $\phi = \left( \frac{t-t_0}{P_{\rm max}} \right) \mod 1 = \left( \left( t-t_0\right ) f_{\rm max} \right) \mod 1 $

    def time_to_phase(times, frequency, t0=0.):
        return ((times-t0)*frequency) % 1.
    
    fig, ax = plt.subplots(1,1,figsize=(6.69,6.69),num=2)
    ax.plot(time_to_phase(times_q, f_max, t0=0.), mags_q, 'kx')
    ax.set_xlabel(r'${\rm Phase}$',fontsize=18)
    ax.set_ylabel(r'${\rm Magnitude}$',fontsize=18)
    ax.set_ylim(ax.get_ylim()[::-1])
    fig.tight_layout()
    
    
    <IPython.core.display.Javascript object>
    

    In this case, we’ve identified the correct frequency manually.

    However, we haven’t removed this signal from the data yet.

    Instead of doing this manually as well, let’s use Pythia, which comes with an iterative pre-whitening function (run_ipw). This function takes the times, flux/magnitude data, as well as the data uncertainties, and any keyword arguments that the LS_periodogram function accepts.

    yerr = 0.0005* np.ones_like(times_q)
    residuals, model, offsets, \
    frequencies, amplitudes, \
    phases, stop_criteria, noise_final = run_ipw(times_q,mags_q-np.mean(mags_q), yerr, maxiter=4, fn=50.)
    
    optimizing logp for variables: [phase]
    
    100.00% [17/17 00:00<00:00 logp = -4.647e+07]
    message: Desired error not necessarily achieved due to precision loss.
    logp: -50924799.969084024 -> -46470818.92124977
    
    
    
    
    
    optimizing logp for variables: [offset, phase, amp, nu]
    
    100.00% [16/16 00:00<00:00 logp = -inf]
    message: Desired error not necessarily achieved due to precision loss.
    logp: -46470818.92124977 -> -46470818.92124977
    
    
    
    
    
    optimizing logp for variables: [phase]
    
    100.00% [60/60 00:00<00:00 logp = -2.453e+07]
    message: Desired error not necessarily achieved due to precision loss.
    logp: -112008450.2974144 -> -24531633.00434024
    
    
    
    
    
    optimizing logp for variables: [phase, amp, nu]
    
    100.00% [48/48 00:00<00:00 logp = -2.450e+07]
    message: Desired error not necessarily achieved due to precision loss.
    logp: -24531633.00434024 -> -24497425.391639415
    
    
    
    
    
    optimizing logp for variables: [phase]
    
    100.00% [44/44 00:00<00:00 logp = -1.417e+07]
    message: Desired error not necessarily achieved due to precision loss.
    logp: -17833305.005050622 -> -14174240.129884329
    
    
    
    
    
    optimizing logp for variables: [phase, amp, nu]
    
    100.00% [47/47 00:00<00:00 logp = -1.413e+07]
    message: Desired error not necessarily achieved due to precision loss.
    logp: -14174240.129884329 -> -14131518.113244643
    
    
    
    
    
    optimizing logp for variables: [phase]
    
    100.00% [72/72 00:00<00:00 logp = -1.025e+07]
    message: Desired error not necessarily achieved due to precision loss.
    logp: -20887900.65313138 -> -10245180.307475703
    
    
    
    
    
    optimizing logp for variables: [phase, amp, nu]
    
    100.00% [60/60 00:00<00:00 logp = -1.023e+07]
    message: Desired error not necessarily achieved due to precision loss.
    logp: -10245180.307475703 -> -10230784.123763653
    
    
    
    Individually optimised sinusoids: 
    Sinusoid 0: Frequency: 1.7177935922246308 Amplitude: 0.23459261072488322 Phase: 0.8363314917221069
    Sinusoid 1: Frequency: 3.4356663926144857 Amplitude: 0.1156819837890806 Phase: -2.2895371589405973
    Sinusoid 2: Frequency: 5.153581114588498 Amplitude: 0.07887281648017042 Phase: 1.1298464566374675
    Sinusoid 3: Frequency: 6.871364519355721 Amplitude: 0.04856713412712954 Phase: -1.1917734779272458
    
    
    optimizing logp for variables: [phase]
    
    100.00% [85/85 00:00<00:00 logp = -9.564e+06]
    message: Desired error not necessarily achieved due to precision loss.
    logp: -120822016.87670055 -> -9563912.818568073
    optimizing logp for variables: [offset, phase, amp, nu]
    
    100.00% [16/16 00:00<00:00 logp = -inf]
    message: Desired error not necessarily achieved due to precision loss.
    logp: -9563912.818568073 -> -9563912.818568073
    

    Excellent – the code ran wihtout any hiccups. Now let’s check the residuals to see what’s left over!

    nu_res, amp_res = LS_periodogram(times_q, residuals, fn=50.)
    fig, ax = plt.subplots(1,1,figsize=(6.69,6.69),num=3)
    ax.plot(nu_res, amp_res*1000.)
    ax.set_xlabel(r'${\rm Frequency~[d^{-1}]}$',fontsize=18)
    ax.set_ylabel(r'${\rm Amplitude~[mmag]}$',fontsize=18)
    ax.set_xlim(0.,50.)
    fig.tight_layout()
    
    <IPython.core.display.Javascript object>
    

    Turns out that there’s a lot of signal left over. This highlights the limitations of IPW for highly non-sinusoidal signals.

    If we phase-fold the signal on the original period, we can see that there’s a lot of harmonic signal leftover.

    fig, ax = plt.subplots(1,1,figsize=(6.69,6.69),num=4)
    ax.plot(time_to_phase(times_q, f_max, t0=0.), residuals, 'kx')
    ax.set_xlabel(r'${\rm Phase}$',fontsize=18)
    ax.set_ylabel(r'${\rm Magnitude}$',fontsize=18)
    ax.set_ylim(ax.get_ylim()[::-1])
    fig.tight_layout()
    
    <IPython.core.display.Javascript object>