Custom Functions

Science analysis is built upon custom data processing. To simplify this task, and enable instrument independent analysis, custom functions may be attached to the Instrument or Constellation object. Each function is run automatically when new data is loaded before it is made available in

This feature enables a user to hand a Constellation or Instrument object to an independent routine and ensure any desired customizations required are performed without any additional user intervention. This feature enables for the transparent modification of a data set in between its state at rest on disk and when the data becomes available for use in


Custom arguments and keywords are supported for these functions. However, these arguments and keywords are only evaluated initially when the function is attached to an Instrument object. Thus the objects passed in must be static or capable of updating themselves from within the custom function itself.

Example Function

If a custom function is attached to an Instrument or Constellation object, the pysat object is passed to function in place when the data is loaded. There is no Instrument or Constellation copy made in memory. The function is expected to modify the supplied pysat object directly and the functions are not allowed to return any information. The example below is appropriate to be applied to an Instrument or a Constellation at the Instrument level.

def modify_value(inst, param_name, factor=1.):
    """Modify param_name in inst by multiplying by factor

    inst : pysat.Instrument
        Object to be modified
    param_name : str
        Label for variable to be multiplied by factor
    factor : int or float
        Value to apply to param_name via multiplication (default=1.)

    # Save the old data
    inst['old_{:s}'.format(param_name)] = inst[param_name]

    # Modify the data by multiplying it by a specified value
    inst[param_name] = inst[param_name] * factor

    # Changes to the instrument object are retained
    inst.modify_value_was_here = True


Custom functions can also be applied at the Constellation level, which allows data from multiple Instrument objects to be used by a single function. Constellation level custom functions are applied after Instrument level functions. Note that the the way this function identifies which Instrument to use means that the Instrument order in Constellation.instruments is important. An alternate method to selected the desired Instrument would be to identify the desired instruments using the platform, name, tag, and inst_id values.

def modify_const(const, inst1_param_name, inst2_param_name, dest_ind=0):
    """Modify param_name in inst by multiplying by factor

    const : pysat.Constellation
        Object to be modified
    inst1_param_name : str
        Label for variable from the first Constellation Instrument
    inst2_param_name : str
        Label for variable from the seccond Constellation Instrument
    dest_ind : int
        Zero-based index identifying the destination Instrument for the
        modified data variable

    # Ensure there are enough Instruments in the constellation
    min_inst = 2 if dest_ind < 2 else dest_ind + 1
    if len(const.instruments) < min_inst:
        raise ValueError('unexpected number of Instruments in Constellation')

    # Modify the data by adding new a new data variable to the destination
    # Instrument, calculated with data from the first and second Instruments
    new_data = const.instruments[0][inst1_param_name] * \
    new_var = " x ".join([inst1_param_name, inst2_param_name])
    const.instruments[dest_ind][new_var] = new_data


Attaching Custom Function to an Instrument

Custom functions must be attached to an Instrument object for pysat to automatically apply the function upon every load.

# Load data
ivm.load(2009, 1)

# Establish current values for 'mlt'
stored_data = ivm['mlt'].copy()

# Attach a custom function and demonstrate execution
ivm.custom_attach(modify_value, args=['mlt'], kwargs={'factor': 2.})

# `modify_value` is executed as part of the `ivm.load` call.
ivm.load(2009, 1)

# Verify result is present
print(ivm['mlt'], stored_result)

# Check for attribute added to ivm

# `modify_vaule` is executed by `ivm.load` call.
ivm.load(2009, 1)

# Verify results are present
print(ivm[['old_mlt', 'mlt']], stored_result)

# Can also set functions via its string name. This example includes
# both required and optional arguments, and requires output from
# the prior custom function
ivm.custom_attach('modify_value', args=['old_mlt'], kwargs={'factor': 3.0})

# All three functions are executed with each load call in the order they
# were attached.
ivm.load(2009, 1)

# Verify results are present
print(ivm[['old_mlt', 'old_old_mlt', 'mlt']], stored_result)

The output of from these and other custom functions will always be available from the Instrument object, regardless of what level the science analysis is performed.

We can repeat the earlier DMSP example, this time using nano-kernel functionality.

import matplotlib.pyplot as plt
import numpy as np
import pandas

# Create custom function
def filter_dmsp(inst, limit=None):
    # Isolate data to locations near geomagnetic equator
    idx, = np.where((dmsp['mlat'] < 5) & (dmsp['mlat'] > -5))

    # Downselect data = dmsp[idx]

# Get a list of dates between start and stop
start = dt.datetime(2001, 1, 1)
stop = dt.datetime(2001, 1, 10)
date_array = pysat.utils.time.create_date_range(start, stop)

# Create empty series to hold result
mean_ti = pandas.Series()

# Instantiate the pysat.Instrument
dmsp = pysat.Instrument(platform='dmsp', name='ivm', tag='utd',

# Attach the custom function defined above

# Attach the first custom function, and declare it should run first
dmsp.custom_attach('modify_value', at_pos=0, args=['ti'],
                    kwargs={'factor': 2.0})

# Iterate over season, calculate the mean Ion Temperature
for date in date_array:
   # Load data into

   # Compute mean ion temperature using pandas functions and store
   if not dmsp.empty:
       mean_ti[] = dmsp['old_ti'].mean(skipna=True)

# Plot the result using pandas functionality
mean_ti.plot(title='Mean Ion Temperature near Magnetic Equator')

# Because the custom function didn't add metadata, use the old data name
plt.ylabel(dmsp.meta['ti', dmsp.desc_label] + ' (' +
           dmsp.meta['ti', dmsp.units_label] + ')')

Note the same result is obtained. The DMSP Instrument object and analysis are performed at the same level, so there is no strict gain by using the pysat nano-kernel in this simple demonstration. However, we can use the nano-kernel to translate this daily mean into an versatile instrument-independent function.

Attaching Custom Function to an Instrument at Instantiation

Custom functions may also be attached to an Instrument object directly at instantiation via the custom keyword.

# Create dictionary for each custom function and associated inputs
custom_func_1 = {'function': modify_value, 'args': ['mlt'],
                 'kwargs': {'factor': 2.})}
custom_func_2 = {'function': modify_value, 'args': ['old_mlt'],
                 'kwargs'={'factor': 3.0}}

# Combine all dicts into a list in order of application and execution.
# However, if you specify the 'at_pos' kwarg, it will take precedence.
custom = [custom_func_1, custom_func_2]

# Instantiate pysat.Instrument
inst = pysat.Instrument(platform, name, inst_id=inst_id, tag=tag,

Attaching Custom Function to a Constellation

Attaching custom functions to Constellation objects is done in the same way as for Instrument objects. The only difference is the additional keyword argument apply_inst, which defaults to True and applies the custom function to all of the Constellation Instrument objects. This example assumes that the pysatSpaceWeather ACE Instruments have been registered.

import datetime as dt

# Apply a Constellation-level custom function at initialization
const = pysat.Constellation(platforms=['ace'], tags=['historic'],
                            custom=[{'function': modify_const,
                                     'apply_inst': False,
                                     'args': ['eflux_38-53', 'bx_gsm'],
                                     'kwargs': {'dest_ind': 2}}])

# Get and load data
stime = dt.datetime(2022, 1, 1)

# Check that the expected new variable is present
# Expected output:
# Index(['jd', 'sec', 'status_10', 'int_pflux_10MeV', 'status_30',
#       'int_pflux_30MeV', 'eflux_38-53 x bx_gsm'], dtype='object')