Initial Instrument Independence¶
Adding Instrument Independence
pysat features enable the development of instrument independent methods, code that can work on many if not all pysat supported datasets. This section continues the evolution of the simple DMSP temperature averaging method presented earlier towards greater instrument independence as well as application to non-DMSP data sets.
import matplotlib.pyplot as plt
import numpy as np
import pandas
def daily_mean(inst, start, stop, data_label):
"""Perform daily mean of data_label over season
Parameters
----------
inst : pysat.Instrument
Instrument object
start : datetime.datetime
Start date
stop : datetime.datetime
Stop date
data_label : string
Identifier for variable to be averaged
"""
# create empty series to hold result
mean_val = pandas.Series()
# get list of dates between start and stop
date_array = pysat.utils.time.create_date_range(start, stop)
# iterate over season, calculate the mean
for date in date_array:
inst.load(date=date)
if not inst.data.empty:
# compute absolute mean using pandas functions and store
mean_val[inst.date] = inst[data_label].abs().mean(skipna=True)
return mean_val
# instantiate pysat.Instrument object to get access to data
vefi = pysat.Instrument(platform='cnofs', name='vefi', tag='dc_b')
# define custom filtering method
def filter_inst(inst, data_label, data_gate):
# select data within +/- data gate
min_gate = -np.abs(data_gate)
max_gate = np.abs(data_gate)
idx, = np.where((inst[data_label] < max_gate) &
(inst[data_label] > min_gate))
inst.data = inst[idx]
return
# attach filter to vefi object, function is run upon every load
vefi.custom.add(filter_inst, 'modify', 'latitude', 5.)
# make a plot of daily mean of 'db_mer'
mean_dB = daily_mean(vefi, start, stop, 'dB_mer')
# plot the result using pandas functionality
mean_dB.plot(title='Absolute Daily Mean of '
+ vefi.meta['dB_mer'].long_name)
plt.ylabel('Absolute Daily Mean (' + vefi.meta['dB_mer'].units + ')')
The pysat nano-kernel lets you modify any data set as needed so that you can get the daily mean you desire, without having to modify the daily_mean function.
Check the instrument independence using a different instrument. Whatever instrument is supplied may be modified in arbitrary ways by the nano-kernel.
Note
Downloading data for COSMIC requires an account at the Cosmic Data Analysis and Archive Center (CDAAC).
cosmic = pysat.Instrument('cosmic', 'gps', tag='ionprf', clean_level='clean',
altitude_bin=3)
# attach filter method
cosmic.custom.add(filter_inst, 'modify', 'edmaxlat', 15.)
# perform average
mean_max_dens = daily_mean(cosmic, start, stop, 'edmax')
# plot the result using pandas functionality
long_name = cosmic.meta[data_label, cosmic.name_label]
units = cosmic.meta[data_label, cosmic.units_label]
mean_max_dens.plot(title='Absolute Daily Mean of ' + long_name)
plt.ylabel('Absolute Daily Mean (' + units + ')')
daily_mean
now works for any instrument, as long as the data to be averaged is
1D. This can be fixed.
Partial Independence from Dimensionality
This section continues the evolution of the daily_mean method presented earlier towards greater instrument independence by supporting more than 1D datasets.
import pandas
import pysat
def daily_mean(inst, start, stop, data_label):
# create empty series to hold result
mean_val = pandas.Series()
# get list of dates between start and stop
date_array = pysat.utils.time.create_date_range(start, stop)
# iterate over season, calculate the mean
for date in date_array:
inst.load(date=date)
if not inst.data.empty:
# compute mean absolute using pandas functions and store
# data could be an image, or lower dimension, account for 2D and lower
data = inst[data_label]
if isinstance(data.iloc[0], pandas.DataFrame):
# 3D data, 2D data at every time
data_panel = pandas.Panel.from_dict(dict([(i, data.iloc[i]) for i in xrange(len(data))]))
mean_val[inst.date] = data_panel.abs().mean(axis=0,skipna=True)
elif isinstance(data.iloc[0], pandas.Series):
# 2D data, 1D data for each time
data_frame = pandas.DataFrame(data.tolist())
data_frame.index = data.index
mean_val[inst.date] = data_frame.abs().mean(axis=0, skipna=True)
else:
# 1D data
mean_val[inst.date] = inst[data_label].abs().mean(axis=0,skipna=True)
return mean_val
This code works for 1D, 2D, and 3D datasets, regardless of instrument platform, with only some minor changes from the initial VEFI specific code. In-situ measurements, remote profiles, and remote images. It is true the nested if statements aren’t the most elegant. Particularly the 3D case. However this code puts the data into an appropriate structure for pandas to align each of the profiles/images by their respective indices before performing the average. Note that the line to obtain the arithmetic mean is the same in all cases, .mean(axis=0, skipna=True). There is an opportunity here for pysat to clean up the little mess caused by dimensionality.
import pandas
import pysat
def daily_mean(inst, start, stop, data_label):
# create empty series to hold result
mean_val = pandas.Series()
# get list of dates between start and stop
date_array = pysat.utils.time.create_date_range(start, stop)
# iterate over season, calculate the mean
for date in date_array:
inst.load(date=date)
if not inst.data.empty:
# compute mean absolute using pandas functions and store
# data could be an image, or lower dimension, account for 2D and lower
data = inst[data_label]
data = pysat.ssnl.computational_form(data)
mean_val[inst.date] = data.abs().mean(axis=0, skipna=True)
return mean_val