API

Instrument

class pysat.Instrument(platform=None, name=None, tag=None, inst_id=None, sat_id=None, clean_level='clean', update_files=None, pad=None, orbit_info=None, inst_module=None, multi_file_day=None, manual_org=None, directory_format=None, file_format=None, temporary_file_list=False, strict_time_flag=False, ignore_empty_files=False, units_label='units', name_label='long_name', notes_label='notes', desc_label='desc', plot_label='label', axis_label='axis', scale_label='scale', min_label='value_min', max_label='value_max', fill_label='fill', *arg, **kwargs)

Download, load, manage, modify and analyze science data.

Deprecated since version 2.3.0: Several attributes and methods will be removed or replaced in pysat 3.0.0: sat_id, default, multi_file_day, manual_org, units_label, name_label, notes_label, desc_label, min_label, max_label, fill_label, plot_label, axis_label, scale_label, and _filter_datetime_input

Parameters:
  • platform (string) – name of platform/satellite.
  • name (string) – name of instrument.
  • tag (string, optional) – identifies particular subset of instrument data.
  • inst_id (string) – Replaces sat_id
  • sat_id (string, optional) – identity within constellation
  • clean_level ({'clean','dusty','dirty','none'}, optional) – level of data quality
  • pad (pandas.DateOffset, or dictionary, optional) – Length of time to pad the begining and end of loaded data for time-series processing. Extra data is removed after applying all custom functions. Dictionary, if supplied, is simply passed to pandas DateOffset.
  • orbit_info (dict) – Orbit information, {‘index’:index, ‘kind’:kind, ‘period’:period}. See pysat.Orbits for more information.
  • inst_module (module, optional) – Provide instrument module directly. Takes precedence over platform/name.
  • update_files (boolean, optional) – If True, immediately query filesystem for instrument files and store.
  • temporary_file_list (boolean, optional) – If true, the list of Instrument files will not be written to disk. Prevents a race condition when running multiple pysat processes.
  • strict_time_flag (boolean, option (False)) – If true, pysat will check data to ensure times are unique and monotonic. In future versions, this will be fixed to True.
  • multi_file_day (boolean, optional) – Set to True if Instrument data files for a day are spread across multiple files and data for day n could be found in a file with a timestamp of day n-1 or n+1. Deprecated at this level in pysat 3.0.0.
  • manual_org (bool) – if True, then pysat will look directly in pysat data directory for data files and will not use default /platform/name/tag. Deprecated in pysat 3.0.0, as this flag is not needed to use directory_format.
  • directory_format (str) – directory naming structure in string format. Variables such as platform, name, and tag will be filled in as needed using python string formatting. The default directory structure would be expressed as ‘{platform}/{name}/{tag}’
  • file_format (str or NoneType) – File naming structure in string format. Variables such as year, month, and sat_id will be filled in as needed using python string formatting. The default file format structure is supplied in the instrument list_files routine.
  • ignore_empty_files (boolean) – if True, the list of files found will be checked to ensure the filesizes are greater than zero. Empty files are removed from the stored list of files.
  • units_label (str) – String used to label units in storage. Defaults to ‘units’.
  • name_label (str) – String used to label long_name in storage. Defaults to ‘name’.
  • notes_label (str) – label to use for notes in storage. Defaults to ‘notes’
  • desc_label (str) – label to use for variable descriptions in storage. Defaults to ‘desc’
  • plot_label (str) – label to use to label variables in plots. Defaults to ‘label’
  • axis_label (str) – label to use for axis on a plot. Defaults to ‘axis’
  • scale_label (str) – label to use for plot scaling type in storage. Defaults to ‘scale’
  • min_label (str) – label to use for typical variable value min limit in storage. Defaults to ‘value_min’
  • max_label (str) – label to use for typical variable value max limit in storage. Defaults to ‘value_max’
  • fill_label (str) – label to use for fill values. Defaults to ‘fill’ but some implementations will use ‘FillVal’
data

loaded science data

Type:pandas.DataFrame
date

date for loaded data

Type:pandas.datetime
yr

year for loaded data

Type:int
bounds

bounds for loading data, supply array_like for a season with gaps. Users may provide as a tuple or tuple of lists, but the attribute is stored as a tuple of lists for consistency

Type:(datetime/filename/None, datetime/filename/None)
doy

day of year for loaded data

Type:int
files

interface to instrument files

Type:pysat.Files
meta

interface to instrument metadata, similar to netCDF 1.6

Type:pysat.Meta
orbits

interface to extracting data orbit-by-orbit

Type:pysat.Orbits
custom

interface to instrument nano-kernel

Type:pysat.Custom
kwargs

keyword arguments passed to instrument loading routine

Type:dictionary

Note

Pysat attempts to load the module platform_name.py located in the pysat/instruments directory. This module provides the underlying functionality to download, load, and clean instrument data. Alternatively, the module may be supplied directly using keyword inst_module.

Examples

# 1-second mag field data
vefi = pysat.Instrument(platform='cnofs',
                        name='vefi',
                        tag='dc_b',
                        clean_level='clean')
start = pysat.datetime(2009,1,1)
stop = pysat.datetime(2009,1,2)
vefi.download(start, stop)
vefi.load(date=start)
print(vefi['dB_mer'])
print(vefi.meta['db_mer'])

# 1-second thermal plasma parameters
ivm = pysat.Instrument(platform='cnofs',
                        name='ivm',
                        tag='',
                        clean_level='clean')
ivm.download(start,stop)
ivm.load(2009,1)
print(ivm['ionVelmeridional'])

# Ionosphere profiles from GPS occultation
cosmic = pysat.Instrument('cosmic',
                            'gps',
                            'ionprf',
                            altitude_bin=3)
# bins profile using 3 km step
cosmic.download(start, stop, user=user, password=password)
cosmic.load(date=start)
bounds

Boundaries for iterating over instrument object by date or file.

Parameters:
  • start (datetime object, filename, or None (default)) – start of iteration, if None uses first data date. list-like collection also accepted
  • end (datetime object, filename, or None (default)) – end of iteration, inclusive. If None uses last data date. list-like collection also accepted

Note

Both start and stop must be the same type (date, or filename) or None. Only the year, month, and day are used for date inputs.

Examples

inst = pysat.Instrument(platform=platform,
                        name=name,
                        tag=tag)
start = pysat.datetime(2009,1,1)
stop = pysat.datetime(2009,1,31)
inst.bounds = (start,stop)

start2 = pysat.datetetime(2010,1,1)
stop2 = pysat.datetime(2010,2,14)
inst.bounds = ([start, start2], [stop, stop2])
concat_data(data, *args, **kwargs)

Concats data1 and data2 for xarray or pandas as needed

Parameters:data (pandas or xarray) – Data to be appended to data already within the Instrument object
Returns:Instrument.data modified in place.
Return type:void

Notes

For pandas, sort=False is passed along to the underlying pandas.concat method. If sort is supplied as a keyword, the user provided value is used instead.

For xarray, dim=’Epoch’ is passed along to xarray.concat except if the user includes a value for dim as a keyword argument.

copy()

Deep copy of the entire Instrument object.

date

Date for loaded data.

download(start=None, stop=None, freq='D', user=None, password=None, date_array=None, **kwargs)

Download data for given Instrument object from start to stop.

Parameters:
  • start (pandas.datetime (yesterday)) – start date to download data
  • stop (pandas.datetime (tomorrow)) – stop date to download data
  • freq (string) – Stepsize between dates for season, ‘D’ for daily, ‘M’ monthly (see pandas)
  • user (string) – username, if required by instrument data archive
  • password (string) – password, if required by instrument data archive
  • date_array (list-like) – Sequence of dates to download date for. Takes precendence over start and stop inputs
  • **kwargs (dict) – Dictionary of keywords that may be options for specific instruments

Note

Data will be downloaded to pysat_data_dir/patform/name/tag

If Instrument bounds are set to defaults they are updated after files are downloaded.

download_updated_files(user=None, password=None, **kwargs)

Grabs a list of remote files, compares to local, then downloads new files.

Parameters:
  • user (string) – username, if required by instrument data archive
  • password (string) – password, if required by instrument data archive
  • **kwargs (dict) – Dictionary of keywords that may be options for specific instruments

Note

Data will be downloaded to pysat_data_dir/patform/name/tag

If Instrument bounds are set to defaults they are updated after files are downloaded.

empty

Boolean flag reflecting lack of data.

True if there is no Instrument data.

generic_meta_translator(meta_to_translate)

Translates the metadate contained in an object into a dictionary suitable for export.

Parameters:meta_to_translate (Meta) – The metadata object to translate
Returns:A dictionary of the metadata for each variable of an output file e.g. netcdf4
Return type:dict
index

Returns time index of loaded data.

load(yr=None, doy=None, date=None, fname=None, fid=None, verifyPad=False)

Load instrument data into Instrument object .data.

Parameters:
  • yr (integer) – year for desired data
  • doy (integer) – day of year
  • date (datetime object) – date to load
  • fname ('string') – filename to be loaded
  • verifyPad (boolean) – if True, padding data not removed (debug purposes)
Returns:

Return type:

Void. Data is added to self.data

Note

Loads data for a chosen instrument into .data. Any functions chosen by the user and added to the custom processing queue (.custom.add) are automatically applied to the data before it is available to user in .data.

next(verifyPad=False)

Manually iterate through the data loaded in Instrument object.

Bounds of iteration and iteration type (day/file) are set by bounds attribute.

Note

If there were no previous calls to load then the first day(default)/file will be loaded.

prev(verifyPad=False)

Manually iterate backwards through the data in Instrument object.

Bounds of iteration and iteration type (day/file) are set by bounds attribute.

Note

If there were no previous calls to load then the first day(default)/file will be loaded.

remote_date_range(year=None, month=None, day=None)

Returns fist and last date for remote data. Default behaviour is to search all files. User may additionally specify a given year, year/month, or year/month/day combination to return a subset of available files.

remote_file_list(year=None, month=None, day=None)

List remote files for chosen instrument. Default behaviour is to return all files. User may additionally specify a given year, year/month, or year/month/day combination to return a subset of available files.

to_netcdf4(fname=None, base_instrument=None, epoch_name='Epoch', zlib=False, complevel=4, shuffle=True, preserve_meta_case=False, export_nan=None, unlimited_time=True)

Stores loaded data into a netCDF4 file.

Parameters:
  • fname (string) – full path to save instrument object to
  • base_instrument (pysat.Instrument) – used as a comparison, only attributes that are present with self and not on base_instrument are written to netCDF
  • epoch_name (str) – Label in file for datetime index of Instrument object
  • zlib (boolean) – Flag for engaging zlib compression (True - compression on)
  • complevel (int) – an integer between 1 and 9 describing the level of compression desired (default 4). Ignored if zlib=False
  • shuffle (boolean) – the HDF5 shuffle filter will be applied before compressing the data (default True). This significantly improves compression. Default is True. Ignored if zlib=False.
  • preserve_meta_case (bool (False)) – if True, then the variable strings within the MetaData object, which preserves case, are used to name variables in the written netCDF file. If False, then the variable strings used to access data from the Instrument object are used instead. By default, the variable strings on both the data and metadata side are the same, though this relationship may be altered by a user.
  • export_nan (list or None) – By default, the metadata variables where a value of NaN is allowed and written to the netCDF4 file is maintained by the Meta object attached to the pysat.Instrument object. A list supplied here will override the settings provided by Meta, and all parameters included will be written to the file. If not listed and a value is NaN then that attribute simply won’t be included in the netCDF4 file.
  • unlimited_time (bool) – If True, then the main epoch dimension will be set to ‘unlimited’ within the netCDF4 file. (default=True)

Note

Stores 1-D data along dimension ‘epoch’ - the date time index.

Stores higher order data (e.g. dataframes within series) separately

  • The name of the main variable column is used to prepend subvariable names within netCDF, var_subvar_sub
  • A netCDF4 dimension is created for each main variable column with higher order data; first dimension Epoch
  • The index organizing the data stored as a dimension variable
  • from_netcdf4 uses the variable dimensions to reconstruct data structure

All attributes attached to instrument meta are written to netCDF attrs with the exception of ‘Date_End’, ‘Date_Start’, ‘File’, ‘File_Date’, ‘Generation_Date’, and ‘Logical_File_ID’. These are defined within to_netCDF at the time the file is written, as per the adopted standard, SPDF ISTP/IACG Modified for NetCDF. Atrributes ‘Conventions’ and ‘Text_Supplement’ are given default values if not present.

today()

Returns today’s date, with no hour, minute, second, etc.

Parameters:None
Returns:Today’s date
Return type:datetime
tomorrow()

Returns tomorrow’s date, with no hour, minute, second, etc.

Parameters:None
Returns:Tomorrow’s date
Return type:datetime
variables

Returns list of variables within loaded data.

yesterday()

Returns yesterday’s date, with no hour, minute, second, etc.

Parameters:None
Returns:Yesterday’s date
Return type:datetime

Instrument Methods

The following methods support the variety of actions needed by underlying pysat.Instrument modules.

Demeter

Provides non-instrument routines for DEMETER microsatellite data

Deprecated since version 2.3.0: This module has been removed from pysat in the 3.0.0 release and can now be found in pysatIncubator (https://github.com/pysat/pysatIncubator)

pysat.instruments.methods.demeter.download(date_array, tag, sat_id, data_path=None, user=None, password=None)

Download

Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatIncubator.instruments.methods.demeter

pysat.instruments.methods.demeter.bytes_to_float(chunk)

Convert a chunk of bytes to a float

Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatIncubator.instruments.methods.demeter

Parameters:chunk (string or bytes) – A chunk of bytes
Returns:value – A 32 bit float
Return type:float
pysat.instruments.methods.demeter.load_general_header(fhandle)

Load the general header block (block 1 for each time)

Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatIncubator.instruments.methods.demeter

Parameters:fhandle ((file handle)) – File handle
Returns:
  • data (list) – List of data values containing: P field, Number of days from 01/01/1950, number of miliseconds in the day, UT as datetime, Orbit number, downward (False) upward (True) indicator
  • meta (dict) – Dictionary with meta data for keys: ‘telemetry station’, ‘software processing version’, ‘software processing subversion’, ‘calibration file version’, and ‘calibration file subversion’, ‘data names’, ‘data units’
pysat.instruments.methods.demeter.load_location_parameters(fhandle)

Load the orbital and geomagnetic parameter block (block 1 for each time)

Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatIncubator.instruments.methods.demeter

Parameters:fhandle ((file handle)) – File handle
Returns:
  • data (list) – List of data values containing: geoc lat, geoc lon, alt, lt, geom lat, geom lon, mlt, inv lat, L-shell, geoc lat of conj point, geoc lon of conj point, geoc lat of N conj point at 110 km, geoc lon of N conj point at 110 km, geoc lat of S conj point at 110 km, geoc lon of S conj point at 110 km, components of magnetic field at sat point, proton gyrofreq at sat point, solar position in geog coords
  • meta (dict) – Dictionary with meta data for keys: ‘software processing version’, ‘software processing subversion’, ‘data names’, ‘data units’
pysat.instruments.methods.demeter.load_attitude_parameters(fhandle)

Load the attitude parameter block (block 1 for each time)

Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatIncubator.instruments.methods.demeter

Parameters:fhandle ((file handle)) – File handle
Returns:
  • data (list) – list of data values containing: matrix elements from satellite coord system to geographic coordinate system, matrix elements from geographic coordinate system to local geomagnetic coordinate system, quality index of attitude parameters.
  • meta (dict) – Dictionary with meta data for keys: ‘software processing version’, ‘software processing subversion’, ‘data names’, ‘data units’
pysat.instruments.methods.demeter.load_binary_file(fname, load_experiment_data)

Load the binary data from a DEMETER file

Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatIncubator.instruments.methods.demeter

Parameters:
  • fname (string) – Filename
  • load_experiment_data (function) – Function to load experiment data, taking the file handle as input
Returns:

  • data (np.array) – Data from file stored in a numpy array
  • meta (dict) – Meta data for file, including data names and units

pysat.instruments.methods.demeter.set_metadata(name, meta_dict)

Set metadata for each DEMETER instrument, using dict containing metadata

Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatIncubator.instruments.methods.demeter

Parameters:
  • name (string) – DEMETER instrument name
  • meta_dict (dict) – Dictionary containing metadata information and data attributes. Data attributes are available in the keys ‘data names’ and ‘data units’
Returns:

meta – Meta class boject

Return type:

pysat.Meta

General

Provides generalized routines for integrating instruments into pysat.

pysat.instruments.methods.general.convert_timestamp_to_datetime(inst, sec_mult=1.0)

Use datetime instead of timestamp for Epoch

Parameters:
  • inst (pysat.Instrument) – associated pysat.Instrument object
  • sec_mult (float) – Multiplier needed to convert epoch time to seconds (default=1.0)
pysat.instruments.methods.general.list_files(tag=None, sat_id=None, data_path=None, format_str=None, supported_tags=None, fake_daily_files_from_monthly=False, two_digit_year_break=None, file_cadance=datetime.timedelta(days=1))

Return a Pandas Series of every file for chosen satellite data.

This routine provides a standard interfacefor pysat instrument modules.

Deprecated since version 2.3.0: The fake_daily_files_from_monthly kwarg has been deprecated and replaced with file_cadance in pysat 3.0.0.

Parameters:
  • tag (string or NoneType) – Denotes type of file to load. Accepted types are <tag strings>. (default=None)
  • sat_id (string or NoneType) – Specifies the satellite ID for a constellation. Not used. (default=None)
  • data_path (string or NoneType) – Path to data directory. If None is specified, the value previously set in Instrument.files.data_path is used. (default=None)
  • format_str (string or NoneType) – User specified file format. If None is specified, the default formats associated with the supplied tags are used. (default=None)
  • supported_tags (dict or NoneType) – keys are sat_id, each containing a dict keyed by tag where the values file format template strings. (default=None)
  • fake_daily_files_from_monthly (bool) – Some CDAWeb instrument data files are stored by month, interfering with pysat’s functionality of loading by day. This flag, when true, appends daily dates to monthly files internally. These dates are used by load routine in this module to provide data by day. This keyword arg has been deprecated. In pysat 2.3.0, setting file_cadance=dt.datetime(days=1) is equivalent to setting this to False, while using file_cadance=pds.DateOffset(months=1) is equivalent to setting this to True. (default=False)
  • two_digit_year_break (int) – If filenames only store two digits for the year, then ‘1900’ will be added for years >= two_digit_year_break and ‘2000’ will be added for years < two_digit_year_break.
  • file_cadence (dt.timedelta or pds.DateOffset) – pysat assumes a daily file cadence, but some instrument data file contain longer periods of time. This parameter allows the specification of regular file cadences greater than or equal to a day (e.g., weekly, monthly, or yearly). In pysat 2.3.0, only daily and monthly cadances are supported. (default=dt.timedelta(days=1))
Returns:

pysat.Files.from_os – A class containing the verified available files

Return type:

(pysat._files.Files)

Examples

fname = 'cnofs_vefi_bfield_1sec_{year:04d}{month:02d}{day:02d}_v05.cdf'
supported_tags = {'dc_b': fname}
list_files = functools.partial(nasa_cdaweb.list_files,
                               supported_tags=supported_tags)

fname = 'cnofs_cindi_ivm_500ms_{year:4d}{month:02d}{day:02d}_v01.cdf'
supported_tags = {'': fname}
list_files = functools.partial(mm_gen.list_files,
                               supported_tags=supported_tags)
pysat.instruments.methods.general.remove_leading_text(inst, target=None)

Removes leading text on variable names :param inst: associated pysat.Instrument object :type inst: pysat.Instrument :param target: Leading string to remove. If none supplied, returns unmodified :type target: str or list of strings

Returns:Modifies Instrument object in place
Return type:None

NASA CDAWeb

Provides default routines for integrating NASA CDAWeb instruments into pysat. Adding new CDAWeb datasets should only require mininal user intervention.

pysat.instruments.methods.nasa_cdaweb.load(fnames, tag=None, sat_id=None, fake_daily_files_from_monthly=False, flatten_twod=True)

Load NASA CDAWeb CDF files.

Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatNASA.instruments.methods.cdaweb

This routine is intended to be used by pysat instrument modules supporting a particular NASA CDAWeb dataset.

Parameters:
  • fnames ((pandas.Series)) – Series of filenames
  • tag ((str or NoneType)) – tag or None (default=None)
  • sat_id ((str or NoneType)) – satellite id or None (default=None)
  • fake_daily_files_from_monthly (bool) – Some CDAWeb instrument data files are stored by month, interfering with pysat’s functionality of loading by day. This flag, when true, parses of daily dates to monthly files that were added internally by the list_files routine, when flagged. These dates are used here to provide data by day.
  • flatted_twod (bool) – Flattens 2D data into different columns of root DataFrame rather than produce a Series of DataFrames
Returns:

  • data ((pandas.DataFrame)) – Object containing satellite data
  • meta ((pysat.Meta)) – Object containing metadata such as column names and units

Examples

# within the new instrument module, at the top level define
# a new variable named load, and set it equal to this load method
# code below taken from cnofs_ivm.py.

# support load routine
# use the default CDAWeb method
load = cdw.load
pysat.instruments.methods.nasa_cdaweb.list_files(tag=None, sat_id=None, data_path=None, format_str=None, supported_tags=None, fake_daily_files_from_monthly=False, two_digit_year_break=None)

Return a Pandas Series of every file for chosen satellite data.

Deprecated since version 2.2.0: list_files will be removed in pysat 3.0.0, it will be replaced by the copy in instruments.methods.general

This routine is intended to be used by pysat instrument modules supporting a particular NASA CDAWeb dataset.

Parameters:
  • tag ((string or NoneType)) – Denotes type of file to load. Accepted types are <tag strings>. (default=None)
  • sat_id ((string or NoneType)) – Specifies the satellite ID for a constellation. Not used. (default=None)
  • data_path ((string or NoneType)) – Path to data directory. If None is specified, the value previously set in Instrument.files.data_path is used. (default=None)
  • format_str ((string or NoneType)) – User specified file format. If None is specified, the default formats associated with the supplied tags are used. (default=None)
  • supported_tags ((dict or NoneType)) – keys are sat_id, each containing a dict keyed by tag where the values file format template strings. (default=None)
  • fake_daily_files_from_monthly ((bool)) – Some CDAWeb instrument data files are stored by month, interfering with pysat’s functionality of loading by day. This flag, when true, appends daily dates to monthly files internally. These dates are used by load routine in this module to provide data by day.
  • two_digit_year_break ((int)) – If filenames only store two digits for the year, then ‘1900’ will be added for years >= two_digit_year_break and ‘2000’ will be added for years < two_digit_year_break.
Returns:

pysat.Files.from_os – A class containing the verified available files

Return type:

(pysat._files.Files)

Examples

fname = 'cnofs_vefi_bfield_1sec_{year:04d}{month:02d}{day:02d}_v05.cdf'
supported_tags = {'dc_b': fname}
list_files = functools.partial(nasa_cdaweb.list_files,
                               supported_tags=supported_tags)

fname = 'cnofs_cindi_ivm_500ms_{year:4d}{month:02d}{day:02d}_v01.cdf'
supported_tags = {'': fname}
list_files = functools.partial(cdw.list_files,
                               supported_tags=supported_tags)
pysat.instruments.methods.nasa_cdaweb.list_remote_files(tag, sat_id, remote_site='https://cdaweb.gsfc.nasa.gov', supported_tags=None, user=None, password=None, fake_daily_files_from_monthly=False, two_digit_year_break=None, delimiter=None, year=None, month=None, day=None)

Return a Pandas Series of every file for chosen remote data.

Deprecated since version 2.3.0: This routine will be removed in pysat 3.0.0, it will be moved to the pysatNASA repository. Also, as of 2.2.0 the year/month/day keywords will be removed in pysat 3.0.0, they will be replaced with a start/stop syntax consistent with the download routine

This routine is intended to be used by pysat instrument modules supporting a particular NASA CDAWeb dataset.

Parameters:
  • tag ((string or NoneType)) – Denotes type of file to load. Accepted types are <tag strings>. (default=None)
  • sat_id ((string or NoneType)) – Specifies the satellite ID for a constellation. (default=None)
  • remote_site ((string or NoneType)) – Remote site to download data from (default=’https://cdaweb.gsfc.nasa.gov’)
  • supported_tags (dict) – dict of dicts. Keys are supported tag names for download. Value is a dict with ‘dir’, ‘remote_fname’, ‘local_fname’. Inteded to be pre-set with functools.partial then assigned to new instrument code.
  • user ((string or NoneType)) – Username to be passed along to resource with relevant data. (default=None)
  • password ((string or NoneType)) – User password to be passed along to resource with relevant data. (default=None)
  • fake_daily_files_from_monthly (bool) – Some CDAWeb instrument data files are stored by month. This flag, when true, accomodates this reality with user feedback on a monthly time frame. (default=False)
  • two_digit_year_break ((int or NoneType)) – If filenames only store two digits for the year, then ‘1900’ will be added for years >= two_digit_year_break and ‘2000’ will be added for years < two_digit_year_break. (default=None)
  • delimiter ((string or NoneType)) – If filename is delimited, then provide delimiter alone e.g. ‘_’ (default=None)
  • year ((int or NoneType)) – Selects a given year to return remote files for. None returns all years. (default=None)
  • month ((int or NoneType)) – Selects a given month to return remote files for. None returns all months. Requires year to be defined. (default=None)
  • day ((int or NoneType)) – Selects a given day to return remote files for. None returns all days. Requires year and month to be defined. (default=None)
Returns:

pysat.Files.from_os – A class containing the verified available files

Return type:

(pysat._files.Files)

Examples

fname = 'cnofs_vefi_bfield_1sec_{year:04d}{month:02d}{day:02d}_v05.cdf'
supported_tags = {'dc_b': fname}
list_remote_files =             functools.partial(nasa_cdaweb.list_remote_files,
                      supported_tags=supported_tags)

fname = 'cnofs_cindi_ivm_500ms_{year:4d}{month:02d}{day:02d}_v01.cdf'
supported_tags = {'': fname}
list_remote_files =             functools.partial(cdw.list_remote_files,
                      supported_tags=supported_tags)
pysat.instruments.methods.nasa_cdaweb.download(supported_tags, date_array, tag, sat_id, remote_site='https://cdaweb.gsfc.nasa.gov', data_path=None, user=None, password=None, fake_daily_files_from_monthly=False, multi_file_day=False)

Routine to download NASA CDAWeb CDF data.

Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatNASA.instruments.methods.cdaweb

This routine is intended to be used by pysat instrument modules supporting a particular NASA CDAWeb dataset.

Parameters:
  • supported_tags (dict) – dict of dicts. Keys are supported tag names for download. Value is a dict with ‘dir’, ‘remote_fname’, ‘local_fname’. Inteded to be pre-set with functools.partial then assigned to new instrument code.
  • date_array (array_like) – Array of datetimes to download data for. Provided by pysat.
  • tag (str or NoneType (None)) – tag or None
  • sat_id ((str or NoneType)) – satellite id or None (default=None)
  • remote_site ((string or NoneType)) – Remote site to download data from (default=’https://cdaweb.gsfc.nasa.gov’)
  • data_path ((string or NoneType)) – Path to data directory. If None is specified, the value previously set in Instrument.files.data_path is used. (default=None)
  • user ((string or NoneType)) – Username to be passed along to resource with relevant data. (default=None)
  • password ((string or NoneType)) – User password to be passed along to resource with relevant data. (default=None)
  • fake_daily_files_from_monthly (bool) – Some CDAWeb instrument data files are stored by month. This flag, when true, accomodates this reality with user feedback on a monthly time frame.
Returns:

Void – Downloads data to disk.

Return type:

(NoneType)

Examples

# download support added to cnofs_vefi.py using code below
rn = '{year:4d}/cnofs_vefi_bfield_1sec_{year:4d}{month:02d}{day:02d}'+
    '_v05.cdf'
ln = 'cnofs_vefi_bfield_1sec_{year:4d}{month:02d}{day:02d}_v05.cdf'
dc_b_tag = {'dir':'/pub/data/cnofs/vefi/bfield_1sec',
            'remote_fname': rn,
            'local_fname': ln}
supported_tags = {'dc_b': dc_b_tag}

download = functools.partial(nasa_cdaweb.download,
                             supported_tags=supported_tags)

NASA ICON

Provides non-instrument specific routines for ICON data

Deprecated since version 2.3.0: This module has been removed from pysat in the 3.0.0 release and can now be found in pysatIncubator (https://github.com/pysat/pysatNASA)

pysat.instruments.methods.icon.list_remote_files(tag, sat_id, user=None, password=None, supported_tags=None, year=None, month=None, day=None, start=None, stop=None)

Return a Pandas Series of every file for chosen remote data.

This routine is intended to be used by pysat instrument modules supporting a particular UC-Berkeley SSL dataset related to ICON.

Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatNASA.instruments.methods.icon

Parameters:
  • tag (string or NoneType) – Denotes type of file to load. Accepted types are <tag strings>. (default=None)
  • sat_id (string or NoneType) – Specifies the satellite ID for a constellation. Not used. (default=None)
  • user (string or NoneType) – Username to be passed along to resource with relevant data. (default=None)
  • password (string or NoneType) – User password to be passed along to resource with relevant data. (default=None)
  • start (dt.datetime or NoneType) – Starting time for file list. A None value will start with the first file found. (default=None)
  • stop (dt.datetime or NoneType) – Ending time for the file list. A None value will stop with the last file found. (default=None)
Returns:

A Series formatted for the Files class (pysat._files.Files) containing filenames and indexed by date and time

Return type:

pandas.Series

pysat.instruments.methods.icon.ssl_download(date_array, tag, sat_id, data_path=None, user=None, password=None, supported_tags=None)

Download ICON data from public area of SSL ftp server

Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0. It is replaced by the pysatNASA.instruments.methods.cdaweb.download method.

Parameters:
  • date_array (array-like) – list of datetimes to download data for. The sequence of dates need not be contiguous.
  • tag (string) – Tag identifier used for particular dataset. This input is provided by pysat. (default=’’)
  • sat_id (string) – Satellite ID string identifier used for particular dataset. This input is provided by pysat. (default=’’)
  • data_path (string) – Path to directory to download data to. (default=None)
  • user (string) – User string input used for download. Provided by user and passed via pysat. If an account is required for downloads this routine here must error if user not supplied. (default=None)
  • password (string) – Password for data download. (default=None)
  • **kwargs (dict) – Additional keywords supplied by user when invoking the download routine attached to a pysat.Instrument object are passed to this routine via kwargs.

Madrigal

Provides default routines for integrating CEDAR Madrigal instruments into pysat, reducing the amount of user intervention.

Deprecated since version 2.3.0: This module has been removed from pysat in the 3.0.0 release and can now be found in pysatMadrigal (https://github.com/pysat/pysatMadrigal)

pysat.instruments.methods.madrigal.cedar_rules()

General acknowledgement statement for Madrigal data.

Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatMadrigal.instruments.methods.madrigal

Returns:ackn – String with general acknowledgement for all CEDAR Madrigal data
Return type:string
pysat.instruments.methods.madrigal.load(fnames, tag=None, sat_id=None, xarray_coords=[])

Loads data from Madrigal into Pandas.

Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatMadrigal.instruments.methods.madrigal

This routine is called as needed by pysat. It is not intended for direct user interaction.

Parameters:
  • fnames (array-like) – iterable of filename strings, full path, to data files to be loaded. This input is nominally provided by pysat itself.
  • tag (string ('')) – tag name used to identify particular data set to be loaded. This input is nominally provided by pysat itself. While tag defaults to None here, pysat provides ‘’ as the default tag unless specified by user at Instrument instantiation.
  • sat_id (string ('')) – Satellite ID used to identify particular data set to be loaded. This input is nominally provided by pysat itself.
  • xarray_coords (list) – List of keywords to use as coordinates if xarray output is desired instead of a Pandas DataFrame (default=[])
Returns:

  • data (pds.DataFrame or xr.DataSet) – A pandas DataFrame or xarray DataSet holding the data from the HDF5 file
  • metadata (pysat.Meta) – Metadata from the HDF5 file, as well as default values from pysat

Examples

::
inst = pysat.Instrument(‘jro’, ‘isr’, ‘drifts’) inst.load(2010,18)
pysat.instruments.methods.madrigal.download(date_array, inst_code=None, kindat=None, data_path=None, user=None, password=None, url='http://cedar.openmadrigal.org', file_format='hdf5')

Downloads data from Madrigal.

Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatMadrigal.instruments.methods.madrigal

Parameters:
  • date_array (array-like) – list of datetimes to download data for. The sequence of dates need not be contiguous.
  • inst_code (string (None)) – Madrigal instrument code(s), cast as a string. If multiple are used, separate them with commas.
  • kindat (string (None)) – Experiment instrument code(s), cast as a string. If multiple are used, separate them with commas.
  • data_path (string (None)) – Path to directory to download data to.
  • user (string (None)) – User string input used for download. Provided by user and passed via pysat. If an account is required for dowloads this routine here must error if user not supplied.
  • password (string (None)) – Password for data download.
  • url (string (’http://cedar.openmadrigal.org’)) – URL for Madrigal site
  • file_format (string ('hdf5')) – File format for Madrigal data. Load routines currently only accept ‘hdf5’, but any of the Madrigal options may be used here.
Returns:

Void – Downloads data to disk.

Return type:

(NoneType)

Notes

The user’s names should be provided in field user. Ruby Payne-Scott should be entered as Ruby+Payne-Scott

The password field should be the user’s email address. These parameters are passed to Madrigal when downloading.

The affiliation field is set to pysat to enable tracking of pysat downloads.

pysat.instruments.methods.madrigal.filter_data_single_date(self)

Filters data to a single date.

Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatMadrigal.instruments.methods.madrigal

Parameters:self (pysat.Instrument) – This object

Note

Madrigal serves multiple days within a single JRO file to counter this, we will filter each loaded day so that it only contains the relevant day of data. This is only applied if loading by date. It is not applied when supplying pysat with a specific filename to load, nor when data padding is enabled. Note that when data padding is enabled the final data available within the instrument will be downselected by pysat to only include the date specified.

This routine is intended to be added to the Instrument nanokernel processing queue via

inst = pysat.Instrument()
inst.custom.add(filter_data_single_date, 'modify')

This function will then be automatically applied to the Instrument object data on every load by the pysat nanokernel.

Warning

For the best performance, this function should be added first in the queue. This may be ensured by setting the default function in a pysat instrument file to this one.

within platform_name.py set

default = pysat.instruments.methods.madrigal.filter_data_single_date

at the top level

Space Weather

Provides default routines for solar wind and geospace indices

Deprecated since version 2.3.0: This Instrument module has been removed from pysat in the 3.0.0 release and can now be found in pysatSpaceWeather (https://github.com/pysat/pysatSpaceWeather)

pysat.instruments.methods.sw.calc_daily_Ap(ap_inst, ap_name='3hr_ap', daily_name='Ap', running_name=None)

Calculate the daily Ap index from the 3hr ap index

Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and has been replaced with pysatSpaceWeather.instruments.methods.kp_ap.calc_daily_Ap

Parameters:
  • ap_inst ((pysat.Instrument)) – pysat instrument containing 3-hourly ap data
  • ap_name ((str)) – Column name for 3-hourly ap data (default=’3hr_ap’)
  • daily_name ((str)) – Column name for daily Ap data (default=’Ap’)
  • running_name ((str or NoneType)) – Column name for daily running average of ap, not output if None (default=None)
Returns:

Void

Return type:

updates intrument to include daily Ap index under daily_name

Notes

Ap is the mean of the 3hr ap indices measured for a given day

Option for running average is included since this information is used by MSIS when running with sub-daily geophysical inputs

pysat.instruments.methods.sw.combine_f107(standard_inst, forecast_inst, start=None, stop=None)

Combine the output from the measured and forecasted F10.7 sources

Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and has been replaced with pysatSpaceWeather.instruments.methods.f107.combine_f107

Parameters:
  • standard_inst ((pysat.Instrument or NoneType)) – Instrument object containing data for the ‘sw’ platform, ‘f107’ name, and ‘’, ‘all’, ‘prelim’, or ‘daily’ tag
  • forecast_inst ((pysat.Instrument or NoneType)) – Instrument object containing data for the ‘sw’ platform, ‘f107’ name, and ‘prelim’, ‘45day’ or ‘forecast’ tag
  • start ((dt.datetime or NoneType)) – Starting time for combining data, or None to use earliest loaded date from the pysat Instruments (default=None)
  • stop ((dt.datetime)) – Ending time for combining data, or None to use the latest loaded date from the pysat Instruments (default=None)
Returns:

f107_inst – Instrument object containing F10.7 observations for the desired period of time, merging the standard, 45day, and forecasted values based on their reliability

Return type:

(pysat.Instrument)

Notes

Merging prioritizes the standard data, then the 45day data, and finally the forecast data

Will not attempt to download any missing data, but will load data

pysat.instruments.methods.sw.combine_kp(standard_inst=None, recent_inst=None, forecast_inst=None, start=None, stop=None, fill_val=nan)

Combine the output from the different Kp sources for a range of dates

Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and has been replaced with pysatSpaceWeather.instruments.methods.kp_ap.combine_kp

Parameters:
  • standard_inst ((pysat.Instrument or NoneType)) – Instrument object containing data for the ‘sw’ platform, ‘kp’ name, and ‘’ tag or None to exclude (default=None)
  • recent_inst ((pysat.Instrument or NoneType)) – Instrument object containing data for the ‘sw’ platform, ‘kp’ name, and ‘recent’ tag or None to exclude (default=None)
  • forecast_inst ((pysat.Instrument or NoneType)) – Instrument object containing data for the ‘sw’ platform, ‘kp’ name, and ‘forecast’ tag or None to exclude (default=None)
  • start ((dt.datetime or NoneType)) – Starting time for combining data, or None to use earliest loaded date from the pysat Instruments (default=None)
  • stop ((dt.datetime)) – Ending time for combining data, or None to use the latest loaded date from the pysat Instruments (default=None)
  • fill_val ((int or float)) – Desired fill value (since the standard instrument fill value differs from the other sources) (default=np.nan)
Returns:

kp_inst – Instrument object containing Kp observations for the desired period of time, merging the standard, recent, and forecasted values based on their reliability

Return type:

(pysat.Instrument)

Notes

Merging prioritizes the standard data, then the recent data, and finally the forecast data

Will not attempt to download any missing data, but will load data

pysat.instruments.methods.sw.convert_ap_to_kp(ap_data, fill_val=-1, ap_name='ap')

Convert Ap into Kp

Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and has been replaced with pysatSpaceWeather.instruments.methods.kp_ap.convert_ap_to_kp

Parameters:
  • ap_data (array-like) – Array-like object containing Ap data
  • fill_val (int, float, NoneType) – Fill value for the data set (default=-1)
  • ap_name (str) – Name of the input ap
Returns:

  • kp_data (array-like) – Array-like object containing Kp data
  • meta (Metadata) – Metadata object containing information about transformed data

Instrument Templates

General Instrument

This is a template for a pysat.Instrument support file. Modify this file as needed when adding a new Instrument to pysat.

This is a good area to introduce the instrument, provide background on the mission, operations, instrumentation, and measurements.

Also a good place to provide contact information. This text will be included in the pysat API documentation.

Properties

platform
List platform string here
name
List name string here
sat_id
List supported sat_ids here
tag
List supported tag strings here

Note

  • Optional section, remove if no notes

Warning

  • Optional section, remove if no warnings
  • Two blank lines needed afterward for proper formatting

Examples

Example code can go here

Authors

Author name and institution

pysat.instruments.templates.template_instrument.init(self)

Initializes the Instrument object with instrument specific values.

Runs once upon instantiation. Object modified in place. Optional.

Parameters:self (pysat.Instrument) – This object
pysat.instruments.templates.template_instrument.default(self)

Default customization function.

This routine is automatically applied to the Instrument object on every load by the pysat nanokernel (first in queue). Object modified in place.

Parameters:self (pysat.Instrument) – This object
pysat.instruments.templates.template_instrument.load(fnames, tag=None, sat_id=None, custom_keyword=None)

Loads PLATFORM data into (PANDAS/XARRAY).

This routine is called as needed by pysat. It is not intended for direct user interaction.

Parameters:
  • fnames (array-like) – iterable of filename strings, full path, to data files to be loaded. This input is nominally provided by pysat itself.
  • tag (string) – tag name used to identify particular data set to be loaded. This input is nominally provided by pysat itself. While tag defaults to None here, pysat provides ‘’ as the default tag unless specified by user at Instrument instantiation. (default=’’)
  • sat_id (string) – Satellite ID used to identify particular data set to be loaded. This input is nominally provided by pysat itself. (default=’’)
  • custom_keyword (type to be set) – Developers may include any custom keywords, with default values defined in the method signature. This is included here as a place holder and should be removed.
Returns:

Data and Metadata are formatted for pysat. Data is a pandas DataFrame or xarray DataSet while metadata is a pysat.Meta instance.

Return type:

data, metadata

Note

Any additional keyword arguments passed to pysat.Instrument upon instantiation are passed along to this routine.

Examples

inst = pysat.Instrument('ucar', 'tiegcm')
inst.load(2019, 1)
pysat.instruments.templates.template_instrument.list_files(tag=None, sat_id=None, data_path=None, format_str=None)

Produce a list of files corresponding to PLATFORM/NAME.

This routine is invoked by pysat and is not intended for direct use by the end user. Arguments are provided by pysat.

Parameters:
  • tag (string) – tag name used to identify particular data set to be loaded. This input is nominally provided by pysat itself. (default=’’)
  • sat_id (string) – Satellite ID used to identify particular data set to be loaded. This input is nominally provided by pysat itself. (default=’’)
  • data_path (string) – Full path to directory containing files to be loaded. This is provided by pysat. The user may specify their own data path at Instrument instantiation and it will appear here. (default=None)
  • format_str (string) – String template used to parse the datasets filenames. If a user supplies a template string at Instrument instantiation then it will appear here, otherwise defaults to None. (default=None)
Returns:

Series of filename strings, including the path, indexed by datetime.

Return type:

pandas.Series

Examples

If a filename is SPORT_L2_IVM_2019-01-01_v01r0000.NC then the template
is 'SPORT_L2_IVM_{year:04d}-{month:02d}-{day:02d}_' +
'v{version:02d}r{revision:04d}.NC'

Note

The returned Series should not have any duplicate datetimes. If there are multiple versions of a file the most recent version should be kept and the rest discarded. This routine uses the pysat.Files.from_os constructor, thus the returned files are up to pysat specifications.

Multiple data levels may be supported via the ‘tag’ input string. Multiple instruments via the sat_id string.

pysat.instruments.templates.template_instrument.list_remote_files(tag, sat_id, user=None, password=None)

Return a Pandas Series of every file for chosen remote data.

This routine is intended to be used by pysat instrument modules supporting a particular NASA CDAWeb dataset.

Parameters:
  • tag (string or NoneType) – Denotes type of file to load. Accepted types are <tag strings>. (default=None)
  • sat_id (string or NoneType) – Specifies the satellite ID for a constellation. Not used. (default=None)
  • user (string or NoneType) – Username to be passed along to resource with relevant data. (default=None)
  • password (string or NoneType) – User password to be passed along to resource with relevant data. (default=None)
Returns:

A Series formatted for the Files class (pysat._files.Files) containing filenames and indexed by date and time

Return type:

pandas.Series

pysat.instruments.templates.template_instrument.download(date_array, tag, sat_id, data_path=None, user=None, password=None, custom_keywords=None)

Placeholder for PLATFORM/NAME downloads.

This routine is invoked by pysat and is not intended for direct use by the end user.

Parameters:
  • date_array (array-like) – list of datetimes to download data for. The sequence of dates need not be contiguous.
  • tag (string) – Tag identifier used for particular dataset. This input is provided by pysat. (default=’’)
  • sat_id (string) – Satellite ID string identifier used for particular dataset. This input is provided by pysat. (default=’’)
  • data_path (string) – Path to directory to download data to. (default=None)
  • user (string) – User string input used for download. Provided by user and passed via pysat. If an account is required for dowloads this routine here must error if user not supplied. (default=None)
  • password (string) – Password for data download. (default=None)
  • custom_keywords (placeholder) – Additional keywords supplied by user when invoking the download routine attached to a pysat.Instrument object are passed to this routine. Use of custom keywords here is discouraged.
pysat.instruments.templates.template_instrument.clean(inst)

Routine to return PLATFORM/NAME data cleaned to the specified level

Cleaning level is specified in inst.clean_level and pysat will accept user input for several strings. The clean_level is specified at instantiation of the Instrument object.

‘clean’ : All parameters should be good, suitable for statistical and case studies ‘dusty’ : All paramers should generally be good though same may not be great ‘dirty’ : There are data areas that have issues, data should be used with caution ‘none’ : No cleaning applied, routine not called in this case.
Parameters:inst (pysat.Instrument) – Instrument class object, whose attribute clean_level is used to return the desired level of data selectivity.

Madrigal Pandas

Generic module for loading netCDF4 files into the pandas format within pysat.

This file may be used as a template for adding pysat support for a new dataset based upon netCDF4 files, or other file types (with modification).

This routine may also be used to add quick local support for a netCDF4 based dataset without having to define an instrument module for pysat. Relevant parameters may be specified when instantiating this Instrument object to support the relevant file location and naming schemes. This presumes the pysat developed utils.load_netCDF4 routine is able to load the file. See the load routine docstring in this module for more.

The routines defined within may also be used when adding a new instrument to pysat by importing this module and using the functools.partial methods to attach these functions to the new instrument model. See pysat/instruments/cnofs_ivm.py for more. NASA CDAWeb datasets, such as C/NOFS IVM, use the methods within pysat/instruments/methods/nasa_cdaweb.py to make adding new CDAWeb instruments easy.

pysat.instruments.templates.netcdf_pandas.init(self)

Initializes the Instrument object with instrument specific values.

Runs once upon instantiation. This routine provides a convenient location to print Acknowledgements or restrictions from the mission.

pysat.instruments.templates.netcdf_pandas.load(fnames, tag=None, sat_id=None, **kwargs)

Loads data using pysat.utils.load_netcdf4 .

This routine is called as needed by pysat. It is not intended for direct user interaction.

Parameters:
  • fnames (array-like) – iterable of filename strings, full path, to data files to be loaded. This input is nominally provided by pysat itself.
  • tag (string) – tag name used to identify particular data set to be loaded. This input is nominally provided by pysat itself.
  • sat_id (string) – Satellite ID used to identify particular data set to be loaded. This input is nominally provided by pysat itself.
  • **kwargs (extra keywords) – Passthrough for additional keyword arguments specified when instantiating an Instrument object. These additional keywords are passed through to this routine by pysat.
Returns:

Data and Metadata are formatted for pysat. Data is a pandas DataFrame while metadata is a pysat.Meta instance.

Return type:

data, metadata

Note

Any additional keyword arguments passed to pysat.Instrument upon instantiation are passed along to this routine and through to the load_netcdf4 call.

Examples

inst = pysat.Instrument('sport', 'ivm')
inst.load(2019,1)

# create quick Instrument object for a new, random netCDF4 file
# define filename template string to identify files
# this is normally done by instrument code, but in this case
# there is no built in pysat instrument support
# presumes files are named default_2019-01-01.NC
format_str = 'default_{year:04d}-{month:02d}-{day:02d}.NC'
inst = pysat.Instrument('netcdf', 'pandas',
                        custom_kwarg='test'
                        data_path='./',
                        format_str=format_str)
inst.load(2019,1)
pysat.instruments.templates.netcdf_pandas.list_files(tag=None, sat_id=None, data_path=None, format_str=None)

Produce a list of files corresponding to format_str located at data_path.

This routine is invoked by pysat and is not intended for direct use by the end user.

Multiple data levels may be supported via the ‘tag’ and ‘sat_id’ input strings.

Parameters:
  • tag (string) – tag name used to identify particular data set to be loaded. This input is nominally provided by pysat itself. (default=’’)
  • sat_id (string) – Satellite ID used to identify particular data set to be loaded. This input is nominally provided by pysat itself. (default=’’)
  • data_path (string) – Full path to directory containing files to be loaded. This is provided by pysat. The user may specify their own data path at Instrument instantiation and it will appear here. (default=None)
  • format_str (string) – String template used to parse the datasets filenames. If a user supplies a template string at Instrument instantiation then it will appear here, otherwise defaults to None. (default=None)
Returns:

Series of filename strings, including the path, indexed by datetime.

Return type:

pandas.Series

Examples

If a filename is SPORT_L2_IVM_2019-01-01_v01r0000.NC then the template
is 'SPORT_L2_IVM_{year:04d}-{month:02d}-{day:02d}_' +
'v{version:02d}r{revision:04d}.NC'

Note

The returned Series should not have any duplicate datetimes. If there are multiple versions of a file the most recent version should be kept and the rest discarded. This routine uses the pysat.Files.from_os constructor, thus the returned files are up to pysat specifications.

Normally the format_str for each supported tag and sat_id is defined within this routine. However, as this is a generic routine, those definitions can’t be made here. This method could be used in an instrument specific module where the list_files routine in the new package defines the format_str based upon inputs, then calls this routine passing both data_path and format_str.

Alternately, the list_files routine in methods.nasa_cdaweb may also be used and has more built in functionality. Supported tages and format strings may be defined within the new instrument module and passed as arguments to methods.nasa_cdaweb.list_files . For an example on using this routine, see pysat/instrument/cnofs_ivm.py or cnofs_vefi, cnofs_plp, omni_hro, timed_see, etc.

pysat.instruments.templates.netcdf_pandas.download(date_array, tag, sat_id, data_path=None, user=None, password=None)

Downloads data for supported instruments, however this is a template call.

This routine is invoked by pysat and is not intended for direct use by the end user.

Parameters:
  • date_array (array-like) – list of datetimes to download data for. The sequence of dates need not be contiguous.
  • tag (string) – Tag identifier used for particular dataset. This input is provided by pysat. (default=’’)
  • sat_id (string) – Satellite ID string identifier used for particular dataset. This input is provided by pysat. (default=’’)
  • data_path (string (None)) – Path to directory to download data to. (default=None)
  • user (string) – User string input used for download. Provided by user and passed via pysat. If an account is required for dowloads this routine here must error if user not supplied. (default=None)
  • password (string) – Password for data download. (default=None)

NASA CDAWeb Instrument

This is a template for a pysat.Instrument support file that utilizes CDAWeb methods. Copy and modify this file as needed when adding a new Instrument to pysat.

This is a good area to introduce the instrument, provide background on the mission, operations, instrumenation, and measurements.

Also a good place to provide contact information. This text will be included in the pysat API documentation.

Properties

platform
List platform string here
name
List name string here
sat_id
List supported sat_ids here
tag
List supported tag strings here

Note

  • Optional section, remove if no notes

Warning

  • Optional section, remove if no warnings
  • Two blank lines needed afterward for proper formatting

Examples

Example code can go here

Authors

Author name and institution

pysat.instruments.templates.template_cdaweb_instrument.default(self)

Default customization function.

This routine is automatically applied to the Instrument object on every load by the pysat nanokernel (first in queue).

Parameters:self (pysat.Instrument) – This object
pysat.instruments.templates.template_cdaweb_instrument.load(fnames, tag=None, sat_id=None, fake_daily_files_from_monthly=False, flatten_twod=True)

Load NASA CDAWeb CDF files.

Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatNASA.instruments.methods.cdaweb

This routine is intended to be used by pysat instrument modules supporting a particular NASA CDAWeb dataset.

Parameters:
  • fnames ((pandas.Series)) – Series of filenames
  • tag ((str or NoneType)) – tag or None (default=None)
  • sat_id ((str or NoneType)) – satellite id or None (default=None)
  • fake_daily_files_from_monthly (bool) – Some CDAWeb instrument data files are stored by month, interfering with pysat’s functionality of loading by day. This flag, when true, parses of daily dates to monthly files that were added internally by the list_files routine, when flagged. These dates are used here to provide data by day.
  • flatted_twod (bool) – Flattens 2D data into different columns of root DataFrame rather than produce a Series of DataFrames
Returns:

  • data ((pandas.DataFrame)) – Object containing satellite data
  • meta ((pysat.Meta)) – Object containing metadata such as column names and units

Examples

# within the new instrument module, at the top level define
# a new variable named load, and set it equal to this load method
# code below taken from cnofs_ivm.py.

# support load routine
# use the default CDAWeb method
load = cdw.load
pysat.instruments.templates.template_cdaweb_instrument.list_files(tag=None, sat_id=None, data_path=None, format_str=None, *, supported_tags={'': {'': 'cnofs_vefi_bfield_1sec_{year:04d}{month:02d}{day:02d}_v05.cdf'}}, fake_daily_files_from_monthly=False, two_digit_year_break=None)

Return a Pandas Series of every file for chosen satellite data.

Deprecated since version 2.2.0: list_files will be removed in pysat 3.0.0, it will be replaced by the copy in instruments.methods.general

This routine is intended to be used by pysat instrument modules supporting a particular NASA CDAWeb dataset.

Parameters:
  • tag ((string or NoneType)) – Denotes type of file to load. Accepted types are <tag strings>. (default=None)
  • sat_id ((string or NoneType)) – Specifies the satellite ID for a constellation. Not used. (default=None)
  • data_path ((string or NoneType)) – Path to data directory. If None is specified, the value previously set in Instrument.files.data_path is used. (default=None)
  • format_str ((string or NoneType)) – User specified file format. If None is specified, the default formats associated with the supplied tags are used. (default=None)
  • supported_tags ((dict or NoneType)) – keys are sat_id, each containing a dict keyed by tag where the values file format template strings. (default=None)
  • fake_daily_files_from_monthly ((bool)) – Some CDAWeb instrument data files are stored by month, interfering with pysat’s functionality of loading by day. This flag, when true, appends daily dates to monthly files internally. These dates are used by load routine in this module to provide data by day.
  • two_digit_year_break ((int)) – If filenames only store two digits for the year, then ‘1900’ will be added for years >= two_digit_year_break and ‘2000’ will be added for years < two_digit_year_break.
Returns:

pysat.Files.from_os – A class containing the verified available files

Return type:

(pysat._files.Files)

Examples

fname = 'cnofs_vefi_bfield_1sec_{year:04d}{month:02d}{day:02d}_v05.cdf'
supported_tags = {'dc_b': fname}
list_files = functools.partial(nasa_cdaweb.list_files,
                               supported_tags=supported_tags)

fname = 'cnofs_cindi_ivm_500ms_{year:4d}{month:02d}{day:02d}_v01.cdf'
supported_tags = {'': fname}
list_files = functools.partial(cdw.list_files,
                               supported_tags=supported_tags)
pysat.instruments.templates.template_cdaweb_instrument.list_remote_files(tag, sat_id, remote_site='https://cdaweb.gsfc.nasa.gov', *, supported_tags={'': {'': {'dir': '/pub/data/cnofs/vefi/bfield_1sec', 'local_fname': 'cnofs_vefi_bfield_1sec_{year:04d}{month:02d}{day:02d}_v05.cdf', 'remote_fname': '{year:4d}/cnofs_vefi_bfield_1sec_{year:04d}{month:02d}{day:02d}_v05.cdf'}}}, user=None, password=None, fake_daily_files_from_monthly=False, two_digit_year_break=None, delimiter=None, year=None, month=None, day=None)

Return a Pandas Series of every file for chosen remote data.

Deprecated since version 2.3.0: This routine will be removed in pysat 3.0.0, it will be moved to the pysatNASA repository. Also, as of 2.2.0 the year/month/day keywords will be removed in pysat 3.0.0, they will be replaced with a start/stop syntax consistent with the download routine

This routine is intended to be used by pysat instrument modules supporting a particular NASA CDAWeb dataset.

Parameters:
  • tag ((string or NoneType)) – Denotes type of file to load. Accepted types are <tag strings>. (default=None)
  • sat_id ((string or NoneType)) – Specifies the satellite ID for a constellation. (default=None)
  • remote_site ((string or NoneType)) – Remote site to download data from (default=’https://cdaweb.gsfc.nasa.gov’)
  • supported_tags (dict) – dict of dicts. Keys are supported tag names for download. Value is a dict with ‘dir’, ‘remote_fname’, ‘local_fname’. Inteded to be pre-set with functools.partial then assigned to new instrument code.
  • user ((string or NoneType)) – Username to be passed along to resource with relevant data. (default=None)
  • password ((string or NoneType)) – User password to be passed along to resource with relevant data. (default=None)
  • fake_daily_files_from_monthly (bool) – Some CDAWeb instrument data files are stored by month. This flag, when true, accomodates this reality with user feedback on a monthly time frame. (default=False)
  • two_digit_year_break ((int or NoneType)) – If filenames only store two digits for the year, then ‘1900’ will be added for years >= two_digit_year_break and ‘2000’ will be added for years < two_digit_year_break. (default=None)
  • delimiter ((string or NoneType)) – If filename is delimited, then provide delimiter alone e.g. ‘_’ (default=None)
  • year ((int or NoneType)) – Selects a given year to return remote files for. None returns all years. (default=None)
  • month ((int or NoneType)) – Selects a given month to return remote files for. None returns all months. Requires year to be defined. (default=None)
  • day ((int or NoneType)) – Selects a given day to return remote files for. None returns all days. Requires year and month to be defined. (default=None)
Returns:

pysat.Files.from_os – A class containing the verified available files

Return type:

(pysat._files.Files)

Examples

fname = 'cnofs_vefi_bfield_1sec_{year:04d}{month:02d}{day:02d}_v05.cdf'
supported_tags = {'dc_b': fname}
list_remote_files =             functools.partial(nasa_cdaweb.list_remote_files,
                      supported_tags=supported_tags)

fname = 'cnofs_cindi_ivm_500ms_{year:4d}{month:02d}{day:02d}_v01.cdf'
supported_tags = {'': fname}
list_remote_files =             functools.partial(cdw.list_remote_files,
                      supported_tags=supported_tags)
pysat.instruments.templates.template_cdaweb_instrument.download(date_array, tag, sat_id, remote_site='https://cdaweb.gsfc.nasa.gov', data_path=None, user=None, password=None, fake_daily_files_from_monthly=False, multi_file_day=False)

Routine to download NASA CDAWeb CDF data.

Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatNASA.instruments.methods.cdaweb

This routine is intended to be used by pysat instrument modules supporting a particular NASA CDAWeb dataset.

Parameters:
  • supported_tags (dict) – dict of dicts. Keys are supported tag names for download. Value is a dict with ‘dir’, ‘remote_fname’, ‘local_fname’. Inteded to be pre-set with functools.partial then assigned to new instrument code.
  • date_array (array_like) – Array of datetimes to download data for. Provided by pysat.
  • tag (str or NoneType (None)) – tag or None
  • sat_id ((str or NoneType)) – satellite id or None (default=None)
  • remote_site ((string or NoneType)) – Remote site to download data from (default=’https://cdaweb.gsfc.nasa.gov’)
  • data_path ((string or NoneType)) – Path to data directory. If None is specified, the value previously set in Instrument.files.data_path is used. (default=None)
  • user ((string or NoneType)) – Username to be passed along to resource with relevant data. (default=None)
  • password ((string or NoneType)) – User password to be passed along to resource with relevant data. (default=None)
  • fake_daily_files_from_monthly (bool) – Some CDAWeb instrument data files are stored by month. This flag, when true, accomodates this reality with user feedback on a monthly time frame.
Returns:

Void – Downloads data to disk.

Return type:

(NoneType)

Examples

# download support added to cnofs_vefi.py using code below
rn = '{year:4d}/cnofs_vefi_bfield_1sec_{year:4d}{month:02d}{day:02d}'+
    '_v05.cdf'
ln = 'cnofs_vefi_bfield_1sec_{year:4d}{month:02d}{day:02d}_v05.cdf'
dc_b_tag = {'dir':'/pub/data/cnofs/vefi/bfield_1sec',
            'remote_fname': rn,
            'local_fname': ln}
supported_tags = {'dc_b': dc_b_tag}

download = functools.partial(nasa_cdaweb.download,
                             supported_tags=supported_tags)
pysat.instruments.templates.template_cdaweb_instrument.clean(inst)

Routine to return PLATFORM/NAME data cleaned to the specified level

Cleaning level is specified in inst.clean_level and pysat will accept user input for several strings. The clean_level is specified at instantiation of the Instrument object.

‘clean’ : All parameters should be good, suitable for statistical and case studies ‘dusty’ : All paramers should generally be good though same may not be great ‘dirty’ : There are data areas that have issues, data should be used with caution ‘none’ : No cleaning applied, routine not called in this case.
Parameters:inst (pysat.Instrument) – Instrument class object, whose attribute clean_level is used to return the desired level of data selectivity.

netCDF Pandas

Generic module for loading netCDF4 files into the pandas format within pysat.

This file may be used as a template for adding pysat support for a new dataset based upon netCDF4 files, or other file types (with modification).

This routine may also be used to add quick local support for a netCDF4 based dataset without having to define an instrument module for pysat. Relevant parameters may be specified when instantiating this Instrument object to support the relevant file location and naming schemes. This presumes the pysat developed utils.load_netCDF4 routine is able to load the file. See the load routine docstring in this module for more.

The routines defined within may also be used when adding a new instrument to pysat by importing this module and using the functools.partial methods to attach these functions to the new instrument model. See pysat/instruments/cnofs_ivm.py for more. NASA CDAWeb datasets, such as C/NOFS IVM, use the methods within pysat/instruments/methods/nasa_cdaweb.py to make adding new CDAWeb instruments easy.

pysat.instruments.templates.netcdf_pandas.init(self)

Initializes the Instrument object with instrument specific values.

Runs once upon instantiation. This routine provides a convenient location to print Acknowledgements or restrictions from the mission.

pysat.instruments.templates.netcdf_pandas.load(fnames, tag=None, sat_id=None, **kwargs)

Loads data using pysat.utils.load_netcdf4 .

This routine is called as needed by pysat. It is not intended for direct user interaction.

Parameters:
  • fnames (array-like) – iterable of filename strings, full path, to data files to be loaded. This input is nominally provided by pysat itself.
  • tag (string) – tag name used to identify particular data set to be loaded. This input is nominally provided by pysat itself.
  • sat_id (string) – Satellite ID used to identify particular data set to be loaded. This input is nominally provided by pysat itself.
  • **kwargs (extra keywords) – Passthrough for additional keyword arguments specified when instantiating an Instrument object. These additional keywords are passed through to this routine by pysat.
Returns:

Data and Metadata are formatted for pysat. Data is a pandas DataFrame while metadata is a pysat.Meta instance.

Return type:

data, metadata

Note

Any additional keyword arguments passed to pysat.Instrument upon instantiation are passed along to this routine and through to the load_netcdf4 call.

Examples

inst = pysat.Instrument('sport', 'ivm')
inst.load(2019,1)

# create quick Instrument object for a new, random netCDF4 file
# define filename template string to identify files
# this is normally done by instrument code, but in this case
# there is no built in pysat instrument support
# presumes files are named default_2019-01-01.NC
format_str = 'default_{year:04d}-{month:02d}-{day:02d}.NC'
inst = pysat.Instrument('netcdf', 'pandas',
                        custom_kwarg='test'
                        data_path='./',
                        format_str=format_str)
inst.load(2019,1)
pysat.instruments.templates.netcdf_pandas.list_files(tag=None, sat_id=None, data_path=None, format_str=None)

Produce a list of files corresponding to format_str located at data_path.

This routine is invoked by pysat and is not intended for direct use by the end user.

Multiple data levels may be supported via the ‘tag’ and ‘sat_id’ input strings.

Parameters:
  • tag (string) – tag name used to identify particular data set to be loaded. This input is nominally provided by pysat itself. (default=’’)
  • sat_id (string) – Satellite ID used to identify particular data set to be loaded. This input is nominally provided by pysat itself. (default=’’)
  • data_path (string) – Full path to directory containing files to be loaded. This is provided by pysat. The user may specify their own data path at Instrument instantiation and it will appear here. (default=None)
  • format_str (string) – String template used to parse the datasets filenames. If a user supplies a template string at Instrument instantiation then it will appear here, otherwise defaults to None. (default=None)
Returns:

Series of filename strings, including the path, indexed by datetime.

Return type:

pandas.Series

Examples

If a filename is SPORT_L2_IVM_2019-01-01_v01r0000.NC then the template
is 'SPORT_L2_IVM_{year:04d}-{month:02d}-{day:02d}_' +
'v{version:02d}r{revision:04d}.NC'

Note

The returned Series should not have any duplicate datetimes. If there are multiple versions of a file the most recent version should be kept and the rest discarded. This routine uses the pysat.Files.from_os constructor, thus the returned files are up to pysat specifications.

Normally the format_str for each supported tag and sat_id is defined within this routine. However, as this is a generic routine, those definitions can’t be made here. This method could be used in an instrument specific module where the list_files routine in the new package defines the format_str based upon inputs, then calls this routine passing both data_path and format_str.

Alternately, the list_files routine in methods.nasa_cdaweb may also be used and has more built in functionality. Supported tages and format strings may be defined within the new instrument module and passed as arguments to methods.nasa_cdaweb.list_files . For an example on using this routine, see pysat/instrument/cnofs_ivm.py or cnofs_vefi, cnofs_plp, omni_hro, timed_see, etc.

pysat.instruments.templates.netcdf_pandas.download(date_array, tag, sat_id, data_path=None, user=None, password=None)

Downloads data for supported instruments, however this is a template call.

This routine is invoked by pysat and is not intended for direct use by the end user.

Parameters:
  • date_array (array-like) – list of datetimes to download data for. The sequence of dates need not be contiguous.
  • tag (string) – Tag identifier used for particular dataset. This input is provided by pysat. (default=’’)
  • sat_id (string) – Satellite ID string identifier used for particular dataset. This input is provided by pysat. (default=’’)
  • data_path (string (None)) – Path to directory to download data to. (default=None)
  • user (string) – User string input used for download. Provided by user and passed via pysat. If an account is required for dowloads this routine here must error if user not supplied. (default=None)
  • password (string) – Password for data download. (default=None)

Constellation

class pysat.Constellation(instruments=None, name=None, const_module=None)

Manage and analyze data from multiple pysat Instruments.

Created as part of a Spring 2018 UTDesign project.

Deprecated since version 2.3.0: The name kwarg was changed to const_module in pysat 3.0.0

Constructs a Constellation given a list of instruments or the name of a file with a pre-defined constellation.

Deprecated since version 2.3.0: The name kwarg was changed to const_module in pysat 3.0.0

Parameters:
  • instruments (list) – a list of pysat Instruments
  • name (string) – Name of a file in pysat/constellations containing a list of instruments.
  • const_module (string or NoneType) – Name of a pysat constellation module (default=None)

Note

The name and instruments parameters should not both be set. If neither is given, an empty constellation will be created.

add(bounds1, label1, bounds2, label2, bin3, label3, data_label)

Combines signals from multiple instruments within given bounds.

Deprecated since version 2.2.0: add will be removed in pysat 3.0.0, it will be added to pysatSeasons

Parameters:
  • bounds1 ((min, max)) – Bounds for selecting data on the axis of label1 Data points with label1 in [min, max) will be considered.
  • label1 (string) – Data label for bounds1 to act on.
  • bounds2 ((min, max)) – Bounds for selecting data on the axis of label2 Data points with label1 in [min, max) will be considered.
  • label2 (string) – Data label for bounds2 to act on.
  • bin3 ((min, max, #bins)) – Min and max bounds and number of bins for third axis.
  • label3 (string) – Data label for third axis.
  • data_label (array of strings) – Data label(s) for data product(s) to be averaged.
Returns:

median – Dictionary indexed by data label, each value of which is a dictionary with keys ‘median’, ‘count’, ‘avg_abs_dev’, and ‘bin’ (the values of the bin edges.)

Return type:

dictionary

data_mod(*args, **kwargs)

Register a function to modify data of member Instruments.

The function is not partially applied to modify member data.

When the Constellation receives a function call to register a function for data modification, it passes the call to each instrument and registers it in the instrument’s pysat.Custom queue.

(Wraps pysat.Custom.add; documentation of that function is reproduced here.)

Parameters:
  • function (string or function object) – name of function or function object to be added to queue
  • kind ({'add, 'modify', 'pass'}) –
    add
    Adds data returned from fuction to instrument object.
    modify
    pysat instrument object supplied to routine. Any and all changes to object are retained.
    pass
    A copy of pysat object is passed to function. No data is accepted from return.
  • at_pos (string or int) – insert at position. (default, insert at end).
  • args (extra arguments) –

Note

Allowed add function returns:

  • {‘data’ : pandas Series/DataFrame/array_like, ‘units’ : string/array_like of strings, ‘long_name’ : string/array_like of strings, ‘name’ : string/array_like of strings (iff data array_like)}
  • pandas DataFrame, names of columns are used
  • pandas Series, .name required
  • (string/list of strings, numpy array/list of arrays)
difference(instrument1, instrument2, bounds, data_labels, cost_function)

Calculates the difference in signals from multiple instruments within the given bounds.

Deprecated since version 2.2.0: difference will be removed in pysat 3.0.0, it will be added to pysatSeasons

Parameters:
  • instrument1 (Instrument) – Information must already be loaded into the instrument.
  • instrument2 (Instrument) – Information must already be loaded into the instrument.
  • bounds (list of tuples in the form (inst1_label, inst2_label,) – min, max, max_difference) inst1_label are inst2_label are labels for the data in instrument1 and instrument2 min and max are bounds on the data considered max_difference is the maximum difference between two points for the difference to be calculated
  • data_labels (list of tuples of data labels) – The first key is used to access data in s1 and the second data in s2.
  • cost_function (function) – function that operates on two rows of the instrument data. used to determine the distance between two points for finding closest points
Returns:

  • data_df (pandas DataFrame) – Each row has a point from instrument1, with the keys preceded by 1_, and a point within bounds on that point from instrument2 with the keys preceded by 2_, and the difference between the instruments’ data for all the labels in data_labels
  • Created as part of a Spring 2018 UTDesign project.

load(*args, **kwargs)

Load instrument data into instrument object.data

(Wraps pysat.Instrument.load; documentation of that function is reproduced here.)

Parameters:
  • yr (integer) – Year for desired data
  • doy (integer) – day of year
  • data (datetime object) – date to load
  • fname ('string') – filename to be loaded
  • verifyPad (boolean) – if true, padding data not removed (debug purposes)
set_bounds(start, stop)

Sets boundaries for all instruments in constellation

Custom

class pysat.Custom

Applies a queue of functions when instrument.load called.

Deprecated since version 2.3.0: Custom will be removed in pysat 3.0.0, it is incorporated into Instrument

Nano-kernel functionality enables instrument objects that are ‘set and forget’. The functions are always run whenever the instrument load routine is called so instrument objects may be passed safely to other routines and the data will always be processed appropriately.

Examples

def custom_func(inst, opt_param1=False, opt_param2=False):
    return None
instrument.custom.attach(custom_func, 'modify', opt_param1=True)

def custom_func2(inst, opt_param1=False, opt_param2=False):
    return data_to_be_added
instrument.custom.attach(custom_func2, 'add', opt_param2=True)
instrument.load(date=date)
print(instrument['data_to_be_added'])

See also

Custom.attach

Note

User should interact with Custom through pysat.Instrument instance’s attribute, instrument.custom

add(function, kind='add', at_pos='end', *args, **kwargs)

Add a function to custom processing queue.

Deprecated since version 2.2.0: Custom.add will be removed in pysat 3.0.0, it is replaced by Instrument.custom_attach to clarify the syntax

Custom functions are applied automatically to associated pysat instrument whenever instrument.load command called.

Parameters:
  • function (string or function object) – name of function or function object to be added to queue
  • kind ({'add', 'modify', 'pass}) –
    add
    Adds data returned from function to instrument object. A copy of pysat instrument object supplied to routine.
    modify
    pysat instrument object supplied to routine. Any and all changes to object are retained.
    pass
    A copy of pysat object is passed to function. No data is accepted from return.
  • at_pos (string or int) – insert at position. (default, insert at end).
  • args (extra arguments) – extra arguments are passed to the custom function (once)
  • kwargs (extra keyword arguments) – extra keyword args are passed to the custom function (once)

Note

Allowed add function returns:

  • {‘data’ : pandas Series/DataFrame/array_like, ‘units’ : string/array_like of strings, ‘long_name’ : string/array_like of strings, ‘name’ : string/array_like of strings (iff data array_like)}
  • pandas DataFrame, names of columns are used
  • pandas Series, .name required
  • (string/list of strings, numpy array/list of arrays)
attach(function, kind='add', at_pos='end', *args, **kwargs)

Attach a function to custom processing queue.

Deprecated since version 2.3.0: Custom.attach will be removed in pysat 3.0.0, it is replaced by Instrument.custom_attach

Custom functions are applied automatically to associated pysat instrument whenever instrument.load command called.

Parameters:
  • function (string or function object) – name of function or function object to be added to queue
  • kind ({'add', 'modify', 'pass}) –
    add
    Adds data returned from function to instrument object. A copy of pysat instrument object supplied to routine.
    modify
    pysat instrument object supplied to routine. Any and all changes to object are retained.
    pass
    A copy of pysat object is passed to function. No data is accepted from return.
  • at_pos (string or int) – insert at position. (default, insert at end).
  • args (extra arguments) – extra arguments are passed to the custom function (once)
  • kwargs (extra keyword arguments) – extra keyword args are passed to the custom function (once)

Note

Allowed attach function returns:

  • {‘data’ : pandas Series/DataFrame/array_like, ‘units’ : string/array_like of strings, ‘long_name’ : string/array_like of strings, ‘name’ : string/array_like of strings (iff data array_like)}
  • pandas DataFrame, names of columns are used
  • pandas Series, .name required
  • (string/list of strings, numpy array/list of arrays)
clear()

Clear custom function list.

Deprecated since version 2.3.0: Custom.clear will be removed in pysat 3.0.0, it is replaced by Instrument.custom_clear

Files

class pysat.Files(sat, manual_org=False, directory_format=None, update_files=False, file_format=None, write_to_disk=True, ignore_empty_files=False)

Maintains collection of files for instrument object.

Uses the list_files functions for each specific instrument to create an ordered collection of files in time. Used by instrument object to load the correct files. Files also contains helper methods for determining the presence of new files and creating an ordered list of files.

base_path

path to .pysat directory in user home

Type:string
start_date

date of first file, used as default start bound for instrument object

Type:datetime
stop_date

date of last file, used as default stop bound for instrument object

Type:datetime
data_path

path to the directory containing instrument files, top_dir/platform/name/tag/

Type:string
manual_org

if True, then Files will look directly in pysat data directory for data files and will not use /platform/name/tag

Type:bool
update_files

updates files on instantiation if True

Type:bool

Note

User should generally use the interface provided by a pysat.Instrument instance. Exceptions are the classmethod from_os, provided to assist in generating the appropriate output for an instrument routine.

Examples

# convenient file access
inst = pysat.Instrument(platform=platform, name=name, tag=tag,
                        sat_id=sat_id)
# first file
inst.files[0]

# files from start up to stop (exclusive on stop)
start = pysat.datetime(2009,1,1)
stop = pysat.datetime(2009,1,3)
print(vefi.files[start:stop])

# files for date
print(vefi.files[start])

# files by slicing
print(vefi.files[0:4])

# get a list of new files
# new files are those that weren't present the last time
# a given instrument's file list was stored
new_files = vefi.files.get_new()

# search pysat appropriate directory for instrument files and
# update Files instance.
vefi.files.refresh()

Initialization for Files class object

Parameters:
  • sat (pysat._instrument.Instrument) – Instrument object
  • manual_org (boolian) – If True, then pysat will look directly in pysat data directory for data files and will not use default /platform/name/tag (default=False)
  • directory_format (string or NoneType) – directory naming structure in string format. Variables such as platform, name, and tag will be filled in as needed using python string formatting. The default directory structure would be expressed as ‘{platform}/{name}/{tag}’ (default=None)
  • update_files (boolean) – If True, immediately query filesystem for instrument files and store (default=False)
  • file_format (str or NoneType) – File naming structure in string format. Variables such as year, month, and sat_id will be filled in as needed using python string formatting. The default file format structure is supplied in the instrument list_files routine. (default=None)
  • write_to_disk (boolean) – If true, the list of Instrument files will be written to disk. Setting this to False prevents a rare condition when running multiple pysat processes.
  • ignore_empty_files (boolean) – if True, the list of files found will be checked to ensure the filesiizes are greater than zero. Empty files are removed from the stored list of files.
classmethod from_os(data_path=None, format_str=None, two_digit_year_break=None, delimiter=None)

Produces a list of files and and formats it for Files class.

Requires fixed_width or delimited filename

Parameters:
  • data_path (string) – Top level directory to search files for. This directory is provided by pysat to the instrument_module.list_files functions as data_path.
  • format_str (string with python format codes) – Provides the naming pattern of the instrument files and the locations of date information so an ordered list may be produced. Supports ‘year’, ‘month’, ‘day’, ‘hour’, ‘minute’, ‘second’, ‘version’, and ‘revision’ Ex: ‘cnofs_cindi_ivm_500ms_{year:4d}{month:02d}{day:02d}_v01.cdf’
  • two_digit_year_break (int) – If filenames only store two digits for the year, then ‘1900’ will be added for years >= two_digit_year_break and ‘2000’ will be added for years < two_digit_year_break.
  • delimiter (string (None)) – If set, then filename will be processed using delimiter rather than assuming a fixed width

Note

Does not produce a Files instance, but the proper output from instrument_module.list_files method.

The ‘?’ may be used to indicate a set number of spaces for a variable part of the name that need not be extracted. ‘cnofs_cindi_ivm_500ms_{year:4d}{month:02d}{day:02d}_v??.cdf’

get_file_array(start, end)

Return a list of filenames between and including start and end.

Parameters:
  • start (array_like or single string) – filenames for start of returned filelist
  • stop (array_like or single string) – filenames inclusive end of list
Returns:

  • list of filenames between and including start and end over all
  • intervals.

get_index(fname)

Return index for a given filename.

Parameters:fname (string) – filename

Note

If fname not found in the file information already attached to the instrument.files instance, then a files.refresh() call is made.

get_new()

List new files since last recorded file state.

pysat stores filenames in the user_home/.pysat directory. Returns a list of all new fileanmes since the last known change to files. Filenames are stored if there is a change and either update_files is True at instrument object level or files.refresh() is called.

Returns:files are indexed by datetime
Return type:pandas.Series
refresh()

Update list of files, if there are changes.

Calls underlying list_rtn for the particular science instrument. Typically, these routines search in the pysat provided path, pysat_data_dir/platform/name/tag/, where pysat_data_dir is set by pysat.utils.set_data_dir(path=path).

Meta

class pysat.Meta(metadata=None, units_label='units', name_label='long_name', notes_label='notes', desc_label='desc', plot_label='label', axis_label='axis', scale_label='scale', min_label='value_min', max_label='value_max', fill_label='fill', export_nan=[])

Stores metadata for Instrument instance, similar to CF-1.6 netCDFdata standard.

Parameters:
  • metadata (pandas.DataFrame) – DataFrame should be indexed by variable name that contains at minimum the standard_name (name), units, and long_name for the data stored in the associated pysat Instrument object.
  • units_label (str) – String used to label units in storage. Defaults to ‘units’.
  • name_label (str) – String used to label long_name in storage. Defaults to ‘long_name’.
  • notes_label (str) – String used to label ‘notes’ in storage. Defaults to ‘notes’
  • desc_label (str) – String used to label variable descriptions in storage. Defaults to ‘desc’
  • plot_label (str) – String used to label variables in plots. Defaults to ‘label’
  • axis_label (str) – Label used for axis on a plot. Defaults to ‘axis’
  • scale_label (str) – string used to label plot scaling type in storage. Defaults to ‘scale’
  • min_label (str) – String used to label typical variable value min limit in storage. Defaults to ‘value_min’
  • max_label (str) – String used to label typical variable value max limit in storage. Defaults to ‘value_max’
  • fill_label (str) – String used to label fill value in storage. Defaults to ‘fill’ per netCDF4 standard
data

index is variable standard name, ‘units’, ‘long_name’, and other defaults are also stored along with additional user provided labels.

Type:pandas.DataFrame
units_label

String used to label units in storage. Defaults to ‘units’.

Type:str
name_label

String used to label long_name in storage. Defaults to ‘long_name’.

Type:str
notes_label

String used to label ‘notes’ in storage. Defaults to ‘notes’

Type:str
desc_label

String used to label variable descriptions in storage. Defaults to ‘desc’

Type:str
plot_label

String used to label variables in plots. Defaults to ‘label’

Type:str
axis_label

Label used for axis on a plot. Defaults to ‘axis’

Type:str
scale_label

string used to label plot scaling type in storage. Defaults to ‘scale’

Type:str
min_label

String used to label typical variable value min limit in storage. Defaults to ‘value_min’

Type:str
max_label

String used to label typical variable value max limit in storage. Defaults to ‘value_max’

Type:str
fill_label

String used to label fill value in storage. Defaults to ‘fill’ per netCDF4 standard

Type:str
export_nan

List of labels that should be exported even if their value is nan. By default, metadata with a value of nan will be exluded from export.

Type:list

Notes

Meta object preserves the case of variables and attributes as it first receives the data. Subsequent calls to set new metadata with the same variable or attribute will use case of first call. Accessing or setting data thereafter is case insensitive. In practice, use is case insensitive but the original case is preserved. Case preseveration is built in to support writing files with a desired case to meet standards.

Metadata for higher order data objects, those that have multiple products under a single variable name in a pysat.Instrument object, are stored by providing a Meta object under the single name.

Supports any custom metadata values in addition to the expected metadata attributes (units, name, notes, desc, plot_label, axis, scale, value_min, value_max, and fill). These base attributes may be used to programatically access and set types of metadata regardless of the string values used for the attribute. String values for attributes may need to be changed depending upon the standards of code or files interacting with pysat.

Meta objects returned as part of pysat loading routines are automatically updated to use the same values of plot_label, units_label, etc. as found on the pysat.Instrument object.

Examples

# instantiate Meta object, default values for attribute labels are used
meta = pysat.Meta()
# set a couple base units
# note that other base parameters not set below will
# be assigned a default value
meta['name'] = {'long_name':string, 'units':string}
# update 'units' to new value
meta['name'] = {'units':string}
# update 'long_name' to new value
meta['name'] = {'long_name':string}
# attach new info with partial information, 'long_name' set to 'name2'
meta['name2'] = {'units':string}
# units are set to '' by default
meta['name3'] = {'long_name':string}

# assigning custom meta parameters
meta['name4'] = {'units':string, 'long_name':string
                 'custom1':string, 'custom2':value}
meta['name5'] = {'custom1':string, 'custom3':value}

# assign multiple variables at once
meta[['name1', 'name2']] = {'long_name':[string1, string2],
                            'units':[string1, string2],
                            'custom10':[string1, string2]}

# assiging metadata for n-Dimensional variables
meta2 = pysat.Meta()
meta2['name41'] = {'long_name':string, 'units':string}
meta2['name42'] = {'long_name':string, 'units':string}
meta['name4'] = {'meta':meta2}
# or
meta['name4'] = meta2
meta['name4'].children['name41']

# mixture of 1D and higher dimensional data
meta = pysat.Meta()
meta['dm'] = {'units':'hey', 'long_name':'boo'}
meta['rpa'] = {'units':'crazy', 'long_name':'boo_whoo'}
meta2 = pysat.Meta()
meta2[['higher', 'lower']] = {'meta':[meta, None],
                              'units':[None, 'boo'],
                              'long_name':[None, 'boohoo']}

# assign from another Meta object
meta[key1] = meta2[key2]

# access fill info for a variable, presuming default label
meta[key1, 'fill']
# access same info, even if 'fill' not used to label fill values
meta[key1, meta.fill_label]

# change a label used by Meta object
# note that all instances of fill_label
# within the meta object are updated
meta.fill_label = '_FillValue'
meta.plot_label = 'Special Plot Variable'
# this feature is useful when converting metadata within pysat
# so that it is consistent with externally imposed file standards
accept_default_labels(other)

Applies labels for default meta labels from other onto self.

Parameters:other (Meta) – Meta object to take default labels from
Returns:
Return type:Meta
apply_default_labels(other)

Applies labels for default meta labels from self onto other.

Parameters:other (Meta) – Meta object to have default labels applied
Returns:
Return type:Meta
attr_case_name(name)

Returns preserved case name for case insensitive value of name.

Checks first within standard attributes. If not found there, checks attributes for higher order data structures. If not found, returns supplied name as it is available for use. Intended to be used to help ensure that the same case is applied to all repetitions of a given variable name.

Parameters:name (str) – name of variable to get stored case form
Returns:name in proper case
Return type:str
attrs()

Yields metadata products stored for each variable name

concat(other, strict=False)

Concats two metadata objects together.

Parameters:
  • other (Meta) – Meta object to be concatenated
  • strict (bool) – if True, ensure there are no duplicate variable names

Notes

Uses units and name label of self if other is different

Returns:Concatenated object
Return type:Meta
drop(names)

Drops variables (names) from metadata.

empty

Return boolean True if there is no metadata

classmethod from_csv(name=None, col_names=None, sep=None, **kwargs)

Create instrument metadata object from csv.

Parameters:
  • name (string) – absolute filename for csv file or name of file stored in pandas instruments location
  • col_names (list-like collection of strings) – column names in csv and resultant meta object
  • sep (string) – column seperator for supplied csv filename

Note

column names must include at least [‘name’, ‘long_name’, ‘units’], assumed if col_names is None.

has_attr(name)

Returns boolean indicating presence of given attribute name

Case-insensitive check

Notes

Does not check higher order meta objects

Parameters:name (str) – name of variable to get stored case form
Returns:True if case-insesitive check for attribute name is True
Return type:bool
keep(keep_names)

Keeps variables (keep_names) while dropping other parameters

Parameters:keep_names (list-like) – variables to keep
keys()

Yields variable names stored for 1D variables

keys_nD()

Yields keys for higher order metadata

merge(other)

Adds metadata variables to self that are in other but not in self.

Parameters:other (pysat.Meta) –
pop(name)

Remove and return metadata about variable

Parameters:name (str) – variable name
Returns:Series of metadata for variable
Return type:pandas.Series
transfer_attributes_to_instrument(inst, strict_names=False)

Transfer non-standard attributes in Meta to Instrument object.

Pysat’s load_netCDF and similar routines are only able to attach netCDF4 attributes to a Meta object. This routine identifies these attributes and removes them from the Meta object. Intent is to support simple transfers to the pysat.Instrument object.

Will not transfer names that conflict with pysat default attributes.

Parameters:
  • inst (pysat.Instrument) – Instrument object to transfer attributes to
  • strict_names (boolean (False)) – If True, produces an error if the Instrument object already has an attribute with the same name to be copied.
Returns:

pysat.Instrument object modified in place with new attributes

Return type:

None

var_case_name(name)

Provides stored name (case preserved) for case insensitive input

If name is not found (case-insensitive check) then name is returned, as input. This function is intended to be used to help ensure the case of a given variable name is the same across the Meta object.

Parameters:name (str) – variable name in any case
Returns:string with case preserved as in metaobject
Return type:str

Orbits

class pysat.Orbits(sat=None, index=None, kind=None, period=None)

Determines orbits on the fly and provides orbital data in .data.

Determines the locations of orbit breaks in the loaded data in inst.data and provides iteration tools and convenient orbit selection via inst.orbit[orbit num].

Parameters:
  • sat (pysat.Instrument instance) – instrument object to determine orbits for
  • index (string) – name of the data series to use for determing orbit breaks
  • kind ({'local time', 'longitude', 'polar', 'orbit'}) –

    kind of orbit, determines how orbital breaks are determined

    • local time: negative gradients in lt or breaks in inst.data.index
    • longitude: negative gradients or breaks in inst.data.index
    • polar: zero crossings in latitude or breaks in inst.data.index
    • orbit: uses unique values of orbit number
  • period (np.timedelta64) – length of time for orbital period, used to gauge when a break in the datetime index (inst.data.index) is large enough to consider it a new orbit

Note

class should not be called directly by the user, use the interface provided by inst.orbits where inst = pysat.Instrument()

Warning

This class is still under development.

Examples

info = {'index':'longitude', 'kind':'longitude'}
vefi = pysat.Instrument(platform='cnofs', name='vefi', tag='dc_b',
                        clean_level=None, orbit_info=info)
start = pysat.datetime(2009,1,1)
stop = pysat.datetime(2009,1,10)
vefi.load(date=start)
vefi.bounds(start, stop)

# iterate over orbits
for vefi in vefi.orbits:
    print('Next available orbit ', vefi['dB_mer'])

# load fifth orbit of first day
vefi.load(date=start)
vefi.orbits[5]

# less convenient load
vefi.orbits.load(5)

# manually iterate orbit
vefi.orbits.next()
# backwards
vefi.orbits.prev()
current

Current orbit number.

Returns:None if no orbit data. Otherwise, returns orbit number, begining with zero. The first and last orbit of a day is somewhat ambiguous. The first orbit for day n is generally also the last orbit on day n - 1. When iterating forward, the orbit will be labeled as first (0). When iterating backward, orbit labeled as the last.
Return type:int or None
load(orbit=None)

Load a particular orbit into .data for loaded day.

Parameters:orbit (int) – orbit number, 1 indexed

Note

A day of data must be loaded before this routine functions properly. If the last orbit of the day is requested, it will automatically be padded with data from the next day. The orbit counter will be reset to 1.

next(*arg, **kwarg)

Load the next orbit into .data.

Note

Forms complete orbits across day boundaries. If no data loaded then the first orbit from the first date of data is returned.

prev(*arg, **kwarg)

Load the previous orbit into .data.

Note

Forms complete orbits across day boundaries. If no data loaded then the last orbit of data from the last day is loaded into .data.

Seasonal Analysis

Occurrence Probability

Occurrence probability routines, daily or by orbit.

Routines calculate the occurrence of an event greater than a supplied gate occuring at least once per day, or once per orbit. The probability is calculated as the (number of times with at least one hit in bin)/(number of times in the bin).The data used to determine the occurrence must be 1D. If a property of a 2D or higher dataset is needed attach a custom function that performs the check and returns a 1D Series.

Deprecated since version 2.2.0: ssnl.occur_prob will be removed in pysat 3.0.0, it will be added to pysatSeasons: https://github.com/pysat/pysatSeasons

Note

The included routines use the bounds attached to the supplied instrument object as the season of interest.

pysat.ssnl.occur_prob.by_orbit2D(inst, bin1, label1, bin2, label2, data_label, gate, returnBins=False)

2D Occurrence Probability of data_label orbit-by-orbit over a season.

Deprecated since version 2.2.0: by_orbit2D will be removed in pysat 3.0.0, it will be added to pysatSeasons

If data_label is greater than gate atleast once per orbit, then a 100% occurrence probability results. Season delineated by the bounds attached to Instrument object. Prob = (# of times with at least one hit)/(# of times in bin)

Parameters:
  • inst (pysat.Instrument()) – Instrument to use for calculating occurrence probability
  • binx (list) – [min value, max value, number of bins]
  • labelx (string) – identifies data product for binx
  • data_label (list of strings) – identifies data product(s) to calculate occurrence probability
  • gate (list of values) – values that data_label must achieve to be counted as an occurrence
  • returnBins (Boolean) – if True, return arrays with values of bin edges, useful for pcolor
Returns:

occur_prob – A dict of dicts indexed by data_label. Each entry is dict with entries ‘prob’ for the probability and ‘count’ for the number of orbits with any data; ‘bin_x’ and ‘bin_y’ are also returned if requested. Note that arrays are organized for direct plotting, y values along rows, x along columns.

Return type:

dictionary

Note

Season delineated by the bounds attached to Instrument object.

pysat.ssnl.occur_prob.by_orbit3D(inst, bin1, label1, bin2, label2, bin3, label3, data_label, gate, returnBins=False)

3D Occurrence Probability of data_label orbit-by-orbit over a season.

Deprecated since version 2.2.0: by_orbit3D will be removed in pysat 3.0.0, it will be added to pysatSeasons

If data_label is greater than gate atleast once per orbit, then a 100% occurrence probability results. Season delineated by the bounds attached to Instrument object. Prob = (# of times with at least one hit)/(# of times in bin)

Parameters:
  • inst (pysat.Instrument()) – Instrument to use for calculating occurrence probability
  • binx (list) – [min value, max value, number of bins]
  • labelx (string) – identifies data product for binx
  • data_label (list of strings) – identifies data product(s) to calculate occurrence probability
  • gate (list of values) – values that data_label must achieve to be counted as an occurrence
  • returnBins (Boolean) – if True, return arrays with values of bin edges, useful for pcolor
Returns:

occur_prob – A dict of dicts indexed by data_label. Each entry is dict with entries ‘prob’ for the probability and ‘count’ for the number of orbits with any data; ‘bin_x’, ‘bin_y’, and ‘bin_z’ are also returned if requested. Note that arrays are organized for direct plotting, z,y,x.

Return type:

dictionary

Note

Season delineated by the bounds attached to Instrument object.

pysat.ssnl.occur_prob.daily2D(inst, bin1, label1, bin2, label2, data_label, gate, returnBins=False)

2D Daily Occurrence Probability of data_label > gate over a season.

Deprecated since version 2.2.0: daily2D will be removed in pysat 3.0.0, it will be added to pysatSeasons

If data_label is greater than gate at least once per day, then a 100% occurrence probability results.Season delineated by the bounds attached to Instrument object. Prob = (# of times with at least one hit)/(# of times in bin)

Parameters:
  • inst (pysat.Instrument()) – Instrument to use for calculating occurrence probability
  • binx (list) – [min, max, number of bins]
  • labelx (string) – name for data product for binx
  • data_label (list of strings) – identifies data product(s) to calculate occurrence probability e.g. inst[data_label]
  • gate (list of values) – values that data_label must achieve to be counted as an occurrence
  • returnBins (Boolean) – if True, return arrays with values of bin edges, useful for pcolor
Returns:

occur_prob – A dict of dicts indexed by data_label. Each entry is dict with entries ‘prob’ for the probability and ‘count’ for the number of days with any data; ‘bin_x’ and ‘bin_y’ are also returned if requested. Note that arrays are organized for direct plotting, y values along rows, x along columns.

Return type:

dictionary

Note

Season delineated by the bounds attached to Instrument object.

pysat.ssnl.occur_prob.daily3D(inst, bin1, label1, bin2, label2, bin3, label3, data_label, gate, returnBins=False)

3D Daily Occurrence Probability of data_label > gate over a season.

Deprecated since version 2.2.0: daily3D will be removed in pysat 3.0.0, it will be added to pysatSeasons

If data_label is greater than gate atleast once per day, then a 100% occurrence probability results. Season delineated by the bounds attached to Instrument object. Prob = (# of times with at least one hit)/(# of times in bin)

Parameters:
  • inst (pysat.Instrument()) – Instrument to use for calculating occurrence probability
  • binx (list) – [min, max, number of bins]
  • labelx (string) – name for data product for binx
  • data_label (list of strings) – identifies data product(s) to calculate occurrence probability
  • gate (list of values) – values that data_label must achieve to be counted as an occurrence
  • returnBins (Boolean) – if True, return arrays with values of bin edges, useful for pcolor
Returns:

occur_prob – A dict of dicts indexed by data_label. Each entry is dict with entries ‘prob’ for the probability and ‘count’ for the number of days with any data; ‘bin_x’, ‘bin_y’, and ‘bin_z’ are also returned if requested. Note that arrays are organized for direct plotting, z,y,x.

Return type:

dictionary

Note

Season delineated by the bounds attached to Instrument object.

Average

Instrument independent seasonal averaging routine. Supports averaging 1D and 2D data.

Deprecated since version 2.2.0: ssnl.avg will be removed in pysat 3.0.0, it will be added to pysatSeasons: https://github.com/pysat/pysatSeasons

pysat.ssnl.avg.mean_by_day(inst, data_label)

Mean of data_label by day over Instrument.bounds

Deprecated since version 2.2.0: mean_by_day will be removed in pysat 3.0.0, it will be added to pysatSeasons

Parameters:data_label (string) – string identifying data product to be averaged
Returns:mean – simple mean of data_label indexed by day
Return type:pandas Series
pysat.ssnl.avg.mean_by_file(inst, data_label)

Mean of data_label by orbit over Instrument.bounds

Deprecated since version 2.2.0: mean_by_file will be removed in pysat 3.0.0, it will be added to pysatSeasons

Parameters:data_label (string) – string identifying data product to be averaged
Returns:mean – simple mean of data_label indexed by start of each file
Return type:pandas Series
pysat.ssnl.avg.mean_by_orbit(inst, data_label)

Mean of data_label by orbit over Instrument.bounds

Deprecated since version 2.2.0: mean_by_orbit will be removed in pysat 3.0.0, it will be added to pysatSeasons

Parameters:data_label (string) – string identifying data product to be averaged
Returns:mean – simple mean of data_label indexed by start of each orbit
Return type:pandas Series
pysat.ssnl.avg.median1D(const, bin1, label1, data_label, auto_bin=True, returnData=False)

Return a 1D median of data_label over a season and label1

Deprecated since version 2.2.0: median1D will be removed in pysat 3.0.0, it will be added to pysatSeasons

Parameters:
  • const (Constellation or Instrument) – Constellation or Instrument object
  • bin1 ((array-like)) – List holding [min, max, number of bins] or array-like containing bin edges
  • label1 ((string)) – data column name that the binning will be performed over (i.e., lat)
  • data_label ((list-like )) – contains strings identifying data product(s) to be averaged
  • auto_bin (if True, function will create bins from the min, max and) – number of bins. If false, bin edges must be manually entered
  • returnData ((boolean)) – Return data in output dictionary as well as statistics
Returns:

median – 1D median accessed by data_label as a function of label1 over the season delineated by bounds of passed instrument objects. Also includes ‘count’ and ‘avg_abs_dev’ as well as the values of the bin edges in ‘bin_x’

Return type:

dictionary

pysat.ssnl.avg.median2D(const, bin1, label1, bin2, label2, data_label, returnData=False, auto_bin=True)

Return a 2D average of data_label over a season and label1, label2.

Deprecated since version 2.2.0: median2D will be removed in pysat 3.0.0, it will be added to pysatSeasons

Parameters:
  • const (Constellation or Instrument) –
  • bin# ([min, max, number of bins], or array-like containing bin edges) –
  • label# (string) – identifies data product for bin#
  • data_label (list-like) – contains strings identifying data product(s) to be averaged
  • auto_bin (if True, function will create bins from the min, max and) – number of bins. If false, bin edges must be manually entered
Returns:

median – 2D median accessed by data_label as a function of label1 and label2 over the season delineated by bounds of passed instrument objects. Also includes ‘count’ and ‘avg_abs_dev’ as well as the values of the bin edges in ‘bin_x’ and ‘bin_y’.

Return type:

dictionary

Plot

pysat.ssnl.plot.scatterplot(inst, labelx, labely, data_label, datalim, xlim=None, ylim=None)

Return scatterplot of data_label(s) as functions of labelx,y over a season.

Deprecated since version 2.2.0: scatterplot will be removed in pysat 3.0.0, it will be added to pysatSeasons

Parameters:
  • labelx (string) – data product for x-axis
  • labely (string) – data product for y-axis
  • data_label (string, array-like of strings) – data product(s) to be scatter plotted
  • datalim (numyp array) – plot limits for data_label
Returns:

  • Returns a list of scatter plots of data_label as a function
  • of labelx and labely over the season delineated by start and
  • stop datetime objects.

Utilities

pysat.utils - utilities for running pysat

pysat.utils contains a number of functions used throughout the pysat package. This includes conversion of formats, loading of files, and user-supplied info for the pysat data directory structure.

Coordinates

pysat.utils.coords - coordinate transformations for pysat

pysat.utils.coords contains a number of coordinate-transformation functions used throughout the pysat package.

pysat.utils.coords.adjust_cyclic_data(samples, high=6.283185307179586, low=0.0)

Adjust cyclic values such as longitude to a different scale

Parameters:
  • samples (array_like) – Input array
  • high (float or int) – Upper boundary for circular standard deviation range (default=2 pi)
  • low (float or int) – Lower boundary for circular standard deviation range (default=0)
  • axis (int or NoneType) – Axis along which standard deviations are computed. The default is to compute the standard deviation of the flattened array
Returns:

out_samples – Circular standard deviation

Return type:

float

pysat.utils.coords.calc_solar_local_time(inst, lon_name=None, slt_name='slt')

Append solar local time to an instrument object

Parameters:
  • inst (pysat.Instrument instance) – instrument object to be updated
  • lon_name (string) – name of the longtiude data key (assumes data are in degrees)
  • slt_name (string) – name of the output solar local time data key (default=’slt’)
Returns:

Return type:

updates instrument data in column specified by slt_name

pysat.utils.coords.geodetic_to_geocentric(lat_in, lon_in=None, inverse=False)

Converts position from geodetic to geocentric or vice-versa.

Deprecated since version 2.2.0: geodetic_to_geocentric will be removed in pysat 3.0.0, it will be added to pysatMadrigal

Parameters:
  • lat_in (float) – latitude in degrees.
  • lon_in (float or NoneType) – longitude in degrees. Remains unchanged, so does not need to be included. (default=None)
  • inverse (bool) – False for geodetic to geocentric, True for geocentric to geodetic. (default=False)
Returns:

  • lat_out (float) – latitude [degree] (geocentric/detic if inverse=False/True)
  • lon_out (float or NoneType) – longitude [degree] (geocentric/detic if inverse=False/True)
  • rad_earth (float) – Earth radius [km] (geocentric/detic if inverse=False/True)

Notes

Uses WGS-84 values

References

Based on J.M. Ruohoniemi’s geopack and R.J. Barnes radar.pro

pysat.utils.coords.geodetic_to_geocentric_horizontal(lat_in, lon_in, az_in, el_in, inverse=False)

Converts from local horizontal coordinates in a geodetic system to local horizontal coordinates in a geocentric system

Deprecated since version 2.2.0: geodetic_to_geocentric_horizontal will be removed in pysat 3.0.0, it will be added to pysatMadrigal

Parameters:
  • lat_in (float) – latitude in degrees of the local horizontal coordinate system center
  • lon_in (float) – longitude in degrees of the local horizontal coordinate system center
  • az_in (float) – azimuth in degrees within the local horizontal coordinate system
  • el_in (float) – elevation in degrees within the local horizontal coordinate system
  • inverse (bool) – False for geodetic to geocentric, True for inverse (default=False)
Returns:

  • lat_out (float) – latitude in degrees of the converted horizontal coordinate system center
  • lon_out (float) – longitude in degrees of the converted horizontal coordinate system center
  • rad_earth (float) – Earth radius in km at the geocentric/detic (False/True) location
  • az_out (float) – azimuth in degrees of the converted horizontal coordinate system
  • el_out (float) – elevation in degrees of the converted horizontal coordinate system

References

Based on J.M. Ruohoniemi’s geopack and R.J. Barnes radar.pro

pysat.utils.coords.global_to_local_cartesian(x_in, y_in, z_in, lat_cent, lon_cent, rad_cent, inverse=False)

Converts a position from global to local cartesian or vice-versa

Deprecated since version 2.2.0: global_to_local_cartesian will be removed in pysat 3.0.0, it will be added to pysatMadrigal

Parameters:
  • x_in (float) – global or local cartesian x in km (inverse=False/True)
  • y_in (float) – global or local cartesian y in km (inverse=False/True)
  • z_in (float) – global or local cartesian z in km (inverse=False/True)
  • lat_cent (float) – geocentric latitude in degrees of local cartesian system origin
  • lon_cent (float) – geocentric longitude in degrees of local cartesian system origin
  • rad_cent (float) – distance from center of the Earth in km of local cartesian system origin
  • inverse (bool) – False to convert from global to local cartesian coodiantes, and True for the inverse (default=False)
Returns:

  • x_out (float) – local or global cartesian x in km (inverse=False/True)
  • y_out (float) – local or global cartesian y in km (inverse=False/True)
  • z_out (float) – local or global cartesian z in km (inverse=False/True)

Notes

The global cartesian coordinate system has its origin at the center of the Earth, while the local system has its origin specified by the input latitude, longitude, and radius. The global system has x intersecting the equatorial plane and the prime meridian, z pointing North along the rotational axis, and y completing the right-handed coodinate system. The local system has z pointing up, y pointing North, and x pointing East.

pysat.utils.coords.local_horizontal_to_global_geo(az, el, dist, lat_orig, lon_orig, alt_orig, geodetic=True)

Convert from local horizontal coordinates to geodetic or geocentric coordinates

Deprecated since version 2.2.0: local_horizontal_to_global_geo will be removed in pysat 3.0.0, it will be added to pysatMadrigal

Parameters:
  • az (float) – Azimuth (angle from North) of point in degrees
  • el (float) – Elevation (angle from ground) of point in degrees
  • dist (float) – Distance from origin to point in km
  • lat_orig (float) – Latitude of origin in degrees
  • lon_orig (float) – Longitude of origin in degrees
  • alt_orig (float) – Altitude of origin in km from the surface of the Earth
  • geodetic (bool) – True if origin coordinates are geodetic, False if they are geocentric. Will return coordinates in the same system as the origin input. (default=True)
Returns:

  • lat_pnt (float) – Latitude of point in degrees
  • lon_pnt (float) – Longitude of point in degrees
  • rad_pnt (float) – Distance to the point from the centre of the Earth in km

References

Based on J.M. Ruohoniemi’s geopack and R.J. Barnes radar.pro

pysat.utils.coords.scale_units(out_unit, in_unit)

Determine the scaling factor between two units

Deprecated since version 2.2.0: utils.coords.scale_units will be removed in pysat 3.0.0, it will be moved to utils.scale_units

Parameters:
  • out_unit (str) – Desired unit after scaling
  • in_unit (str) – Unit to be scaled
Returns:

unit_scale – Scaling factor that will convert from in_units to out_units

Return type:

float

pysat.utils.coords.spherical_to_cartesian(az_in, el_in, r_in, inverse=False)

Convert a position from spherical to cartesian, or vice-versa

Deprecated since version 2.2.0: spherical_to_cartesian will be removed in pysat 3.0.0, it will be added to pysatMadrigal

Parameters:
  • az_in (float) – azimuth/longitude in degrees or cartesian x in km (inverse=False/True)
  • el_in (float) – elevation/latitude in degrees or cartesian y in km (inverse=False/True)
  • r_in (float) – distance from origin in km or cartesian z in km (inverse=False/True)
  • inverse (boolian) – False to go from spherical to cartesian and True for the inverse
Returns:

  • x_out (float) – cartesian x in km or azimuth/longitude in degrees (inverse=False/True)
  • y_out (float) – cartesian y in km or elevation/latitude in degrees (inverse=False/True)
  • z_out (float) – cartesian z in km or distance from origin in km (inverse=False/True)

Notes

This transform is the same for local or global spherical/cartesian transformations.

Returns elevation angle (angle from the xy plane) rather than zenith angle (angle from the z-axis)

pysat.utils.coords.update_longitude(inst, lon_name=None, high=180.0, low=-180.0)

Update longitude to the desired range

Parameters:
  • inst (pysat.Instrument instance) – instrument object to be updated
  • lon_name (string) – name of the longtiude data
  • high (float) – Highest allowed longitude value (default=180.0)
  • low (float) – Lowest allowed longitude value (default=-180.0)
Returns:

Return type:

updates instrument data in column ‘lon_name’

Statistics

pysat.utils.stats - statistical operations in pysat

pysat.coords contains a number of coordinate-transformation functions used throughout the pysat package.

pysat.utils.stats.median1D(self, bin_params, bin_label, data_label)

Calculates the median for a series of binned data.

Deprecated since version 2.2.0: median1D will be removed in pysat 3.0.0, a similar function will be added to pysatSeasons

Parameters:
  • bin_params (array_like) – Input array defining the bins in which the median is calculated
  • bin_label (string) – Name of data parameter which the bins cover
  • data_level (string) – Name of data parameter to take the median of in each bin
Returns:

medians – The median data value in each bin

Return type:

array_like

pysat.utils.stats.nan_circmean(samples, high=6.283185307179586, low=0.0, axis=None)

NaN insensitive version of scipy’s circular mean routine

Deprecated since version 2.1.0: nan_circmean will be removed in pysat 3.0.0, this functionality has been added to scipy 1.4

Parameters:
  • samples (array_like) – Input array
  • high (float or int) – Upper boundary for circular standard deviation range (default=2 pi)
  • low (float or int) – Lower boundary for circular standard deviation range (default=0)
  • axis (int or NoneType) – Axis along which standard deviations are computed. The default is to compute the standard deviation of the flattened array
Returns:

circmean – Circular mean

Return type:

float

pysat.utils.stats.nan_circstd(samples, high=6.283185307179586, low=0.0, axis=None)

NaN insensitive version of scipy’s circular standard deviation routine

Deprecated since version 2.1.0: nan_circstd will be removed in pysat 3.0.0, this functionality has been added to scipy 1.4

Parameters:
  • samples (array_like) – Input array
  • high (float or int) – Upper boundary for circular standard deviation range (default=2 pi)
  • low (float or int) – Lower boundary for circular standard deviation range (default=0)
  • axis (int or NoneType) – Axis along which standard deviations are computed. The default is to compute the standard deviation of the flattened array
Returns:

circstd – Circular standard deviation

Return type:

float

Time

pysat.utils.time - date and time operations in pysat

pysat.utils.time contains a number of functions used throughout the pysat package, including interactions with datetime objects, seasons, and calculation of solar local time

pysat.utils.time.calc_freq(index)

Determine the frequency for a time index

Parameters:index ((array-like)) – Datetime list, array, or Index
Returns:freq – Frequency string as described in Pandas Offset Aliases
Return type:(str)

Notes

Calculates the minimum time difference and sets that as the frequency.

To reduce the amount of calculations done, the returned frequency is either in seconds (if no sub-second resolution is found) or nanoseconds.

pysat.utils.time.create_date_range(start, stop, freq='D')

Return array of datetime objects using input frequency from start to stop

Supports single datetime object or list, tuple, ndarray of start and stop dates.

freq codes correspond to pandas date_range codes, D daily, M monthly, S secondly

pysat.utils.time.create_datetime_index(year=None, month=None, day=None, uts=None)

Create a timeseries index using supplied year, month, day, and ut in seconds.

Parameters:
  • year (array_like of ints) –
  • month (array_like of ints or None) –
  • day (array_like of ints) – for day (default) or day of year (use month=None)
  • uts (array_like of floats) –
Returns:

Return type:

Pandas timeseries index.

Note

Leap seconds have no meaning here.

pysat.utils.time.getyrdoy(date)

Return a tuple of year, day of year for a supplied datetime object.

Parameters:date (datetime.datetime) – Datetime object
Returns:
  • year (int) – Integer year
  • doy (int) – Integer day of year
pysat.utils.time.parse_date(str_yr, str_mo, str_day, str_hr='0', str_min='0', str_sec='0', century=2000)

Basic date parser for file reading

Parameters:
  • str_yr (string) – String containing the year (2 or 4 digits)
  • str_mo (string) – String containing month digits
  • str_day (string) – String containing day of month digits
  • str_hr (string ('0')) – String containing the hour of day
  • str_min (string ('0')) – String containing the minutes of hour
  • str_sec (string ('0')) – String containing the seconds of minute
  • century (int (2000)) – Century, only used if str_yr is a 2-digit year
Returns:

out_date – Pandas datetime object

Return type:

pds.datetime

pysat.utils.time.season_date_range(start, stop, freq='D')

Deprecated Function, will be removed in future version.

Deprecated since version 2.1.0: season_date_range will be removed in pysat 3.0.0, this will be replaced by create_date_range