API

Instrument

class pysat.Instrument(platform=None, name=None, tag=None, inst_id=None, clean_level=None, update_files=None, pad=None, orbit_info=None, inst_module=None, directory_format=None, file_format=None, temporary_file_list=False, strict_time_flag=True, ignore_empty_files=False, labels={'desc': ('desc', <class 'str'>), 'fill_val': ('fill', <class 'numpy.float64'>), 'max_val': ('value_max', <class 'numpy.float64'>), 'min_val': ('value_min', <class 'numpy.float64'>), 'name': ('long_name', <class 'str'>), 'notes': ('notes', <class 'str'>), 'units': ('units', <class 'str'>)}, custom=None, **kwargs)

Download, load, manage, modify and analyze science data.

Parameters:
  • platform (string) – name of instrument platform (default=’’)
  • name (string) – name of instrument (default=’’)
  • tag (string) – identifies particular subset of instrument data (default=’’)
  • inst_id (string) – Secondary level of identification, such as spacecraft within a constellation platform (default=’’)
  • clean_level (str or NoneType) – Level of data quality. If not provided, will default to the setting in pysat.params[‘clean_level’] (default=None)
  • pad (pandas.DateOffset, dictionary, or NoneType) – Length of time to pad the begining and end of loaded data for time-series processing. Extra data is removed after applying all custom functions. Dictionary, if supplied, is simply passed to pandas DateOffset. (default=None)
  • orbit_info (dict) – Orbit information, {‘index’: index, ‘kind’: kind, ‘period’: period}. See pysat.Orbits for more information. (default={})
  • inst_module (module or NoneType) – Provide instrument module directly, takes precedence over platform/name (default=None)
  • update_files (boolean or Nonetype) – If True, immediately query filesystem for instrument files and store. If False, the local files are presumed to be the same. By default, this setting will be obtained from pysat.params (default=None)
  • temporary_file_list (boolean) – If true, the list of Instrument files will not be written to disk. Prevents a race condition when running multiple pysat processes. (default=False)
  • strict_time_flag (boolean) – If true, pysat will check data to ensure times are unique and monotonically increasing. (default=True)
  • directory_format (string, function, or NoneType) – Directory naming structure in string format. Variables such as platform, name, and tag will be filled in as needed using python string formatting. The default directory structure, which is used if None is specified, is ‘{platform}/{name}/{tag}’. If a function is provided, it must take tag and inst_id as arguments and return an appropriate string. (default=None)
  • file_format (str or NoneType) – File naming structure in string format. Variables such as year, month, and inst_id will be filled in as needed using python string formatting. The default file format structure is supplied in the instrument list_files routine. (default=None)
  • ignore_empty_files (boolean) – if True, the list of files found will be checked to ensure the filesizes are greater than zero. Empty files are removed from the stored list of files. (default=False)
  • labels (dict) – Dict where keys are the label attribute names and the values are tuples that have the label values and value types in that order. (default={‘units’: (‘units’, str), ‘name’: (‘long_name’, str), ‘notes’: (‘notes’, str), ‘desc’: (‘desc’, str), ‘min_val’: (‘value_min’, float), ‘max_val’: (‘value_max’, float), ‘fill_val’: (‘fill’, float)})
bounds

bounds for loading data, supply array_like for a season with gaps. Users may provide as a tuple or tuple of lists, but the attribute is stored as a tuple of lists for consistency

Type:(datetime/filename/None, datetime/filename/None)
custom_functions

List of functions to be applied by instrument nano-kernel

Type:list
custom_args

List of lists containing arguments to be passed to particular custom function

Type:list
custom_kwargs

List of dictionaries with keywords and values to be passed to a custom function

Type:list
data

loaded science data

Type:pandas.DataFrame or xarray.Dataset
date

date for loaded data

Type:dt.datetime
yr

year for loaded data

Type:int
doy

day of year for loaded data

Type:int
files

interface to instrument files

Type:pysat.Files
kwargs

keyword arguments passed to the standard Instrument routines

Type:dictionary
meta_labels

Dict containing defaults for new Meta data labels

Type:dict
meta

interface to instrument metadata, similar to netCDF 1.6

Type:pysat.Meta
orbits

interface to extracting data orbit-by-orbit

Type:pysat.Orbits

Note

pysat attempts to load the module platform_name.py located in the pysat/instruments directory. This module provides the underlying functionality to download, load, and clean instrument data. Alternatively, the module may be supplied directly using keyword inst_module.

Examples

# 1-second mag field data
vefi = pysat.Instrument(platform='cnofs',
                        name='vefi',
                        tag='dc_b',
                        clean_level='clean')
start = dt.datetime(2009,1,1)
stop = dt.datetime(2009,1,2)
vefi.download(start, stop)
vefi.load(date=start)
print(vefi['dB_mer'])
print(vefi.meta['db_mer'])

# 1-second thermal plasma parameters
ivm = pysat.Instrument(platform='cnofs',
                       name='ivm',
                       tag='',
                       clean_level='clean')
ivm.download(start,stop)
ivm.load(2009,1)
print(ivm['ionVelmeridional'])

# Ionosphere profiles from GPS occultation. Enable binning profile
# data using a constant step-size. Feature provided by the underlying
# COSMIC support code.
cosmic = pysat.Instrument('cosmic',
                          'gps',
                          'ionprf',
                          altitude_bin=3)
cosmic.download(start, stop, user=user, password=password)
cosmic.load(date=start)

# Nano-kernel functionality enables instrument objects that are
# 'set and forget'. The functions are always run whenever
# the instrument load routine is called so instrument objects may
# be passed safely to other routines and the data will always
# be processed appropriately.

# Define custom function to modify Instrument in place.
def custom_func(inst, opt_param1=False, opt_param2=False):
    # perform calculations and store in new_data
    inst['new_data'] = new_data
    return

inst = pysat.Instrument('pysat', 'testing')
inst.custom_attach(custom_func, kwargs={'opt_param1': True})

# Custom methods are applied to data when loaded.
inst.load(date=date)

print(inst['new_data2'])

# Custom methods may also be attached at instantiation.
# Create a dictionary for each custom method and associated inputs
custom_func_1 = {'function': custom_func,
                 'kwargs': {'opt_param1': True}}
custom_func_2 = {'function': custom_func, 'args'=[True, False]}
custom_func_3 = {'function': custom_func, 'at_pos'=0,
                 'kwargs': {'opt_param2': True}}

# Combine all dicts into a list in order of application and execution,
# although this can be modified by specifying 'at_pos'. The actual
# order these functions will run is: 3, 1, 2
custom = [custom_func_1, custom_func_2, custom_func_3]

# Instantiate pysat.Instrument
inst = pysat.Instrument(platform, name, inst_id=inst_id, tag=tag,
                        custom=custom)
bounds

Boundaries for iterating over instrument object by date or file.

Parameters:
  • start (datetime object, filename, or None) – start of iteration, if None uses first data date. list-like collection also accepted. (default=None)
  • stop (datetime object, filename, or None) – stop of iteration, inclusive. If None uses last data date. list-like collection also accepted. (default=None)
  • step (str, int, or None) – Step size used when iterating from start to stop. Use a Pandas frequency string (‘3D’, ‘1M’) when setting bounds by date, an integer when setting bounds by file. Defaults to a single day/file (default=’1D’, 1).
  • width (pandas.DateOffset, int, or None) – Data window used when loading data within iteration. Defaults to a single day/file if not assigned. (default=dt.timedelta(days=1), 1)

Note

Both start and stop must be the same type (date, or filename) or None. Only the year, month, and day are used for date inputs.

Examples

import datetime as dt
import pandas as pds
import pysat

inst = pysat.Instrument(platform=platform,
                        name=name,
                        tag=tag)
start = dt.datetime(2009, 1, 1)
stop = dt.datetime(2009, 1, 31)

# Defaults to stepping by a single day and a data loading window
# of one day/file.
inst.bounds = (start, stop)

# Set bounds by file. Iterates a file at a time.
inst.bounds = ('filename1', 'filename2')

# Create a more complicated season, multiple start and stop dates.
start2 = dt.datetetime(2010,1,1)
stop2 = dt.datetime(2010,2,14)
inst.bounds = ([start, start2], [stop, stop2])

# Iterate via a non-standard step size of two days.
inst.bounds = ([start, start2], [stop, stop2], '2D')

# Load more than a single day/file at a time when iterating
inst.bounds = ([start, start2], [stop, stop2], '2D',
               dt.timedelta(days=3))
concat_data(new_data, prepend=False, **kwargs)

Concats new_data to self.data for xarray or pandas as needed

Parameters:
  • new_data (pds.DataFrame, xr.Dataset, or list of such objects) – New data objects to be concatonated
  • prepend (boolean) – If True, assign new data before existing data; if False append new data (default=False)
  • **kwargs (dict) – Optional keyword arguments passed to pds.concat or xr.concat

Note

For pandas, sort=False is passed along to the underlying pandas.concat method. If sort is supplied as a keyword, the user provided value is used instead. Recall that sort orders the data columns, not the data values or the index.

For xarray, dim=Instrument.index.name is passed along to xarray.concat except if the user includes a value for dim as a keyword argument.

copy()

Deep copy of the entire Instrument object.

Returns:
Return type:pysat.Instrument
custom_apply_all()

Apply all of the custom functions to the satellite data object.

Raises:ValueError – Raised when function returns any value

Note

This method does not generally need to be invoked directly by users.

custom_attach(function, at_pos='end', args=[], kwargs={})

Attach a function to custom processing queue.

Custom functions are applied automatically whenever .load() command called.

Parameters:
  • function (string or function object) – name of function or function object to be added to queue
  • at_pos (string or int) – Accepts string ‘end’ or a number that will be used to determine the insertion order if multiple custom functions are attached to an Instrument object. (default=’end’).
  • args (list or tuple) – Ordered arguments following the instrument object input that are required by the custom function (default=[])
  • kwargs (dict) – Dictionary of keyword arguments required by the custom function (default={})

Note

Functions applied using custom_attach may add, modify, or use the data within Instrument inside of the function, and so should not return anything.

custom_clear()

Clear the custom function list.

date

Date for loaded data.

download(start=None, stop=None, freq='D', date_array=None, **kwargs)

Download data for given Instrument object from start to stop.

Parameters:
  • start (pandas.datetime (yesterday)) – start date to download data
  • stop (pandas.datetime (tomorrow)) – stop date (inclusive) to download data
  • freq (string) – Stepsize between dates for season, ‘D’ for daily, ‘M’ monthly (see pandas)
  • date_array (list-like) – Sequence of dates to download date for. Takes precedence over start and stop inputs
  • **kwargs (dict) – Dictionary of keywords that may be options for specific instruments. The keyword arguments ‘user’ and ‘password’ are expected for remote databases requiring sign in or registration.

Note

Data will be downloaded to pysat_data_dir/patform/name/tag

If Instrument bounds are set to defaults they are updated after files are downloaded.

download_updated_files(**kwargs)

Grabs a list of remote files, compares to local, then downloads new files.

Parameters:**kwargs (dict) – Dictionary of keywords that may be options for specific instruments

Note

Data will be downloaded to pysat_data_dir/patform/name/tag

If Instrument bounds are set to defaults they are updated after files are downloaded.

empty

Boolean flag reflecting lack of data, True if there is no data.

generic_meta_translator(input_meta)

Translates the metadata contained in an object into a dictionary

Parameters:input_meta (Meta) – The metadata object to translate
Returns:A dictionary of the metadata for each variable of an output file e.g. netcdf4
Return type:dict
index

Returns time index of loaded data.

load(yr=None, doy=None, end_yr=None, end_doy=None, date=None, end_date=None, fname=None, stop_fname=None, verifyPad=False, **kwargs)

Load instrument data into Instrument.data object.

Parameters:
  • yr (integer) – Year for desired data. pysat will load all files with an associated date between yr, doy and yr, doy + 1 (default=None)
  • doy (integer) – Day of year for desired data. Must be present with yr input. (default=None)
  • end_yr (integer) – Used when loading a range of dates, from yr, doy to end_yr, end_doy based upon the dates associated with the Instrument’s files. Date range is inclusive for yr, doy but exclusive for end_yr, end_doy. (default=None)
  • end_doy (integer) – Used when loading a range of dates, from yr, doy to end_yr, end_doy based upon the dates associated with the Instrument’s files. Date range is inclusive for yr, doy but exclusive for end_yr, end_doy. (default=None)
  • date (dt.datetime) – Date to load data. pysat will load all files with an associated date between date and date + 1 day (default=None)
  • end_date (dt.datetime) – Used when loading a range of data from date to end_date based upon the dates associated with the Instrument’s files. Date range is inclusive for date but exclusive for end_date. (default=None)
  • fname (str or NoneType) – Filename to be loaded (default=None)
  • stop_fname (str or NoneType) – Used when loading a range of filenames from fname to stop_fname, inclusive. (default=None)
  • verifyPad (bool) – If True, padding data not removed for debugging. Padding parameters are provided at Instrument instantiation. (default=False)
  • **kwargs (dict) – Dictionary of keywords that may be options for specific instruments.
Raises:
  • TypeError – For incomplete or incorrect input
  • ValueError – For input incompatible with Instrument set-up

Note

Loads data for a chosen instrument into .data. Any functions chosen by the user and added to the custom processing queue (.custom.attach) are automatically applied to the data before it is available to user in .data.

A mixed combination of .load() keywords such as yr and date are not allowed.

Note

end kwargs have exclusive ranges (stop before the condition is reached), while stop kwargs have inclusive ranges (stop once the condition is reached).

Examples

import datetime as dt
import pysat

inst = pysat.Instrument('pysat', 'testing')

# load a single day by year and day of year
inst.load(2009, 1)

# load a single day by date
date = dt.datetime(2009, 1, 1)
inst.load(date=date)

# load a single file, first file in this example
inst.load(fname=inst.files[0])

# load a range of days, data between
# Jan. 1st (inclusive) - Jan. 3rd (exclusive)
inst.load(2009, 1, 2009, 3)

# same procedure using datetimes
date = dt.datetime(2009, 1, 1)
end_date = dt.datetime(2009, 1, 3)
inst.load(date=date, end_date=end_date)

# same procedure using filenames
# note the change in index due to inclusive slicing on filenames!
inst.load(fname=inst.files[0], stop_fname=inst.files[1])
next(verifyPad=False)

Manually iterate through the data loaded in Instrument object.

Bounds of iteration and iteration type (day/file) are set by bounds attribute.

Parameters:verifyPad (bool) – Passed to self.load(). If True, then padded data within the load method will be retained. (default=False)

Note

If there were no previous calls to load then the first day(default)/file will be loaded.

prev(verifyPad=False)

Manually iterate backwards through the data in Instrument object.

Bounds of iteration and iteration type (day/file) are set by bounds attribute.

Parameters:verifyPad (bool) – Passed to self.load(). If True, then padded data within the load method will be retained. (default=False)

Note

If there were no previous calls to load then the first day(default)/file will be loaded.

remote_date_range(start=None, stop=None, **kwargs)

Returns fist and last date for remote data

Parameters:
  • start (dt.datetime or NoneType) – Starting time for file list. A None value will start with the first file found. (default=None)
  • stop (dt.datetime or NoneType) – Ending time for the file list. A None value will stop with the last file found. (default=None)
  • **kwargs (dict) – Dictionary of keywords that may be options for specific instruments. The keyword arguments ‘user’ and ‘password’ are expected for remote databases requiring sign in or registration.
Returns:

First and last datetimes obtained from remote_file_list

Return type:

List

Note

Default behaviour is to search all files. User may additionally specify a given year, year/month, or year/month/day combination to return a subset of available files.

remote_file_list(start=None, stop=None, **kwargs)

List remote files for chosen instrument

Parameters:
  • start (dt.datetime or NoneType) – Starting time for file list. A None value will start with the first file found. (default=None)
  • stop (dt.datetime or NoneType) – Ending time for the file list. A None value will stop with the last file found. (default=None)
  • **kwargs (dict) – Dictionary of keywords that may be options for specific instruments. The keyword arguments ‘user’ and ‘password’ are expected for remote databases requiring sign in or registration.
Returns:

pandas Series of filenames indexed by date and time

Return type:

Series

Note

Default behaviour is to return all files. User may additionally specify a given year, year/month, or year/month/day combination to return a subset of available files.

rename(var_names, lowercase_data_labels=False)

Renames variable within both data and metadata.

Parameters:
  • var_names (dict or other map) – Existing var_names are keys, values are new var_names
  • lowercase_data_labels (bool) – If True, the labels applied to inst.data are forced to lowercase. The supplied case in var_names is retained within inst.meta.

Examples

# standard renaming
new_var_names = {'old_name': 'new_name',
             'old_name2':, 'new_name2'}
inst.rename(new_var_names)

If using a pandas DataFrame as the underlying data object, to rename higher-order variables supply a modified dictionary. Note that this rename will be invoked individually for all times in the dataset.

# applies to higher-order datasets
# that are loaded into pandas
# general example
new_var_names = {'old_name': 'new_name',
                 'old_name2':, 'new_name2',
                 'col_name': {'old_ho_name': 'new_ho_name'}}
inst.rename(new_var_names)

# specific example
inst = pysat.Instrument('pysat', 'testing2D')
inst.load(2009, 1)
var_names = {'uts': 'pysat_uts',
         'profiles': {'density': 'pysat_density'}}
inst.rename(var_names)

pysat supports differing case for variable labels across the data and metadata objects attached to an Instrument. Since metadata is case-preserving (on assignment) but case-insensitive, the labels used for data are always valid for metadata. This feature may be used to provide friendlier variable names within pysat while also maintaining external format compatibility when writing files.

# example with lowercase_data_labels
inst = pysat.Instrument('pysat', 'testing2D')
inst.load(2009, 1)
var_names = {'uts': 'Pysat_UTS',
         'profiles': {'density': 'PYSAT_density'}}
inst.rename(var_names, lowercase_data_labels=True)

# note that 'Pysat_UTS' was applied to data as 'pysat_uts'
print(inst['pysat_uts'])

# case is retained within inst.meta, though
# data access to meta is case insensitive
print('True meta variable name is ', inst.meta['pysat_uts'].name)

# Note that the labels in meta may be used when creating a file
# thus 'Pysat_UTS' would be found in the resulting file
inst.to_netcdf4('./test.nc', preserve_meta_case=True)

# load in file and check
raw = netCDF4.Dataset('./test.nc')
print(raw.variables['Pysat_UTS'])
to_netcdf4(fname=None, base_instrument=None, epoch_name='Epoch', zlib=False, complevel=4, shuffle=True, preserve_meta_case=False, export_nan=None, unlimited_time=True)

Stores loaded data into a netCDF4 file.

Parameters:
  • fname (str) – full path to save instrument object to
  • base_instrument (pysat.Instrument) – used as a comparison, only attributes that are present with self and not on base_instrument are written to netCDF
  • epoch_name (str) – Label in file for datetime index of Instrument object
  • zlib (bool) – Flag for engaging zlib compression (True - compression on)
  • complevel (int) – an integer between 1 and 9 describing the level of compression desired. Ignored if zlib=False. (default=4)
  • shuffle (bool) – The HDF5 shuffle filter will be applied before compressing the data. This significantly improves compression. Ignored if zlib=False. (default=True)
  • preserve_meta_case (bool) – if True, then the variable strings within the MetaData object, which preserves case, are used to name variables in the written netCDF file. If False, then the variable strings used to access data from the Instrument object are used instead. By default, the variable strings on both the data and metadata side are the same, though this relationship may be altered by a user. (default=False)
  • export_nan (list or None) – By default, the metadata variables where a value of NaN is allowed and written to the netCDF4 file is maintained by the Meta object attached to the pysat.Instrument object. A list supplied here will override the settings provided by Meta, and all parameters included will be written to the file. If not listed and a value is NaN then that attribute simply won’t be included in the netCDF4 file. (default=None)
  • unlimited_time (bool) – If True, then the main epoch dimension will be set to ‘unlimited’ within the netCDF4 file. (default=True)

Note

Stores 1-D data along dimension ‘epoch’ - the date time index.

Stores higher order data (e.g. dataframes within series) separately

  • The name of the main variable column is used to prepend subvariable names within netCDF, var_subvar_sub
  • A netCDF4 dimension is created for each main variable column with higher order data; first dimension Epoch
  • The index organizing the data stored as a dimension variable
  • from_netcdf4 uses the variable dimensions to reconstruct data structure

All attributes attached to instrument meta are written to netCDF attrs with the exception of ‘Date_End’, ‘Date_Start’, ‘File’, ‘File_Date’, ‘Generation_Date’, and ‘Logical_File_ID’. These are defined within to_netCDF at the time the file is written, as per the adopted standard, SPDF ISTP/IACG Modified for NetCDF. Atrributes ‘Conventions’ and ‘Text_Supplement’ are given default values if not present.

today()

Returns today’s date (UTC), with no hour, minute, second, etc.

Returns:today_utc – Today’s date in UTC
Return type:datetime
tomorrow()

Returns tomorrow’s date (UTC), with no hour, minute, second, etc.

Returns:Tomorrow’s date in UTC
Return type:datetime
variables

Returns list of variables within loaded data.

yesterday()

Returns yesterday’s date (UTC), with no hour, minute, second, etc.

Returns:Yesterday’s date in UTC
Return type:datetime

Constellation

class pysat.Constellation(const_module=None, instruments=None, index_res=None, common_index=True)

Manage and analyze data from multiple pysat Instruments.

Parameters:
  • const_module (module or NoneType) – Name of a pysat constellation module (default=None)
  • instruments (list-like or NoneType) – A list of pysat Instruments to include in the Constellation (default=None)
  • index_res (float or NoneType) – Output index resolution in seconds or None to determine from Constellation instruments (default=None)
  • common_index (bool) – True to include times where all instruments have data, False to use the maximum time range from the Constellation (default=True)
instruments

A list of pysat Instruments that make up the Constellation

Type:list
bounds

bounds for loading data, supply array_like for a season with gaps. Users may provide as a tuple or tuple of lists, but the attribute is stored as a tuple of lists for consistency

Type:(datetime/filename/None, datetime/filename/None)

Constructs a Constellation given a list of instruments or the name of a file with a pre-defined constellation.

Parameters:
  • const_module (module or NoneType) – Name of a pysat constellation module (default=None)
  • instruments (list-like or NoneType) – A list of pysat Instruments to include in the Constellation (default=None)
  • index_res (float or NoneType) – Output index resolution in seconds or None to determine from Constellation instruments (default=None)
  • common_index (bool) – True to include times where all instruments have data, False to use the maximum time range from the Constellation (default=True)
Raises:

ValueError – When instruments is not list-like

Note

Omit instruments and const_module to create an empty constellation.

custom_attach(*args, **kwargs)

Register a function to modify data of member Instruments.

Parameters:
  • *args (list reference) – References a list of input arguments
  • **kwargs (dict reference) – References a dict of input keyword arguments

See also

Instrument.custom_attach()
base method for attaching custom functions
custom_clear()

Clear the custom function list

See also

Instrument.custom_clear()
base method for clearing custom functions
date

Date for loaded data.

download(*args, **kwargs)

Download instrument data into Instrument object.data

Parameters:
  • *args (list reference) – References a list of input arguments
  • **kwargs (dict reference) – References a dict of input keyword arguments

See also

Instrument.download()
base method for loading Instrument data

Note

If individual instruments require specific kwargs that differ from other instruments, define that in the individual instrument rather than this method.

empty

Boolean flag reflecting lack of data, True if there is no Instrument data in any Constellation Instrument.

index

Returns time index of loaded data.

load(*args, **kwargs)

Load instrument data into Instrument object.data

Parameters:
  • *args (list reference) – References a list of input arguments
  • **kwargs (dict reference) – References a dict of input keyword arguments

See also

Instrument.load()
base method for loading Instrument data
today()

Returns UTC date for today, see pysat.Instrument for details.

tomorrow()

Returns UTC date for tomorrow, see pysat.Instrument for details.

variables

Returns a list of uniquely named variables from all the loaded data.

yesterday()

Returns UTC date for yesterday, see pysat.Instrument for details.

Files

class pysat.Files(inst, directory_format=None, update_files=False, file_format=None, write_to_disk=True, ignore_empty_files=False)

Maintains collection of files and associated methods.

Parameters:
  • inst (pysat.Instrument) – Instrument object
  • directory_format (str or NoneType) – Directory naming structure in string format. Variables such as platform, name, tag, and inst_id will be filled in as needed using python string formatting. The default directory structure would be expressed as ‘{platform}/{name}/{tag}/{inst_id}’. If None, the default directory structure is used (default=None)
  • update_files (boolean) – If True, immediately query filesystem for instrument files and store (default=False)
  • file_format (str or NoneType) – File naming structure in string format. Variables such as year, month, day, and inst_id will be filled in as needed using python string formatting. The default file format structure is supplied in the instrument list_files routine. (default=None)
  • write_to_disk (boolean) – If true, the list of Instrument files will be written to disk. (default=True)
  • ignore_empty_files (boolean) – If True, the list of files found will be checked to ensure the filesizes are greater than zero. Empty files are removed from the stored list of files. (default=False)
home_path

Path to the pysat information directory.

Type:str
data_path

Path to the top-level directory containing instrument files, selected from data_paths.

Type:str
data_paths

Available paths that pysat will use when looking for files. The class uses the first directory with relevant data, stored in data_path.

Type:list of str
files

Series of data files, indexed by file start time

Type:pds.Series
inst_info

Contains pysat.Instrument parameters ‘platform’, ‘name’, ‘tag’, and ‘inst_id’, identifying the source of the files.

Type:dict
list_files_creator

Experimental feature for Instruments that internally generate data and thus don’t have a defined supported date range.

Type:functools.partial or NoneType
list_files_rtn

Method used to locate relevant files on the local system. Provided by associated pysat.Instrument object.

Type:method
multi_file_day

Flag copied from associated pysat.Instrument object that indicates when data for day n may be found in files for days n-1, or n+1

Type:boolean
start_date

Date of first file, used as default start bound for instrument object, or None if no files are loaded.

Type:datetime or NoneType
stop_date

Date of last file, used as default stop bound for instrument object, or None if no files are loaded.

Type:datetime or NoneType
stored_file_name

Name of the hidden file containing the list of archived data files for this instrument.

Type:str
sub_dir_path

directory_format string formatted for the local system.

Type:str

Note

Interfaces with the list_files method for a given instrument support module to create an ordered collection of files in time, used primarily by the pysat.Instrument object to identify files to be loaded. The Files class mediates access to the files by datetime and contains helper methods for determining the presence of new files and filtering out empty files.

User should generally use the interface provided by a pysat.Instrument instance. Exceptions are the classmethod from_os, provided to assist in generating the appropriate output for an instrument routine.

Examples

# convenient file access
inst = pysat.Instrument(platform=platform, name=name, tag=tag,
                        inst_id=inst_id)
# first file
inst.files[0]

# files from start up to stop (exclusive on stop)
start = dt.datetime(2009,1,1)
stop = dt.datetime(2009,1,3)
print(inst.files[start:stop])

# files for date
print(inst.files[start])

# files by slicing
print(inst.files[0:4])

# get a list of new files
# new files are those that weren't present the last time
# a given instrument's file list was stored
new_files = inst.files.get_new()

# search pysat appropriate directory for instrument files and
# update Files instance.
inst.files.refresh()
copy()

Provide a deep copy of object

Returns:Copy of self
Return type:Files class instance
classmethod from_os(data_path=None, format_str=None, two_digit_year_break=None, delimiter=None)

Produces a list of files and and formats it for Files class.

Parameters:
  • data_path (string) – Top level directory to search files for. This directory is provided by pysat to the instrument_module.list_files functions as data_path.
  • format_str (string with python format codes) – Provides the naming pattern of the instrument files and the locations of date information so an ordered list may be produced. Supports ‘year’, ‘month’, ‘day’, ‘hour’, ‘minute’, ‘second’, ‘version’, ‘revision’, and ‘cycle’ Ex: ‘cnofs_cindi_ivm_500ms_{year:4d}{month:02d}{day:02d}_v01.cdf’
  • two_digit_year_break (int or None) – If filenames only store two digits for the year, then ‘1900’ will be added for years >= two_digit_year_break and ‘2000’ will be added for years < two_digit_year_break. If None, then four-digit years are assumed. (default=None)
  • delimiter (string or NoneType) – Delimiter string upon which files will be split (e.g., ‘.’). If None, filenames will be parsed presuming a fixed width format. (default=None)
Returns:

A Series of filenames indexed by time. See pysat.utils.files.process_parsed_filenames for details.

Return type:

pds.Series

Note

Requires fixed_width or delimited filename

Does not produce a Files instance, but the proper output from instrument_module.list_files method.

The ‘?’ may be used to indicate a set number of spaces for a variable part of the name that need not be extracted. ‘cnofs_cindi_ivm_500ms_{year:4d}{month:02d}{day:02d}_v??.cdf’

The ‘day’ format keyword may be used to specify either day of month (if month is included) or day of year.

get_file_array(start, stop)

Return a list of filenames between and including start and stop.

Parameters:
  • start (array_like or single string) – filenames for start of returned filelist
  • stop (array_like or single string) – filenames inclusive of the ending of list provided by the stop time
Returns:

files – A list of filenames between and including start and stop times over all intervals.

Return type:

list

Note

start and stop must be of the same type: both array-like or both strings

get_index(fname)

Return index for a given filename.

Parameters:fname (string) – Filename for the desired time index

Note

If fname not found in the file information already attached to the instrument.files instance, then a files.refresh() call is made.

get_new()

List new files since last recorded file state.

Returns:A datetime-index Series of all new fileanmes since the last known change to the files.
Return type:pandas.Series

Note

pysat stores filenames in the user_home/.pysat directory. Filenames are stored if there is a change and either update_files is True at instrument object level or files.refresh() is called.

refresh()

Update list of files, if there are changes.

Note

Calls underlying list_files_rtn for the particular science instrument. Typically, these routines search in the pysat provided path, pysat_data_dir/platform/name/tag/inst_id, where pysat_data_dir is set by pysat.utils.set_data_dir(path=path).

set_top_level_directory(path)

Sets top-level data directory.

Sets a valid self.data_path using provided top-level directory path and the associated pysat subdirectories derived from the directory_format attribute as stored in self.sub_dir_path.

Parameters:path (str) – Top-level path to use when looking for files. Must be in pysat.params[‘data_dirs’]

Note

If there are Instrument files on the system under a top-level directory other than path, then, under certain conditions, self.data_path may be later updated by the object to point back to the directory with files.

Meta

class pysat.Meta(metadata=None, labels={'desc': ('desc', <class 'str'>), 'fill_val': ('fill', <class 'float'>), 'max_val': ('value_max', <class 'float'>), 'min_val': ('value_min', <class 'float'>), 'name': ('long_name', <class 'str'>), 'notes': ('notes', <class 'str'>), 'units': ('units', <class 'str'>)}, export_nan=None)

Stores metadata for Instrument instance, similar to CF-1.6 netCDFdata standard.

Parameters:
  • metadata (pandas.DataFrame) – DataFrame should be indexed by variable name that contains at minimum the standard_name (name), units, and long_name for the data stored in the associated pysat Instrument object.
  • labels (dict) – Dict where keys are the label attribute names and the values are tuples that have the label values and value types in that order. (default={‘units’: (‘units’, str), ‘name’: (‘long_name’, str), ‘notes’: (‘notes’, str), ‘desc’: (‘desc’, str), ‘min_val’: (‘value_min’, float), ‘max_val’: (‘value_max’, float), ‘fill_val’: (‘fill’, float)})
  • export_nan (list or NoneType) – List of labels that should be exported even if their value is nan or None for an empty list. When used, metadata with a value of nan will be excluded from export. Will always allow nan export for labels of the float type (default=None)
data

index is variable standard name, ‘units’, ‘long_name’, and other defaults are also stored along with additional user provided labels.

Type:pandas.DataFrame
labels

Labels for MetaData attributes

Type:MetaLabels

Note

Meta object preserves the case of variables and attributes as it first receives the data. Subsequent calls to set new metadata with the same variable or attribute will use case of first call. Accessing or setting data thereafter is case insensitive. In practice, use is case insensitive but the original case is preserved. Case preseveration is built in to support writing files with a desired case to meet standards.

Metadata for higher order data objects, those that have multiple products under a single variable name in a pysat.Instrument object, are stored by providing a Meta object under the single name.

Supports any custom metadata values in addition to the expected metadata attributes (units, name, notes, desc, value_min, value_max, and fill). These base attributes may be used to programatically access and set types of metadata regardless of the string values used for the attribute. String values for attributes may need to be changed depending upon the standards of code or files interacting with pysat.

Meta objects returned as part of pysat loading routines are automatically updated to use the same values of units, etc. as found in the pysat.Instrument object.

Examples

# instantiate Meta object, default values for attribute labels are used
meta = pysat.Meta()
# set a couple base units
# note that other base parameters not set below will
# be assigned a default value
meta['name'] = {'long_name': string, 'units': string}
# update 'units' to new value
meta['name'] = {'units': string}
# update 'long_name' to new value
meta['name'] = {'long_name': string}
# attach new info with partial information, 'long_name' set to 'name2'
meta['name2'] = {'units': string}
# units are set to '' by default
meta['name3'] = {'long_name': string}

# assigning custom meta parameters
meta['name4'] = {'units': string, 'long_name': string
                 'custom1': string, 'custom2': value}
meta['name5'] = {'custom1': string, 'custom3': value}

# assign multiple variables at once
meta[['name1', 'name2']] = {'long_name': [string1, string2],
                            'units': [string1, string2],
                            'custom10': [string1, string2]}

# assiging metadata for n-Dimensional variables
meta2 = pysat.Meta()
meta2['name41'] = {'long_name': string, 'units': string}
meta2['name42'] = {'long_name': string, 'units': string}
meta['name4'] = {'meta': meta2}

# or
meta['name4'] = meta2
meta['name4'].children['name41']

# mixture of 1D and higher dimensional data
meta = pysat.Meta()
meta['dm'] = {'units': 'hey', 'long_name': 'boo'}
meta['rpa'] = {'units': 'crazy', 'long_name': 'boo_whoo'}
meta2 = pysat.Meta()
meta2[['higher', 'lower']] = {'meta': [meta, None],
                              'units': [None, 'boo'],
                              'long_name': [None, 'boohoo']}

# assign from another Meta object
meta[key1] = meta2[key2]

# access fill info for a variable, presuming default label
meta[key1, 'fill']

# access same info, even if 'fill' not used to label fill values
meta[key1, meta.fill_label]

# change a label used by Meta object
meta.labels.fill_val = '_FillValue'

# Note that the fill label is intended for use when interacting
# with external files. Thus, any fill values (NaN) within the Meta
# object are not updated when changing the metadata string label,
# or when updating the value representing fill data. A future update
# (Issue #707) will expand functionality to include these custom
# fill values when producing files.
accept_default_labels(other_meta)

Applies labels for default meta labels from other onto self.

Parameters:other_meta (Meta) – Meta object to take default labels from
apply_meta_labels(other_meta)

Applies the existing meta labels from self onto different MetaData

Parameters:other_meta (Meta) – Meta object to have default labels applied
Returns:other_updated – Meta object with the default labels applied
Return type:Meta
attr_case_name(name)

Returns preserved case name for case insensitive value of name.

Parameters:name (str) – Name of variable to get stored case form
Returns:out_name – Name in proper case
Return type:str

Note

Checks first within standard attributes. If not found there, checks attributes for higher order data structures. If not found, returns supplied name as it is available for use. Intended to be used to help ensure that the same case is applied to all repetitions of a given variable name.

attrs()

Yields metadata products stored for each variable name

concat(other_meta, strict=False)

Concats two metadata objects together.

Parameters:
  • other_meta (Meta) – Meta object to be concatenated
  • strict (bool) – If True, this flag ensures there are no duplicate variable names (default=False)
Returns:

mdata – Concatenated object

Return type:

Meta

Note

Uses units and name label of self if other_meta is different

copy()

Deep copy of the meta object.

drop(names)

Drops variables (names) from metadata.

Parameters:names (list-like) – List of string specifying the variable names to drop
empty

Return boolean True if there is no metadata

Returns:Returns True if there is no data, and False if there is data
Return type:bool
classmethod from_csv(filename=None, col_names=None, sep=None, **kwargs)

Create instrument metadata object from csv.

Parameters:
  • filename (string) – absolute filename for csv file or name of file stored in pandas instruments location
  • col_names (list-like collection of strings) – column names in csv and resultant meta object
  • sep (string) – column seperator for supplied csv filename
  • **kwargs (dict) – Optional kwargs used by pds.read_csv

Note

column names must include at least [‘name’, ‘long_name’, ‘units’], assumed if col_names is None.

hasattr_case_neutral(attr_name)

Case-insensitive check for attribute names in this class

Parameters:attr_name (str) – Name of attribute to find
Returns:has_name – True if case-insesitive check for attribute name is True
Return type:bool

Note

Does not check higher order meta objects

keep(keep_names)

Keeps variables (keep_names) while dropping other parameters

Parameters:keep_names (list-like) – variables to keep
keys()

Yields variable names stored for 1D variables

keys_nD()

Yields keys for higher order metadata

merge(other)

Adds metadata variables to self that are in other but not in self.

Parameters:other (pysat.Meta) –
pop(label_name)

Remove and return metadata about variable

Parameters:label_name (str) – Meta key for a data variable
Returns:output – Series of metadata for variable
Return type:pds.Series
transfer_attributes_to_instrument(inst, strict_names=False)

Transfer non-standard attributes in Meta to Instrument object.

Parameters:
  • inst (pysat.Instrument) – Instrument object to transfer attributes to
  • strict_names (bool) – If True, produces an error if the Instrument object already has an attribute with the same name to be copied (default=False).

Note

pysat’s load_netCDF and similar routines are only able to attach netCDF4 attributes to a Meta object. This routine identifies these attributes and removes them from the Meta object. Intent is to support simple transfers to the pysat.Instrument object.

Will not transfer names that conflict with pysat default attributes.

var_case_name(name)

Provides stored name (case preserved) for case insensitive input

Parameters:name (str) – variable name in any case
Returns:out_name – String with case preserved as in the meta object
Return type:str

Note

If name is not found (case-insensitive check) then name is returned, as input. This function is intended to be used to help ensure the case of a given variable name is the same across the Meta object.

MetaLabels

class pysat.MetaLabels(metadata=None, units=('units', <class 'str'>), name=('long_name', <class 'str'>), notes=('notes', <class 'str'>), desc=('desc', <class 'str'>), min_val=('value_min', <class 'float'>), max_val=('value_max', <class 'float'>), fill_val=('fill', <class 'float'>), **kwargs)

Stores metadata labels for Instrument instance

Parameters:
  • units (tuple) – Units label name and value type (default=(‘units’, str))
  • name (tuple) – Name label name and value type (default=(‘long_name’, str))
  • notes (tuple) – Notes label name and value type (default=(‘notes’, str))
  • desc (tuple) – Description label name and value type (default=(‘desc’, str))
  • min_val (tuple) – Minimum value label name and value type (default=(‘value_min’, float))
  • max_val (tuple) – Maximum value label name and value type (default=(‘value_max’, float))
  • fill_val (tuple) – Fill value label name and value type (default=(‘fill’, float))
  • kwargs (dict) – Dictionary containing optional label attributes, where the keys are the attribute names and the values are tuples containing the label name and value type
data

index is variable standard name, ‘units’, ‘long_name’, and other defaults are also stored along with additional user provided labels.

Type:pandas.DataFrame
units

String used to label units in storage. (default=’units’)

Type:str
name

String used to label long_name in storage. (default=’long_name’)

Type:str
notes

String used to label ‘notes’ in storage. (default=’notes’)

Type:str
desc

String used to label variable descriptions in storage. (default=’desc’)

Type:str
min_val

String used to label typical variable value min limit in storage. (default=’value_min’)

Type:str
max_val

String used to label typical variable value max limit in storage. (default=’value_max’)

Type:str
fill_val

String used to label fill value in storage. The default follows the netCDF4 standards (default=’fill’)

Type:str

Note

Meta object preserves the case of variables and attributes as it first receives the data. Subsequent calls to set new metadata with the same variable or attribute will use case of first call. Accessing or setting data thereafter is case insensitive. In practice, use is case insensitive but the original case is preserved. Case preseveration is built in to support writing files with a desired case to meet standards.

Metadata for higher order data objects, those that have multiple products under a single variable name in a pysat.Instrument object, are stored by providing a Meta object under the single name.

Supports any custom metadata values in addition to the expected metadata attributes (units, name, notes, desc, value_min, value_max, and fill). These base attributes may be used to programatically access and set types of metadata regardless of the string values used for the attribute. String values for attributes may need to be changed depending upon the standards of code or files interacting with pysat.

Meta objects returned as part of pysat loading routines are automatically updated to use the same values of units, etc. as found in the pysat.Instrument object.

Initialize the MetaLabels class

Parameters:
  • units (tuple) – Units label name and value type (default=(‘units’, str))
  • name (tuple) – Name label name and value type (default=(‘long_name’, str))
  • notes (tuple) – Notes label name and value type (default=(‘notes’, str))
  • desc (tuple) – Description label name and value type (default=(‘desc’, str))
  • min_val (tuple) – Minimum value label name and value type (default=(‘value_min’, float))
  • max_val (tuple) – Maximum value label name and value type (default=(‘value_max’, float))
  • fill_val (tuple) – Fill value label name and value type (default=(‘fill’, float))
  • kwargs (dict) – Dictionary containing optional label attributes, where the keys are the attribute names and the values are tuples containing the label name and value type
default_values_from_attr(attr_name)

Return the default values for each label based on their type

Parameters:attr_name (str) – Label attribute name (e.g., max_val)
Returns:default_val – Sets NaN for all float values, -1 for all int values, and ‘’ for all str values except for ‘scale’, which defaults to ‘linear’, and None for any other data type
Return type:str, float, int, or NoneType
Raises:ValueError – For unknown attr_name
default_values_from_type(val_type)

Return the default values for each label based on their type

Parameters:val_type (type) – Variable type for the value to be assigned to a MetaLabel
Returns:default_val – Sets NaN for all float values, -1 for all int values, and ‘’ for all str values except for ‘scale’, which defaults to ‘linear’, and None for any other data type
Return type:str, float, int, NoneType

Orbits

class pysat.Orbits(inst, index=None, kind='local time', period=None)

Determines orbits on the fly and provides orbital data in .data.

Parameters:
  • inst (pysat.Instrument) – Instrument object for which the orbits will be determined
  • index (str or NoneType) – Name of the data series to use for determing orbit breaks (default=None)
  • kind (str) – Kind of orbit, which specifies how orbital breaks are determined. Expects one of: ‘local time’, ‘longitude’, ‘polar’, or ‘orbit’ - local time: negative gradients in lt or breaks in inst.data.index - longitude: negative gradients or breaks in inst.data.index - polar: zero crossings in latitude or breaks in inst.data.index - orbit: uses unique values of orbit number (default=’local time’)
  • period (np.timedelta64 or NoneType) – length of time for orbital period, used to gauge when a break in the datetime index (inst.data.index) is large enough to consider it a new orbit
inst

Instrument object for which the orbits will be determined

Type:pysat.Instrument
kind

Kind of orbit, which specifies how orbital breaks are determined. Expects one of: ‘local time’, ‘longitude’, ‘polar’, or ‘orbit’ - local time : negative gradients in lt or breaks in inst.data.index - longitude : negative gradients or breaks in inst.data.index - polar : zero crossings in latitude or breaks in inst.data.index - orbit : uses unique values of orbit number (default=’local time’)

Type:str
orbit_period

Pandas Timedelta that specifies the orbit period. Used instead of dt.timedelta to enable np.timedelta64 input. (default=97 min)

Type:pds.Timedelta
num

Number of orbits in loaded data

Type:int
orbit_index

Index of currently loaded orbit, zero indexed

Type:int

Note

Determines the locations of orbit breaks in the loaded data in inst.data and provides iteration tools and convenient orbit selection via inst.orbit[orbit num].

This class should not be called directly by the user, it uses the interface provided by inst.orbits where inst = pysat.Instrument()

Examples

# Use orbit_info Instrument keyword to pass all Orbit kwargs
orbit_info = {'index': 'longitude', 'kind': 'longitude'}
vefi = pysat.Instrument(platform='cnofs', name='vefi', tag='dc_b',
                        clean_level=None, orbit_info=orbit_info)

# Set the instrument bounds
start = dt.datetime(2009, 1, 1)
stop = dt.datetime(2009, 1, 10)
vefi.load(date=start)
vefi.bounds(start, stop)

# Iterate over orbits
for vefi in vefi.orbits:
    print('Next available orbit ', vefi['dB_mer'])

# Load fifth orbit of first day
vefi.load(date=start)
vefi.orbits[5]

# Less convenient load
vefi.orbits.load(5)

# Manually iterate forwards in the orbit
vefi.orbits.next()

# Manually iterate backwards in the orbit
vefi.orbits.prev()
copy()

Provide a deep copy of object

Returns:Copy of self
Return type:Orbits class instance
current

Current orbit number.

Returns:None if no orbit data. Otherwise, returns orbit number, beginning with zero. The first and last orbit of a day is somewhat ambiguous. The first orbit for day n is generally also the last orbit on day n - 1. When iterating forward, the orbit will be labeled as first (0). When iterating backward, orbit labeled as the last.
Return type:int or NoneType
load(orbit_num)

Load a particular orbit into .data for loaded day.

Parameters:orbit_num (int) – orbit number, 1 indexed (1-length or -1 to -length) with sign denoting forward or backward indexing
Raises:ValueError – If index requested lies beyond the number of orbits

Note

A day of data must be loaded before this routine functions properly. If the last orbit of the day is requested, it will automatically be padded with data from the next day. The orbit counter will be reset to 1.

next()

Load the next orbit into associated Instrument.data object

Note

Forms complete orbits across day boundaries. If no data loaded then the first orbit from the first date of data is returned.

prev()

Load the previous orbit into associated Instrument.data object

Note

Forms complete orbits across day boundaries. If no data loaded then the last orbit of data from the last day is loaded.

Parameters

class pysat._params.Parameters(path=None, create_new=False)

Stores user parameters used by pysat.

Also stores custom user parameters provided the keys don’t conflict with default pysat parameters.

Parameters:
  • path (str) – If provided, the directory path will be used to load/store a parameters file with name ‘pysat_settings.json’ (default=None).
  • create_new (bool) – If True, a new parameters file is created. Will be created at path if provided. If not, file will be created in pysat.pysat_dir.
data

pysat user settings dictionary

Type:dict
defaults

Default parameters (keys) and values used by pysat that include {‘clean_level’: ‘clean’, ‘directory_format’: os.path.join(‘{platform}’, ‘{name}’, ‘{tag}’, ‘{inst_id}’), ‘ignore_empty_files’: False, ‘update_files’: True, ‘file_timeout’: 10, ‘user_modules’ : {}, ‘warn_empty_file_list’: False}

Type:dict
file_path

Location of file used to store settings

Type:str
non_defaults

List of pysat parameters (strings) that don’t have a defined default and are unaffected by self.restore_defaults().

Type:list
Raises:ValueError – The ‘user_modules’ parameter may not be set directly by the user. Please use the pysat.utils.regsitry module to modify the packages stored in ‘user_modules’.

Note

This method will look for ‘pysat_settings.json’ file first in the current working directory and then in the home ‘~/.pysat’ directory.

All pysat parameters are automatically stored whenever a parameter is assigned or modified. The default parameters and values tracked by this class are grouped by type below.

Values that map to the corresponding keywords on pysat.Instrument: clean_level, directory_format, ignore_empty_files, and update_files. See the Instrument docstring for more information on these keywords.

Values that map to internal pysat settings: file_timeout, user_modules, and warn_empty_file_list.

Stored pysat parameters without a working default value: data_dirs.

file_timeout - Time in seconds that pysat will wait to modify a busy file

user_modules - Stores information on modules registered by pysat

warn_empty_file_list - Raise a warning when no Instrument files are found

data_dirs - Directory(ies) where data are stored, in access order

clear_and_restart()

Clears all stored settings and sets pysat defaults

pysat parameters without a default value are set to []

restore_defaults()

Restore default pysat parameters

Does not modify any stored custom user keys or pysat parameters without a default value.

store()

Store parameters using the filename specified in self.file_path.

Instrument Methods

The following methods support a variety of actions commonly needed by pysat.Instrument modules regardless of the data source.

General

Provides generalized routines for integrating instruments into pysat.

pysat.instruments.methods.general.convert_timestamp_to_datetime(inst, sec_mult=1.0, epoch_name='Epoch')

Use datetime instead of timestamp for Epoch

Parameters:
  • inst (pysat.Instrument) – associated pysat.Instrument object
  • sec_mult (float) – Multiplier needed to convert epoch time to seconds (default=1.0)
  • epoch_name (str) – variable name for instrument index (default=’Epoch’)
pysat.instruments.methods.general.filename_creator(value, format_str=None, start_date=None, stop_date=None)

Creates filenames as needed to support use of generated pysat data sets

Parameters:
  • value (slice) – Datetime slice, see _instrument.py, fname = self.files[date:(date + inc)]
  • format_str (str or NoneType) – File format template string (default=None)
  • start_date (datetime.datetime or NoneType) – First date supported (default=None)
  • stop_date (datetime.datetime or NoneType) – Last date supported (default=None)
Returns:

Created filenames from format_str indexed by datetime

Return type:

pandas.Series

Raises:

NotImplementedError – This method is a stub to support development of a potential feature slated for a future release.

pysat.instruments.methods.general.is_daily_file_cadence(file_cadence)

Evaluate file cadence to see if it is daily or greater than daily

Parameters:file_cadence (dt.timedelta or pds.DateOffset) – pysat assumes a daily file cadence, but some instrument data file contain longer periods of time. This parameter allows the specification of regular file cadences greater than or equal to a day (e.g., weekly, monthly, or yearly). (default=dt.timedelta(days=1))
Returns:is_daily – True if the cadence is daily or less, False if the cadence is greater than daily
Return type:bool
pysat.instruments.methods.general.list_files(tag=None, inst_id=None, data_path=None, format_str=None, supported_tags=None, file_cadence=datetime.timedelta(days=1), two_digit_year_break=None, delimiter=None)

Return a Pandas Series of every file for chosen Instrument data.

This routine provides a standard interface for pysat instrument modules.

Parameters:
  • tag (string or NoneType) – Denotes type of file to load. Accepted types are <tag strings>. (default=None)
  • inst_id (string or NoneType) – Specifies the satellite ID for a constellation. Not used. (default=None)
  • data_path (string or NoneType) – Path to data directory. If None is specified, the value previously set in Instrument.files.data_path is used. (default=None)
  • format_str (string or NoneType) – User specified file format. If None is specified, the default formats associated with the supplied tags are used. (default=None)
  • supported_tags (dict or NoneType) – Keys are inst_id, each containing a dict keyed by tag where the values file format template strings. See Files.from_os format_str kwarg for more details. (default=None)
  • file_cadence (dt.timedelta or pds.DateOffset) – pysat assumes a daily file cadence, but some instrument data file contain longer periods of time. This parameter allows the specification of regular file cadences greater than or equal to a day (e.g., weekly, monthly, or yearly). (default=dt.timedelta(days=1))
  • two_digit_year_break (int or NoneType) – If filenames only store two digits for the year, then ‘1900’ will be added for years >= two_digit_year_break and ‘2000’ will be added for years < two_digit_year_break. If None, then four-digit years are assumed. (default=None)
  • delimiter (string or NoneType) – Delimiter string upon which files will be split (e.g., ‘.’). If None, filenames will be parsed presuming a fixed width format. (default=None)
Returns:

out – A class containing the verified available files

Return type:

pysat.Files.from_os : pysat._files.Files

Examples

fname = 'cnofs_vefi_bfield_1sec_{year:04d}{month:02d}{day:02d}_v05.cdf'
supported_tags = {'dc_b': fname}
list_files = functools.partial(nasa_cdaweb.list_files,
                               supported_tags=supported_tags)

fname = 'cnofs_cindi_ivm_500ms_{year:4d}{month:02d}{day:02d}_v01.cdf'
supported_tags = {'': fname}
list_files = functools.partial(mm_gen.list_files,
                               supported_tags=supported_tags)
pysat.instruments.methods.general.load_csv_data(fnames, read_csv_kwargs=None)

Load CSV data from a list of files into a single DataFrame

Parameters:
  • fnames (array-like) – Series, list, or array of filenames
  • read_csv_kwargs (dict or NoneType) – Dict of kwargs to apply to pds.read_csv. (default=None)
Returns:

data – Data frame with data from all files in the fnames list

Return type:

pds.DataFrame

See also

pds.read_csv()

pysat.instruments.methods.general.remove_leading_text(inst, target=None)

Removes leading text on variable names

Parameters:
  • inst (pysat.Instrument) – associated pysat.Instrument object
  • target (str or list of strings) – Leading string to remove. If none supplied, returns unmodified

Testing

pysat.instruments.methods.testing.clean(self, test_clean_kwarg=None)

Cleaning function

Parameters:
  • self (pysat.Instrument) – This object
  • test_clean_kwarg (any or NoneType) – Testing keyword (default=None)
pysat.instruments.methods.testing.define_period()

Define the default periods for the fake data functions

Returns:def_period – Dictionary of periods to use in test instruments
Return type:dict

Note

Local time and longitude slightly out of sync to simulate motion of Earth

pysat.instruments.methods.testing.define_range()

Define the default ranges for the fake data functions

Returns:def_range – Dictionary of periods to use in test instruments
Return type:dict
pysat.instruments.methods.testing.download(date_array, tag, inst_id, data_path=None, user=None, password=None, test_download_kwarg=None)

Simple pass function for pysat compatibility for test instruments.

This routine is invoked by pysat and is not intended for direct use by the end user.

Parameters:
  • date_array (array-like) – list of datetimes to download data for. The sequence of dates need not be contiguous.
  • tag (string) – Tag identifier used for particular dataset. This input is provided by pysat. (default=’’)
  • inst_id (string) – Instrument ID string identifier used for particular dataset. This input is provided by pysat. (default=’’)
  • data_path (string or NoneType) – Path to directory to download data to. (default=None)
  • user (string or NoneType) – User string input used for download. Provided by user and passed via pysat. If an account is required for downloads this routine here must error if user not supplied. (default=None)
  • password (string or NoneType) – Password for data download. (default=None)
  • test_download_kwarg (any or NoneType) – Testing keyword (default=None)
Raises:

ValueError – When user/password are required but not supplied

Warning

When no download support will be provided

pysat.instruments.methods.testing.eval_dep_warnings(warns, check_msgs)

Evaluate deprecation warnings by category and message

Parameters:
  • warns (list) – List of warnings.WarningMessage objects
  • check_msgs (list) – List of strings containing the expected warning messages
Returns:

found_msgs – List of booleans corresponding to check_msgs, which are True when the messages are found and False when they are not

Return type:

list

pysat.instruments.methods.testing.generate_fake_data(t0, num_array, period=5820, data_range=[0.0, 24.0], cyclic=True)

Generates fake data over a given range

Parameters:
  • t0 (float) – Start time in seconds
  • num_array (array_like) – Array of time steps from t0. This is the index of the fake data
  • period (int) – The number of seconds per period. (default = 5820)
  • data_range (float) – For cyclic functions, the range of data values cycled over one period. Not used for non-cyclic functions. (default = 24.0)
  • cyclic (bool) – If True, assume that fake data is a cyclic function (ie, longitude, slt) that will reset to data_range[0] once it reaches data_range[1]. If False, continue to monotonically increase
Returns:

data – Array with fake data

Return type:

array-like

pysat.instruments.methods.testing.generate_times(fnames, num, freq='1S')

Construct list of times for simulated instruments

Parameters:
  • fnames (list) – List of filenames.
  • num (int) – Number of times to generate
  • freq (string) – Frequency of temporal output, compatible with pandas.date_range [default : ‘1S’]
Returns:

  • uts (array) – Array of integers representing uts for a given day
  • index (pds.DatetimeIndex) – The DatetimeIndex to be used in the pysat test instrument objects
  • date (datetime) – The requested date reconstructed from the fake file name

pysat.instruments.methods.testing.init(self, test_init_kwarg=None)

Initializes the Instrument object with instrument specific values.

Runs once upon instantiation.

Shifts time index of files by 5-minutes if mangle_file_dates set to True at pysat.Instrument instantiation.

Creates a file list for a given range if the file_date_range keyword is set at instantiation.

Parameters:
  • self (pysat.Instrument) – This object
  • test_init_kwarg (any or NoneType) – Testing keyword (default=None)
pysat.instruments.methods.testing.list_files(tag=None, inst_id=None, data_path=None, format_str=None, file_date_range=None, test_dates=None, mangle_file_dates=False, test_list_files_kwarg=None)

Produce a fake list of files spanning three years

Parameters:
  • tag (str or NoneType) – pysat instrument tag (default=None)
  • inst_id (str or NoneType) – pysat satellite ID tag (default=None)
  • data_path (str or NoneType) – pysat data path (default=None)
  • format_str (str or NoneType) – file format string (default=None)
  • file_date_range (pds.date_range) – File date range. The default mode generates a list of 3 years of daily files (1 year back, 2 years forward) based on the test_dates passed through below. Otherwise, accepts a range of files specified by the user. (default=None)
  • test_dates (dt.datetime or NoneType) – Pass the _test_date object through from the test instrument files
  • mangle_file_dates (bool) – If True, file dates are shifted by 5 minutes. (default=False)
  • test_list_files_kwarg (any or NoneType) – Testing keyword (default=None)
Returns:

Return type:

Series of filenames indexed by file time

pysat.instruments.methods.testing.list_remote_files(tag=None, inst_id=None, data_path=None, format_str=None, start=None, stop=None, test_dates=None, user=None, password=None, mangle_file_dates=False, test_list_remote_kwarg=None)

Produce a fake list of files spanning three years and one month to simulate new data files on a remote server

Parameters:
  • tag (str or NoneType) – pysat instrument tag (default=None)
  • inst_id (str or NoneType) – pysat satellite ID tag (default=None)
  • data_path (str or NoneType) – pysat data path (default=None)
  • format_str (str or NoneType) – file format string (default=None)
  • start (dt.datetime or NoneType) – Starting time for file list. A None value will start 1 year before test_date (default=None)
  • stop (dt.datetime or NoneType) – Ending time for the file list. A None value will stop 2 years 1 month after test_date (default=None)
  • test_dates (dt.datetime or NoneType) – Pass the _test_date object through from the test instrument files
  • user (string or NoneType) – User string input used for download. Provided by user and passed via pysat. If an account is required for dowloads this routine here must error if user not supplied. (default=None)
  • password (string or NoneType) – Password for data download. (default=None)
  • mangle_file_dates (bool) – If True, file dates are shifted by 5 minutes. (default=False)
  • test_list_remote_kwarg (any or NoneType) – Testing keyword (default=None)
Returns:

Filenames indexed by file time, see list_files for more info

Return type:

pds.Series

pysat.instruments.methods.testing.preprocess(self, test_preprocess_kwarg=None)

Customization method that performs standard preprocessing.

This routine is automatically applied to the Instrument object on every load by the pysat nanokernel (first in queue). Object modified in place.

Parameters:test_preprocess_kwarg (any or NoneType) – Testing keyword (default=None)

Utilities

The utilites module contains functions used throughout the pysat package. This includes utilities for determining the available Instruments, loading files, et cetera.

Core Utilities

These utilities are available directly from the pysat.utils module.

class pysat.utils._core.NetworkLock(*args, **kwargs)

Lock manager compatible with networked file systems.

Parameters:
  • *args (list reference) – References a list of input arguments
  • **kwargs (dict reference) – References a dict of input keyword argument

Note

See portalocker.utils.Lock for more details (portalocker.utils.Lock)

Example

from pysat.utils import NetworkLock

with NetworkLock(file_to_be_written, 'w') as locked_file:
    locked_file.write('content')
release()

Releases the Lock so the file system

From portalocker docs:
On some networked filesystems it might be needed to force a os.fsync() before closing the file so it’s actually written before another client reads the file.
pysat.utils._core.available_instruments(inst_loc=None)

Obtain basic information about instruments in a given subpackage.

Parameters:inst_loc (python subpackage or NoneType) – The location of the instrument subpackage (e.g., pysat.instruments) or None to list all registered instruments (default=None)
Returns:inst_info – Nested dictionary with ‘platform’, ‘name’, ‘inst_module’, ‘inst_ids_tags’, ‘inst_id’, and ‘tag’ with the tag descriptions given as the value for each unique dictionary combination.
Return type:dict
pysat.utils._core.display_available_instruments(inst_loc=None, show_inst_mod=None, show_platform_name=None)

Display basic information about instruments in a given subpackage.

Parameters:
  • inst_loc (python subpackage or NoneType) – The location of the instrument subpackage (e.g., pysat.instruments) or None to list all registered instruments (default=None)
  • show_inst_mod (boolean or NoneType) – Displays the instrument module if True, does not include it if False, and reverts to standard display based on inst_loc type if None. (default=None)
  • show_platform_name (boolean or NoneType) – Displays the platform and name if True, does not include it if False, and reverts to standard display based on inst_loc type if None. (default=None)

Note

Prints to standard out, a user-friendly interface for availabe_instruments. Defaults to including the instrument module and not the platform/name values if inst_loc is an instrument module and to including the platform/name values and not the instrument module if inst_loc is None (listing the registered instruments).

pysat.utils._core.fmt_output_in_cols(out_strs, ncols=3, max_num=6, lpad=None)

Format a string with desired output values in columns

Parameters:
  • out_strs (array-like) – Array like object containing strings to print
  • ncols (int) – Number of columns to print (default=3)
  • max_num (int) – Maximum number of out_strs members to print. Best display achieved if this number is divisable by 2 and ncols (default=6)
  • lpad (int or NoneType) – Left padding or None to use length of longest string + 1 (default=None)
Returns:

output – String with desired data formatted in columns

Return type:

string

pysat.utils._core.generate_instrument_list(inst_loc, user_info=None)

Iterate through and classify instruments in a given subpackage.

Parameters:
  • inst_loc (python subpackage) – The location of the instrument subpackage to test, e.g., ‘pysat.instruments’
  • user_info (dict or NoneType) – Nested dictionary with user and password info for instrument module name. If None, no user or password is assumed. (default=None) EX: user_info = {‘jro_isr’: {‘user’: ‘myname’, ‘password’: ‘email’}}
Returns:

output – Dictionary with keys ‘names’, ‘download’, ‘no_download’ that contain lists with different information for each key: ‘names’ - list of platform_name combinations ‘download’ - dict containing ‘inst_module’, ‘tag’, and ‘inst_id’ for instruments with download routines ‘no_download’ - dict containing ‘inst_module’, ‘tag’, and ‘inst_id’ for instruments without download routines

Return type:

dict

Note

This routine currently supports classification of instruments for unit tests both in the core package and in seperate instrument packages that use pysat.

pysat.utils._core.listify(iterable)

Returns a flattened list of iterable if not already a list

Parameters:iterable (iter-like) – An iterable object that will be wrapped within a list
Returns:An enclosing 1-D list of iterable if not already a list
Return type:list
pysat.utils._core.load_netcdf4(fnames=None, strict_meta=False, file_format=None, epoch_name='Epoch', pandas_format=True, labels={'axis': ('axis', <class 'str'>), 'desc': ('desc', <class 'str'>), 'fill_val': ('fill', <class 'numpy.float64'>), 'max_val': ('value_max', <class 'numpy.float64'>), 'min_val': ('value_min', <class 'numpy.float64'>), 'name': ('long_name', <class 'str'>), 'notes': ('notes', <class 'str'>), 'plot': ('plot_label', <class 'str'>), 'scale': ('scale', <class 'str'>), 'units': ('units', <class 'str'>)})

Load netCDF-3/4 file produced by pysat.

Parameters:
  • fnames (string, array_like of strings, or NoneType) – Filename(s) to load, will fail if None (default=None)
  • strict_meta (boolean) – Flag that checks if metadata across fnames is the same if True (default=False)
  • file_format (string or NoneType) – file_format keyword passed to netCDF4 routine. Expects one of ‘NETCDF3_CLASSIC’, ‘NETCDF3_64BIT’, ‘NETCDF4_CLASSIC’, or ‘NETCDF4’. If None, defaults to ‘NETCDF4’. (default=None)
  • epoch_name (string) – Data key for time variable (default=’Epoch’)
  • pandas_format (bool) – Flag specifying if data is stored in a pandas DataFrame (True) or xarray Dataset (False). (default=False)
  • labels (dict) – Dict where keys are the label attribute names and the values are tuples that have the label values and value types in that order. (default={‘units’: (‘units’, str), ‘name’: (‘long_name’, str), ‘notes’: (‘notes’, str), ‘desc’: (‘desc’, str), ‘plot’: (‘plot_label’, str), ‘axis’: (‘axis’, str), ‘scale’: (‘scale’, str), ‘min_val’: (‘value_min’, np.float64), ‘max_val’: (‘value_max’, np.float64), ‘fill_val’: (‘fill’, np.float64)})
Returns:

  • out (pandas.DataFrame) – DataFrame output
  • meta (pysat.Meta) – Meta data

Raises:

ValueError – If kwargs that should be args are not set on instantiation.

pysat.utils._core.scale_units(out_unit, in_unit)

Determine the scaling factor between two units

Parameters:
  • out_unit (str) – Desired unit after scaling
  • in_unit (str) – Unit to be scaled
Returns:

unit_scale – Scaling factor that will convert from in_units to out_units

Return type:

float

Note

Accepted units include degrees (‘deg’, ‘degree’, ‘degrees’), radians (‘rad’, ‘radian’, ‘radians’), hours (‘h’, ‘hr’, ‘hrs’, ‘hour’, ‘hours’), and lengths (‘m’, ‘km’, ‘cm’). Can convert between degrees, radians, and hours or different lengths.

Example

import numpy as np
two_pi = 2.0 * np.pi
scale = scale_units("deg", "RAD")
two_pi *= scale
two_pi # will show 360.0

Coordinates

Coordinate transformation functions for pysat

pysat.utils.coords.adjust_cyclic_data(samples, high=6.283185307179586, low=0.0)

Adjust cyclic values such as longitude to a different scale

Parameters:
  • samples (array_like) – Input array
  • high (float or int) – Upper boundary for circular standard deviation range (default=2 pi)
  • low (float or int) – Lower boundary for circular standard deviation range (default=0)
  • axis (int or NoneType) – Axis along which standard deviations are computed. The default is to compute the standard deviation of the flattened array
Returns:

out_samples – Circular standard deviation

Return type:

float

pysat.utils.coords.calc_solar_local_time(inst, lon_name=None, slt_name='slt', apply_modulus=True, ref_date=None)

Append solar local time to an instrument object

Parameters:
  • inst (pysat.Instrument instance) – instrument object to be updated
  • lon_name (string) – name of the longtiude data key (assumes data are in degrees)
  • slt_name (string) – name of the output solar local time data key (default=’slt’)
  • apply_modulus (bool) – If True, SLT values are confined to [0, 24), if False they may be positive or negative based on the value of their universal time relative to that of the reference date ref_date. (default=True)
  • ref_date (dt.datetime or NoneType) – Reference initial date. If None, will use the date found at inst.date. Only valid if apply_modulus is True. (default=None)

Note

Updates Instrument data in column specified by slt_name, as well as Metadata

pysat.utils.coords.update_longitude(inst, lon_name=None, high=180.0, low=-180.0)

Update longitude to the desired range

Parameters:
  • inst (pysat.Instrument instance) – instrument object to be updated
  • lon_name (string) – name of the longtiude data
  • high (float) – Highest allowed longitude value (default=180.0)
  • low (float) – Lowest allowed longitude value (default=-180.0)
Returns:

Return type:

updates instrument data in column ‘lon_name’

Files

pysat.utils.files.check_and_make_path(path)

Checks if path exists, creates it if it doesn’t.

Parameters:path (string) – Directory path without any file names. Creates all necessary directories to complete the path.
Raises:ValueError – If input path and internally constructed path are not equal.
pysat.utils.files.construct_searchstring_from_format(format_str, wildcard=False)

Parses format file string and returns string formatted for searching.

Parameters:
  • format_str (str) – Provides the naming pattern of the instrument files and the locations of date information so an ordered list may be produced. Supports ‘year’, ‘month’, ‘day’, ‘hour’, ‘minute’, ‘second’, ‘version’, ‘revision’, and ‘cycle’. For example, instrument_{year:04d}{month:02d}{day:02d}_v{version:02d}.cdf
  • wildcard (bool) – If True, replaces the ? sequence with a * . This option may be well suited when dealing with delimited filenames.
Returns:

out_dict – An output dict with the following keys: - ‘search_string’ (format_str with data to be parsed replaced with ?) - ‘keys’ (keys for data to be parsed) - ‘lengths’ (string length for data to be parsed) - ‘string_blocks’ (the filenames are broken into fixed width segments).

Return type:

dict

Note

The ‘?’ may be used to indicate a set number of spaces for a variable part of the name that need not be extracted. cnofs_cindi_ivm_500ms_{year:4d}{month:02d}{day:02d}_v??.cdf

A standards compliant filename can be constructed by starting with string_blocks, adding keys in order, and replacing the ‘’ locations with data of length length.

pysat.utils.files.parse_delimited_filenames(files, format_str, delimiter)

Parses list of files, extracting data identified by format_str

Parameters:
  • files (list) – List of files
  • format_str (str) – Provides the naming pattern of the instrument files and the locations of date information so an ordered list may be produced. Supports string formatting codes ‘year’, ‘month’, ‘day’, ‘hour’, ‘minute’, ‘second’, ‘version’, ‘revision’, and ‘cycle’. For example, instrument_{year:4d}{month:02d}{day:02d}_v{version:02d}.cdf
  • delimiter (string) – Delimiter string upon which files will be split (e.g., ‘.’)
Returns:

stored – Information parsed from filenames that inclues: ‘year’, ‘month’, ‘day’, ‘hour’, ‘minute’, ‘second’, ‘version’, ‘revision’, and ‘cycle’. Also includes ‘files’, an input list of files, and ‘format_str’, a formatted string from input

Return type:

collections.OrderedDict

pysat.utils.files.parse_fixed_width_filenames(files, format_str)

Parses list of files, extracting data identified by format_str

Parameters:
  • files (list) – List of files
  • format_str (str) – Provides the naming pattern of the instrument files and the locations of date information so an ordered list may be produced. Supports string formatting codes ‘year’, ‘month’, ‘day’, ‘hour’, ‘minute’, ‘second’, ‘version’, ‘revision’, and ‘cycle’. For example, instrument_{year:4d}{month:02d}{day:02d}_v{version:02d}.cdf
Returns:

stored – Information parsed from filenames that inclues: ‘year’, ‘month’, ‘day’, ‘hour’, ‘minute’, ‘second’, ‘version’, ‘revision’, and ‘cycle’. Also includes ‘files’, an input list of files, and ‘format_str’, a formatted string from input

Return type:

collections.OrderedDict

pysat.utils.files.process_parsed_filenames(stored, two_digit_year_break=None)

Create a Files pandas Series of filenames from a formatted dict

Parameters:
  • stored (collections.orderedDict) – Dict produced by parse_fixed_width_filenames or parse_delimited_filenames
  • two_digit_year_break (int or NoneType) – If filenames only store two digits for the year, then ‘1900’ will be added for years >= two_digit_year_break and ‘2000’ will be added for years < two_digit_year_break. If None, then four-digit years are assumed. (default=None)
Returns:

Series, indexed by datetime, with file strings

Return type:

pds.Series

Note

If two files have the same date and time information in the filename then the file with the higher version/revision/cycle is used. Series returned only has one file per datetime. Version is required for this filtering, revision and cycle are optional.

pysat.utils.files.search_local_system_formatted_filename(data_path, search_str)

Parses format file string and returns string formatted for searching.

Parameters:
  • data_path (string) – Top level directory to search files for. This directory is provided by pysat to the instrument_module.list_files functions as data_path.
  • search_str (string) – String used to search for local files. For example, cnofs_cindi_ivm_500ms_????????_v??.cdf or inst-name-*-v??.cdf
Returns:

files – list of files matching the specified file format

Return type:

list

Note

The use of ?s (1 ? per character) rather than the full wildcard * provides a more specific filename search string that limits the false positive rate.

pysat.utils.files.update_data_directory_structure(new_template, test_run=True, full_breakdown=False, remove_empty_dirs=False)

Update pysat data directory structure to match supplied template

Translates all of pysat’s managed science files to a new directory structure. By default, pysat uses the template string stored in pysat.params[‘directory_format’] to organize files. This method makes it possible to transition an existing pysat installation so it works with the supplied new template.

Parameters:
  • new_template (str) –
    New directory template string. The default value for pysat is
    os.path.join((‘{platform}’, ‘{name}’, ‘{tag}’, ‘{inst_id}’))
  • test_run (bool) – If True, a printout of all proposed changes will be made, but the directory changes will not be enacted. (default=True)
  • full_breakdown (bool) – If True, a full path for every file is printed to terminal. (default=False)
  • remove_empty_dirs (bool) – If True, all directories that had pysat.Instrument data moved to another location and are now empty are deleted. Traverses the directory chain up to the top-level directories in pysat.params[‘data_dirs’]. (default=False)

Note

After updating the data directory structures users should nominally assign new_template as the directory format via

pysat.params['directory_format'] = new_template

Registry

pysat user module registry utilities

This module allows pysat to provide direct access to external or custom instrument modules by maintaining information about these instrument modules.

Examples

Instrument support modules must be registered before use. This may be done individually or for a collection of Instruments at once. For example, assume there is an implementation for myInstrument in the module my.package.myInstrument with platform and name attributes ‘myplatform’ and ‘myname’. Such an instrument may be registered with

registry.register(['my.package.myInstrument'])

The full module name “my.package.myInstrument” will be registered in pysat.params[‘user_modules’] and stored as a dict of dicts keyed by platform and name.

Once registered, subsequent calls to Instrument may use the platform and name string identifiers.

Instrument('myplatform', 'myname')

A full suite of instrument support modules may be registered at once using

# General form where my.package contains a collection of
# submodules to support Instrument data sets.
registry.register_by_module(my.package)

# Register published packages from pysat team
import pysatSpaceWeather
registry.register_by_module(pysatSpaceWeather.instruments)

import pysatNASA
registry.register_by_module(pysatNASA.instruments)

import pysatModels
registry.register_by_module(pysatModels.models)
pysat.utils.registry.load_saved_modules()

Load registered pysat.Instrument modules

Returns:instrument module strings are keyed by platform then name
Return type:dict of dicts
pysat.utils.registry.register(module_names, overwrite=False)

Registers a user pysat.Instrument module by name

Enables instantiation of a third-party Instrument module using

inst = pysat.Instrument(platform, name, tag=tag, inst_id=inst_id)
Parameters:
  • module_names (list-like of str) – specify package name and instrument modules
  • overwrite (bool) – If True, an existing registration will be updated with the new module information.
Raises:

ValueError – If a new module is input with a platform and name that is already associated with a registered module and the overwrite flag is set to False.

Warning

Registering a module that contains code other than pysat instrument files could result in unexpected consequences.

Note

Modules should be importable using

from my.package.name import my_instrument

Module names do not have to follow the pysat platform_name naming convection.

Current registered modules bay be found at

pysat.params['user_modules']

which is stored as a dict of dicts keyed by platform and name.

Examples

from pysat import Instrument
from pysat.utils import registry

registry.register(['my.package.name.myInstrument'])

testInst = Instrument(platform, name)
pysat.utils.registry.register_by_module(module)

Register all sub-modules attached to input module

Enables instantiation of a third-party Instrument module using

inst = pysat.Instrument(platform, name)
Parameters:module (Python module) – Module with one or more pysat.Instrument support modules attached as sub-modules to the input module
Raises:ValueError – If platform and name associated with a module are already registered

Note

Gets a list of sub-modules by using the __all__ attribute, defined in the module’s __init__.py

Examples

import pysat
import pysatModels
pysat.utils.registry.register_by_module(pysatModels.models)
pysat.utils.registry.remove(platforms, names)

Removes module from registered user modules

Parameters:
  • platforms (list-like of str) – Platform identifiers to remove
  • names (list-like of str) – Name identifiers, paired with platforms, to remove. If the names element paired with the platform element is None, then all instruments under the specified platform will be removed. Should be the same type as platforms.
Raises:

ValueError – If platform and/or name are not currently registered

Note

Current registered user modules available at pysat.params[‘user_modules’]

Examples

platforms = ['platform1', 'platform2']
names = ['name1', 'name2']

# remove all instruments with platform=='platform1'
registry.remove(['platform1'], [None])
# remove all instruments with platform 'platform1' or 'platform2'
registry.remove(platforms, [None, None])
# remove all instruments with 'platform1', and individual instrument
# for 'platform2', 'name2'
registry.remove(platforms, [None, 'name2']
# remove 'platform1', 'name1', as well as 'platform2', 'name2'
registry.remove(platforms, names)
pysat.utils.registry.store()

Store current registry onto disk

Time

pysat date and time utilities

pysat.utils.time.calc_freq(index)

Determine the frequency for a time index

Parameters:index (array-like) – Datetime list, array, or Index
Returns:freq – Frequency string as described in Pandas Offset Aliases
Return type:str

Note

Calculates the minimum time difference and sets that as the frequency.

To reduce the amount of calculations done, the returned frequency is either in seconds (if no sub-second resolution is found) or nanoseconds.

See also

pds.offsets.DateOffset()

pysat.utils.time.calc_res(index, use_mean=False)

Determine the resolution for a time index

Parameters:
  • index (array-like) – Datetime list, array, or Index
  • use_mean (bool) – Use the minimum time difference if False, use the mean time difference if True (default=False)
Returns:

res_sec – Resolution value in seconds

Return type:

float

pysat.utils.time.create_date_range(start, stop, freq='D')

Return array of datetime objects using input frequency from start to stop

Supports single datetime object or list, tuple, ndarray of start and stop dates.

freq codes correspond to pandas date_range codes, D daily, M monthly, S secondly

pysat.utils.time.create_datetime_index(year=None, month=None, day=None, uts=None)

Create a timeseries index using supplied year, month, day, and ut in seconds.

Parameters:
  • year (array_like or NoneType) – Array of year values as np.int (default=None)
  • month (array_like or NoneType) – Array of month values as np.int. Leave None if using day for day of year. (default=None)
  • day (array_like or NoneType) – Array of number of days as np.int. If month=None then value interpreted as day of year, otherwise, day of month. (default=None)
  • uts (array-like or NoneType) – Array of UT seconds of minute as np.float64 values (default=None)
Returns:

Return type:

Pandas timeseries index.

Note

Leap seconds have no meaning here.

pysat.utils.time.filter_datetime_input(date)

Returns datetime that only includes year, month, and day.

Parameters:date (NoneType, array-like, or datetime) – Single or sequence of datetime inputs
Returns:out_date – NoneType input yeilds NoneType output, array-like yeilds list, datetime object yeilds like. All datetime output excludes the sub-daily temporal increments (keeps only date information).
Return type:NoneType, datetime, or list of datetimes

Note

Checks for timezone information not in UTC

pysat.utils.time.freq_to_res(freq)

Convert a frequency string to a resolution value in seconds

Parameters:freq (str) – Frequency string as described in Pandas Offset Aliases
Returns:res_sec – Resolution value in seconds
Return type:np.float64

See also

pds.offsets.DateOffset(), Reference(), ---------(), Separating(), as()

https()
//stackoverflow.com/a/12409995
pysat.utils.time.getyrdoy(date)

Return a tuple of year, day of year for a supplied datetime object.

Parameters:date (datetime.datetime) – Datetime object
Returns:
  • year (int) – Integer year
  • doy (int) – Integer day of year
pysat.utils.time.parse_date(str_yr, str_mo, str_day, str_hr='0', str_min='0', str_sec='0', century=2000)

Basic date parser for file reading

Parameters:
  • str_yr (string) – String containing the year (2 or 4 digits)
  • str_mo (string) – String containing month digits
  • str_day (string) – String containing day of month digits
  • str_hr (string) – String containing the hour of day (default=’0’)
  • str_min (string) – String containing the minutes of hour (default=’0’)
  • str_sec (string) – String containing the seconds of minute (default=’0’)
  • century (int) – Century, only used if str_yr is a 2-digit year (default=2000)
Returns:

out_date – datetime object

Return type:

dt.datetime

pysat.utils.time.today()

Returns today’s date (UTC), with no hour, minute, second, etc.

Returns:today_utc – Today’s date in UTC
Return type:datetime

Testing

Utilities to perform common evaluations

pysat.utils.testing.assert_list_contains(small_list, big_list)

Assert all elements of one list exist within the other list

Parameters:
  • small_list (list) – List whose values must all be present within big_list
  • big_list (list) – List that must contain all the values in small_list
Raises:

AssertionError – If a small_list value is missing from big_list

pysat.utils.testing.assert_lists_equal(list1, list2)

Assert that the lists contain the same elements

Parameters:
  • list1 (list) – Input list one
  • list2 (list) – Input list two
Raises:

AssertionError – If a list1 value is missing from list2 or list lengths are unequal

Note

This test does not require that the lists have the same elements in the same order, and so is also a good test for keys.

pysat.utils.testing.nan_equal(value1, value2)

Determine if values are equal or are both NaN

Parameters:
  • value1 (scalar-like) – Value of any type that can be compared without iterating
  • value2 (scalar-like) – Another value of any type that can be compared without iterating
Returns:

is_equal – True if both values are equal or NaN, False if they are not

Return type:

bool

Instrument Template

This is a template for a pysat.Instrument support file. Modify this file as needed when adding a new Instrument to pysat.

This is a good area to introduce the instrument, provide background on the mission, operations, instrumentation, and measurements.

Also a good place to provide contact information. This text will be included in the pysat API documentation.

Properties

platform
List platform string here
name
List name string here
inst_id
List supported inst_ids here
tag
List supported tag strings here

Note

  • Optional section, remove if no notes

Warning

  • Optional section, remove if no warnings
  • Two blank lines needed afterward for proper formatting

Examples

Example code can go here

Authors

Author name and institution

pysat.instruments.templates.template_instrument.init(self)

Initializes the Instrument object with instrument specific values.

Runs once upon instantiation. Object modified in place. Use this to set the acknowledgements and references.

pysat.instruments.templates.template_instrument.clean(self)

Method to return PLATFORM/NAME data cleaned to the specified level

Cleaning level is specified in inst.clean_level and pysat will accept user input for several strings. The clean_level is specified at instantiation of the Instrument object, though it may be updated to a more stringent level and re-applied after instantiation. The clean method is applied after default every time data is loaded.

Note

  • ‘clean’ All parameters are good, suitable for scientific studies
  • ‘dusty’ Most parameters are good, requires instrument familiarity
  • ‘dirty’ There are data areas that have issues, use with caution
  • ‘none’ No cleaning applied, routine not called in this case.
pysat.instruments.templates.template_instrument.preprocess(self)

Customization method that performs standard preprocessing.

This routine is automatically applied to the Instrument object on every load by the pysat nanokernel (first in queue). Object modified in place.

pysat.instruments.templates.template_instrument.list_files(tag=None, inst_id=None, data_path=None, format_str=None)

Produce a list of files corresponding to PLATFORM/NAME.

This routine is invoked by pysat and is not intended for direct use by the end user. Arguments are provided by pysat.

Parameters:
  • tag (string) – tag name used to identify particular data set to be loaded. This input is nominally provided by pysat itself. (default=’’)
  • inst_id (string) – Satellite ID used to identify particular data set to be loaded. This input is nominally provided by pysat itself. (default=’’)
  • data_path (string) – Full path to directory containing files to be loaded. This is provided by pysat. The user may specify their own data path at Instrument instantiation and it will appear here. (default=None)
  • format_str (string) – String template used to parse the datasets filenames. If a user supplies a template string at Instrument instantiation then it will appear here, otherwise defaults to None. (default=None)
Returns:

Series of filename strings, including the path, indexed by datetime.

Return type:

pandas.Series

Examples

If a filename is SPORT_L2_IVM_2019-01-01_v01r0000.NC then the template
is 'SPORT_L2_IVM_{year:04d}-{month:02d}-{day:02d}_' +
'v{version:02d}r{revision:04d}.NC'

Note

The returned Series should not have any duplicate datetimes. If there are multiple versions of a file the most recent version should be kept and the rest discarded. This routine uses the pysat.Files.from_os constructor, thus the returned files are up to pysat specifications.

Multiple data levels may be supported via the ‘tag’ input string. Multiple instruments via the inst_id string.

pysat.instruments.templates.template_instrument.download(date_array, tag, inst_id, data_path=None, user=None, password=None, **kwargs)

Placeholder for PLATFORM/NAME downloads.

This routine is called as needed by pysat. It is not intended for direct user interaction.

Parameters:
  • date_array (array-like) – list of datetimes to download data for. The sequence of dates need not be contiguous.
  • tag (string) – Tag identifier used for particular dataset. This input is provided by pysat. (default=’’)
  • inst_id (string) – Satellite ID string identifier used for particular dataset. This input is provided by pysat. (default=’’)
  • data_path (string) – Path to directory to download data to. (default=None)
  • user (string (OPTIONAL)) – User string input used for download. Provided by user and passed via pysat. If an account is required for dowloads this routine here must error if user not supplied. (default=None)
  • password (string (OPTIONAL)) – Password for data download. (default=None)
  • custom_keywords (placeholder) – Additional keywords supplied by user when invoking the download routine attached to a pysat.Instrument object are passed to this routine. Use of custom keywords here is discouraged.
pysat.instruments.templates.template_instrument.load(fnames, tag=None, inst_id=None, custom_keyword=None)

Loads PLATFORM data into (PANDAS/XARRAY).

This routine is called as needed by pysat. It is not intended for direct user interaction.

Parameters:
  • fnames (array-like) – iterable of filename strings, full path, to data files to be loaded. This input is nominally provided by pysat itself.
  • tag (string) – tag name used to identify particular data set to be loaded. This input is nominally provided by pysat itself. While tag defaults to None here, pysat provides ‘’ as the default tag unless specified by user at Instrument instantiation. (default=’’)
  • inst_id (string) – Satellite ID used to identify particular data set to be loaded. This input is nominally provided by pysat itself. (default=’’)
  • custom_keyword (type to be set) – Developers may include any custom keywords, with default values defined in the method signature. This is included here as a place holder and should be removed.
Returns:

Data and Metadata are formatted for pysat. Data is a pandas DataFrame or xarray DataSet while metadata is a pysat.Meta instance.

Return type:

data, metadata

Note

Any additional keyword arguments passed to pysat.Instrument upon instantiation are passed along to this routine.

Examples

inst = pysat.Instrument('ucar', 'tiegcm')
inst.load(2019, 1)
pysat.instruments.templates.template_instrument.list_remote_files(tag, inst_id, user=None, password=None)

Return a Pandas Series of every file for chosen remote data.

This routine is intended to be used by pysat instrument modules supporting a particular NASA CDAWeb dataset.

Parameters:
  • tag (string or NoneType) – Denotes type of file to load. Accepted types are <tag strings>. (default=None)
  • inst_id (string or NoneType) – Specifies the satellite ID for a constellation. Not used. (default=None)
  • user (string or NoneType) – Username to be passed along to resource with relevant data. (default=None)
  • password (string or NoneType) – User password to be passed along to resource with relevant data. (default=None)

Note

If defined, the expected return variable is a pandas.Series formatted for the Files class (pysat._files.Files) containing filenames and indexed by date and time

Test Instruments

The following Instrument modules support unit and integration testing for packages that depend on pysat.

pysat_testing

Produces fake instrument data for testing.

pysat.instruments.pysat_testing.download(date_array, tag, inst_id, data_path=None, user=None, password=None, test_download_kwarg=None)

Simple pass function for pysat compatibility for test instruments.

This routine is invoked by pysat and is not intended for direct use by the end user.

Parameters:
  • date_array (array-like) – list of datetimes to download data for. The sequence of dates need not be contiguous.
  • tag (string) – Tag identifier used for particular dataset. This input is provided by pysat. (default=’’)
  • inst_id (string) – Instrument ID string identifier used for particular dataset. This input is provided by pysat. (default=’’)
  • data_path (string or NoneType) – Path to directory to download data to. (default=None)
  • user (string or NoneType) – User string input used for download. Provided by user and passed via pysat. If an account is required for downloads this routine here must error if user not supplied. (default=None)
  • password (string or NoneType) – Password for data download. (default=None)
  • test_download_kwarg (any or NoneType) – Testing keyword (default=None)
Raises:

ValueError – When user/password are required but not supplied

Warning

When no download support will be provided

pysat.instruments.pysat_testing.list_files(tag=None, inst_id=None, data_path=None, format_str=None, file_date_range=None, *, test_dates={'': {'': datetime.datetime(2009, 1, 1, 0, 0), 'default_meta': datetime.datetime(2009, 1, 1, 0, 0), 'no_download': datetime.datetime(2009, 1, 1, 0, 0), 'non_strict': datetime.datetime(2009, 1, 1, 0, 0), 'user_password': datetime.datetime(2009, 1, 1, 0, 0)}}, mangle_file_dates=False, test_list_files_kwarg=None)

Produce a fake list of files spanning three years

Parameters:
  • tag (str or NoneType) – pysat instrument tag (default=None)
  • inst_id (str or NoneType) – pysat satellite ID tag (default=None)
  • data_path (str or NoneType) – pysat data path (default=None)
  • format_str (str or NoneType) – file format string (default=None)
  • file_date_range (pds.date_range) – File date range. The default mode generates a list of 3 years of daily files (1 year back, 2 years forward) based on the test_dates passed through below. Otherwise, accepts a range of files specified by the user. (default=None)
  • test_dates (dt.datetime or NoneType) – Pass the _test_date object through from the test instrument files
  • mangle_file_dates (bool) – If True, file dates are shifted by 5 minutes. (default=False)
  • test_list_files_kwarg (any or NoneType) – Testing keyword (default=None)
Returns:

Return type:

Series of filenames indexed by file time

pysat.instruments.pysat_testing.list_remote_files(tag=None, inst_id=None, data_path=None, format_str=None, start=None, stop=None, *, test_dates={'': {'': datetime.datetime(2009, 1, 1, 0, 0), 'default_meta': datetime.datetime(2009, 1, 1, 0, 0), 'no_download': datetime.datetime(2009, 1, 1, 0, 0), 'non_strict': datetime.datetime(2009, 1, 1, 0, 0), 'user_password': datetime.datetime(2009, 1, 1, 0, 0)}}, user=None, password=None, mangle_file_dates=False, test_list_remote_kwarg=None)

Produce a fake list of files spanning three years and one month to simulate new data files on a remote server

Parameters:
  • tag (str or NoneType) – pysat instrument tag (default=None)
  • inst_id (str or NoneType) – pysat satellite ID tag (default=None)
  • data_path (str or NoneType) – pysat data path (default=None)
  • format_str (str or NoneType) – file format string (default=None)
  • start (dt.datetime or NoneType) – Starting time for file list. A None value will start 1 year before test_date (default=None)
  • stop (dt.datetime or NoneType) – Ending time for the file list. A None value will stop 2 years 1 month after test_date (default=None)
  • test_dates (dt.datetime or NoneType) – Pass the _test_date object through from the test instrument files
  • user (string or NoneType) – User string input used for download. Provided by user and passed via pysat. If an account is required for dowloads this routine here must error if user not supplied. (default=None)
  • password (string or NoneType) – Password for data download. (default=None)
  • mangle_file_dates (bool) – If True, file dates are shifted by 5 minutes. (default=False)
  • test_list_remote_kwarg (any or NoneType) – Testing keyword (default=None)
Returns:

Filenames indexed by file time, see list_files for more info

Return type:

pds.Series

pysat.instruments.pysat_testing.load(fnames, tag=None, inst_id=None, sim_multi_file_right=False, sim_multi_file_left=False, root_date=None, malformed_index=False, num_samples=None, test_load_kwarg=None)

Loads the test files

Parameters:
  • fnames (list) – List of filenames
  • tag (str or NoneType) – Instrument tag (accepts ‘’ or a string to change the behaviour of certain instrument aspects for testing)
  • inst_id (str or NoneType) – Instrument satellite ID (accepts ‘’)
  • sim_multi_file_right (boolean) – Adjusts date range to be 12 hours in the future or twelve hours beyond root_date (default=False)
  • sim_multi_file_left (boolean) – Adjusts date range to be 12 hours in the past or twelve hours before root_date (default=False)
  • root_date (NoneType) – Optional central date, uses _test_dates if not specified. (default=None)
  • malformed_index (boolean) – If True, time index will be non-unique and non-monotonic (default=False)
  • num_samples (int) – Number of samples per day
  • test_load_kwarg (any or NoneType) – Testing keyword (default=None)
Returns:

  • data (pds.DataFrame) – Testing data
  • meta (pysat.Meta) – Metadata

pysat_testing_xarray

Produces fake instrument data for testing.

pysat.instruments.pysat_testing_xarray.download(date_array, tag, inst_id, data_path=None, user=None, password=None, test_download_kwarg=None)

Simple pass function for pysat compatibility for test instruments.

This routine is invoked by pysat and is not intended for direct use by the end user.

Parameters:
  • date_array (array-like) – list of datetimes to download data for. The sequence of dates need not be contiguous.
  • tag (string) – Tag identifier used for particular dataset. This input is provided by pysat. (default=’’)
  • inst_id (string) – Instrument ID string identifier used for particular dataset. This input is provided by pysat. (default=’’)
  • data_path (string or NoneType) – Path to directory to download data to. (default=None)
  • user (string or NoneType) – User string input used for download. Provided by user and passed via pysat. If an account is required for downloads this routine here must error if user not supplied. (default=None)
  • password (string or NoneType) – Password for data download. (default=None)
  • test_download_kwarg (any or NoneType) – Testing keyword (default=None)
Raises:

ValueError – When user/password are required but not supplied

Warning

When no download support will be provided

pysat.instruments.pysat_testing_xarray.list_files(tag=None, inst_id=None, data_path=None, format_str=None, file_date_range=None, *, test_dates={'': {'': datetime.datetime(2009, 1, 1, 0, 0)}}, mangle_file_dates=False, test_list_files_kwarg=None)

Produce a fake list of files spanning three years

Parameters:
  • tag (str or NoneType) – pysat instrument tag (default=None)
  • inst_id (str or NoneType) – pysat satellite ID tag (default=None)
  • data_path (str or NoneType) – pysat data path (default=None)
  • format_str (str or NoneType) – file format string (default=None)
  • file_date_range (pds.date_range) – File date range. The default mode generates a list of 3 years of daily files (1 year back, 2 years forward) based on the test_dates passed through below. Otherwise, accepts a range of files specified by the user. (default=None)
  • test_dates (dt.datetime or NoneType) – Pass the _test_date object through from the test instrument files
  • mangle_file_dates (bool) – If True, file dates are shifted by 5 minutes. (default=False)
  • test_list_files_kwarg (any or NoneType) – Testing keyword (default=None)
Returns:

Return type:

Series of filenames indexed by file time

pysat.instruments.pysat_testing_xarray.list_remote_files(tag=None, inst_id=None, data_path=None, format_str=None, start=None, stop=None, *, test_dates={'': {'': datetime.datetime(2009, 1, 1, 0, 0)}}, user=None, password=None, mangle_file_dates=False, test_list_remote_kwarg=None)

Produce a fake list of files spanning three years and one month to simulate new data files on a remote server

Parameters:
  • tag (str or NoneType) – pysat instrument tag (default=None)
  • inst_id (str or NoneType) – pysat satellite ID tag (default=None)
  • data_path (str or NoneType) – pysat data path (default=None)
  • format_str (str or NoneType) – file format string (default=None)
  • start (dt.datetime or NoneType) – Starting time for file list. A None value will start 1 year before test_date (default=None)
  • stop (dt.datetime or NoneType) – Ending time for the file list. A None value will stop 2 years 1 month after test_date (default=None)
  • test_dates (dt.datetime or NoneType) – Pass the _test_date object through from the test instrument files
  • user (string or NoneType) – User string input used for download. Provided by user and passed via pysat. If an account is required for dowloads this routine here must error if user not supplied. (default=None)
  • password (string or NoneType) – Password for data download. (default=None)
  • mangle_file_dates (bool) – If True, file dates are shifted by 5 minutes. (default=False)
  • test_list_remote_kwarg (any or NoneType) – Testing keyword (default=None)
Returns:

Filenames indexed by file time, see list_files for more info

Return type:

pds.Series

pysat.instruments.pysat_testing_xarray.load(fnames, tag=None, inst_id=None, sim_multi_file_right=False, sim_multi_file_left=False, malformed_index=False, num_samples=None, test_load_kwarg=None)

Loads the test files

Parameters:
  • fnames (list) – List of filenames
  • tag (str or NoneType) – Instrument tag (accepts ‘’)
  • inst_id (str or NoneType) – Instrument satellite ID (accepts ‘’)
  • sim_multi_file_right (boolean) – Adjusts date range to be 12 hours in the future or twelve hours beyond root_date (default=False)
  • sim_multi_file_left (boolean) – Adjusts date range to be 12 hours in the past or twelve hours before root_date (default=False)
  • malformed_index (boolean) – If True, time index will be non-unique and non-monotonic.
  • num_samples (int) – Number of samples
  • test_load_kwarg (any or NoneType) – Testing keyword (default=None)
Returns:

  • data (xr.Dataset) – Testing data
  • meta (pysat.Meta) – Metadata

pysat_testing2d

Produces fake instrument data for testing.

pysat.instruments.pysat_testing2d.download(date_array, tag, inst_id, data_path=None, user=None, password=None, test_download_kwarg=None)

Simple pass function for pysat compatibility for test instruments.

This routine is invoked by pysat and is not intended for direct use by the end user.

Parameters:
  • date_array (array-like) – list of datetimes to download data for. The sequence of dates need not be contiguous.
  • tag (string) – Tag identifier used for particular dataset. This input is provided by pysat. (default=’’)
  • inst_id (string) – Instrument ID string identifier used for particular dataset. This input is provided by pysat. (default=’’)
  • data_path (string or NoneType) – Path to directory to download data to. (default=None)
  • user (string or NoneType) – User string input used for download. Provided by user and passed via pysat. If an account is required for downloads this routine here must error if user not supplied. (default=None)
  • password (string or NoneType) – Password for data download. (default=None)
  • test_download_kwarg (any or NoneType) – Testing keyword (default=None)
Raises:

ValueError – When user/password are required but not supplied

Warning

When no download support will be provided

pysat.instruments.pysat_testing2d.list_files(tag=None, inst_id=None, data_path=None, format_str=None, file_date_range=None, *, test_dates={'': {'': datetime.datetime(2009, 1, 1, 0, 0)}}, mangle_file_dates=False, test_list_files_kwarg=None)

Produce a fake list of files spanning three years

Parameters:
  • tag (str or NoneType) – pysat instrument tag (default=None)
  • inst_id (str or NoneType) – pysat satellite ID tag (default=None)
  • data_path (str or NoneType) – pysat data path (default=None)
  • format_str (str or NoneType) – file format string (default=None)
  • file_date_range (pds.date_range) – File date range. The default mode generates a list of 3 years of daily files (1 year back, 2 years forward) based on the test_dates passed through below. Otherwise, accepts a range of files specified by the user. (default=None)
  • test_dates (dt.datetime or NoneType) – Pass the _test_date object through from the test instrument files
  • mangle_file_dates (bool) – If True, file dates are shifted by 5 minutes. (default=False)
  • test_list_files_kwarg (any or NoneType) – Testing keyword (default=None)
Returns:

Return type:

Series of filenames indexed by file time

pysat.instruments.pysat_testing2d.list_remote_files(tag=None, inst_id=None, data_path=None, format_str=None, start=None, stop=None, *, test_dates={'': {'': datetime.datetime(2009, 1, 1, 0, 0)}}, user=None, password=None, mangle_file_dates=False, test_list_remote_kwarg=None)

Produce a fake list of files spanning three years and one month to simulate new data files on a remote server

Parameters:
  • tag (str or NoneType) – pysat instrument tag (default=None)
  • inst_id (str or NoneType) – pysat satellite ID tag (default=None)
  • data_path (str or NoneType) – pysat data path (default=None)
  • format_str (str or NoneType) – file format string (default=None)
  • start (dt.datetime or NoneType) – Starting time for file list. A None value will start 1 year before test_date (default=None)
  • stop (dt.datetime or NoneType) – Ending time for the file list. A None value will stop 2 years 1 month after test_date (default=None)
  • test_dates (dt.datetime or NoneType) – Pass the _test_date object through from the test instrument files
  • user (string or NoneType) – User string input used for download. Provided by user and passed via pysat. If an account is required for dowloads this routine here must error if user not supplied. (default=None)
  • password (string or NoneType) – Password for data download. (default=None)
  • mangle_file_dates (bool) – If True, file dates are shifted by 5 minutes. (default=False)
  • test_list_remote_kwarg (any or NoneType) – Testing keyword (default=None)
Returns:

Filenames indexed by file time, see list_files for more info

Return type:

pds.Series

pysat.instruments.pysat_testing2d.load(fnames, tag=None, inst_id=None, malformed_index=False, num_samples=None, test_load_kwarg=None)

Loads the test files

Parameters:
  • fnames (list) – List of filenames
  • tag (str or NoneType) – Instrument tag (accepts ‘’)
  • inst_id (str or NoneType) – Instrument satellite ID (accepts ‘’)
  • malformed_index (bool) – If True, the time index will be non-unique and non-monotonic. (default=False)
  • num_samples (int) – Number of samples
  • test_load_kwarg (any or NoneType) – Testing keyword (default=None)
Returns:

  • data (pds.DataFrame) – Testing data
  • meta (pysat.Meta) – Testing metadata

pysat_testing2d_xarray

Produces fake instrument data for testing.

pysat.instruments.pysat_testing2d_xarray.download(date_array, tag, inst_id, data_path=None, user=None, password=None, test_download_kwarg=None)

Simple pass function for pysat compatibility for test instruments.

This routine is invoked by pysat and is not intended for direct use by the end user.

Parameters:
  • date_array (array-like) – list of datetimes to download data for. The sequence of dates need not be contiguous.
  • tag (string) – Tag identifier used for particular dataset. This input is provided by pysat. (default=’’)
  • inst_id (string) – Instrument ID string identifier used for particular dataset. This input is provided by pysat. (default=’’)
  • data_path (string or NoneType) – Path to directory to download data to. (default=None)
  • user (string or NoneType) – User string input used for download. Provided by user and passed via pysat. If an account is required for downloads this routine here must error if user not supplied. (default=None)
  • password (string or NoneType) – Password for data download. (default=None)
  • test_download_kwarg (any or NoneType) – Testing keyword (default=None)
Raises:

ValueError – When user/password are required but not supplied

Warning

When no download support will be provided

pysat.instruments.pysat_testing2d_xarray.list_files(tag=None, inst_id=None, data_path=None, format_str=None, file_date_range=None, *, test_dates={'': {'': datetime.datetime(2009, 1, 1, 0, 0)}}, mangle_file_dates=False, test_list_files_kwarg=None)

Produce a fake list of files spanning three years

Parameters:
  • tag (str or NoneType) – pysat instrument tag (default=None)
  • inst_id (str or NoneType) – pysat satellite ID tag (default=None)
  • data_path (str or NoneType) – pysat data path (default=None)
  • format_str (str or NoneType) – file format string (default=None)
  • file_date_range (pds.date_range) – File date range. The default mode generates a list of 3 years of daily files (1 year back, 2 years forward) based on the test_dates passed through below. Otherwise, accepts a range of files specified by the user. (default=None)
  • test_dates (dt.datetime or NoneType) – Pass the _test_date object through from the test instrument files
  • mangle_file_dates (bool) – If True, file dates are shifted by 5 minutes. (default=False)
  • test_list_files_kwarg (any or NoneType) – Testing keyword (default=None)
Returns:

Return type:

Series of filenames indexed by file time

pysat.instruments.pysat_testing2d_xarray.list_remote_files(tag=None, inst_id=None, data_path=None, format_str=None, start=None, stop=None, *, test_dates={'': {'': datetime.datetime(2009, 1, 1, 0, 0)}}, user=None, password=None, mangle_file_dates=False, test_list_remote_kwarg=None)

Produce a fake list of files spanning three years and one month to simulate new data files on a remote server

Parameters:
  • tag (str or NoneType) – pysat instrument tag (default=None)
  • inst_id (str or NoneType) – pysat satellite ID tag (default=None)
  • data_path (str or NoneType) – pysat data path (default=None)
  • format_str (str or NoneType) – file format string (default=None)
  • start (dt.datetime or NoneType) – Starting time for file list. A None value will start 1 year before test_date (default=None)
  • stop (dt.datetime or NoneType) – Ending time for the file list. A None value will stop 2 years 1 month after test_date (default=None)
  • test_dates (dt.datetime or NoneType) – Pass the _test_date object through from the test instrument files
  • user (string or NoneType) – User string input used for download. Provided by user and passed via pysat. If an account is required for dowloads this routine here must error if user not supplied. (default=None)
  • password (string or NoneType) – Password for data download. (default=None)
  • mangle_file_dates (bool) – If True, file dates are shifted by 5 minutes. (default=False)
  • test_list_remote_kwarg (any or NoneType) – Testing keyword (default=None)
Returns:

Filenames indexed by file time, see list_files for more info

Return type:

pds.Series

pysat.instruments.pysat_testing2d_xarray.load(fnames, tag=None, inst_id=None, malformed_index=False, num_samples=None, test_load_kwarg=None)

Loads the test files

Parameters:
  • fnames (list) – List of filenames
  • tag (str or NoneType) – Instrument tag (accepts ‘’)
  • inst_id (str or NoneType) – Instrument satellite ID (accepts ‘’)
  • malformed_index (bool False) – If True, the time index will be non-unique and non-monotonic.
  • num_samples (int) – Number of samples
  • test_load_kwarg (any or NoneType) – Testing keyword (default=None)
Returns:

  • data (xr.Dataset) – Testing data
  • meta (pysat.Meta) – Testing metadata

pysat_testmodel

Produces fake instrument data for testing.

pysat.instruments.pysat_testmodel.download(date_array, tag, inst_id, data_path=None, user=None, password=None, test_download_kwarg=None)

Simple pass function for pysat compatibility for test instruments.

This routine is invoked by pysat and is not intended for direct use by the end user.

Parameters:
  • date_array (array-like) – list of datetimes to download data for. The sequence of dates need not be contiguous.
  • tag (string) – Tag identifier used for particular dataset. This input is provided by pysat. (default=’’)
  • inst_id (string) – Instrument ID string identifier used for particular dataset. This input is provided by pysat. (default=’’)
  • data_path (string or NoneType) – Path to directory to download data to. (default=None)
  • user (string or NoneType) – User string input used for download. Provided by user and passed via pysat. If an account is required for downloads this routine here must error if user not supplied. (default=None)
  • password (string or NoneType) – Password for data download. (default=None)
  • test_download_kwarg (any or NoneType) – Testing keyword (default=None)
Raises:

ValueError – When user/password are required but not supplied

Warning

When no download support will be provided

pysat.instruments.pysat_testmodel.list_files(tag=None, inst_id=None, data_path=None, format_str=None, file_date_range=None, *, test_dates={'': {'': datetime.datetime(2009, 1, 1, 0, 0)}}, mangle_file_dates=False, test_list_files_kwarg=None)

Produce a fake list of files spanning three years

Parameters:
  • tag (str or NoneType) – pysat instrument tag (default=None)
  • inst_id (str or NoneType) – pysat satellite ID tag (default=None)
  • data_path (str or NoneType) – pysat data path (default=None)
  • format_str (str or NoneType) – file format string (default=None)
  • file_date_range (pds.date_range) – File date range. The default mode generates a list of 3 years of daily files (1 year back, 2 years forward) based on the test_dates passed through below. Otherwise, accepts a range of files specified by the user. (default=None)
  • test_dates (dt.datetime or NoneType) – Pass the _test_date object through from the test instrument files
  • mangle_file_dates (bool) – If True, file dates are shifted by 5 minutes. (default=False)
  • test_list_files_kwarg (any or NoneType) – Testing keyword (default=None)
Returns:

Return type:

Series of filenames indexed by file time

pysat.instruments.pysat_testmodel.list_remote_files(tag=None, inst_id=None, data_path=None, format_str=None, start=None, stop=None, *, test_dates={'': {'': datetime.datetime(2009, 1, 1, 0, 0)}}, user=None, password=None, mangle_file_dates=False, test_list_remote_kwarg=None)

Produce a fake list of files spanning three years and one month to simulate new data files on a remote server

Parameters:
  • tag (str or NoneType) – pysat instrument tag (default=None)
  • inst_id (str or NoneType) – pysat satellite ID tag (default=None)
  • data_path (str or NoneType) – pysat data path (default=None)
  • format_str (str or NoneType) – file format string (default=None)
  • start (dt.datetime or NoneType) – Starting time for file list. A None value will start 1 year before test_date (default=None)
  • stop (dt.datetime or NoneType) – Ending time for the file list. A None value will stop 2 years 1 month after test_date (default=None)
  • test_dates (dt.datetime or NoneType) – Pass the _test_date object through from the test instrument files
  • user (string or NoneType) – User string input used for download. Provided by user and passed via pysat. If an account is required for dowloads this routine here must error if user not supplied. (default=None)
  • password (string or NoneType) – Password for data download. (default=None)
  • mangle_file_dates (bool) – If True, file dates are shifted by 5 minutes. (default=False)
  • test_list_remote_kwarg (any or NoneType) – Testing keyword (default=None)
Returns:

Filenames indexed by file time, see list_files for more info

Return type:

pds.Series

pysat.instruments.pysat_testmodel.load(fnames, tag=None, inst_id=None, num_samples=None, test_load_kwarg=None)

Loads the test files

Parameters:
  • fnames (list) – List of filenames
  • tag (str or NoneType) – Instrument tag (accepts ‘’)
  • inst_id (str or NoneType) – Instrument satellite ID (accepts ‘’)
  • num_samples (int) – Number of samples
  • test_load_kwarg (any or NoneType) – Testing keyword (default=None)
Returns:

  • data (xr.Dataset) – Testing data
  • meta (pysat.Meta) – Metadata

Test Constellations

The following Constellation modules support unit and integration testing for packages that depend on pysat.

Testing

Single Test

Testing Empty