API¶
Instrument¶
-
class
pysat.
Instrument
(platform=None, name=None, tag=None, inst_id=None, sat_id=None, clean_level='clean', update_files=None, pad=None, orbit_info=None, inst_module=None, multi_file_day=None, manual_org=None, directory_format=None, file_format=None, temporary_file_list=False, strict_time_flag=False, ignore_empty_files=False, units_label='units', name_label='long_name', notes_label='notes', desc_label='desc', plot_label='label', axis_label='axis', scale_label='scale', min_label='value_min', max_label='value_max', fill_label='fill', *arg, **kwargs)¶ Download, load, manage, modify and analyze science data.
Deprecated since version 2.3.0: Several attributes and methods will be removed or replaced in pysat 3.0.0: sat_id, default, multi_file_day, manual_org, units_label, name_label, notes_label, desc_label, min_label, max_label, fill_label, plot_label, axis_label, scale_label, and _filter_datetime_input
Parameters: - platform (string) – name of platform/satellite.
- name (string) – name of instrument.
- tag (string, optional) – identifies particular subset of instrument data.
- inst_id (string) – Replaces sat_id
- sat_id (string, optional) – identity within constellation
- clean_level ({'clean','dusty','dirty','none'}, optional) – level of data quality
- pad (pandas.DateOffset, or dictionary, optional) – Length of time to pad the begining and end of loaded data for time-series processing. Extra data is removed after applying all custom functions. Dictionary, if supplied, is simply passed to pandas DateOffset.
- orbit_info (dict) – Orbit information, {‘index’:index, ‘kind’:kind, ‘period’:period}. See pysat.Orbits for more information.
- inst_module (module, optional) – Provide instrument module directly. Takes precedence over platform/name.
- update_files (boolean, optional) – If True, immediately query filesystem for instrument files and store.
- temporary_file_list (boolean, optional) – If true, the list of Instrument files will not be written to disk. Prevents a race condition when running multiple pysat processes.
- strict_time_flag (boolean, option (False)) – If true, pysat will check data to ensure times are unique and monotonic. In future versions, this will be fixed to True.
- multi_file_day (boolean, optional) – Set to True if Instrument data files for a day are spread across multiple files and data for day n could be found in a file with a timestamp of day n-1 or n+1. Deprecated at this level in pysat 3.0.0.
- manual_org (bool) – if True, then pysat will look directly in pysat data directory for data files and will not use default /platform/name/tag. Deprecated in pysat 3.0.0, as this flag is not needed to use directory_format.
- directory_format (str) – directory naming structure in string format. Variables such as platform, name, and tag will be filled in as needed using python string formatting. The default directory structure would be expressed as ‘{platform}/{name}/{tag}’
- file_format (str or NoneType) – File naming structure in string format. Variables such as year, month, and sat_id will be filled in as needed using python string formatting. The default file format structure is supplied in the instrument list_files routine.
- ignore_empty_files (boolean) – if True, the list of files found will be checked to ensure the filesizes are greater than zero. Empty files are removed from the stored list of files.
- units_label (str) – String used to label units in storage. Defaults to ‘units’.
- name_label (str) – String used to label long_name in storage. Defaults to ‘name’.
- notes_label (str) – label to use for notes in storage. Defaults to ‘notes’
- desc_label (str) – label to use for variable descriptions in storage. Defaults to ‘desc’
- plot_label (str) – label to use to label variables in plots. Defaults to ‘label’
- axis_label (str) – label to use for axis on a plot. Defaults to ‘axis’
- scale_label (str) – label to use for plot scaling type in storage. Defaults to ‘scale’
- min_label (str) – label to use for typical variable value min limit in storage. Defaults to ‘value_min’
- max_label (str) – label to use for typical variable value max limit in storage. Defaults to ‘value_max’
- fill_label (str) – label to use for fill values. Defaults to ‘fill’ but some implementations will use ‘FillVal’
-
data
¶ loaded science data
Type: pandas.DataFrame
-
date
¶ date for loaded data
Type: pandas.datetime
-
yr
¶ year for loaded data
Type: int
-
bounds
¶ bounds for loading data, supply array_like for a season with gaps. Users may provide as a tuple or tuple of lists, but the attribute is stored as a tuple of lists for consistency
Type: (datetime/filename/None, datetime/filename/None)
-
doy
¶ day of year for loaded data
Type: int
-
files
¶ interface to instrument files
Type: pysat.Files
-
meta
¶ interface to instrument metadata, similar to netCDF 1.6
Type: pysat.Meta
-
orbits
¶ interface to extracting data orbit-by-orbit
Type: pysat.Orbits
-
custom
¶ interface to instrument nano-kernel
Type: pysat.Custom
-
kwargs
¶ keyword arguments passed to instrument loading routine
Type: dictionary
Note
Pysat attempts to load the module platform_name.py located in the pysat/instruments directory. This module provides the underlying functionality to download, load, and clean instrument data. Alternatively, the module may be supplied directly using keyword inst_module.
Examples
# 1-second mag field data vefi = pysat.Instrument(platform='cnofs', name='vefi', tag='dc_b', clean_level='clean') start = pysat.datetime(2009,1,1) stop = pysat.datetime(2009,1,2) vefi.download(start, stop) vefi.load(date=start) print(vefi['dB_mer']) print(vefi.meta['db_mer']) # 1-second thermal plasma parameters ivm = pysat.Instrument(platform='cnofs', name='ivm', tag='', clean_level='clean') ivm.download(start,stop) ivm.load(2009,1) print(ivm['ionVelmeridional']) # Ionosphere profiles from GPS occultation cosmic = pysat.Instrument('cosmic', 'gps', 'ionprf', altitude_bin=3) # bins profile using 3 km step cosmic.download(start, stop, user=user, password=password) cosmic.load(date=start)
-
bounds
Boundaries for iterating over instrument object by date or file.
Parameters: - start (datetime object, filename, or None (default)) – start of iteration, if None uses first data date. list-like collection also accepted
- end (datetime object, filename, or None (default)) – end of iteration, inclusive. If None uses last data date. list-like collection also accepted
Note
Both start and stop must be the same type (date, or filename) or None. Only the year, month, and day are used for date inputs.
Examples
inst = pysat.Instrument(platform=platform, name=name, tag=tag) start = pysat.datetime(2009,1,1) stop = pysat.datetime(2009,1,31) inst.bounds = (start,stop) start2 = pysat.datetetime(2010,1,1) stop2 = pysat.datetime(2010,2,14) inst.bounds = ([start, start2], [stop, stop2])
-
concat_data
(data, *args, **kwargs)¶ Concats data1 and data2 for xarray or pandas as needed
Parameters: data (pandas or xarray) – Data to be appended to data already within the Instrument object Returns: Instrument.data modified in place. Return type: void Notes
For pandas, sort=False is passed along to the underlying pandas.concat method. If sort is supplied as a keyword, the user provided value is used instead.
For xarray, dim=’Epoch’ is passed along to xarray.concat except if the user includes a value for dim as a keyword argument.
-
copy
()¶ Deep copy of the entire Instrument object.
-
date
Date for loaded data.
-
download
(start=None, stop=None, freq='D', user=None, password=None, date_array=None, **kwargs)¶ Download data for given Instrument object from start to stop.
Parameters: - start (pandas.datetime (yesterday)) – start date to download data
- stop (pandas.datetime (tomorrow)) – stop date to download data
- freq (string) – Stepsize between dates for season, ‘D’ for daily, ‘M’ monthly (see pandas)
- user (string) – username, if required by instrument data archive
- password (string) – password, if required by instrument data archive
- date_array (list-like) – Sequence of dates to download date for. Takes precendence over start and stop inputs
- **kwargs (dict) – Dictionary of keywords that may be options for specific instruments
Note
Data will be downloaded to pysat_data_dir/patform/name/tag
If Instrument bounds are set to defaults they are updated after files are downloaded.
-
download_updated_files
(user=None, password=None, **kwargs)¶ Grabs a list of remote files, compares to local, then downloads new files.
Parameters: - user (string) – username, if required by instrument data archive
- password (string) – password, if required by instrument data archive
- **kwargs (dict) – Dictionary of keywords that may be options for specific instruments
Note
Data will be downloaded to pysat_data_dir/patform/name/tag
If Instrument bounds are set to defaults they are updated after files are downloaded.
-
empty
¶ Boolean flag reflecting lack of data.
True if there is no Instrument data.
-
generic_meta_translator
(meta_to_translate)¶ Translates the metadate contained in an object into a dictionary suitable for export.
Parameters: meta_to_translate (Meta) – The metadata object to translate Returns: A dictionary of the metadata for each variable of an output file e.g. netcdf4 Return type: dict
-
index
¶ Returns time index of loaded data.
-
load
(yr=None, doy=None, date=None, fname=None, fid=None, verifyPad=False)¶ Load instrument data into Instrument object .data.
Parameters: - yr (integer) – year for desired data
- doy (integer) – day of year
- date (datetime object) – date to load
- fname ('string') – filename to be loaded
- verifyPad (boolean) – if True, padding data not removed (debug purposes)
Returns: Return type: Void. Data is added to self.data
Note
Loads data for a chosen instrument into .data. Any functions chosen by the user and added to the custom processing queue (.custom.add) are automatically applied to the data before it is available to user in .data.
-
next
(verifyPad=False)¶ Manually iterate through the data loaded in Instrument object.
Bounds of iteration and iteration type (day/file) are set by bounds attribute.
Note
If there were no previous calls to load then the first day(default)/file will be loaded.
-
prev
(verifyPad=False)¶ Manually iterate backwards through the data in Instrument object.
Bounds of iteration and iteration type (day/file) are set by bounds attribute.
Note
If there were no previous calls to load then the first day(default)/file will be loaded.
-
remote_date_range
(year=None, month=None, day=None)¶ Returns fist and last date for remote data. Default behaviour is to search all files. User may additionally specify a given year, year/month, or year/month/day combination to return a subset of available files.
-
remote_file_list
(year=None, month=None, day=None)¶ List remote files for chosen instrument. Default behaviour is to return all files. User may additionally specify a given year, year/month, or year/month/day combination to return a subset of available files.
-
to_netcdf4
(fname=None, base_instrument=None, epoch_name='Epoch', zlib=False, complevel=4, shuffle=True, preserve_meta_case=False, export_nan=None, unlimited_time=True)¶ Stores loaded data into a netCDF4 file.
Parameters: - fname (string) – full path to save instrument object to
- base_instrument (pysat.Instrument) – used as a comparison, only attributes that are present with self and not on base_instrument are written to netCDF
- epoch_name (str) – Label in file for datetime index of Instrument object
- zlib (boolean) – Flag for engaging zlib compression (True - compression on)
- complevel (int) – an integer between 1 and 9 describing the level of compression desired (default 4). Ignored if zlib=False
- shuffle (boolean) – the HDF5 shuffle filter will be applied before compressing the data (default True). This significantly improves compression. Default is True. Ignored if zlib=False.
- preserve_meta_case (bool (False)) – if True, then the variable strings within the MetaData object, which preserves case, are used to name variables in the written netCDF file. If False, then the variable strings used to access data from the Instrument object are used instead. By default, the variable strings on both the data and metadata side are the same, though this relationship may be altered by a user.
- export_nan (list or None) – By default, the metadata variables where a value of NaN is allowed and written to the netCDF4 file is maintained by the Meta object attached to the pysat.Instrument object. A list supplied here will override the settings provided by Meta, and all parameters included will be written to the file. If not listed and a value is NaN then that attribute simply won’t be included in the netCDF4 file.
- unlimited_time (bool) – If True, then the main epoch dimension will be set to ‘unlimited’ within the netCDF4 file. (default=True)
Note
Stores 1-D data along dimension ‘epoch’ - the date time index.
Stores higher order data (e.g. dataframes within series) separately
- The name of the main variable column is used to prepend subvariable names within netCDF, var_subvar_sub
- A netCDF4 dimension is created for each main variable column with higher order data; first dimension Epoch
- The index organizing the data stored as a dimension variable
- from_netcdf4 uses the variable dimensions to reconstruct data structure
All attributes attached to instrument meta are written to netCDF attrs with the exception of ‘Date_End’, ‘Date_Start’, ‘File’, ‘File_Date’, ‘Generation_Date’, and ‘Logical_File_ID’. These are defined within to_netCDF at the time the file is written, as per the adopted standard, SPDF ISTP/IACG Modified for NetCDF. Atrributes ‘Conventions’ and ‘Text_Supplement’ are given default values if not present.
-
today
()¶ Returns today’s date, with no hour, minute, second, etc.
Parameters: None – Returns: Today’s date Return type: datetime
-
tomorrow
()¶ Returns tomorrow’s date, with no hour, minute, second, etc.
Parameters: None – Returns: Tomorrow’s date Return type: datetime
-
variables
¶ Returns list of variables within loaded data.
-
yesterday
()¶ Returns yesterday’s date, with no hour, minute, second, etc.
Parameters: None – Returns: Yesterday’s date Return type: datetime
Instrument Methods¶
The following methods support the variety of actions needed by underlying pysat.Instrument modules.
Demeter¶
Provides non-instrument routines for DEMETER microsatellite data
Deprecated since version 2.3.0: This module has been removed from pysat in the 3.0.0 release and can now be found in pysatIncubator (https://github.com/pysat/pysatIncubator)
-
pysat.instruments.methods.demeter.
download
(date_array, tag, sat_id, data_path=None, user=None, password=None)¶ Download
Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatIncubator.instruments.methods.demeter
-
pysat.instruments.methods.demeter.
bytes_to_float
(chunk)¶ Convert a chunk of bytes to a float
Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatIncubator.instruments.methods.demeter
Parameters: chunk (string or bytes) – A chunk of bytes Returns: value – A 32 bit float Return type: float
-
pysat.instruments.methods.demeter.
load_general_header
(fhandle)¶ Load the general header block (block 1 for each time)
Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatIncubator.instruments.methods.demeter
Parameters: fhandle ((file handle)) – File handle Returns: - data (list) – List of data values containing: P field, Number of days from 01/01/1950, number of miliseconds in the day, UT as datetime, Orbit number, downward (False) upward (True) indicator
- meta (dict) – Dictionary with meta data for keys: ‘telemetry station’, ‘software processing version’, ‘software processing subversion’, ‘calibration file version’, and ‘calibration file subversion’, ‘data names’, ‘data units’
-
pysat.instruments.methods.demeter.
load_location_parameters
(fhandle)¶ Load the orbital and geomagnetic parameter block (block 1 for each time)
Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatIncubator.instruments.methods.demeter
Parameters: fhandle ((file handle)) – File handle Returns: - data (list) – List of data values containing: geoc lat, geoc lon, alt, lt, geom lat, geom lon, mlt, inv lat, L-shell, geoc lat of conj point, geoc lon of conj point, geoc lat of N conj point at 110 km, geoc lon of N conj point at 110 km, geoc lat of S conj point at 110 km, geoc lon of S conj point at 110 km, components of magnetic field at sat point, proton gyrofreq at sat point, solar position in geog coords
- meta (dict) – Dictionary with meta data for keys: ‘software processing version’, ‘software processing subversion’, ‘data names’, ‘data units’
-
pysat.instruments.methods.demeter.
load_attitude_parameters
(fhandle)¶ Load the attitude parameter block (block 1 for each time)
Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatIncubator.instruments.methods.demeter
Parameters: fhandle ((file handle)) – File handle Returns: - data (list) – list of data values containing: matrix elements from satellite coord system to geographic coordinate system, matrix elements from geographic coordinate system to local geomagnetic coordinate system, quality index of attitude parameters.
- meta (dict) – Dictionary with meta data for keys: ‘software processing version’, ‘software processing subversion’, ‘data names’, ‘data units’
-
pysat.instruments.methods.demeter.
load_binary_file
(fname, load_experiment_data)¶ Load the binary data from a DEMETER file
Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatIncubator.instruments.methods.demeter
Parameters: - fname (string) – Filename
- load_experiment_data (function) – Function to load experiment data, taking the file handle as input
Returns: - data (np.array) – Data from file stored in a numpy array
- meta (dict) – Meta data for file, including data names and units
-
pysat.instruments.methods.demeter.
set_metadata
(name, meta_dict)¶ Set metadata for each DEMETER instrument, using dict containing metadata
Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatIncubator.instruments.methods.demeter
Parameters: - name (string) – DEMETER instrument name
- meta_dict (dict) – Dictionary containing metadata information and data attributes. Data attributes are available in the keys ‘data names’ and ‘data units’
Returns: meta – Meta class boject
Return type:
General¶
Provides generalized routines for integrating instruments into pysat.
-
pysat.instruments.methods.general.
convert_timestamp_to_datetime
(inst, sec_mult=1.0)¶ Use datetime instead of timestamp for Epoch
Parameters: - inst (pysat.Instrument) – associated pysat.Instrument object
- sec_mult (float) – Multiplier needed to convert epoch time to seconds (default=1.0)
-
pysat.instruments.methods.general.
list_files
(tag=None, sat_id=None, data_path=None, format_str=None, supported_tags=None, fake_daily_files_from_monthly=False, two_digit_year_break=None, file_cadance=datetime.timedelta(days=1))¶ Return a Pandas Series of every file for chosen satellite data.
This routine provides a standard interfacefor pysat instrument modules.
Deprecated since version 2.3.0: The fake_daily_files_from_monthly kwarg has been deprecated and replaced with file_cadance in pysat 3.0.0.
Parameters: - tag (string or NoneType) – Denotes type of file to load. Accepted types are <tag strings>. (default=None)
- sat_id (string or NoneType) – Specifies the satellite ID for a constellation. Not used. (default=None)
- data_path (string or NoneType) – Path to data directory. If None is specified, the value previously set in Instrument.files.data_path is used. (default=None)
- format_str (string or NoneType) – User specified file format. If None is specified, the default formats associated with the supplied tags are used. (default=None)
- supported_tags (dict or NoneType) – keys are sat_id, each containing a dict keyed by tag where the values file format template strings. (default=None)
- fake_daily_files_from_monthly (bool) – Some CDAWeb instrument data files are stored by month, interfering with pysat’s functionality of loading by day. This flag, when true, appends daily dates to monthly files internally. These dates are used by load routine in this module to provide data by day. This keyword arg has been deprecated. In pysat 2.3.0, setting file_cadance=dt.datetime(days=1) is equivalent to setting this to False, while using file_cadance=pds.DateOffset(months=1) is equivalent to setting this to True. (default=False)
- two_digit_year_break (int) – If filenames only store two digits for the year, then ‘1900’ will be added for years >= two_digit_year_break and ‘2000’ will be added for years < two_digit_year_break.
- file_cadence (dt.timedelta or pds.DateOffset) – pysat assumes a daily file cadence, but some instrument data file contain longer periods of time. This parameter allows the specification of regular file cadences greater than or equal to a day (e.g., weekly, monthly, or yearly). In pysat 2.3.0, only daily and monthly cadances are supported. (default=dt.timedelta(days=1))
Returns: pysat.Files.from_os – A class containing the verified available files
Return type: (pysat._files.Files)
Examples
fname = 'cnofs_vefi_bfield_1sec_{year:04d}{month:02d}{day:02d}_v05.cdf' supported_tags = {'dc_b': fname} list_files = functools.partial(nasa_cdaweb.list_files, supported_tags=supported_tags) fname = 'cnofs_cindi_ivm_500ms_{year:4d}{month:02d}{day:02d}_v01.cdf' supported_tags = {'': fname} list_files = functools.partial(mm_gen.list_files, supported_tags=supported_tags)
-
pysat.instruments.methods.general.
remove_leading_text
(inst, target=None)¶ Removes leading text on variable names :param inst: associated pysat.Instrument object :type inst: pysat.Instrument :param target: Leading string to remove. If none supplied, returns unmodified :type target: str or list of strings
Returns: Modifies Instrument object in place Return type: None
NASA CDAWeb¶
Provides default routines for integrating NASA CDAWeb instruments into pysat. Adding new CDAWeb datasets should only require mininal user intervention.
-
pysat.instruments.methods.nasa_cdaweb.
load
(fnames, tag=None, sat_id=None, fake_daily_files_from_monthly=False, flatten_twod=True)¶ Load NASA CDAWeb CDF files.
Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatNASA.instruments.methods.cdaweb
This routine is intended to be used by pysat instrument modules supporting a particular NASA CDAWeb dataset.
Parameters: - fnames ((pandas.Series)) – Series of filenames
- tag ((str or NoneType)) – tag or None (default=None)
- sat_id ((str or NoneType)) – satellite id or None (default=None)
- fake_daily_files_from_monthly (bool) – Some CDAWeb instrument data files are stored by month, interfering with pysat’s functionality of loading by day. This flag, when true, parses of daily dates to monthly files that were added internally by the list_files routine, when flagged. These dates are used here to provide data by day.
- flatted_twod (bool) – Flattens 2D data into different columns of root DataFrame rather than produce a Series of DataFrames
Returns: - data ((pandas.DataFrame)) – Object containing satellite data
- meta ((pysat.Meta)) – Object containing metadata such as column names and units
Examples
# within the new instrument module, at the top level define # a new variable named load, and set it equal to this load method # code below taken from cnofs_ivm.py. # support load routine # use the default CDAWeb method load = cdw.load
-
pysat.instruments.methods.nasa_cdaweb.
list_files
(tag=None, sat_id=None, data_path=None, format_str=None, supported_tags=None, fake_daily_files_from_monthly=False, two_digit_year_break=None)¶ Return a Pandas Series of every file for chosen satellite data.
Deprecated since version 2.2.0: list_files will be removed in pysat 3.0.0, it will be replaced by the copy in instruments.methods.general
This routine is intended to be used by pysat instrument modules supporting a particular NASA CDAWeb dataset.
Parameters: - tag ((string or NoneType)) – Denotes type of file to load. Accepted types are <tag strings>. (default=None)
- sat_id ((string or NoneType)) – Specifies the satellite ID for a constellation. Not used. (default=None)
- data_path ((string or NoneType)) – Path to data directory. If None is specified, the value previously set in Instrument.files.data_path is used. (default=None)
- format_str ((string or NoneType)) – User specified file format. If None is specified, the default formats associated with the supplied tags are used. (default=None)
- supported_tags ((dict or NoneType)) – keys are sat_id, each containing a dict keyed by tag where the values file format template strings. (default=None)
- fake_daily_files_from_monthly ((bool)) – Some CDAWeb instrument data files are stored by month, interfering with pysat’s functionality of loading by day. This flag, when true, appends daily dates to monthly files internally. These dates are used by load routine in this module to provide data by day.
- two_digit_year_break ((int)) – If filenames only store two digits for the year, then ‘1900’ will be added for years >= two_digit_year_break and ‘2000’ will be added for years < two_digit_year_break.
Returns: pysat.Files.from_os – A class containing the verified available files
Return type: (pysat._files.Files)
Examples
fname = 'cnofs_vefi_bfield_1sec_{year:04d}{month:02d}{day:02d}_v05.cdf' supported_tags = {'dc_b': fname} list_files = functools.partial(nasa_cdaweb.list_files, supported_tags=supported_tags) fname = 'cnofs_cindi_ivm_500ms_{year:4d}{month:02d}{day:02d}_v01.cdf' supported_tags = {'': fname} list_files = functools.partial(cdw.list_files, supported_tags=supported_tags)
-
pysat.instruments.methods.nasa_cdaweb.
list_remote_files
(tag, sat_id, remote_site='https://cdaweb.gsfc.nasa.gov', supported_tags=None, user=None, password=None, fake_daily_files_from_monthly=False, two_digit_year_break=None, delimiter=None, year=None, month=None, day=None)¶ Return a Pandas Series of every file for chosen remote data.
Deprecated since version 2.3.0: This routine will be removed in pysat 3.0.0, it will be moved to the pysatNASA repository. Also, as of 2.2.0 the year/month/day keywords will be removed in pysat 3.0.0, they will be replaced with a start/stop syntax consistent with the download routine
This routine is intended to be used by pysat instrument modules supporting a particular NASA CDAWeb dataset.
Parameters: - tag ((string or NoneType)) – Denotes type of file to load. Accepted types are <tag strings>. (default=None)
- sat_id ((string or NoneType)) – Specifies the satellite ID for a constellation. (default=None)
- remote_site ((string or NoneType)) – Remote site to download data from (default=’https://cdaweb.gsfc.nasa.gov’)
- supported_tags (dict) – dict of dicts. Keys are supported tag names for download. Value is a dict with ‘dir’, ‘remote_fname’, ‘local_fname’. Inteded to be pre-set with functools.partial then assigned to new instrument code.
- user ((string or NoneType)) – Username to be passed along to resource with relevant data. (default=None)
- password ((string or NoneType)) – User password to be passed along to resource with relevant data. (default=None)
- fake_daily_files_from_monthly (bool) – Some CDAWeb instrument data files are stored by month. This flag, when true, accomodates this reality with user feedback on a monthly time frame. (default=False)
- two_digit_year_break ((int or NoneType)) – If filenames only store two digits for the year, then ‘1900’ will be added for years >= two_digit_year_break and ‘2000’ will be added for years < two_digit_year_break. (default=None)
- delimiter ((string or NoneType)) – If filename is delimited, then provide delimiter alone e.g. ‘_’ (default=None)
- year ((int or NoneType)) – Selects a given year to return remote files for. None returns all years. (default=None)
- month ((int or NoneType)) – Selects a given month to return remote files for. None returns all months. Requires year to be defined. (default=None)
- day ((int or NoneType)) – Selects a given day to return remote files for. None returns all days. Requires year and month to be defined. (default=None)
Returns: pysat.Files.from_os – A class containing the verified available files
Return type: (pysat._files.Files)
Examples
fname = 'cnofs_vefi_bfield_1sec_{year:04d}{month:02d}{day:02d}_v05.cdf' supported_tags = {'dc_b': fname} list_remote_files = functools.partial(nasa_cdaweb.list_remote_files, supported_tags=supported_tags) fname = 'cnofs_cindi_ivm_500ms_{year:4d}{month:02d}{day:02d}_v01.cdf' supported_tags = {'': fname} list_remote_files = functools.partial(cdw.list_remote_files, supported_tags=supported_tags)
-
pysat.instruments.methods.nasa_cdaweb.
download
(supported_tags, date_array, tag, sat_id, remote_site='https://cdaweb.gsfc.nasa.gov', data_path=None, user=None, password=None, fake_daily_files_from_monthly=False, multi_file_day=False)¶ Routine to download NASA CDAWeb CDF data.
Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatNASA.instruments.methods.cdaweb
This routine is intended to be used by pysat instrument modules supporting a particular NASA CDAWeb dataset.
Parameters: - supported_tags (dict) – dict of dicts. Keys are supported tag names for download. Value is a dict with ‘dir’, ‘remote_fname’, ‘local_fname’. Inteded to be pre-set with functools.partial then assigned to new instrument code.
- date_array (array_like) – Array of datetimes to download data for. Provided by pysat.
- tag (str or NoneType (None)) – tag or None
- sat_id ((str or NoneType)) – satellite id or None (default=None)
- remote_site ((string or NoneType)) – Remote site to download data from (default=’https://cdaweb.gsfc.nasa.gov’)
- data_path ((string or NoneType)) – Path to data directory. If None is specified, the value previously set in Instrument.files.data_path is used. (default=None)
- user ((string or NoneType)) – Username to be passed along to resource with relevant data. (default=None)
- password ((string or NoneType)) – User password to be passed along to resource with relevant data. (default=None)
- fake_daily_files_from_monthly (bool) – Some CDAWeb instrument data files are stored by month. This flag, when true, accomodates this reality with user feedback on a monthly time frame.
Returns: Void – Downloads data to disk.
Return type: (NoneType)
Examples
# download support added to cnofs_vefi.py using code below rn = '{year:4d}/cnofs_vefi_bfield_1sec_{year:4d}{month:02d}{day:02d}'+ '_v05.cdf' ln = 'cnofs_vefi_bfield_1sec_{year:4d}{month:02d}{day:02d}_v05.cdf' dc_b_tag = {'dir':'/pub/data/cnofs/vefi/bfield_1sec', 'remote_fname': rn, 'local_fname': ln} supported_tags = {'dc_b': dc_b_tag} download = functools.partial(nasa_cdaweb.download, supported_tags=supported_tags)
NASA ICON¶
Provides non-instrument specific routines for ICON data
Deprecated since version 2.3.0: This module has been removed from pysat in the 3.0.0 release and can now be found in pysatIncubator (https://github.com/pysat/pysatNASA)
-
pysat.instruments.methods.icon.
list_remote_files
(tag, sat_id, user=None, password=None, supported_tags=None, year=None, month=None, day=None, start=None, stop=None)¶ Return a Pandas Series of every file for chosen remote data.
This routine is intended to be used by pysat instrument modules supporting a particular UC-Berkeley SSL dataset related to ICON.
Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatNASA.instruments.methods.icon
Parameters: - tag (string or NoneType) – Denotes type of file to load. Accepted types are <tag strings>. (default=None)
- sat_id (string or NoneType) – Specifies the satellite ID for a constellation. Not used. (default=None)
- user (string or NoneType) – Username to be passed along to resource with relevant data. (default=None)
- password (string or NoneType) – User password to be passed along to resource with relevant data. (default=None)
- start (dt.datetime or NoneType) – Starting time for file list. A None value will start with the first file found. (default=None)
- stop (dt.datetime or NoneType) – Ending time for the file list. A None value will stop with the last file found. (default=None)
Returns: A Series formatted for the Files class (pysat._files.Files) containing filenames and indexed by date and time
Return type: pandas.Series
-
pysat.instruments.methods.icon.
ssl_download
(date_array, tag, sat_id, data_path=None, user=None, password=None, supported_tags=None)¶ Download ICON data from public area of SSL ftp server
Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0. It is replaced by the pysatNASA.instruments.methods.cdaweb.download method.
Parameters: - date_array (array-like) – list of datetimes to download data for. The sequence of dates need not be contiguous.
- tag (string) – Tag identifier used for particular dataset. This input is provided by pysat. (default=’’)
- sat_id (string) – Satellite ID string identifier used for particular dataset. This input is provided by pysat. (default=’’)
- data_path (string) – Path to directory to download data to. (default=None)
- user (string) – User string input used for download. Provided by user and passed via pysat. If an account is required for downloads this routine here must error if user not supplied. (default=None)
- password (string) – Password for data download. (default=None)
- **kwargs (dict) – Additional keywords supplied by user when invoking the download routine attached to a pysat.Instrument object are passed to this routine via kwargs.
Madrigal¶
Provides default routines for integrating CEDAR Madrigal instruments into pysat, reducing the amount of user intervention.
Deprecated since version 2.3.0: This module has been removed from pysat in the 3.0.0 release and can now be found in pysatMadrigal (https://github.com/pysat/pysatMadrigal)
-
pysat.instruments.methods.madrigal.
cedar_rules
()¶ General acknowledgement statement for Madrigal data.
Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatMadrigal.instruments.methods.madrigal
Returns: ackn – String with general acknowledgement for all CEDAR Madrigal data Return type: string
-
pysat.instruments.methods.madrigal.
load
(fnames, tag=None, sat_id=None, xarray_coords=[])¶ Loads data from Madrigal into Pandas.
Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatMadrigal.instruments.methods.madrigal
This routine is called as needed by pysat. It is not intended for direct user interaction.
Parameters: - fnames (array-like) – iterable of filename strings, full path, to data files to be loaded. This input is nominally provided by pysat itself.
- tag (string ('')) – tag name used to identify particular data set to be loaded. This input is nominally provided by pysat itself. While tag defaults to None here, pysat provides ‘’ as the default tag unless specified by user at Instrument instantiation.
- sat_id (string ('')) – Satellite ID used to identify particular data set to be loaded. This input is nominally provided by pysat itself.
- xarray_coords (list) – List of keywords to use as coordinates if xarray output is desired instead of a Pandas DataFrame (default=[])
Returns: - data (pds.DataFrame or xr.DataSet) – A pandas DataFrame or xarray DataSet holding the data from the HDF5 file
- metadata (pysat.Meta) – Metadata from the HDF5 file, as well as default values from pysat
Examples
- ::
- inst = pysat.Instrument(‘jro’, ‘isr’, ‘drifts’) inst.load(2010,18)
-
pysat.instruments.methods.madrigal.
download
(date_array, inst_code=None, kindat=None, data_path=None, user=None, password=None, url='http://cedar.openmadrigal.org', file_format='hdf5')¶ Downloads data from Madrigal.
Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatMadrigal.instruments.methods.madrigal
Parameters: - date_array (array-like) – list of datetimes to download data for. The sequence of dates need not be contiguous.
- inst_code (string (None)) – Madrigal instrument code(s), cast as a string. If multiple are used, separate them with commas.
- kindat (string (None)) – Experiment instrument code(s), cast as a string. If multiple are used, separate them with commas.
- data_path (string (None)) – Path to directory to download data to.
- user (string (None)) – User string input used for download. Provided by user and passed via pysat. If an account is required for dowloads this routine here must error if user not supplied.
- password (string (None)) – Password for data download.
- url (string (’http://cedar.openmadrigal.org’)) – URL for Madrigal site
- file_format (string ('hdf5')) – File format for Madrigal data. Load routines currently only accept ‘hdf5’, but any of the Madrigal options may be used here.
Returns: Void – Downloads data to disk.
Return type: (NoneType)
Notes
The user’s names should be provided in field user. Ruby Payne-Scott should be entered as Ruby+Payne-Scott
The password field should be the user’s email address. These parameters are passed to Madrigal when downloading.
The affiliation field is set to pysat to enable tracking of pysat downloads.
-
pysat.instruments.methods.madrigal.
filter_data_single_date
(self)¶ Filters data to a single date.
Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatMadrigal.instruments.methods.madrigal
Parameters: self (pysat.Instrument) – This object Note
Madrigal serves multiple days within a single JRO file to counter this, we will filter each loaded day so that it only contains the relevant day of data. This is only applied if loading by date. It is not applied when supplying pysat with a specific filename to load, nor when data padding is enabled. Note that when data padding is enabled the final data available within the instrument will be downselected by pysat to only include the date specified.
This routine is intended to be added to the Instrument nanokernel processing queue via
inst = pysat.Instrument() inst.custom.add(filter_data_single_date, 'modify')
This function will then be automatically applied to the Instrument object data on every load by the pysat nanokernel.
Warning
For the best performance, this function should be added first in the queue. This may be ensured by setting the default function in a pysat instrument file to this one.
within platform_name.py set
default = pysat.instruments.methods.madrigal.filter_data_single_date
at the top level
Space Weather¶
Provides default routines for solar wind and geospace indices
Deprecated since version 2.3.0: This Instrument module has been removed from pysat in the 3.0.0 release and can now be found in pysatSpaceWeather (https://github.com/pysat/pysatSpaceWeather)
-
pysat.instruments.methods.sw.
calc_daily_Ap
(ap_inst, ap_name='3hr_ap', daily_name='Ap', running_name=None)¶ Calculate the daily Ap index from the 3hr ap index
Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and has been replaced with pysatSpaceWeather.instruments.methods.kp_ap.calc_daily_Ap
Parameters: - ap_inst ((pysat.Instrument)) – pysat instrument containing 3-hourly ap data
- ap_name ((str)) – Column name for 3-hourly ap data (default=’3hr_ap’)
- daily_name ((str)) – Column name for daily Ap data (default=’Ap’)
- running_name ((str or NoneType)) – Column name for daily running average of ap, not output if None (default=None)
Returns: Void
Return type: updates intrument to include daily Ap index under daily_name
Notes
Ap is the mean of the 3hr ap indices measured for a given day
Option for running average is included since this information is used by MSIS when running with sub-daily geophysical inputs
-
pysat.instruments.methods.sw.
combine_f107
(standard_inst, forecast_inst, start=None, stop=None)¶ Combine the output from the measured and forecasted F10.7 sources
Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and has been replaced with pysatSpaceWeather.instruments.methods.f107.combine_f107
Parameters: - standard_inst ((pysat.Instrument or NoneType)) – Instrument object containing data for the ‘sw’ platform, ‘f107’ name, and ‘’, ‘all’, ‘prelim’, or ‘daily’ tag
- forecast_inst ((pysat.Instrument or NoneType)) – Instrument object containing data for the ‘sw’ platform, ‘f107’ name, and ‘prelim’, ‘45day’ or ‘forecast’ tag
- start ((dt.datetime or NoneType)) – Starting time for combining data, or None to use earliest loaded date from the pysat Instruments (default=None)
- stop ((dt.datetime)) – Ending time for combining data, or None to use the latest loaded date from the pysat Instruments (default=None)
Returns: f107_inst – Instrument object containing F10.7 observations for the desired period of time, merging the standard, 45day, and forecasted values based on their reliability
Return type: Notes
Merging prioritizes the standard data, then the 45day data, and finally the forecast data
Will not attempt to download any missing data, but will load data
-
pysat.instruments.methods.sw.
combine_kp
(standard_inst=None, recent_inst=None, forecast_inst=None, start=None, stop=None, fill_val=nan)¶ Combine the output from the different Kp sources for a range of dates
Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and has been replaced with pysatSpaceWeather.instruments.methods.kp_ap.combine_kp
Parameters: - standard_inst ((pysat.Instrument or NoneType)) – Instrument object containing data for the ‘sw’ platform, ‘kp’ name, and ‘’ tag or None to exclude (default=None)
- recent_inst ((pysat.Instrument or NoneType)) – Instrument object containing data for the ‘sw’ platform, ‘kp’ name, and ‘recent’ tag or None to exclude (default=None)
- forecast_inst ((pysat.Instrument or NoneType)) – Instrument object containing data for the ‘sw’ platform, ‘kp’ name, and ‘forecast’ tag or None to exclude (default=None)
- start ((dt.datetime or NoneType)) – Starting time for combining data, or None to use earliest loaded date from the pysat Instruments (default=None)
- stop ((dt.datetime)) – Ending time for combining data, or None to use the latest loaded date from the pysat Instruments (default=None)
- fill_val ((int or float)) – Desired fill value (since the standard instrument fill value differs from the other sources) (default=np.nan)
Returns: kp_inst – Instrument object containing Kp observations for the desired period of time, merging the standard, recent, and forecasted values based on their reliability
Return type: Notes
Merging prioritizes the standard data, then the recent data, and finally the forecast data
Will not attempt to download any missing data, but will load data
-
pysat.instruments.methods.sw.
convert_ap_to_kp
(ap_data, fill_val=-1, ap_name='ap')¶ Convert Ap into Kp
Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and has been replaced with pysatSpaceWeather.instruments.methods.kp_ap.convert_ap_to_kp
Parameters: - ap_data (array-like) – Array-like object containing Ap data
- fill_val (int, float, NoneType) – Fill value for the data set (default=-1)
- ap_name (str) – Name of the input ap
Returns: - kp_data (array-like) – Array-like object containing Kp data
- meta (Metadata) – Metadata object containing information about transformed data
Instrument Templates¶
General Instrument¶
This is a template for a pysat.Instrument support file. Modify this file as needed when adding a new Instrument to pysat.
This is a good area to introduce the instrument, provide background on the mission, operations, instrumentation, and measurements.
Also a good place to provide contact information. This text will be included in the pysat API documentation.
Properties¶
- platform
- List platform string here
- name
- List name string here
- sat_id
- List supported sat_ids here
- tag
- List supported tag strings here
Note
- Optional section, remove if no notes
Warning
- Optional section, remove if no warnings
- Two blank lines needed afterward for proper formatting
Examples
Example code can go here
Authors¶
Author name and institution
-
pysat.instruments.templates.template_instrument.
init
(self)¶ Initializes the Instrument object with instrument specific values.
Runs once upon instantiation. Object modified in place. Optional.
Parameters: self (pysat.Instrument) – This object
-
pysat.instruments.templates.template_instrument.
default
(self)¶ Default customization function.
This routine is automatically applied to the Instrument object on every load by the pysat nanokernel (first in queue). Object modified in place.
Parameters: self (pysat.Instrument) – This object
-
pysat.instruments.templates.template_instrument.
load
(fnames, tag=None, sat_id=None, custom_keyword=None)¶ Loads PLATFORM data into (PANDAS/XARRAY).
This routine is called as needed by pysat. It is not intended for direct user interaction.
Parameters: - fnames (array-like) – iterable of filename strings, full path, to data files to be loaded. This input is nominally provided by pysat itself.
- tag (string) – tag name used to identify particular data set to be loaded. This input is nominally provided by pysat itself. While tag defaults to None here, pysat provides ‘’ as the default tag unless specified by user at Instrument instantiation. (default=’’)
- sat_id (string) – Satellite ID used to identify particular data set to be loaded. This input is nominally provided by pysat itself. (default=’’)
- custom_keyword (type to be set) – Developers may include any custom keywords, with default values defined in the method signature. This is included here as a place holder and should be removed.
Returns: Data and Metadata are formatted for pysat. Data is a pandas DataFrame or xarray DataSet while metadata is a pysat.Meta instance.
Return type: data, metadata
Note
Any additional keyword arguments passed to pysat.Instrument upon instantiation are passed along to this routine.
Examples
inst = pysat.Instrument('ucar', 'tiegcm') inst.load(2019, 1)
-
pysat.instruments.templates.template_instrument.
list_files
(tag=None, sat_id=None, data_path=None, format_str=None)¶ Produce a list of files corresponding to PLATFORM/NAME.
This routine is invoked by pysat and is not intended for direct use by the end user. Arguments are provided by pysat.
Parameters: - tag (string) – tag name used to identify particular data set to be loaded. This input is nominally provided by pysat itself. (default=’’)
- sat_id (string) – Satellite ID used to identify particular data set to be loaded. This input is nominally provided by pysat itself. (default=’’)
- data_path (string) – Full path to directory containing files to be loaded. This is provided by pysat. The user may specify their own data path at Instrument instantiation and it will appear here. (default=None)
- format_str (string) – String template used to parse the datasets filenames. If a user supplies a template string at Instrument instantiation then it will appear here, otherwise defaults to None. (default=None)
Returns: Series of filename strings, including the path, indexed by datetime.
Return type: pandas.Series
Examples
If a filename is SPORT_L2_IVM_2019-01-01_v01r0000.NC then the template is 'SPORT_L2_IVM_{year:04d}-{month:02d}-{day:02d}_' + 'v{version:02d}r{revision:04d}.NC'
Note
The returned Series should not have any duplicate datetimes. If there are multiple versions of a file the most recent version should be kept and the rest discarded. This routine uses the pysat.Files.from_os constructor, thus the returned files are up to pysat specifications.
Multiple data levels may be supported via the ‘tag’ input string. Multiple instruments via the sat_id string.
-
pysat.instruments.templates.template_instrument.
list_remote_files
(tag, sat_id, user=None, password=None)¶ Return a Pandas Series of every file for chosen remote data.
This routine is intended to be used by pysat instrument modules supporting a particular NASA CDAWeb dataset.
Parameters: - tag (string or NoneType) – Denotes type of file to load. Accepted types are <tag strings>. (default=None)
- sat_id (string or NoneType) – Specifies the satellite ID for a constellation. Not used. (default=None)
- user (string or NoneType) – Username to be passed along to resource with relevant data. (default=None)
- password (string or NoneType) – User password to be passed along to resource with relevant data. (default=None)
Returns: A Series formatted for the Files class (pysat._files.Files) containing filenames and indexed by date and time
Return type: pandas.Series
-
pysat.instruments.templates.template_instrument.
download
(date_array, tag, sat_id, data_path=None, user=None, password=None, custom_keywords=None)¶ Placeholder for PLATFORM/NAME downloads.
This routine is invoked by pysat and is not intended for direct use by the end user.
Parameters: - date_array (array-like) – list of datetimes to download data for. The sequence of dates need not be contiguous.
- tag (string) – Tag identifier used for particular dataset. This input is provided by pysat. (default=’’)
- sat_id (string) – Satellite ID string identifier used for particular dataset. This input is provided by pysat. (default=’’)
- data_path (string) – Path to directory to download data to. (default=None)
- user (string) – User string input used for download. Provided by user and passed via pysat. If an account is required for dowloads this routine here must error if user not supplied. (default=None)
- password (string) – Password for data download. (default=None)
- custom_keywords (placeholder) – Additional keywords supplied by user when invoking the download routine attached to a pysat.Instrument object are passed to this routine. Use of custom keywords here is discouraged.
-
pysat.instruments.templates.template_instrument.
clean
(inst)¶ Routine to return PLATFORM/NAME data cleaned to the specified level
Cleaning level is specified in inst.clean_level and pysat will accept user input for several strings. The clean_level is specified at instantiation of the Instrument object.
‘clean’ : All parameters should be good, suitable for statistical and case studies ‘dusty’ : All paramers should generally be good though same may not be great ‘dirty’ : There are data areas that have issues, data should be used with caution ‘none’ : No cleaning applied, routine not called in this case.Parameters: inst (pysat.Instrument) – Instrument class object, whose attribute clean_level is used to return the desired level of data selectivity.
Madrigal Pandas¶
Generic module for loading netCDF4 files into the pandas format within pysat.
This file may be used as a template for adding pysat support for a new dataset based upon netCDF4 files, or other file types (with modification).
This routine may also be used to add quick local support for a netCDF4 based dataset without having to define an instrument module for pysat. Relevant parameters may be specified when instantiating this Instrument object to support the relevant file location and naming schemes. This presumes the pysat developed utils.load_netCDF4 routine is able to load the file. See the load routine docstring in this module for more.
The routines defined within may also be used when adding a new instrument to pysat by importing this module and using the functools.partial methods to attach these functions to the new instrument model. See pysat/instruments/cnofs_ivm.py for more. NASA CDAWeb datasets, such as C/NOFS IVM, use the methods within pysat/instruments/methods/nasa_cdaweb.py to make adding new CDAWeb instruments easy.
-
pysat.instruments.templates.netcdf_pandas.
init
(self)¶ Initializes the Instrument object with instrument specific values.
Runs once upon instantiation. This routine provides a convenient location to print Acknowledgements or restrictions from the mission.
-
pysat.instruments.templates.netcdf_pandas.
load
(fnames, tag=None, sat_id=None, **kwargs)¶ Loads data using pysat.utils.load_netcdf4 .
This routine is called as needed by pysat. It is not intended for direct user interaction.
Parameters: - fnames (array-like) – iterable of filename strings, full path, to data files to be loaded. This input is nominally provided by pysat itself.
- tag (string) – tag name used to identify particular data set to be loaded. This input is nominally provided by pysat itself.
- sat_id (string) – Satellite ID used to identify particular data set to be loaded. This input is nominally provided by pysat itself.
- **kwargs (extra keywords) – Passthrough for additional keyword arguments specified when instantiating an Instrument object. These additional keywords are passed through to this routine by pysat.
Returns: Data and Metadata are formatted for pysat. Data is a pandas DataFrame while metadata is a pysat.Meta instance.
Return type: data, metadata
Note
Any additional keyword arguments passed to pysat.Instrument upon instantiation are passed along to this routine and through to the load_netcdf4 call.
Examples
inst = pysat.Instrument('sport', 'ivm') inst.load(2019,1) # create quick Instrument object for a new, random netCDF4 file # define filename template string to identify files # this is normally done by instrument code, but in this case # there is no built in pysat instrument support # presumes files are named default_2019-01-01.NC format_str = 'default_{year:04d}-{month:02d}-{day:02d}.NC' inst = pysat.Instrument('netcdf', 'pandas', custom_kwarg='test' data_path='./', format_str=format_str) inst.load(2019,1)
-
pysat.instruments.templates.netcdf_pandas.
list_files
(tag=None, sat_id=None, data_path=None, format_str=None)¶ Produce a list of files corresponding to format_str located at data_path.
This routine is invoked by pysat and is not intended for direct use by the end user.
Multiple data levels may be supported via the ‘tag’ and ‘sat_id’ input strings.
Parameters: - tag (string) – tag name used to identify particular data set to be loaded. This input is nominally provided by pysat itself. (default=’’)
- sat_id (string) – Satellite ID used to identify particular data set to be loaded. This input is nominally provided by pysat itself. (default=’’)
- data_path (string) – Full path to directory containing files to be loaded. This is provided by pysat. The user may specify their own data path at Instrument instantiation and it will appear here. (default=None)
- format_str (string) – String template used to parse the datasets filenames. If a user supplies a template string at Instrument instantiation then it will appear here, otherwise defaults to None. (default=None)
Returns: Series of filename strings, including the path, indexed by datetime.
Return type: pandas.Series
Examples
If a filename is SPORT_L2_IVM_2019-01-01_v01r0000.NC then the template is 'SPORT_L2_IVM_{year:04d}-{month:02d}-{day:02d}_' + 'v{version:02d}r{revision:04d}.NC'
Note
The returned Series should not have any duplicate datetimes. If there are multiple versions of a file the most recent version should be kept and the rest discarded. This routine uses the pysat.Files.from_os constructor, thus the returned files are up to pysat specifications.
Normally the format_str for each supported tag and sat_id is defined within this routine. However, as this is a generic routine, those definitions can’t be made here. This method could be used in an instrument specific module where the list_files routine in the new package defines the format_str based upon inputs, then calls this routine passing both data_path and format_str.
Alternately, the list_files routine in methods.nasa_cdaweb may also be used and has more built in functionality. Supported tages and format strings may be defined within the new instrument module and passed as arguments to methods.nasa_cdaweb.list_files . For an example on using this routine, see pysat/instrument/cnofs_ivm.py or cnofs_vefi, cnofs_plp, omni_hro, timed_see, etc.
-
pysat.instruments.templates.netcdf_pandas.
download
(date_array, tag, sat_id, data_path=None, user=None, password=None)¶ Downloads data for supported instruments, however this is a template call.
This routine is invoked by pysat and is not intended for direct use by the end user.
Parameters: - date_array (array-like) – list of datetimes to download data for. The sequence of dates need not be contiguous.
- tag (string) – Tag identifier used for particular dataset. This input is provided by pysat. (default=’’)
- sat_id (string) – Satellite ID string identifier used for particular dataset. This input is provided by pysat. (default=’’)
- data_path (string (None)) – Path to directory to download data to. (default=None)
- user (string) – User string input used for download. Provided by user and passed via pysat. If an account is required for dowloads this routine here must error if user not supplied. (default=None)
- password (string) – Password for data download. (default=None)
NASA CDAWeb Instrument¶
This is a template for a pysat.Instrument support file that utilizes CDAWeb methods. Copy and modify this file as needed when adding a new Instrument to pysat.
This is a good area to introduce the instrument, provide background on the mission, operations, instrumenation, and measurements.
Also a good place to provide contact information. This text will be included in the pysat API documentation.
Properties¶
- platform
- List platform string here
- name
- List name string here
- sat_id
- List supported sat_ids here
- tag
- List supported tag strings here
Note
- Optional section, remove if no notes
Warning
- Optional section, remove if no warnings
- Two blank lines needed afterward for proper formatting
Examples
Example code can go here
Authors¶
Author name and institution
-
pysat.instruments.templates.template_cdaweb_instrument.
default
(self)¶ Default customization function.
This routine is automatically applied to the Instrument object on every load by the pysat nanokernel (first in queue).
Parameters: self (pysat.Instrument) – This object
-
pysat.instruments.templates.template_cdaweb_instrument.
load
(fnames, tag=None, sat_id=None, fake_daily_files_from_monthly=False, flatten_twod=True)¶ Load NASA CDAWeb CDF files.
Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatNASA.instruments.methods.cdaweb
This routine is intended to be used by pysat instrument modules supporting a particular NASA CDAWeb dataset.
Parameters: - fnames ((pandas.Series)) – Series of filenames
- tag ((str or NoneType)) – tag or None (default=None)
- sat_id ((str or NoneType)) – satellite id or None (default=None)
- fake_daily_files_from_monthly (bool) – Some CDAWeb instrument data files are stored by month, interfering with pysat’s functionality of loading by day. This flag, when true, parses of daily dates to monthly files that were added internally by the list_files routine, when flagged. These dates are used here to provide data by day.
- flatted_twod (bool) – Flattens 2D data into different columns of root DataFrame rather than produce a Series of DataFrames
Returns: - data ((pandas.DataFrame)) – Object containing satellite data
- meta ((pysat.Meta)) – Object containing metadata such as column names and units
Examples
# within the new instrument module, at the top level define # a new variable named load, and set it equal to this load method # code below taken from cnofs_ivm.py. # support load routine # use the default CDAWeb method load = cdw.load
-
pysat.instruments.templates.template_cdaweb_instrument.
list_files
(tag=None, sat_id=None, data_path=None, format_str=None, *, supported_tags={'': {'': 'cnofs_vefi_bfield_1sec_{year:04d}{month:02d}{day:02d}_v05.cdf'}}, fake_daily_files_from_monthly=False, two_digit_year_break=None)¶ Return a Pandas Series of every file for chosen satellite data.
Deprecated since version 2.2.0: list_files will be removed in pysat 3.0.0, it will be replaced by the copy in instruments.methods.general
This routine is intended to be used by pysat instrument modules supporting a particular NASA CDAWeb dataset.
Parameters: - tag ((string or NoneType)) – Denotes type of file to load. Accepted types are <tag strings>. (default=None)
- sat_id ((string or NoneType)) – Specifies the satellite ID for a constellation. Not used. (default=None)
- data_path ((string or NoneType)) – Path to data directory. If None is specified, the value previously set in Instrument.files.data_path is used. (default=None)
- format_str ((string or NoneType)) – User specified file format. If None is specified, the default formats associated with the supplied tags are used. (default=None)
- supported_tags ((dict or NoneType)) – keys are sat_id, each containing a dict keyed by tag where the values file format template strings. (default=None)
- fake_daily_files_from_monthly ((bool)) – Some CDAWeb instrument data files are stored by month, interfering with pysat’s functionality of loading by day. This flag, when true, appends daily dates to monthly files internally. These dates are used by load routine in this module to provide data by day.
- two_digit_year_break ((int)) – If filenames only store two digits for the year, then ‘1900’ will be added for years >= two_digit_year_break and ‘2000’ will be added for years < two_digit_year_break.
Returns: pysat.Files.from_os – A class containing the verified available files
Return type: (pysat._files.Files)
Examples
fname = 'cnofs_vefi_bfield_1sec_{year:04d}{month:02d}{day:02d}_v05.cdf' supported_tags = {'dc_b': fname} list_files = functools.partial(nasa_cdaweb.list_files, supported_tags=supported_tags) fname = 'cnofs_cindi_ivm_500ms_{year:4d}{month:02d}{day:02d}_v01.cdf' supported_tags = {'': fname} list_files = functools.partial(cdw.list_files, supported_tags=supported_tags)
-
pysat.instruments.templates.template_cdaweb_instrument.
list_remote_files
(tag, sat_id, remote_site='https://cdaweb.gsfc.nasa.gov', *, supported_tags={'': {'': {'dir': '/pub/data/cnofs/vefi/bfield_1sec', 'local_fname': 'cnofs_vefi_bfield_1sec_{year:04d}{month:02d}{day:02d}_v05.cdf', 'remote_fname': '{year:4d}/cnofs_vefi_bfield_1sec_{year:04d}{month:02d}{day:02d}_v05.cdf'}}}, user=None, password=None, fake_daily_files_from_monthly=False, two_digit_year_break=None, delimiter=None, year=None, month=None, day=None)¶ Return a Pandas Series of every file for chosen remote data.
Deprecated since version 2.3.0: This routine will be removed in pysat 3.0.0, it will be moved to the pysatNASA repository. Also, as of 2.2.0 the year/month/day keywords will be removed in pysat 3.0.0, they will be replaced with a start/stop syntax consistent with the download routine
This routine is intended to be used by pysat instrument modules supporting a particular NASA CDAWeb dataset.
Parameters: - tag ((string or NoneType)) – Denotes type of file to load. Accepted types are <tag strings>. (default=None)
- sat_id ((string or NoneType)) – Specifies the satellite ID for a constellation. (default=None)
- remote_site ((string or NoneType)) – Remote site to download data from (default=’https://cdaweb.gsfc.nasa.gov’)
- supported_tags (dict) – dict of dicts. Keys are supported tag names for download. Value is a dict with ‘dir’, ‘remote_fname’, ‘local_fname’. Inteded to be pre-set with functools.partial then assigned to new instrument code.
- user ((string or NoneType)) – Username to be passed along to resource with relevant data. (default=None)
- password ((string or NoneType)) – User password to be passed along to resource with relevant data. (default=None)
- fake_daily_files_from_monthly (bool) – Some CDAWeb instrument data files are stored by month. This flag, when true, accomodates this reality with user feedback on a monthly time frame. (default=False)
- two_digit_year_break ((int or NoneType)) – If filenames only store two digits for the year, then ‘1900’ will be added for years >= two_digit_year_break and ‘2000’ will be added for years < two_digit_year_break. (default=None)
- delimiter ((string or NoneType)) – If filename is delimited, then provide delimiter alone e.g. ‘_’ (default=None)
- year ((int or NoneType)) – Selects a given year to return remote files for. None returns all years. (default=None)
- month ((int or NoneType)) – Selects a given month to return remote files for. None returns all months. Requires year to be defined. (default=None)
- day ((int or NoneType)) – Selects a given day to return remote files for. None returns all days. Requires year and month to be defined. (default=None)
Returns: pysat.Files.from_os – A class containing the verified available files
Return type: (pysat._files.Files)
Examples
fname = 'cnofs_vefi_bfield_1sec_{year:04d}{month:02d}{day:02d}_v05.cdf' supported_tags = {'dc_b': fname} list_remote_files = functools.partial(nasa_cdaweb.list_remote_files, supported_tags=supported_tags) fname = 'cnofs_cindi_ivm_500ms_{year:4d}{month:02d}{day:02d}_v01.cdf' supported_tags = {'': fname} list_remote_files = functools.partial(cdw.list_remote_files, supported_tags=supported_tags)
-
pysat.instruments.templates.template_cdaweb_instrument.
download
(date_array, tag, sat_id, remote_site='https://cdaweb.gsfc.nasa.gov', data_path=None, user=None, password=None, fake_daily_files_from_monthly=False, multi_file_day=False)¶ Routine to download NASA CDAWeb CDF data.
Deprecated since version 2.3.0: This routine has been deprecated in pysat 3.0.0, and will be accessible in pysatNASA.instruments.methods.cdaweb
This routine is intended to be used by pysat instrument modules supporting a particular NASA CDAWeb dataset.
Parameters: - supported_tags (dict) – dict of dicts. Keys are supported tag names for download. Value is a dict with ‘dir’, ‘remote_fname’, ‘local_fname’. Inteded to be pre-set with functools.partial then assigned to new instrument code.
- date_array (array_like) – Array of datetimes to download data for. Provided by pysat.
- tag (str or NoneType (None)) – tag or None
- sat_id ((str or NoneType)) – satellite id or None (default=None)
- remote_site ((string or NoneType)) – Remote site to download data from (default=’https://cdaweb.gsfc.nasa.gov’)
- data_path ((string or NoneType)) – Path to data directory. If None is specified, the value previously set in Instrument.files.data_path is used. (default=None)
- user ((string or NoneType)) – Username to be passed along to resource with relevant data. (default=None)
- password ((string or NoneType)) – User password to be passed along to resource with relevant data. (default=None)
- fake_daily_files_from_monthly (bool) – Some CDAWeb instrument data files are stored by month. This flag, when true, accomodates this reality with user feedback on a monthly time frame.
Returns: Void – Downloads data to disk.
Return type: (NoneType)
Examples
# download support added to cnofs_vefi.py using code below rn = '{year:4d}/cnofs_vefi_bfield_1sec_{year:4d}{month:02d}{day:02d}'+ '_v05.cdf' ln = 'cnofs_vefi_bfield_1sec_{year:4d}{month:02d}{day:02d}_v05.cdf' dc_b_tag = {'dir':'/pub/data/cnofs/vefi/bfield_1sec', 'remote_fname': rn, 'local_fname': ln} supported_tags = {'dc_b': dc_b_tag} download = functools.partial(nasa_cdaweb.download, supported_tags=supported_tags)
-
pysat.instruments.templates.template_cdaweb_instrument.
clean
(inst)¶ Routine to return PLATFORM/NAME data cleaned to the specified level
Cleaning level is specified in inst.clean_level and pysat will accept user input for several strings. The clean_level is specified at instantiation of the Instrument object.
‘clean’ : All parameters should be good, suitable for statistical and case studies ‘dusty’ : All paramers should generally be good though same may not be great ‘dirty’ : There are data areas that have issues, data should be used with caution ‘none’ : No cleaning applied, routine not called in this case.Parameters: inst (pysat.Instrument) – Instrument class object, whose attribute clean_level is used to return the desired level of data selectivity.
netCDF Pandas¶
Generic module for loading netCDF4 files into the pandas format within pysat.
This file may be used as a template for adding pysat support for a new dataset based upon netCDF4 files, or other file types (with modification).
This routine may also be used to add quick local support for a netCDF4 based dataset without having to define an instrument module for pysat. Relevant parameters may be specified when instantiating this Instrument object to support the relevant file location and naming schemes. This presumes the pysat developed utils.load_netCDF4 routine is able to load the file. See the load routine docstring in this module for more.
The routines defined within may also be used when adding a new instrument to pysat by importing this module and using the functools.partial methods to attach these functions to the new instrument model. See pysat/instruments/cnofs_ivm.py for more. NASA CDAWeb datasets, such as C/NOFS IVM, use the methods within pysat/instruments/methods/nasa_cdaweb.py to make adding new CDAWeb instruments easy.
-
pysat.instruments.templates.netcdf_pandas.
init
(self) Initializes the Instrument object with instrument specific values.
Runs once upon instantiation. This routine provides a convenient location to print Acknowledgements or restrictions from the mission.
-
pysat.instruments.templates.netcdf_pandas.
load
(fnames, tag=None, sat_id=None, **kwargs) Loads data using pysat.utils.load_netcdf4 .
This routine is called as needed by pysat. It is not intended for direct user interaction.
Parameters: - fnames (array-like) – iterable of filename strings, full path, to data files to be loaded. This input is nominally provided by pysat itself.
- tag (string) – tag name used to identify particular data set to be loaded. This input is nominally provided by pysat itself.
- sat_id (string) – Satellite ID used to identify particular data set to be loaded. This input is nominally provided by pysat itself.
- **kwargs (extra keywords) – Passthrough for additional keyword arguments specified when instantiating an Instrument object. These additional keywords are passed through to this routine by pysat.
Returns: Data and Metadata are formatted for pysat. Data is a pandas DataFrame while metadata is a pysat.Meta instance.
Return type: data, metadata
Note
Any additional keyword arguments passed to pysat.Instrument upon instantiation are passed along to this routine and through to the load_netcdf4 call.
Examples
inst = pysat.Instrument('sport', 'ivm') inst.load(2019,1) # create quick Instrument object for a new, random netCDF4 file # define filename template string to identify files # this is normally done by instrument code, but in this case # there is no built in pysat instrument support # presumes files are named default_2019-01-01.NC format_str = 'default_{year:04d}-{month:02d}-{day:02d}.NC' inst = pysat.Instrument('netcdf', 'pandas', custom_kwarg='test' data_path='./', format_str=format_str) inst.load(2019,1)
-
pysat.instruments.templates.netcdf_pandas.
list_files
(tag=None, sat_id=None, data_path=None, format_str=None) Produce a list of files corresponding to format_str located at data_path.
This routine is invoked by pysat and is not intended for direct use by the end user.
Multiple data levels may be supported via the ‘tag’ and ‘sat_id’ input strings.
Parameters: - tag (string) – tag name used to identify particular data set to be loaded. This input is nominally provided by pysat itself. (default=’’)
- sat_id (string) – Satellite ID used to identify particular data set to be loaded. This input is nominally provided by pysat itself. (default=’’)
- data_path (string) – Full path to directory containing files to be loaded. This is provided by pysat. The user may specify their own data path at Instrument instantiation and it will appear here. (default=None)
- format_str (string) – String template used to parse the datasets filenames. If a user supplies a template string at Instrument instantiation then it will appear here, otherwise defaults to None. (default=None)
Returns: Series of filename strings, including the path, indexed by datetime.
Return type: pandas.Series
Examples
If a filename is SPORT_L2_IVM_2019-01-01_v01r0000.NC then the template is 'SPORT_L2_IVM_{year:04d}-{month:02d}-{day:02d}_' + 'v{version:02d}r{revision:04d}.NC'
Note
The returned Series should not have any duplicate datetimes. If there are multiple versions of a file the most recent version should be kept and the rest discarded. This routine uses the pysat.Files.from_os constructor, thus the returned files are up to pysat specifications.
Normally the format_str for each supported tag and sat_id is defined within this routine. However, as this is a generic routine, those definitions can’t be made here. This method could be used in an instrument specific module where the list_files routine in the new package defines the format_str based upon inputs, then calls this routine passing both data_path and format_str.
Alternately, the list_files routine in methods.nasa_cdaweb may also be used and has more built in functionality. Supported tages and format strings may be defined within the new instrument module and passed as arguments to methods.nasa_cdaweb.list_files . For an example on using this routine, see pysat/instrument/cnofs_ivm.py or cnofs_vefi, cnofs_plp, omni_hro, timed_see, etc.
-
pysat.instruments.templates.netcdf_pandas.
download
(date_array, tag, sat_id, data_path=None, user=None, password=None) Downloads data for supported instruments, however this is a template call.
This routine is invoked by pysat and is not intended for direct use by the end user.
Parameters: - date_array (array-like) – list of datetimes to download data for. The sequence of dates need not be contiguous.
- tag (string) – Tag identifier used for particular dataset. This input is provided by pysat. (default=’’)
- sat_id (string) – Satellite ID string identifier used for particular dataset. This input is provided by pysat. (default=’’)
- data_path (string (None)) – Path to directory to download data to. (default=None)
- user (string) – User string input used for download. Provided by user and passed via pysat. If an account is required for dowloads this routine here must error if user not supplied. (default=None)
- password (string) – Password for data download. (default=None)
Constellation¶
-
class
pysat.
Constellation
(instruments=None, name=None, const_module=None)¶ Manage and analyze data from multiple pysat Instruments.
Created as part of a Spring 2018 UTDesign project.
Deprecated since version 2.3.0: The name kwarg was changed to const_module in pysat 3.0.0
Constructs a Constellation given a list of instruments or the name of a file with a pre-defined constellation.
Deprecated since version 2.3.0: The name kwarg was changed to const_module in pysat 3.0.0
Parameters: - instruments (list) – a list of pysat Instruments
- name (string) – Name of a file in pysat/constellations containing a list of instruments.
- const_module (string or NoneType) – Name of a pysat constellation module (default=None)
Note
The name and instruments parameters should not both be set. If neither is given, an empty constellation will be created.
-
add
(bounds1, label1, bounds2, label2, bin3, label3, data_label)¶ Combines signals from multiple instruments within given bounds.
Deprecated since version 2.2.0: add will be removed in pysat 3.0.0, it will be added to pysatSeasons
Parameters: - bounds1 ((min, max)) – Bounds for selecting data on the axis of label1 Data points with label1 in [min, max) will be considered.
- label1 (string) – Data label for bounds1 to act on.
- bounds2 ((min, max)) – Bounds for selecting data on the axis of label2 Data points with label1 in [min, max) will be considered.
- label2 (string) – Data label for bounds2 to act on.
- bin3 ((min, max, #bins)) – Min and max bounds and number of bins for third axis.
- label3 (string) – Data label for third axis.
- data_label (array of strings) – Data label(s) for data product(s) to be averaged.
Returns: median – Dictionary indexed by data label, each value of which is a dictionary with keys ‘median’, ‘count’, ‘avg_abs_dev’, and ‘bin’ (the values of the bin edges.)
Return type: dictionary
-
data_mod
(*args, **kwargs)¶ Register a function to modify data of member Instruments.
The function is not partially applied to modify member data.
When the Constellation receives a function call to register a function for data modification, it passes the call to each instrument and registers it in the instrument’s pysat.Custom queue.
(Wraps pysat.Custom.add; documentation of that function is reproduced here.)
Parameters: - function (string or function object) – name of function or function object to be added to queue
- kind ({'add, 'modify', 'pass'}) –
- add
- Adds data returned from fuction to instrument object.
- modify
- pysat instrument object supplied to routine. Any and all changes to object are retained.
- pass
- A copy of pysat object is passed to function. No data is accepted from return.
- at_pos (string or int) – insert at position. (default, insert at end).
- args (extra arguments) –
Note
Allowed add function returns:
- {‘data’ : pandas Series/DataFrame/array_like, ‘units’ : string/array_like of strings, ‘long_name’ : string/array_like of strings, ‘name’ : string/array_like of strings (iff data array_like)}
- pandas DataFrame, names of columns are used
- pandas Series, .name required
- (string/list of strings, numpy array/list of arrays)
-
difference
(instrument1, instrument2, bounds, data_labels, cost_function)¶ Calculates the difference in signals from multiple instruments within the given bounds.
Deprecated since version 2.2.0: difference will be removed in pysat 3.0.0, it will be added to pysatSeasons
Parameters: - instrument1 (Instrument) – Information must already be loaded into the instrument.
- instrument2 (Instrument) – Information must already be loaded into the instrument.
- bounds (list of tuples in the form (inst1_label, inst2_label,) – min, max, max_difference) inst1_label are inst2_label are labels for the data in instrument1 and instrument2 min and max are bounds on the data considered max_difference is the maximum difference between two points for the difference to be calculated
- data_labels (list of tuples of data labels) – The first key is used to access data in s1 and the second data in s2.
- cost_function (function) – function that operates on two rows of the instrument data. used to determine the distance between two points for finding closest points
Returns: - data_df (pandas DataFrame) – Each row has a point from instrument1, with the keys preceded by 1_, and a point within bounds on that point from instrument2 with the keys preceded by 2_, and the difference between the instruments’ data for all the labels in data_labels
- Created as part of a Spring 2018 UTDesign project.
-
load
(*args, **kwargs)¶ Load instrument data into instrument object.data
(Wraps pysat.Instrument.load; documentation of that function is reproduced here.)
Parameters: - yr (integer) – Year for desired data
- doy (integer) – day of year
- data (datetime object) – date to load
- fname ('string') – filename to be loaded
- verifyPad (boolean) – if true, padding data not removed (debug purposes)
-
set_bounds
(start, stop)¶ Sets boundaries for all instruments in constellation
Custom¶
-
class
pysat.
Custom
¶ Applies a queue of functions when instrument.load called.
Deprecated since version 2.3.0: Custom will be removed in pysat 3.0.0, it is incorporated into Instrument
Nano-kernel functionality enables instrument objects that are ‘set and forget’. The functions are always run whenever the instrument load routine is called so instrument objects may be passed safely to other routines and the data will always be processed appropriately.
Examples
def custom_func(inst, opt_param1=False, opt_param2=False): return None instrument.custom.attach(custom_func, 'modify', opt_param1=True) def custom_func2(inst, opt_param1=False, opt_param2=False): return data_to_be_added instrument.custom.attach(custom_func2, 'add', opt_param2=True) instrument.load(date=date) print(instrument['data_to_be_added'])
See also
Note
User should interact with Custom through pysat.Instrument instance’s attribute, instrument.custom
-
add
(function, kind='add', at_pos='end', *args, **kwargs)¶ Add a function to custom processing queue.
Deprecated since version 2.2.0: Custom.add will be removed in pysat 3.0.0, it is replaced by Instrument.custom_attach to clarify the syntax
Custom functions are applied automatically to associated pysat instrument whenever instrument.load command called.
Parameters: - function (string or function object) – name of function or function object to be added to queue
- kind ({'add', 'modify', 'pass}) –
- add
- Adds data returned from function to instrument object. A copy of pysat instrument object supplied to routine.
- modify
- pysat instrument object supplied to routine. Any and all changes to object are retained.
- pass
- A copy of pysat object is passed to function. No data is accepted from return.
- at_pos (string or int) – insert at position. (default, insert at end).
- args (extra arguments) – extra arguments are passed to the custom function (once)
- kwargs (extra keyword arguments) – extra keyword args are passed to the custom function (once)
Note
Allowed add function returns:
- {‘data’ : pandas Series/DataFrame/array_like, ‘units’ : string/array_like of strings, ‘long_name’ : string/array_like of strings, ‘name’ : string/array_like of strings (iff data array_like)}
- pandas DataFrame, names of columns are used
- pandas Series, .name required
- (string/list of strings, numpy array/list of arrays)
-
attach
(function, kind='add', at_pos='end', *args, **kwargs)¶ Attach a function to custom processing queue.
Deprecated since version 2.3.0: Custom.attach will be removed in pysat 3.0.0, it is replaced by Instrument.custom_attach
Custom functions are applied automatically to associated pysat instrument whenever instrument.load command called.
Parameters: - function (string or function object) – name of function or function object to be added to queue
- kind ({'add', 'modify', 'pass}) –
- add
- Adds data returned from function to instrument object. A copy of pysat instrument object supplied to routine.
- modify
- pysat instrument object supplied to routine. Any and all changes to object are retained.
- pass
- A copy of pysat object is passed to function. No data is accepted from return.
- at_pos (string or int) – insert at position. (default, insert at end).
- args (extra arguments) – extra arguments are passed to the custom function (once)
- kwargs (extra keyword arguments) – extra keyword args are passed to the custom function (once)
Note
Allowed attach function returns:
- {‘data’ : pandas Series/DataFrame/array_like, ‘units’ : string/array_like of strings, ‘long_name’ : string/array_like of strings, ‘name’ : string/array_like of strings (iff data array_like)}
- pandas DataFrame, names of columns are used
- pandas Series, .name required
- (string/list of strings, numpy array/list of arrays)
-
clear
()¶ Clear custom function list.
Deprecated since version 2.3.0: Custom.clear will be removed in pysat 3.0.0, it is replaced by Instrument.custom_clear
-
Files¶
-
class
pysat.
Files
(sat, manual_org=False, directory_format=None, update_files=False, file_format=None, write_to_disk=True, ignore_empty_files=False)¶ Maintains collection of files for instrument object.
Uses the list_files functions for each specific instrument to create an ordered collection of files in time. Used by instrument object to load the correct files. Files also contains helper methods for determining the presence of new files and creating an ordered list of files.
-
base_path
¶ path to .pysat directory in user home
Type: string
-
start_date
¶ date of first file, used as default start bound for instrument object
Type: datetime
-
stop_date
¶ date of last file, used as default stop bound for instrument object
Type: datetime
-
data_path
¶ path to the directory containing instrument files, top_dir/platform/name/tag/
Type: string
-
manual_org
¶ if True, then Files will look directly in pysat data directory for data files and will not use /platform/name/tag
Type: bool
-
update_files
¶ updates files on instantiation if True
Type: bool
Note
User should generally use the interface provided by a pysat.Instrument instance. Exceptions are the classmethod from_os, provided to assist in generating the appropriate output for an instrument routine.
Examples
# convenient file access inst = pysat.Instrument(platform=platform, name=name, tag=tag, sat_id=sat_id) # first file inst.files[0] # files from start up to stop (exclusive on stop) start = pysat.datetime(2009,1,1) stop = pysat.datetime(2009,1,3) print(vefi.files[start:stop]) # files for date print(vefi.files[start]) # files by slicing print(vefi.files[0:4]) # get a list of new files # new files are those that weren't present the last time # a given instrument's file list was stored new_files = vefi.files.get_new() # search pysat appropriate directory for instrument files and # update Files instance. vefi.files.refresh()
Initialization for Files class object
Parameters: - sat (pysat._instrument.Instrument) – Instrument object
- manual_org (boolian) – If True, then pysat will look directly in pysat data directory for data files and will not use default /platform/name/tag (default=False)
- directory_format (string or NoneType) – directory naming structure in string format. Variables such as platform, name, and tag will be filled in as needed using python string formatting. The default directory structure would be expressed as ‘{platform}/{name}/{tag}’ (default=None)
- update_files (boolean) – If True, immediately query filesystem for instrument files and store (default=False)
- file_format (str or NoneType) – File naming structure in string format. Variables such as year, month, and sat_id will be filled in as needed using python string formatting. The default file format structure is supplied in the instrument list_files routine. (default=None)
- write_to_disk (boolean) – If true, the list of Instrument files will be written to disk. Setting this to False prevents a rare condition when running multiple pysat processes.
- ignore_empty_files (boolean) – if True, the list of files found will be checked to ensure the filesiizes are greater than zero. Empty files are removed from the stored list of files.
-
classmethod
from_os
(data_path=None, format_str=None, two_digit_year_break=None, delimiter=None)¶ Produces a list of files and and formats it for Files class.
Requires fixed_width or delimited filename
Parameters: - data_path (string) – Top level directory to search files for. This directory is provided by pysat to the instrument_module.list_files functions as data_path.
- format_str (string with python format codes) – Provides the naming pattern of the instrument files and the locations of date information so an ordered list may be produced. Supports ‘year’, ‘month’, ‘day’, ‘hour’, ‘minute’, ‘second’, ‘version’, and ‘revision’ Ex: ‘cnofs_cindi_ivm_500ms_{year:4d}{month:02d}{day:02d}_v01.cdf’
- two_digit_year_break (int) – If filenames only store two digits for the year, then ‘1900’ will be added for years >= two_digit_year_break and ‘2000’ will be added for years < two_digit_year_break.
- delimiter (string (None)) – If set, then filename will be processed using delimiter rather than assuming a fixed width
Note
Does not produce a Files instance, but the proper output from instrument_module.list_files method.
The ‘?’ may be used to indicate a set number of spaces for a variable part of the name that need not be extracted. ‘cnofs_cindi_ivm_500ms_{year:4d}{month:02d}{day:02d}_v??.cdf’
-
get_file_array
(start, end)¶ Return a list of filenames between and including start and end.
Parameters: - start (array_like or single string) – filenames for start of returned filelist
- stop (array_like or single string) – filenames inclusive end of list
Returns: - list of filenames between and including start and end over all
- intervals.
-
get_index
(fname)¶ Return index for a given filename.
Parameters: fname (string) – filename Note
If fname not found in the file information already attached to the instrument.files instance, then a files.refresh() call is made.
-
get_new
()¶ List new files since last recorded file state.
pysat stores filenames in the user_home/.pysat directory. Returns a list of all new fileanmes since the last known change to files. Filenames are stored if there is a change and either update_files is True at instrument object level or files.refresh() is called.
Returns: files are indexed by datetime Return type: pandas.Series
-
refresh
()¶ Update list of files, if there are changes.
Calls underlying list_rtn for the particular science instrument. Typically, these routines search in the pysat provided path, pysat_data_dir/platform/name/tag/, where pysat_data_dir is set by pysat.utils.set_data_dir(path=path).
-
Meta¶
-
class
pysat.
Meta
(metadata=None, units_label='units', name_label='long_name', notes_label='notes', desc_label='desc', plot_label='label', axis_label='axis', scale_label='scale', min_label='value_min', max_label='value_max', fill_label='fill', export_nan=[])¶ Stores metadata for Instrument instance, similar to CF-1.6 netCDFdata standard.
Parameters: - metadata (pandas.DataFrame) – DataFrame should be indexed by variable name that contains at minimum the standard_name (name), units, and long_name for the data stored in the associated pysat Instrument object.
- units_label (str) – String used to label units in storage. Defaults to ‘units’.
- name_label (str) – String used to label long_name in storage. Defaults to ‘long_name’.
- notes_label (str) – String used to label ‘notes’ in storage. Defaults to ‘notes’
- desc_label (str) – String used to label variable descriptions in storage. Defaults to ‘desc’
- plot_label (str) – String used to label variables in plots. Defaults to ‘label’
- axis_label (str) – Label used for axis on a plot. Defaults to ‘axis’
- scale_label (str) – string used to label plot scaling type in storage. Defaults to ‘scale’
- min_label (str) – String used to label typical variable value min limit in storage. Defaults to ‘value_min’
- max_label (str) – String used to label typical variable value max limit in storage. Defaults to ‘value_max’
- fill_label (str) – String used to label fill value in storage. Defaults to ‘fill’ per netCDF4 standard
-
data
¶ index is variable standard name, ‘units’, ‘long_name’, and other defaults are also stored along with additional user provided labels.
Type: pandas.DataFrame
-
units_label
¶ String used to label units in storage. Defaults to ‘units’.
Type: str
-
name_label
¶ String used to label long_name in storage. Defaults to ‘long_name’.
Type: str
-
notes_label
¶ String used to label ‘notes’ in storage. Defaults to ‘notes’
Type: str
-
desc_label
¶ String used to label variable descriptions in storage. Defaults to ‘desc’
Type: str
-
plot_label
¶ String used to label variables in plots. Defaults to ‘label’
Type: str
-
axis_label
¶ Label used for axis on a plot. Defaults to ‘axis’
Type: str
-
scale_label
¶ string used to label plot scaling type in storage. Defaults to ‘scale’
Type: str
-
min_label
¶ String used to label typical variable value min limit in storage. Defaults to ‘value_min’
Type: str
-
max_label
¶ String used to label typical variable value max limit in storage. Defaults to ‘value_max’
Type: str
-
fill_label
¶ String used to label fill value in storage. Defaults to ‘fill’ per netCDF4 standard
Type: str
-
export_nan
¶ List of labels that should be exported even if their value is nan. By default, metadata with a value of nan will be exluded from export.
Type: list
Notes
Meta object preserves the case of variables and attributes as it first receives the data. Subsequent calls to set new metadata with the same variable or attribute will use case of first call. Accessing or setting data thereafter is case insensitive. In practice, use is case insensitive but the original case is preserved. Case preseveration is built in to support writing files with a desired case to meet standards.
Metadata for higher order data objects, those that have multiple products under a single variable name in a pysat.Instrument object, are stored by providing a Meta object under the single name.
Supports any custom metadata values in addition to the expected metadata attributes (units, name, notes, desc, plot_label, axis, scale, value_min, value_max, and fill). These base attributes may be used to programatically access and set types of metadata regardless of the string values used for the attribute. String values for attributes may need to be changed depending upon the standards of code or files interacting with pysat.
Meta objects returned as part of pysat loading routines are automatically updated to use the same values of plot_label, units_label, etc. as found on the pysat.Instrument object.
Examples
# instantiate Meta object, default values for attribute labels are used meta = pysat.Meta() # set a couple base units # note that other base parameters not set below will # be assigned a default value meta['name'] = {'long_name':string, 'units':string} # update 'units' to new value meta['name'] = {'units':string} # update 'long_name' to new value meta['name'] = {'long_name':string} # attach new info with partial information, 'long_name' set to 'name2' meta['name2'] = {'units':string} # units are set to '' by default meta['name3'] = {'long_name':string} # assigning custom meta parameters meta['name4'] = {'units':string, 'long_name':string 'custom1':string, 'custom2':value} meta['name5'] = {'custom1':string, 'custom3':value} # assign multiple variables at once meta[['name1', 'name2']] = {'long_name':[string1, string2], 'units':[string1, string2], 'custom10':[string1, string2]} # assiging metadata for n-Dimensional variables meta2 = pysat.Meta() meta2['name41'] = {'long_name':string, 'units':string} meta2['name42'] = {'long_name':string, 'units':string} meta['name4'] = {'meta':meta2} # or meta['name4'] = meta2 meta['name4'].children['name41'] # mixture of 1D and higher dimensional data meta = pysat.Meta() meta['dm'] = {'units':'hey', 'long_name':'boo'} meta['rpa'] = {'units':'crazy', 'long_name':'boo_whoo'} meta2 = pysat.Meta() meta2[['higher', 'lower']] = {'meta':[meta, None], 'units':[None, 'boo'], 'long_name':[None, 'boohoo']} # assign from another Meta object meta[key1] = meta2[key2] # access fill info for a variable, presuming default label meta[key1, 'fill'] # access same info, even if 'fill' not used to label fill values meta[key1, meta.fill_label] # change a label used by Meta object # note that all instances of fill_label # within the meta object are updated meta.fill_label = '_FillValue' meta.plot_label = 'Special Plot Variable' # this feature is useful when converting metadata within pysat # so that it is consistent with externally imposed file standards
-
accept_default_labels
(other)¶ Applies labels for default meta labels from other onto self.
Parameters: other (Meta) – Meta object to take default labels from Returns: Return type: Meta
-
apply_default_labels
(other)¶ Applies labels for default meta labels from self onto other.
Parameters: other (Meta) – Meta object to have default labels applied Returns: Return type: Meta
-
attr_case_name
(name)¶ Returns preserved case name for case insensitive value of name.
Checks first within standard attributes. If not found there, checks attributes for higher order data structures. If not found, returns supplied name as it is available for use. Intended to be used to help ensure that the same case is applied to all repetitions of a given variable name.
Parameters: name (str) – name of variable to get stored case form Returns: name in proper case Return type: str
-
attrs
()¶ Yields metadata products stored for each variable name
-
concat
(other, strict=False)¶ Concats two metadata objects together.
Parameters: - other (Meta) – Meta object to be concatenated
- strict (bool) – if True, ensure there are no duplicate variable names
Notes
Uses units and name label of self if other is different
Returns: Concatenated object Return type: Meta
-
drop
(names)¶ Drops variables (names) from metadata.
-
empty
¶ Return boolean True if there is no metadata
-
classmethod
from_csv
(name=None, col_names=None, sep=None, **kwargs)¶ Create instrument metadata object from csv.
Parameters: - name (string) – absolute filename for csv file or name of file stored in pandas instruments location
- col_names (list-like collection of strings) – column names in csv and resultant meta object
- sep (string) – column seperator for supplied csv filename
Note
column names must include at least [‘name’, ‘long_name’, ‘units’], assumed if col_names is None.
-
has_attr
(name)¶ Returns boolean indicating presence of given attribute name
Case-insensitive check
Notes
Does not check higher order meta objects
Parameters: name (str) – name of variable to get stored case form Returns: True if case-insesitive check for attribute name is True Return type: bool
-
keep
(keep_names)¶ Keeps variables (keep_names) while dropping other parameters
Parameters: keep_names (list-like) – variables to keep
-
keys
()¶ Yields variable names stored for 1D variables
-
keys_nD
()¶ Yields keys for higher order metadata
-
merge
(other)¶ Adds metadata variables to self that are in other but not in self.
Parameters: other (pysat.Meta) –
-
pop
(name)¶ Remove and return metadata about variable
Parameters: name (str) – variable name Returns: Series of metadata for variable Return type: pandas.Series
-
transfer_attributes_to_instrument
(inst, strict_names=False)¶ Transfer non-standard attributes in Meta to Instrument object.
Pysat’s load_netCDF and similar routines are only able to attach netCDF4 attributes to a Meta object. This routine identifies these attributes and removes them from the Meta object. Intent is to support simple transfers to the pysat.Instrument object.
Will not transfer names that conflict with pysat default attributes.
Parameters: - inst (pysat.Instrument) – Instrument object to transfer attributes to
- strict_names (boolean (False)) – If True, produces an error if the Instrument object already has an attribute with the same name to be copied.
Returns: pysat.Instrument object modified in place with new attributes
Return type: None
-
var_case_name
(name)¶ Provides stored name (case preserved) for case insensitive input
If name is not found (case-insensitive check) then name is returned, as input. This function is intended to be used to help ensure the case of a given variable name is the same across the Meta object.
Parameters: name (str) – variable name in any case Returns: string with case preserved as in metaobject Return type: str
Orbits¶
-
class
pysat.
Orbits
(sat=None, index=None, kind=None, period=None)¶ Determines orbits on the fly and provides orbital data in .data.
Determines the locations of orbit breaks in the loaded data in inst.data and provides iteration tools and convenient orbit selection via inst.orbit[orbit num].
Parameters: - sat (pysat.Instrument instance) – instrument object to determine orbits for
- index (string) – name of the data series to use for determing orbit breaks
- kind ({'local time', 'longitude', 'polar', 'orbit'}) –
kind of orbit, determines how orbital breaks are determined
- local time: negative gradients in lt or breaks in inst.data.index
- longitude: negative gradients or breaks in inst.data.index
- polar: zero crossings in latitude or breaks in inst.data.index
- orbit: uses unique values of orbit number
- period (np.timedelta64) – length of time for orbital period, used to gauge when a break in the datetime index (inst.data.index) is large enough to consider it a new orbit
Note
class should not be called directly by the user, use the interface provided by inst.orbits where inst = pysat.Instrument()
Warning
This class is still under development.
Examples
info = {'index':'longitude', 'kind':'longitude'} vefi = pysat.Instrument(platform='cnofs', name='vefi', tag='dc_b', clean_level=None, orbit_info=info) start = pysat.datetime(2009,1,1) stop = pysat.datetime(2009,1,10) vefi.load(date=start) vefi.bounds(start, stop) # iterate over orbits for vefi in vefi.orbits: print('Next available orbit ', vefi['dB_mer']) # load fifth orbit of first day vefi.load(date=start) vefi.orbits[5] # less convenient load vefi.orbits.load(5) # manually iterate orbit vefi.orbits.next() # backwards vefi.orbits.prev()
-
current
¶ Current orbit number.
Returns: None if no orbit data. Otherwise, returns orbit number, begining with zero. The first and last orbit of a day is somewhat ambiguous. The first orbit for day n is generally also the last orbit on day n - 1. When iterating forward, the orbit will be labeled as first (0). When iterating backward, orbit labeled as the last. Return type: int or None
-
load
(orbit=None)¶ Load a particular orbit into .data for loaded day.
Parameters: orbit (int) – orbit number, 1 indexed Note
A day of data must be loaded before this routine functions properly. If the last orbit of the day is requested, it will automatically be padded with data from the next day. The orbit counter will be reset to 1.
-
next
(*arg, **kwarg)¶ Load the next orbit into .data.
Note
Forms complete orbits across day boundaries. If no data loaded then the first orbit from the first date of data is returned.
-
prev
(*arg, **kwarg)¶ Load the previous orbit into .data.
Note
Forms complete orbits across day boundaries. If no data loaded then the last orbit of data from the last day is loaded into .data.
Seasonal Analysis¶
Occurrence Probability¶
Occurrence probability routines, daily or by orbit.
Routines calculate the occurrence of an event greater than a supplied gate occuring at least once per day, or once per orbit. The probability is calculated as the (number of times with at least one hit in bin)/(number of times in the bin).The data used to determine the occurrence must be 1D. If a property of a 2D or higher dataset is needed attach a custom function that performs the check and returns a 1D Series.
Deprecated since version 2.2.0: ssnl.occur_prob will be removed in pysat 3.0.0, it will be added to pysatSeasons: https://github.com/pysat/pysatSeasons
Note
The included routines use the bounds attached to the supplied instrument object as the season of interest.
-
pysat.ssnl.occur_prob.
by_orbit2D
(inst, bin1, label1, bin2, label2, data_label, gate, returnBins=False)¶ 2D Occurrence Probability of data_label orbit-by-orbit over a season.
Deprecated since version 2.2.0: by_orbit2D will be removed in pysat 3.0.0, it will be added to pysatSeasons
If data_label is greater than gate atleast once per orbit, then a 100% occurrence probability results. Season delineated by the bounds attached to Instrument object. Prob = (# of times with at least one hit)/(# of times in bin)
Parameters: - inst (pysat.Instrument()) – Instrument to use for calculating occurrence probability
- binx (list) – [min value, max value, number of bins]
- labelx (string) – identifies data product for binx
- data_label (list of strings) – identifies data product(s) to calculate occurrence probability
- gate (list of values) – values that data_label must achieve to be counted as an occurrence
- returnBins (Boolean) – if True, return arrays with values of bin edges, useful for pcolor
Returns: occur_prob – A dict of dicts indexed by data_label. Each entry is dict with entries ‘prob’ for the probability and ‘count’ for the number of orbits with any data; ‘bin_x’ and ‘bin_y’ are also returned if requested. Note that arrays are organized for direct plotting, y values along rows, x along columns.
Return type: dictionary
Note
Season delineated by the bounds attached to Instrument object.
-
pysat.ssnl.occur_prob.
by_orbit3D
(inst, bin1, label1, bin2, label2, bin3, label3, data_label, gate, returnBins=False)¶ 3D Occurrence Probability of data_label orbit-by-orbit over a season.
Deprecated since version 2.2.0: by_orbit3D will be removed in pysat 3.0.0, it will be added to pysatSeasons
If data_label is greater than gate atleast once per orbit, then a 100% occurrence probability results. Season delineated by the bounds attached to Instrument object. Prob = (# of times with at least one hit)/(# of times in bin)
Parameters: - inst (pysat.Instrument()) – Instrument to use for calculating occurrence probability
- binx (list) – [min value, max value, number of bins]
- labelx (string) – identifies data product for binx
- data_label (list of strings) – identifies data product(s) to calculate occurrence probability
- gate (list of values) – values that data_label must achieve to be counted as an occurrence
- returnBins (Boolean) – if True, return arrays with values of bin edges, useful for pcolor
Returns: occur_prob – A dict of dicts indexed by data_label. Each entry is dict with entries ‘prob’ for the probability and ‘count’ for the number of orbits with any data; ‘bin_x’, ‘bin_y’, and ‘bin_z’ are also returned if requested. Note that arrays are organized for direct plotting, z,y,x.
Return type: dictionary
Note
Season delineated by the bounds attached to Instrument object.
-
pysat.ssnl.occur_prob.
daily2D
(inst, bin1, label1, bin2, label2, data_label, gate, returnBins=False)¶ 2D Daily Occurrence Probability of data_label > gate over a season.
Deprecated since version 2.2.0: daily2D will be removed in pysat 3.0.0, it will be added to pysatSeasons
If data_label is greater than gate at least once per day, then a 100% occurrence probability results.Season delineated by the bounds attached to Instrument object. Prob = (# of times with at least one hit)/(# of times in bin)
Parameters: - inst (pysat.Instrument()) – Instrument to use for calculating occurrence probability
- binx (list) – [min, max, number of bins]
- labelx (string) – name for data product for binx
- data_label (list of strings) – identifies data product(s) to calculate occurrence probability e.g. inst[data_label]
- gate (list of values) – values that data_label must achieve to be counted as an occurrence
- returnBins (Boolean) – if True, return arrays with values of bin edges, useful for pcolor
Returns: occur_prob – A dict of dicts indexed by data_label. Each entry is dict with entries ‘prob’ for the probability and ‘count’ for the number of days with any data; ‘bin_x’ and ‘bin_y’ are also returned if requested. Note that arrays are organized for direct plotting, y values along rows, x along columns.
Return type: dictionary
Note
Season delineated by the bounds attached to Instrument object.
-
pysat.ssnl.occur_prob.
daily3D
(inst, bin1, label1, bin2, label2, bin3, label3, data_label, gate, returnBins=False)¶ 3D Daily Occurrence Probability of data_label > gate over a season.
Deprecated since version 2.2.0: daily3D will be removed in pysat 3.0.0, it will be added to pysatSeasons
If data_label is greater than gate atleast once per day, then a 100% occurrence probability results. Season delineated by the bounds attached to Instrument object. Prob = (# of times with at least one hit)/(# of times in bin)
Parameters: - inst (pysat.Instrument()) – Instrument to use for calculating occurrence probability
- binx (list) – [min, max, number of bins]
- labelx (string) – name for data product for binx
- data_label (list of strings) – identifies data product(s) to calculate occurrence probability
- gate (list of values) – values that data_label must achieve to be counted as an occurrence
- returnBins (Boolean) – if True, return arrays with values of bin edges, useful for pcolor
Returns: occur_prob – A dict of dicts indexed by data_label. Each entry is dict with entries ‘prob’ for the probability and ‘count’ for the number of days with any data; ‘bin_x’, ‘bin_y’, and ‘bin_z’ are also returned if requested. Note that arrays are organized for direct plotting, z,y,x.
Return type: dictionary
Note
Season delineated by the bounds attached to Instrument object.
Average¶
Instrument independent seasonal averaging routine. Supports averaging 1D and 2D data.
Deprecated since version 2.2.0: ssnl.avg will be removed in pysat 3.0.0, it will be added to pysatSeasons: https://github.com/pysat/pysatSeasons
-
pysat.ssnl.avg.
mean_by_day
(inst, data_label)¶ Mean of data_label by day over Instrument.bounds
Deprecated since version 2.2.0: mean_by_day will be removed in pysat 3.0.0, it will be added to pysatSeasons
Parameters: data_label (string) – string identifying data product to be averaged Returns: mean – simple mean of data_label indexed by day Return type: pandas Series
-
pysat.ssnl.avg.
mean_by_file
(inst, data_label)¶ Mean of data_label by orbit over Instrument.bounds
Deprecated since version 2.2.0: mean_by_file will be removed in pysat 3.0.0, it will be added to pysatSeasons
Parameters: data_label (string) – string identifying data product to be averaged Returns: mean – simple mean of data_label indexed by start of each file Return type: pandas Series
-
pysat.ssnl.avg.
mean_by_orbit
(inst, data_label)¶ Mean of data_label by orbit over Instrument.bounds
Deprecated since version 2.2.0: mean_by_orbit will be removed in pysat 3.0.0, it will be added to pysatSeasons
Parameters: data_label (string) – string identifying data product to be averaged Returns: mean – simple mean of data_label indexed by start of each orbit Return type: pandas Series
-
pysat.ssnl.avg.
median1D
(const, bin1, label1, data_label, auto_bin=True, returnData=False)¶ Return a 1D median of data_label over a season and label1
Deprecated since version 2.2.0: median1D will be removed in pysat 3.0.0, it will be added to pysatSeasons
Parameters: - const (Constellation or Instrument) – Constellation or Instrument object
- bin1 ((array-like)) – List holding [min, max, number of bins] or array-like containing bin edges
- label1 ((string)) – data column name that the binning will be performed over (i.e., lat)
- data_label ((list-like )) – contains strings identifying data product(s) to be averaged
- auto_bin (if True, function will create bins from the min, max and) – number of bins. If false, bin edges must be manually entered
- returnData ((boolean)) – Return data in output dictionary as well as statistics
Returns: median – 1D median accessed by data_label as a function of label1 over the season delineated by bounds of passed instrument objects. Also includes ‘count’ and ‘avg_abs_dev’ as well as the values of the bin edges in ‘bin_x’
Return type: dictionary
-
pysat.ssnl.avg.
median2D
(const, bin1, label1, bin2, label2, data_label, returnData=False, auto_bin=True)¶ Return a 2D average of data_label over a season and label1, label2.
Deprecated since version 2.2.0: median2D will be removed in pysat 3.0.0, it will be added to pysatSeasons
Parameters: - const (Constellation or Instrument) –
- bin# ([min, max, number of bins], or array-like containing bin edges) –
- label# (string) – identifies data product for bin#
- data_label (list-like) – contains strings identifying data product(s) to be averaged
- auto_bin (if True, function will create bins from the min, max and) – number of bins. If false, bin edges must be manually entered
Returns: median – 2D median accessed by data_label as a function of label1 and label2 over the season delineated by bounds of passed instrument objects. Also includes ‘count’ and ‘avg_abs_dev’ as well as the values of the bin edges in ‘bin_x’ and ‘bin_y’.
Return type: dictionary
Plot¶
-
pysat.ssnl.plot.
scatterplot
(inst, labelx, labely, data_label, datalim, xlim=None, ylim=None)¶ Return scatterplot of data_label(s) as functions of labelx,y over a season.
Deprecated since version 2.2.0: scatterplot will be removed in pysat 3.0.0, it will be added to pysatSeasons
Parameters: - labelx (string) – data product for x-axis
- labely (string) – data product for y-axis
- data_label (string, array-like of strings) – data product(s) to be scatter plotted
- datalim (numyp array) – plot limits for data_label
Returns: - Returns a list of scatter plots of data_label as a function
- of labelx and labely over the season delineated by start and
- stop datetime objects.
Utilities¶
pysat.utils - utilities for running pysat¶
pysat.utils contains a number of functions used throughout the pysat package. This includes conversion of formats, loading of files, and user-supplied info for the pysat data directory structure.
Coordinates¶
pysat.utils.coords - coordinate transformations for pysat¶
pysat.utils.coords contains a number of coordinate-transformation functions used throughout the pysat package.
-
pysat.utils.coords.
adjust_cyclic_data
(samples, high=6.283185307179586, low=0.0)¶ Adjust cyclic values such as longitude to a different scale
Parameters: - samples (array_like) – Input array
- high (float or int) – Upper boundary for circular standard deviation range (default=2 pi)
- low (float or int) – Lower boundary for circular standard deviation range (default=0)
- axis (int or NoneType) – Axis along which standard deviations are computed. The default is to compute the standard deviation of the flattened array
Returns: out_samples – Circular standard deviation
Return type: float
-
pysat.utils.coords.
calc_solar_local_time
(inst, lon_name=None, slt_name='slt')¶ Append solar local time to an instrument object
Parameters: - inst (pysat.Instrument instance) – instrument object to be updated
- lon_name (string) – name of the longtiude data key (assumes data are in degrees)
- slt_name (string) – name of the output solar local time data key (default=’slt’)
Returns: Return type: updates instrument data in column specified by slt_name
-
pysat.utils.coords.
geodetic_to_geocentric
(lat_in, lon_in=None, inverse=False)¶ Converts position from geodetic to geocentric or vice-versa.
Deprecated since version 2.2.0: geodetic_to_geocentric will be removed in pysat 3.0.0, it will be added to pysatMadrigal
Parameters: - lat_in (float) – latitude in degrees.
- lon_in (float or NoneType) – longitude in degrees. Remains unchanged, so does not need to be included. (default=None)
- inverse (bool) – False for geodetic to geocentric, True for geocentric to geodetic. (default=False)
Returns: - lat_out (float) – latitude [degree] (geocentric/detic if inverse=False/True)
- lon_out (float or NoneType) – longitude [degree] (geocentric/detic if inverse=False/True)
- rad_earth (float) – Earth radius [km] (geocentric/detic if inverse=False/True)
Notes
Uses WGS-84 values
References
Based on J.M. Ruohoniemi’s geopack and R.J. Barnes radar.pro
-
pysat.utils.coords.
geodetic_to_geocentric_horizontal
(lat_in, lon_in, az_in, el_in, inverse=False)¶ Converts from local horizontal coordinates in a geodetic system to local horizontal coordinates in a geocentric system
Deprecated since version 2.2.0: geodetic_to_geocentric_horizontal will be removed in pysat 3.0.0, it will be added to pysatMadrigal
Parameters: - lat_in (float) – latitude in degrees of the local horizontal coordinate system center
- lon_in (float) – longitude in degrees of the local horizontal coordinate system center
- az_in (float) – azimuth in degrees within the local horizontal coordinate system
- el_in (float) – elevation in degrees within the local horizontal coordinate system
- inverse (bool) – False for geodetic to geocentric, True for inverse (default=False)
Returns: - lat_out (float) – latitude in degrees of the converted horizontal coordinate system center
- lon_out (float) – longitude in degrees of the converted horizontal coordinate system center
- rad_earth (float) – Earth radius in km at the geocentric/detic (False/True) location
- az_out (float) – azimuth in degrees of the converted horizontal coordinate system
- el_out (float) – elevation in degrees of the converted horizontal coordinate system
References
Based on J.M. Ruohoniemi’s geopack and R.J. Barnes radar.pro
-
pysat.utils.coords.
global_to_local_cartesian
(x_in, y_in, z_in, lat_cent, lon_cent, rad_cent, inverse=False)¶ Converts a position from global to local cartesian or vice-versa
Deprecated since version 2.2.0: global_to_local_cartesian will be removed in pysat 3.0.0, it will be added to pysatMadrigal
Parameters: - x_in (float) – global or local cartesian x in km (inverse=False/True)
- y_in (float) – global or local cartesian y in km (inverse=False/True)
- z_in (float) – global or local cartesian z in km (inverse=False/True)
- lat_cent (float) – geocentric latitude in degrees of local cartesian system origin
- lon_cent (float) – geocentric longitude in degrees of local cartesian system origin
- rad_cent (float) – distance from center of the Earth in km of local cartesian system origin
- inverse (bool) – False to convert from global to local cartesian coodiantes, and True for the inverse (default=False)
Returns: - x_out (float) – local or global cartesian x in km (inverse=False/True)
- y_out (float) – local or global cartesian y in km (inverse=False/True)
- z_out (float) – local or global cartesian z in km (inverse=False/True)
Notes
The global cartesian coordinate system has its origin at the center of the Earth, while the local system has its origin specified by the input latitude, longitude, and radius. The global system has x intersecting the equatorial plane and the prime meridian, z pointing North along the rotational axis, and y completing the right-handed coodinate system. The local system has z pointing up, y pointing North, and x pointing East.
-
pysat.utils.coords.
local_horizontal_to_global_geo
(az, el, dist, lat_orig, lon_orig, alt_orig, geodetic=True)¶ Convert from local horizontal coordinates to geodetic or geocentric coordinates
Deprecated since version 2.2.0: local_horizontal_to_global_geo will be removed in pysat 3.0.0, it will be added to pysatMadrigal
Parameters: - az (float) – Azimuth (angle from North) of point in degrees
- el (float) – Elevation (angle from ground) of point in degrees
- dist (float) – Distance from origin to point in km
- lat_orig (float) – Latitude of origin in degrees
- lon_orig (float) – Longitude of origin in degrees
- alt_orig (float) – Altitude of origin in km from the surface of the Earth
- geodetic (bool) – True if origin coordinates are geodetic, False if they are geocentric. Will return coordinates in the same system as the origin input. (default=True)
Returns: - lat_pnt (float) – Latitude of point in degrees
- lon_pnt (float) – Longitude of point in degrees
- rad_pnt (float) – Distance to the point from the centre of the Earth in km
References
Based on J.M. Ruohoniemi’s geopack and R.J. Barnes radar.pro
-
pysat.utils.coords.
scale_units
(out_unit, in_unit)¶ Determine the scaling factor between two units
Deprecated since version 2.2.0: utils.coords.scale_units will be removed in pysat 3.0.0, it will be moved to utils.scale_units
Parameters: - out_unit (str) – Desired unit after scaling
- in_unit (str) – Unit to be scaled
Returns: unit_scale – Scaling factor that will convert from in_units to out_units
Return type: float
-
pysat.utils.coords.
spherical_to_cartesian
(az_in, el_in, r_in, inverse=False)¶ Convert a position from spherical to cartesian, or vice-versa
Deprecated since version 2.2.0: spherical_to_cartesian will be removed in pysat 3.0.0, it will be added to pysatMadrigal
Parameters: - az_in (float) – azimuth/longitude in degrees or cartesian x in km (inverse=False/True)
- el_in (float) – elevation/latitude in degrees or cartesian y in km (inverse=False/True)
- r_in (float) – distance from origin in km or cartesian z in km (inverse=False/True)
- inverse (boolian) – False to go from spherical to cartesian and True for the inverse
Returns: - x_out (float) – cartesian x in km or azimuth/longitude in degrees (inverse=False/True)
- y_out (float) – cartesian y in km or elevation/latitude in degrees (inverse=False/True)
- z_out (float) – cartesian z in km or distance from origin in km (inverse=False/True)
Notes
This transform is the same for local or global spherical/cartesian transformations.
Returns elevation angle (angle from the xy plane) rather than zenith angle (angle from the z-axis)
-
pysat.utils.coords.
update_longitude
(inst, lon_name=None, high=180.0, low=-180.0)¶ Update longitude to the desired range
Parameters: - inst (pysat.Instrument instance) – instrument object to be updated
- lon_name (string) – name of the longtiude data
- high (float) – Highest allowed longitude value (default=180.0)
- low (float) – Lowest allowed longitude value (default=-180.0)
Returns: Return type: updates instrument data in column ‘lon_name’
Statistics¶
pysat.utils.stats - statistical operations in pysat¶
pysat.coords contains a number of coordinate-transformation functions used throughout the pysat package.
-
pysat.utils.stats.
median1D
(self, bin_params, bin_label, data_label)¶ Calculates the median for a series of binned data.
Deprecated since version 2.2.0: median1D will be removed in pysat 3.0.0, a similar function will be added to pysatSeasons
Parameters: - bin_params (array_like) – Input array defining the bins in which the median is calculated
- bin_label (string) – Name of data parameter which the bins cover
- data_level (string) – Name of data parameter to take the median of in each bin
Returns: medians – The median data value in each bin
Return type: array_like
-
pysat.utils.stats.
nan_circmean
(samples, high=6.283185307179586, low=0.0, axis=None)¶ NaN insensitive version of scipy’s circular mean routine
Deprecated since version 2.1.0: nan_circmean will be removed in pysat 3.0.0, this functionality has been added to scipy 1.4
Parameters: - samples (array_like) – Input array
- high (float or int) – Upper boundary for circular standard deviation range (default=2 pi)
- low (float or int) – Lower boundary for circular standard deviation range (default=0)
- axis (int or NoneType) – Axis along which standard deviations are computed. The default is to compute the standard deviation of the flattened array
Returns: circmean – Circular mean
Return type: float
-
pysat.utils.stats.
nan_circstd
(samples, high=6.283185307179586, low=0.0, axis=None)¶ NaN insensitive version of scipy’s circular standard deviation routine
Deprecated since version 2.1.0: nan_circstd will be removed in pysat 3.0.0, this functionality has been added to scipy 1.4
Parameters: - samples (array_like) – Input array
- high (float or int) – Upper boundary for circular standard deviation range (default=2 pi)
- low (float or int) – Lower boundary for circular standard deviation range (default=0)
- axis (int or NoneType) – Axis along which standard deviations are computed. The default is to compute the standard deviation of the flattened array
Returns: circstd – Circular standard deviation
Return type: float
Time¶
pysat.utils.time - date and time operations in pysat¶
pysat.utils.time contains a number of functions used throughout the pysat package, including interactions with datetime objects, seasons, and calculation of solar local time
-
pysat.utils.time.
calc_freq
(index)¶ Determine the frequency for a time index
Parameters: index ((array-like)) – Datetime list, array, or Index Returns: freq – Frequency string as described in Pandas Offset Aliases Return type: (str) Notes
Calculates the minimum time difference and sets that as the frequency.
To reduce the amount of calculations done, the returned frequency is either in seconds (if no sub-second resolution is found) or nanoseconds.
-
pysat.utils.time.
create_date_range
(start, stop, freq='D')¶ Return array of datetime objects using input frequency from start to stop
Supports single datetime object or list, tuple, ndarray of start and stop dates.
freq codes correspond to pandas date_range codes, D daily, M monthly, S secondly
-
pysat.utils.time.
create_datetime_index
(year=None, month=None, day=None, uts=None)¶ Create a timeseries index using supplied year, month, day, and ut in seconds.
Parameters: - year (array_like of ints) –
- month (array_like of ints or None) –
- day (array_like of ints) – for day (default) or day of year (use month=None)
- uts (array_like of floats) –
Returns: Return type: Pandas timeseries index.
Note
Leap seconds have no meaning here.
-
pysat.utils.time.
getyrdoy
(date)¶ Return a tuple of year, day of year for a supplied datetime object.
Parameters: date (datetime.datetime) – Datetime object Returns: - year (int) – Integer year
- doy (int) – Integer day of year
-
pysat.utils.time.
parse_date
(str_yr, str_mo, str_day, str_hr='0', str_min='0', str_sec='0', century=2000)¶ Basic date parser for file reading
Parameters: - str_yr (string) – String containing the year (2 or 4 digits)
- str_mo (string) – String containing month digits
- str_day (string) – String containing day of month digits
- str_hr (string ('0')) – String containing the hour of day
- str_min (string ('0')) – String containing the minutes of hour
- str_sec (string ('0')) – String containing the seconds of minute
- century (int (2000)) – Century, only used if str_yr is a 2-digit year
Returns: out_date – Pandas datetime object
Return type: pds.datetime
-
pysat.utils.time.
season_date_range
(start, stop, freq='D')¶ Deprecated Function, will be removed in future version.
Deprecated since version 2.1.0: season_date_range will be removed in pysat 3.0.0, this will be replaced by create_date_range