Quickstart Guide#
Installation#
obsarray is installable via pip.
pip install obsarray
Dependencies#
obsarray is an extension to xarray to support defining, storing and interfacing with measurement data. It is designed to work well with netCDF files, using the netcdf4 library.
The pip installation will also automatically install any dependencies.
Example Usage#
First we build an example dataset that represents a time series of temperatures (for more on how do this see the xarray documentation).
In [1]: import numpy as np
In [2]: import xarray as xr
In [3]: import obsarray
# build an xarray to represents a time series of temperatures
In [4]: temps = np.array([20.2, 21.1, 20.8])
In [5]: times = np.array([0, 30, 60])
In [6]: ds = xr.Dataset(
...: {"temperature": (["time"], temps, {"units": "degC"})},
...: coords = {"time": (["time"], times, {"units": "s"})}
...: )
...:
Uncertainty and error-covariance information for observation variables can be defined using the dataset’s unc
accessor, which is provided by obsarray.
# add random component uncertainty
In [7]: ds.unc["temperature"]["u_r_temperature"] = (
...: ["time"],
...: np.array([0.5, 0.5, 0.6]),
...: {"err_corr": [{"dim": "time", "form": "random"}]}
...: )
...:
# add systematic component uncertainty
In [8]: ds.unc["temperature"]["u_s_temperature"] = (
...: ["time"],
...: np.array([0.3, 0.3, 0.3]),
...: {"err_corr": [{"dim": "time", "form": "systematic"}]}
...: )
...:
Dataset structures can be defined separately using obsarray’s templating functionality. This is helpful for processing chains where you want to write files to a defined format.
The defined uncertainty information then can be interfaced with, for example:
# get total combined uncertainty of all components
In [9]: ds.unc["temperature"].total_unc()
Out[9]:
<xarray.DataArray (time: 3)> Size: 24B
array([0.58309519, 0.58309519, 0.67082039])
Coordinates:
* time (time) int64 24B 0 30 60
# get total error-covariance matrix for all components
In [10]: ds.unc["temperature"].total_err_cov_matrix()
Out[10]:
<xarray.DataArray (time: 3)> Size: 72B
array([[0.34, 0.09, 0.09],
[0.09, 0.34, 0.09],
[0.09, 0.09, 0.45]])
Dimensions without coordinates: time, time
This information is preserved in metadata when written to netCDF files
Similarly, data flags can be defined using the dataset’s flag
accessor, which again is provided by obsarray. These flags are defined following the CF Convention metadata standard.
A flag variable can be created to store data for a set of flags with defined meanings
In [11]: ds.flag["quality_flags"] = (
....: ["time"],
....: {"flag_meanings": ["dubious", "invalid", "saturated"]}
....: )
....:
In [12]: print(ds.flag)
<FlagAccessor>
Dataset Flags:
* <FlagVariable>
FlagVariable: 'quality_flags'
['dubious', 'invalid', 'saturated']
These flag meanings can be indexed, to get and set their value
In [13]: print(ds.flag["quality_flags"]["dubious"].value)
<xarray.DataArray (time: 3)> Size: 3B
array([False, False, False])
Dimensions without coordinates: time
In [14]: ds.flag["quality_flags"]["dubious"][0] = True
In [15]: print(ds.flag["quality_flags"]["dubious"].value)
<xarray.DataArray (time: 3)> Size: 3B
array([ True, False, False])
Dimensions without coordinates: time