Introduction#

Eradiate ships, processes and produces data. This guide presents:

  • the rationale underlying data models used in Eradiate;

  • the components used to manipulate data shipped with Eradiate.

Formats#

Most data sets used and produced by Eradiate are stored in the NetCDF format. Eradiate interacts with these data using the xarray library, whose data model is based on NetCDF. Xarray provides a comprehensive, robust and convenient interface to read, write, manipulate and visualise NetCDF data.

Accessing shipped data#

Eradiate ships with a series of data sets managed its global data store.

from eradiate.data import data_store

This global data store aggregates multiple subordinated data stores based on the size and maturity level of data files they manage. List a data store’ registered data sets by reading its registry property, i.e.:

list(data_store.stores["small_files"].registry)

for small data files and:

list(data_store.stores["large_files_stable"].registry)

for large data files.

To open a specific data set, use eradiate.data.open_dataset():

import eradiate
ds = eradiate.data.open_dataset("spectra/solar_irradiance/thuillier_2003.nc")

To load a data set into memory, use eradiate.data.load_dataset():

ds = eradiate.data.load_dataset("spectra/solar_irradiance/thuillier_2003.nc")

Warning

The data module does not support concurrent download requests from multiple processes running Eradiate. This means that in such cases, two processes requesting the same resource using e.g. eradiate.data.load_dataset() may both trigger two downloads overwriting each other, resulting in unpredictable (but surely incorrect) behaviour.

If your use case requires running Eradiate from multiple processes, we strongly advise that you download all data in advance using the eradiate data fetch command (see Command-line interface reference).

Working with angular data#

Eradiate notably manipulates and produces what we refer to as angular data, which represent variables dependent on one or more directional parameters. Typical examples are BRDFs (\(f_\mathrm{r} (\theta_\mathrm{i}, \varphi_\mathrm{i}, \theta_\mathrm{o}, \varphi_\mathrm{o})\)) or top-of-atmosphere BRFs (\(\mathit{BRF}_\mathrm{TOA} (\theta_\mathrm{sun}, \varphi_\mathrm{sun}, \theta_\mathrm{view}, \varphi_\mathrm{view})\)): a xarray data array representing them has at least one angular dimension (and corresponding coordinates). Eradiate has specific functionality to deal more easily with this sort of data.

Angular dependencies and coordinate variable names#

Angular variable naming in Earth observation and radiative transfer modelling may sometimes clash or be confusing. Eradiate clearly distinguishes between two types of angular dependencies for its variables:

  • Physical properties such as BRDFs and phase functions have intrinsic bidirectional dependencies which are referred to as incoming and outgoing directions. Data sets representing such quantities use coordinate variables phi_i, theta_i for the incoming direction’s azimuth and zenith angles, and phi_o, theta_o for their outgoing counterparts.

  • Observations are usually parametrised by illumination (or solar) and viewing (or sensor) directions. For data sets representing such results, Eradiate uses coordinate variables sza, saa for solar zenith/azimuth angle and vza, vaa for viewing zenith/azimuth angle. A typical example of such variable is the top-of-atmosphere bidirectional reflectance factor (TOA BRF).

Under specific circumstances, one can directly convert an observation dataset to a physical property dataset. This, for instance, applies to top-of-atmosphere BRF data, but also any BRF computed or measured in a vacuum. In such cases, incoming/outgoing directions can be directly converted to illumination/viewing directions. But in general, this does not work.

Angular data set types#

While one should clearly distinguish intrinsic and observation angular dependencies for correct physical interpretation of radiative data, both share an asymmetry between ‘incoming’ and ‘outgoing’ directions. Eradiate uses similar semantics to handle both angular data types, and the table below clarifies the nomenclature for the two types:

Type

Incoming

Outgoing

Intrinsic

\(\varphi_\mathrm{i}\), \(\theta_\mathrm{i}\)

\(\varphi_\mathrm{o}\), \(\theta_\mathrm{o}\)

Observation

\(\varphi_\mathrm{s}\), \(\theta_\mathrm{s}\)

\(\varphi_\mathrm{v}\), \(\theta_\mathrm{v}\)

Eradiate’s xarray containers do not explicitly keep track of the angular data set type. However, when relevant, coordinate naming is used to determine whether an angular data set is of intrinsic or observation type.

Angular data sets with a pair of angular dimensions \((\theta, \varphi)\) are called hemispherical. If they have two pairs of angular dimensions (incoming and outgoing), they are then called bi-hemispherical.

Measure data formats#

Most measures in Earth observation radiative transfer modelling have angular dependencies. However, Eradiate uses storage data structures inherited from computer graphics technology and measure results are usually mapped against film coordinates \((x, y) \in [0, 1]^2\). When those data represent hemispherical quantities, a mapping transformation associate angles to film coordinates. For convenience, Eradiate ships helpers to convert data from film coordinates to angular coordinates.

Configuring data storage#

Eradiate stores downloaded data in a configurable directory. We distinguish two cases:

  • You are working with a packaged version of Eradiate, i.e. you installed it to your Python environment from PyPI. In that case, the default download location is set to ./eradiate_downloads, which is a subdirectory relative to the current working directory.

  • You are working with a development setup, i.e. you checked out the sources. In that case, the default download location is set to $ERADIATE_SOURCE_DIR/resources/data/downloads.

In both cases, the download location can be controlled by setting the ERADIATE_DOWNLOAD_DIR environment variable.