.. _sec-user_guide-data-intro: Introduction ============ Eradiate ships, processes and produces data. This guide presents: * the rationale underlying data models used in Eradiate; * the components used to manipulate data shipped with Eradiate. Formats ------- Most data sets used and produced by Eradiate are stored in the `NetCDF format `_. Eradiate interacts with these data using the `xarray `_ library, whose data model is based on NetCDF. Xarray provides a comprehensive, robust and convenient interface to read, write, manipulate and visualise NetCDF data. Accessing shipped data ---------------------- Eradiate ships with a series of data sets managed its global data store. .. code-block:: python from eradiate.data import data_store This global data store aggregates multiple subordinated data stores based on the size and maturity level of data files they manage. List a data store' registered data sets by reading its ``registry`` property, *i.e.*: .. code-block:: python list(data_store.stores["small_files"].registry) for small data files and: .. code-block:: python list(data_store.stores["large_files_stable"].registry) for large data files. To open a specific data set, use :func:`eradiate.data.open_dataset`: .. code-block:: python import eradiate ds = eradiate.data.open_dataset("spectra/solar_irradiance/thuillier_2003.nc") To load a data set into memory, use :func:`eradiate.data.load_dataset`: .. code-block:: python ds = eradiate.data.load_dataset("spectra/solar_irradiance/thuillier_2003.nc") .. warning:: The data module does not support concurrent download requests from multiple processes running Eradiate. This means that in such cases, two processes requesting the same resource using *e.g.* :func:`eradiate.data.load_dataset` may both trigger two downloads overwriting each other, resulting in unpredictable (but surely incorrect) behaviour. If your use case requires running Eradiate from multiple processes, we strongly advise that you **download all data in advance** using the ``eradiate data fetch`` command (see :ref:`sec-reference_cli`). .. _sec-user_guide-data_guide-working_angular_data: Working with angular data ------------------------- Eradiate notably manipulates and produces what we refer to as *angular data*, which represent variables dependent on one or more directional parameters. Typical examples are BRDFs (:math:`f_\mathrm{r} (\theta_\mathrm{i}, \varphi_\mathrm{i}, \theta_\mathrm{o}, \varphi_\mathrm{o})`) or top-of-atmosphere BRFs (:math:`\mathit{BRF}_\mathrm{TOA} (\theta_\mathrm{sun}, \varphi_\mathrm{sun}, \theta_\mathrm{view}, \varphi_\mathrm{view})`): a xarray data array representing them has at least one angular dimension (and corresponding coordinates). Eradiate has specific functionality to deal more easily with this sort of data. Angular dependencies and coordinate variable names ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Angular variable naming in Earth observation and radiative transfer modelling may sometimes clash or be confusing. Eradiate clearly distinguishes between two types of angular dependencies for its variables: * Physical properties such as BRDFs and phase functions have intrinsic bidirectional dependencies which are referred to as *incoming* and *outgoing* directions. Data sets representing such quantities use coordinate variables ``phi_i``, ``theta_i`` for the incoming direction's azimuth and zenith angles, and ``phi_o``, ``theta_o`` for their outgoing counterparts. * Observations are usually parametrised by *illumination* (or *solar*) and *viewing* (or *sensor*) directions. For data sets representing such results, Eradiate uses coordinate variables ``sza``, ``saa`` for *solar zenith/azimuth angle* and ``vza``, ``vaa`` for *viewing zenith/azimuth angle*. A typical example of such variable is the top-of-atmosphere bidirectional reflectance factor (TOA BRF). Under specific circumstances, one can directly convert an observation dataset to a physical property dataset. This, for instance, applies to top-of-atmosphere BRF data, but also any BRF computed or measured in a vacuum. In such cases, incoming/outgoing directions can be directly converted to illumination/viewing directions. **But in general, this does not work.** Angular data set types ^^^^^^^^^^^^^^^^^^^^^^ While one should clearly distinguish intrinsic and observation angular dependencies for correct physical interpretation of radiative data, both share an asymmetry between 'incoming' and 'outgoing' directions. Eradiate uses similar semantics to handle both angular data types, and the table below clarifies the nomenclature for the two types: .. list-table:: :header-rows: 1 * - Type - Incoming - Outgoing * - Intrinsic - :math:`\varphi_\mathrm{i}`, :math:`\theta_\mathrm{i}` - :math:`\varphi_\mathrm{o}`, :math:`\theta_\mathrm{o}` * - Observation - :math:`\varphi_\mathrm{s}`, :math:`\theta_\mathrm{s}` - :math:`\varphi_\mathrm{v}`, :math:`\theta_\mathrm{v}` Eradiate's xarray containers do not explicitly keep track of the angular data set type. However, when relevant, coordinate naming is used to determine whether an angular data set is of intrinsic or observation type. Angular data sets with a pair of angular dimensions :math:`(\theta, \varphi)` are called *hemispherical*. If they have two pairs of angular dimensions (incoming and outgoing), they are then called *bi-hemispherical*. Measure data formats -------------------- Most measures in Earth observation radiative transfer modelling have angular dependencies. However, Eradiate uses storage data structures inherited from computer graphics technology and measure results are usually mapped against *film coordinates* :math:`(x, y) \in [0, 1]^2`. When those data represent hemispherical quantities, a mapping transformation associate angles to film coordinates. For convenience, Eradiate ships helpers to convert data from film coordinates to angular coordinates.